Thomas, Thomas Hruska wrote:
>An associative hash (a.k.a. associative array) implements either a hash >table or a B-tree - usually the latter in C++ because stl hash_map >specifically implements a hash table. > > This is however not quite what you said in your original post. In your original post you said: "An associative hash is a data structure that implements a B-tree and thus has O(log n) insert and search times." The 2 terms "associative array" and "associative hash" are not entirely synonymous or interchangeable: "associative hash" says something about the implementation detail and implies using hashing, whereas "associative array" defines an abstract datatype mapping a set of keys to a set of values. Thus, when you said "associative hash" (and did no use the more general ADT term of "associate array"), and then also stated that it was (by implication) **always** implemented as a B-Tree with O(log n) search time, it appeared like a mistake to me (or at least an incomplete answer). Hence my response. Thus, I felt that if you meant to use the the general term of an "associative map" in your answer then you should've at least mentioned both possible implementations in your answer (using B-Trees or Hash tables) with both time complexities to avoid giving others a wrong impression. That's all I was trying to point out, I apologise if this correction came over as overly critical or critical of you personally. >http://en.wikipedia.org/wiki/Associative_array > >So I'm using correct terminology. > > > Just for completeness, I'll also quote the first 2 paragraphs of the entry on "Hash table" in wikipedia (__emphasis__ mine): "In computer science, a hash table, or a __hash map__, is a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person's name), find the corresponding value (e.g. that person's telephone number). It works by transforming the key using a hash function into a hash, a number that the hash table uses to locate the desired value. A small phone book as a hash table. Enlarge A small phone book as a hash table. Hash tables are often used to implement __associative arrays__, sets and caches. Like arrays, hash tables __provide constant-time O(1) lookup on average__, regardless of the number of items in the table. However, the rare __worst-case__ lookup time can be as bad as __O(n)__. Compared to __other__ associative array data structures, hash tables are most useful when large numbers of records of data are to be stored. Hash tables store data in pseudo-random locations, so accessing the data in a sorted manner is a very time consuming operation. __Other data structures__ such as self-balancing __binary search trees__ generally operate more slowly (since their lookup time is __O(log n)__) and are rather more complex to implement than hash tables but maintain a sorted data structure at all times." So, to be clear: Hashing has a __general__ time complexity of O(1), which degerates in the worst case to O(n). Other structures like B-trees have __worst case__ complexity of O(log n), but has other advantages like being traversable in sorted order. This was all I was trying to ensure everyone else was aware of. This did not come accross from your original answer. >Your suggestion to load the entire database into memory is not a good >one. According to the OP, it is roughly 100MB...when you take into >consideration data structures and data alignment, loading the whole >thing would probably consume at least 300MB RAM. > > > The original poster originally said 30Mb, and only later had a proper look and reported back that it was in fact a 100Mb. The original thought I had ocurred with the 30Mb size in mind. And in any case, I never actually __strongly recommended__ that he go down this road at all, I merely mentioned it as "a thought that had gone through my mind" given the orignal 30Mb size of the database, in order to share it with the rest of the group. Further, I also said "...with the amount of growth per week, such a solution may not be scalable so is __possibly not such a good idea from that point of view either...__" So, I already pointed out in my own post that it might not be such a good idea, and hadn't even considered other possible implementation details like data alignment. I'm sorry if I left the impression that I was actually recommending he actually goes down this road. >You should get yourself a copy of my book on Safe C++ Design Principles. > Your assumptions that accessing a hash table is always O(1) are false. > When a hash collision occurs, most hash tables switch to a O(n) search >algorithm (linked list). Also, as stated in the book, O(1) algorithms >usually come with a side-effect of consuming other resources...CPU, >memory, etc. In this particular case, calculating a good hash key >typically takes many CPU cycles. > > You seem to have misread what I said. I said: "B-Trees have O(log n) lookup time as you say, but hashing/hash tables/hash maps don't use B-Trees and in fact has O(1) lookup time. It takes just as long __(in general case)__ to look up an item from 100 as it does from 100 000 000." Note the "general case" comment. __Of course__ the theorectical O(1) time complexity of hash-tables deteriorates into worse than the general case due to practical considerations (collisions and whatever else) in some situations. I know that just as well as you. Indeed, the wikipedia page I mentioned also clearly states that, so to be clear, I never said or tried to say that accessing hash-tables is __always__ pedantically, strictly, O(1). I merely stated that hash tables generally have a time complexity of O(1) (which I'm sure we can agree on) and that "__in the general case__ it taks just as long to look up an item from 100 as from 100 000 000" which is approximately true. In closing, I'd like to apologise if my original reply was perhaps a bit abrupt and caused you offense. I was simply trying to add on to/correct what appeared to me to be incorrect (or incomplete) information. Sincerely, Walter Prins > > ----------------------------------------------------- Home page: http://groups.yahoo.com/group/delphi-en/ To unsubscribe: [EMAIL PROTECTED] Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/delphi-en/ <*> To unsubscribe from this group, send an email to: [EMAIL PROTECTED] <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/

