On Wed, Jun 23, 2010 at 10:42:18AM -0700, timri...@appahost.com wrote: > > The reason for the discussion was Brian's intrusion detection > implementation stored the incoming packets in a hash > table. The key to the hash table was quite > large -- inbound IP address, outbound IP address, inbound port, and > outbound port. I thought to my self, on a very large implementation (say > Google) the table could grow to a billion entries. Could a hash table > store this amount in memory? Could you allocate an array of half the > total memory? Could you allocate an array of a billion integers? Brian, > on his laptop, couldn't allocate a billion integers. But he could > allocate a billion characters (bytes). Since I thought both bytes and > integers were words, and since I thought memory stored words > like registers stored words, we had our discussion.
What started this is that having a hash provides ideally a O(1). So, I say that to achieve this, one would ideally want to have a large array to store your IP addresses and a hashing function that provides good distribution. To which, Tim pointed to using a smaller array and using chaining, because he initially thought that arrays were limited to smaller sizes than was observed by doing some simple tests. To which I replied, using a fixed size array and the chaining could cause the hash to decay into a linked list which has a O(n). And to which Tim has noted using a large fixed size array may just take up all your architectural limits of the program. I imagine that having such a large array would push the lower bound of the stack up in memory, or if allocated at runtime, would push the heap way down from the top. In my Genetic Algorithm, I used Gnome's glib to do the hashing for me. I figured that someone else already looked at this problem. In my Genetic Algorithm, I am only storing a maximum of 32 bits in the hash key. In the nProbe code, Luca Deri wrote the hash, using the key values as Tim described. I have not figured out exactly how Luca does it, but I will assume that he used a suitable method that works well enough for me. ;-) If I don't contain the scope on my Master's Project, I won't finish! Thus, this was the impetus for our discussion. ;-) brian -- Brian Lavender http://www.brie.com/brian/ "There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies." Professor C. A. R. Hoare The 1980 Turing award lecture _______________________________________________ vox-tech mailing list vox-tech@lists.lugod.org http://lists.lugod.org/mailman/listinfo/vox-tech