Re: [HACKERS] Hash support for arrays

2010-11-04 Thread Dean Rasheed
On 3 November 2010 09:24, Nicolas Barbier nicolas.barb...@gmail.com wrote: 2010/11/2 Kenneth Marshall k...@rice.edu: Given that our hash implimentation mixes the input data well (It does. I tested it.) then a simple rotate-and-xor method is all that should be needed to maintain all of the

Re: [HACKERS] Hash support for arrays

2010-11-04 Thread Kenneth Marshall
On Thu, Nov 04, 2010 at 10:00:40AM +, Dean Rasheed wrote: On 3 November 2010 09:24, Nicolas Barbier nicolas.barb...@gmail.com wrote: 2010/11/2 Kenneth Marshall k...@rice.edu: Given that our hash implimentation mixes the input data well (It does. I tested it.) then a simple

Re: [HACKERS] Hash support for arrays

2010-11-03 Thread Nicolas Barbier
2010/11/2 Kenneth Marshall k...@rice.edu: Given that our hash implimentation mixes the input data well (It does. I tested it.) then a simple rotate-and-xor method is all that should be needed to maintain all of the needed information. The original hash function has done the heavy lifting in

Re: [HACKERS] Hash support for arrays

2010-11-03 Thread Kenneth Marshall
On Wed, Nov 03, 2010 at 10:24:16AM +0100, Nicolas Barbier wrote: 2010/11/2 Kenneth Marshall k...@rice.edu: Given that our hash implimentation mixes the input data well (It does. I tested it.) then a simple rotate-and-xor method is all that should be needed to maintain all of the needed

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Sat, Oct 30, 2010 at 10:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: marcin mank marcin.m...@gmail.com writes: On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: 3. To hash, apply the element type's hash function to each array element.  Combine these values by rotating the

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Sat, Oct 30, 2010 at 10:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: marcin mank marcin.m...@gmail.com writes: This would make the hash the same for arrays with elements 32 apart swapped. Well, there are *always* going to be cases where you get the

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mar nov 02 15:21:31 -0300 2010: What concerns me about that is that it tends to push the bits to the left --- I think the effects of the earlier inputs are going to overflow out as you incorporate a lot of newer inputs. What you want is a scheme where

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Oct 30, 2010 at 10:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: marcin mank marcin.m...@gmail.com writes: This would make the hash the same for arrays with elements 32 apart

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The goal is to make those hard to predict, though. Really? I think I don't understand when this fails isn't obviously better than being able to predict when it fails ...

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Tue, Nov 2, 2010 at 12:34 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Robert Haas robertmh...@gmail.com wrote: Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: The goal is to make those hard to predict, though. Really?  I think I don't understand

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: There's no reason that the hash value of an integer of the same size as the hash shouldn't be the integer itself, for example. It's hard to get more predictable than that, yet it works well for hash lookups. Well, no, not really. For example, it

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Really?  I think I don't understand when this fails isn't obviously better than being able to predict when it fails ... Isn't that the whole point of hash functions? Collisions are

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Kenneth Marshall
On Tue, Nov 02, 2010 at 04:42:19PM -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Really? ?I think I don't understand when this fails isn't obviously better than being able to predict when it fails ...

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Nov 2, 2010, at 1:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: However, this is largely beside the point, because that theory, as well as the Java code you're arguing from, has to do with the initial hashing of a raw sequence of input items. Not with combining some existing hash values. The

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Nov 2, 2010, at 1:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: However, this is largely beside the point, because that theory, as well as the Java code you're arguing from, has to do with the initial hashing of a raw sequence of input items. Not with

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Sam Mason
On Sat, Oct 30, 2010 at 01:01:44PM -0400, Tom Lane wrote: marcin mank marcin.m...@gmail.com writes: This is what boost does: http://www.systomath.com/include/Boost-1_34/doc/html/boost/hash_combine.html Hmm. I am reminded of Knuth's famous dictum: never generate random numbers with a

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Ron Mayer
Tom Lane wrote: It's possible that the multiply-by-31 method is as good as the rotate-and-xor method by that measure, or even better; but it's far from obvious that it's better. And I'm not convinced that the multiply method has a pedigree that should encourage me to take it on faith. Short

Re: [HACKERS] Hash support for arrays

2010-11-01 Thread hernan gonzalez
Hmm. I am reminded of Knuth's famous dictum: never generate random numbers with a method chosen at random. Is there any actual theory behind that algorithm, and if so what is it? The combination of shifting with addition (not xor) seems more likely to lead to weird cancellations than any

[HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
I was reminded today that I'd promised to do $SUBJECT awhile back. It's worth having so that hash joins and hash aggregation can work on array values. I believe this is a fairly straightforward exercise: 1. Add a hash opclass that accepts ANYARRAY, similar to the existing btree opclass for

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread marcin mank
On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: 3. To hash, apply the element type's hash function to each array element.  Combine these values by rotating the accumulator left one bit at each step and xor'ing in the next element's hash value. Thoughts?  In particular, is

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
marcin mank marcin.m...@gmail.com writes: On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: 3. To hash, apply the element type's hash function to each array element.  Combine these values by rotating the accumulator left one bit at each step and xor'ing in the next element's

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Greg Stark
On Sat, Oct 30, 2010 at 9:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Thoughts?  In particular, is anyone aware of a better method for combining the element hash values? The obvious thing to do seems like it would be to feed the individual values back into the regular hash function. -- greg

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: On Sat, Oct 30, 2010 at 9:21 AM, Tom Lane t...@sss.pgh.pa.us wrote: Thoughts?  In particular, is anyone aware of a better method for combining the element hash values? The obvious thing to do seems like it would be to feed the individual values back into

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Greg Stark
On Sat, Oct 30, 2010 at 1:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: Hmm, you mean use hash_any on the sequence of hash values?  Interesting idea; but hash_any doesn't look very amenable to being invoked incrementally, and I don't think I want to construct a temp array with enough space for the

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: I suppose you could use crc where our interface does allow incremental use. Hm, that's a thought. I'm not sure though whether CRC really has the behavior you want from a hash function, in particular that all the bits are independent (a/k/a taking N low-order