Re: [HACKERS] Hash support for arrays

2010-11-04 Thread Kenneth Marshall
On Thu, Nov 04, 2010 at 10:00:40AM +, Dean Rasheed wrote: > On 3 November 2010 09:24, Nicolas Barbier wrote: > > 2010/11/2 Kenneth Marshall : > > > >> Given that our hash implimentation mixes the input data well (It does. > >> I tested it.) then a simple rotate-and-xor method is all that shoul

Re: [HACKERS] Hash support for arrays

2010-11-04 Thread Dean Rasheed
On 3 November 2010 09:24, Nicolas Barbier wrote: > 2010/11/2 Kenneth Marshall : > >> Given that our hash implimentation mixes the input data well (It does. >> I tested it.) then a simple rotate-and-xor method is all that should >> be needed to maintain all of the needed information. The original >

Re: [HACKERS] Hash support for arrays

2010-11-03 Thread Kenneth Marshall
On Wed, Nov 03, 2010 at 10:24:16AM +0100, Nicolas Barbier wrote: > 2010/11/2 Kenneth Marshall : > > > Given that our hash implimentation mixes the input data well (It does. > > I tested it.) then a simple rotate-and-xor method is all that should > > be needed to maintain all of the needed informat

Re: [HACKERS] Hash support for arrays

2010-11-03 Thread Nicolas Barbier
2010/11/2 Kenneth Marshall : > Given that our hash implimentation mixes the input data well (It does. > I tested it.) then a simple rotate-and-xor method is all that should > be needed to maintain all of the needed information. The original > hash function has done the heavy lifting in this case.

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Ron Mayer
Tom Lane wrote: > It's possible that the multiply-by-31 method is as good as the > rotate-and-xor method by that measure, or even better; but it's far from > obvious that it's better. And I'm not convinced that the multiply > method has a pedigree that should encourage me to take it on faith. Sho

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Sam Mason
On Sat, Oct 30, 2010 at 01:01:44PM -0400, Tom Lane wrote: > marcin mank writes: > > This is what boost does: > > http://www.systomath.com/include/Boost-1_34/doc/html/boost/hash_combine.html > > Hmm. I am reminded of Knuth's famous dictum: "never generate random > numbers with a method chosen at

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Tom Lane
Robert Haas writes: > On Nov 2, 2010, at 1:42 PM, Tom Lane wrote: >> However, this is largely beside the point, because that theory, as well >> as the Java code you're arguing from, has to do with the initial hashing >> of a raw sequence of input items. Not with combining some existing hash >> v

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Nov 2, 2010, at 1:42 PM, Tom Lane wrote: > However, this is largely beside the point, because that theory, as well > as the Java code you're arguing from, has to do with the initial hashing > of a raw sequence of input items. Not with combining some existing hash > values. The rotate-and-xor

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Kenneth Marshall
On Tue, Nov 02, 2010 at 04:42:19PM -0400, Tom Lane wrote: > Robert Haas writes: > > On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane wrote: > >> Really? ?I think "I don't understand when this fails" isn't obviously > >> better than being able to predict when it fails ... > > > Isn't that the whole poin

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Tom Lane
Robert Haas writes: > On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane wrote: >> Really?  I think "I don't understand when this fails" isn't obviously >> better than being able to predict when it fails ... > Isn't that the whole point of hash functions? Collisions are > inevitable, but you want them t

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Kevin Grittner
Robert Haas wrote: >> There's no reason that the hash value of an integer of the same >> size as the hash shouldn't be the integer itself, for example. >> It's hard to get more predictable than that, yet it works well >> for hash lookups. > > Well, no, not really. For example, it may be that

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Tue, Nov 2, 2010 at 12:34 PM, Kevin Grittner wrote: > Robert Haas wrote: >> Tom Lane wrote: >>> Robert Haas writes: > The goal is to make those hard to predict, though. >>> >>> Really?  I think "I don't understand when this fails" isn't >>> obviously better than being able to predict wh

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Kevin Grittner
Robert Haas wrote: > Tom Lane wrote: >> Robert Haas writes: >>> The goal is to make those hard to predict, though. >> >> Really? I think "I don't understand when this fails" isn't >> obviously better than being able to predict when it fails ... > > Isn't that the whole point of hash function

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Tue, Nov 2, 2010 at 11:21 AM, Tom Lane wrote: > Robert Haas writes: >> On Sat, Oct 30, 2010 at 10:01 AM, Tom Lane wrote: >>> marcin mank writes: This would make the hash the same for arrays with elements 32 apart swapped. >>> >>> Well, there are *always* going to be cases where yo

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mar nov 02 15:21:31 -0300 2010: > What concerns me about that is that it tends to push the bits to the > left --- I think the effects of the earlier inputs are going to overflow > out as you incorporate a lot of newer inputs. What you want is a scheme > where e

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Tom Lane
Robert Haas writes: > On Sat, Oct 30, 2010 at 10:01 AM, Tom Lane wrote: >> marcin mank writes: >>> This would make the hash the same for arrays with elements 32 apart swapped. >> >> Well, there are *always* going to be cases where you get the same hash >> value for two different inputs; it's un

Re: [HACKERS] Hash support for arrays

2010-11-02 Thread Robert Haas
On Sat, Oct 30, 2010 at 10:01 AM, Tom Lane wrote: > marcin mank writes: >> On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane wrote: >>> 3. To hash, apply the element type's hash function to each array >>> element.  Combine these values by rotating the accumulator left >>> one bit at each step and xor'in

Re: [HACKERS] Hash support for arrays

2010-11-01 Thread hernan gonzalez
>Hmm. I am reminded of Knuth's famous dictum: "never generate random numbers with a method chosen at random". Is there any actual theory behind that algorithm, and if so what is it? The combination of shifting with addition (not xor) seems more likely to lead to weird cancellations than any impr

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
Greg Stark writes: > I suppose you could use crc where our interface does allow incremental use. Hm, that's a thought. I'm not sure though whether CRC really has the behavior you want from a hash function, in particular that all the bits are independent (a/k/a taking N low-order bits is a reason

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Greg Stark
On Sat, Oct 30, 2010 at 1:48 PM, Tom Lane wrote: > Hmm, you mean use hash_any on the sequence of hash values?  Interesting > idea; but hash_any doesn't look very amenable to being invoked > incrementally, and I don't think I want to construct a temp array with > enough space for the hashes of all

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
Greg Stark writes: > On Sat, Oct 30, 2010 at 9:21 AM, Tom Lane wrote: >> Thoughts?  In particular, is anyone aware of a better method >> for combining the element hash values? > The obvious thing to do seems like it would be to feed the individual > values back into the regular hash function. H

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Greg Stark
On Sat, Oct 30, 2010 at 9:21 AM, Tom Lane wrote: > Thoughts?  In particular, is anyone aware of a better method > for combining the element hash values? > The obvious thing to do seems like it would be to feed the individual values back into the regular hash function. -- greg -- Sent via pgsq

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
marcin mank writes: > On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane wrote: >> 3. To hash, apply the element type's hash function to each array >> element.  Combine these values by rotating the accumulator left >> one bit at each step and xor'ing in the next element's hash value. >> >> Thoughts?  In

Re: [HACKERS] Hash support for arrays

2010-10-30 Thread marcin mank
On Sat, Oct 30, 2010 at 6:21 PM, Tom Lane wrote: > 3. To hash, apply the element type's hash function to each array > element.  Combine these values by rotating the accumulator left > one bit at each step and xor'ing in the next element's hash value. > > Thoughts?  In particular, is anyone aware

[HACKERS] Hash support for arrays

2010-10-30 Thread Tom Lane
I was reminded today that I'd promised to do $SUBJECT awhile back. It's worth having so that hash joins and hash aggregation can work on array values. I believe this is a fairly straightforward exercise: 1. Add a hash opclass that accepts ANYARRAY, similar to the existing btree opclass for ANYARR