I got a way which works fine: use bigint first, and then convert it to bit(32), 
and convert it to int4 at last.

declare i integer := 0;
declare h bigint := 0;
begin
  for i in 1..length(str) loop
    h = (h * 31 + ascii(substring(str, i, 1))) & 4294967295;
  end loop;
  return cast(cast(h as bit(32)) as int4);
end;

I did some tests which include both positive and negative results, seems all Ok.

On Sep 21, 2012, at 11:21 PM, Craig James <cja...@emolecules.com> wrote:

> On Thu, Sep 20, 2012 at 7:56 PM, Haifeng Liu <liuhaif...@live.com> wrote:
> 
> On Sep 20, 2012, at 10:34 PM, Craig James <cja...@emolecules.com> wrote:
> 
>> 
>> 
>> On Thu, Sep 20, 2012 at 1:55 AM, Haifeng Liu <liuhaif...@live.com> wrote:
>> I want to write a hash function which acts as String.hashCode() in java: 
>> hash = hash * 31 + s.charAt(i)... but I got integer out of range error. How 
>> can I avoid this? I saw java do not care overflow of int, it just make the 
>> result negative.
>> 
>> 
>> Use the bitwise AND operator to mask the hash value with 0x3FFFFFF before 
>> each iteration:
>> 
>>   hash = (hash & 67108863) * 31 + s.charAt(i);
>> 
>> Craig
> 
> Thank you, I believe your solution is OK for a hash function, but I am aiming 
> to create a hash function that is consistent with the one applications use. I 
> know postgresql 9.1 has a hash function called hashtext, but I don't know 
> what algorithm it use,  and I also see that it's not recommended to relay on 
> it. So I am trying to create a hash function which behaves exactly the same 
> as java.lang.String.hashCode().  The later one may generate negative hash 
> value. I guess when the number is overflowing, the part out of range will be 
> ignored, and if the highest bit get 1, the hash value turn to negative value.
> 
> You are probably doing something where you want the application and the 
> database to implement the exact same function, but if you stick to the Java 
> built-in function, you will only have control over one implementation of that 
> function.  What happens if someone working on Java changes the how the Java 
> internals work?

That's not the trouble, just create a hash tool which copies the code of 
java.lang.String.hashCode() and use that tool instead will resolve this. The 
key is, I know and I can reimplement the algorithm.

> 
> A better solution would be to implement your own hash function in Postgres, 
> and then once you know exactly how it will work, re-implement it in Java with 
> your own code.  That's the only way you can ensure consistency between the 
> two.
> 
> Craig

Reply via email to