[
https://issues.apache.org/jira/browse/HIVE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gopal V updated HIVE-16592:
---------------------------
Description:
{code}
public static int calculateLongHashCode(long key) {
key = (~key) + (key << 21); // key = (key << 21) - key - 1;
key = key ^ (key >>> 24);
key = (key + (key << 3)) + (key << 8); // key * 265
key = key ^ (key >>> 14);
key = (key + (key << 2)) + (key << 4); // key * 21
key = key ^ (key >>> 28);
key = key + (key << 31);
return (int) key;
}
{code}
Does not mix enough bits into the lower 32 bits, which are used for the bucket
probes.
was:
{code}
public static int calculateLongHashCode(long key) {
key = (~key) + (key << 21); // key = (key << 21) - key - 1;
key = key ^ (key >>> 24);
key = (key + (key << 3)) + (key << 8); // key * 265
key = key ^ (key >>> 14);
key = (key + (key << 2)) + (key << 4); // key * 21
key = key ^ (key >>> 28);
key = key + (key << 31);
return (int) key;
}
{code}
Does not mix enough bits into the lower 32 bits, which are used for the bucket
probes.
The 1997 document lists
{code}
public int hash6432shift(long key)
{
key = (~key) + (key << 18); // key = (key << 18) - key - 1;
key = key ^ (key >>> 31);
key = key * 21; // key = (key + (key << 2)) + (key << 4);
key = key ^ (key >>> 11);
key = key + (key << 6);
key = key ^ (key >>> 22);
return (int) key;
}
{code}
as the algorithm for keeping the lower 32 bits well distributed.
> Vectorization: Long hashes use hash64shift and not hash6432shift to generate
> int hashCodes
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-16592
> URL: https://issues.apache.org/jira/browse/HIVE-16592
> Project: Hive
> Issue Type: Bug
> Reporter: Gopal V
> Priority: Minor
>
> {code}
> public static int calculateLongHashCode(long key) {
> key = (~key) + (key << 21); // key = (key << 21) - key - 1;
> key = key ^ (key >>> 24);
> key = (key + (key << 3)) + (key << 8); // key * 265
> key = key ^ (key >>> 14);
> key = (key + (key << 2)) + (key << 4); // key * 21
> key = key ^ (key >>> 28);
> key = key + (key << 31);
> return (int) key;
> }
> {code}
> Does not mix enough bits into the lower 32 bits, which are used for the
> bucket probes.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)