Joe McDonnell created IMPALA-14701:
--------------------------------------

             Summary: Use non-power-of-two sizing for large hash joins
                 Key: IMPALA-14701
                 URL: https://issues.apache.org/jira/browse/IMPALA-14701
             Project: IMPALA
          Issue Type: Task
          Components: Backend
    Affects Versions: Impala 5.0.0
            Reporter: Joe McDonnell


The hash tables for hash joins use a power of two for the number of buckets. 
This uses a simple bit-mask to get the bucket from the lower bits of the hash 
value. For larger hash tables, the cost of the hash table is dominated by 
memory accesses rather than actual CPU cycles. Being able to use a number of 
buckets that is not a power of two would reduce memory usage and allow some 
workloads to fit in cache that otherwise would exceed it. There are a few 
different ways to efficiently map the hash to the buckets:
 * The "fast modulo" (which is not really a modulo) from 
[https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/] 
would be most influenced by high bits in the hash
 * Compilers use strength reduction to convert modulus to multiplication and 
shift. [https://www.vldb.org/pvldb/vol12/p502-lang.pdf] refers to this as 
"magic modulo". This should behave like regular modulo and use low bits.
 * Either way, this needs to ignore the top bits that we use for partitioning, 
as those bits will be constant within each partition and can't give entropy.

Hash joins know the number of tuples before creating the hash table, so this 
can be calculated once before creating the hash table. It doesn't need to 
handle resizes.

A few complications:
 * The memory for the hash table comes from a Suballocator, which only supports 
power of two sizes. To support other sizes, we would need a different way to 
get memory. Suballocator is not very useful for this case, because we would 
only use this for large hash tables where the buckets/hash arrays are larger 
than the page size. Also, Suballocator is not very useful when we aren't 
resizing the hash table.
 * The quadratic probing relies on a power of two sizing to guarantee that it 
can reach every element. Quadratic probing can work with prime sizes, but we 
may not want to restrict ourselves that way. For very large hash tables, linear 
probing should be fine.
 * Codegen is happening before we know the size of the hash table, so we would 
need additional codegen functions specializing in large non-power-of-two hash 
tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to