I've been looking into DRILL-1162, and found that a query that used to run
within certain constraints (DRILL_MAX_DIRECT_MEMORY=32G) no longer does
even though it looks like there should be plenty of memory. I took the
query in that report, and removed the last ten (redundant) join elements,
and it now fails with 32G direct memory, even though it previously ran
(although it produced the wrong results). When I check the query profile,
it only consumed around ~9G -- so there should be plenty of space left
before it fails. I started looking at it in the debugger, and the
allocation failure occurs during an attempt to resize the output vector.
The allocator being used believes there's no memory left, even though it's
parent has more than enough to satisfy the request.

I've also found another ticket with a HashJoin that fails in a similar way
even though there is plenty of memory.

Hash the execution of HashJoin or its use of its result vector changed in
some way recently?

Reply via email to