I've been looking into DRILL-1162, and found that a query that used to run within certain constraints (DRILL_MAX_DIRECT_MEMORY=32G) no longer does even though it looks like there should be plenty of memory. I took the query in that report, and removed the last ten (redundant) join elements, and it now fails with 32G direct memory, even though it previously ran (although it produced the wrong results). When I check the query profile, it only consumed around ~9G -- so there should be plenty of space left before it fails. I started looking at it in the debugger, and the allocation failure occurs during an attempt to resize the output vector. The allocator being used believes there's no memory left, even though it's parent has more than enough to satisfy the request.
I've also found another ticket with a HashJoin that fails in a similar way even though there is plenty of memory. Hash the execution of HashJoin or its use of its result vector changed in some way recently?
