Right, the HashJoin and HashTable code hasn't changed significantly in terms of memory allocation in the last several releases. You might want to look at the change history for underlying vector allocations...I recall that variable length vector allocations went through some changes. However DRILL-1162 does not seem to be using varchar columns (I think..).
On Fri, Sep 25, 2015 at 6:51 AM, Jacques Nadeau <[email protected]> wrote: > I don't think anyone has done much there in quite some time. I'd guess > something external has changed that affects it. The last substantive change > around that code (I think) was the introduction of the multiplexing work > that Venki and Yuliya did early this year. > On Sep 25, 2015 6:32 AM, "Chris Westin" <[email protected]> wrote: > > > I've been looking into DRILL-1162, and found that a query that used to > run > > within certain constraints (DRILL_MAX_DIRECT_MEMORY=32G) no longer does > > even though it looks like there should be plenty of memory. I took the > > query in that report, and removed the last ten (redundant) join elements, > > and it now fails with 32G direct memory, even though it previously ran > > (although it produced the wrong results). When I check the query profile, > > it only consumed around ~9G -- so there should be plenty of space left > > before it fails. I started looking at it in the debugger, and the > > allocation failure occurs during an attempt to resize the output vector. > > The allocator being used believes there's no memory left, even though > it's > > parent has more than enough to satisfy the request. > > > > I've also found another ticket with a HashJoin that fails in a similar > way > > even though there is plenty of memory. > > > > Hash the execution of HashJoin or its use of its result vector changed in > > some way recently? > > >
