RE: direct.used on the Metrics page exceeded planner.memory.max_query_memory_per_node while running a single query

Kunal Khatua Tue, 15 Aug 2017 10:50:11 -0700

The property *planner.memory.max_query_memory_per_node* is a cumulative limit 
of all the operators' minor fragments' memory consumption.

However, like Boaz pointed out, the truly memory hungry operators like Hash, 
Sort or Scan operators will take the lion's share of a query's memory. Since, 
some operators which cannot spill to disk as yet (e.g. Hash Join) will continue 
to grab as much memory as they need, you can see the memory go up beyond the 
limit defined.

So, with a 4GB limit, you should see the query constraint within that total for 
each node. It might be a bit higher because there are other overheads, like 
SCAN, but I doubt it would double!

Also, the property 'direct.used' only shows the memory currently held by Netty. 
So, it could be a tad bit misleading (unless you know for sure that each 
subsequent query needs more memory, in which case you'll see the value rise). 

Boaz might be able to point you to the 10GB limit.

-----Original Message-----
From: Muhammad Gelbana [mailto:[email protected]] 
Sent: Tuesday, August 15, 2017 7:43 AM
To: [email protected]
Subject: Re: direct.used on the Metrics page exceeded 
planner.memory.max_query_memory_per_node while running a single query

By "instance", you mean minor fragments, correct ? And does the
*planner.memory.max_query_memory_per_node* limit apply to each *type* of minor 
fragments individually ? Assuming the memory limit is set to *4 GB*, and the 
running query involves external sort and hash aggregates, should I expect the 
query to consume at least *8 GB* ?

Would you please point out to me, where in the code can I look into the 
implementation of this 10GB memory limit ?

Thanks,
Gelbana

On Tue, Aug 15, 2017 at 2:40 AM, Boaz Ben-Zvi <[email protected]> wrote:

> There is this page: https://drill.apache.org/docs/ 
> sort-based-and-hash-based-memory-constrained-operators/
> But it seems out of date (correct for 1.10). It does not explain about 
> the hash operators, except that they run till they can not allocate 
> any more memory. This will happen (undocumented) at either 10GB per 
> instance, or when there is no more memory at the node.
>
>      Boaz
>
> On 8/14/17, 1:16 PM, "Muhammad Gelbana" <[email protected]> wrote:
>
>     I'm not sure which version I was using when that happened. But 
> that's some
>     precise details you've mentioned! Is this mentioned somewhere in 
> the docs ?
>
>     Thanks a lot.
>
>     On Aug 14, 2017 9:00 PM, "Boaz Ben-Zvi" <[email protected]> wrote:
>
>     > Did your query include a hash join ?
>     >
>     > As of 1.11, only the External Sort and Hash Aggregate operators 
> obey the
>     > memory limit (that is, the “max query memory per node” figure is 
> divided
>     > among all the instances of these operators).
>     > The Hash Join (as was before 1.11) still does not take part in 
> this memory
>     > allocation scheme, and each instance may use up to 10GB.
>     >
>     > Also in 1.11, the Hash Aggregate may “fall back” to the 1.10 behavior
>     > (same as the Hash Join; i.e. up to 10GB) in case there is too 
> little memory
>     > per an instance (because it cannot perform memory spilling, 
> which requires
>     > some minimal memory to hold multiple batches).
>     >
>     >    Thanks,
>     >
>     >           Boaz
>     >
>     > On 8/11/17, 4:25 PM, "Muhammad Gelbana" <[email protected]> wrote:
>     >
>     >     Sorry for the long subject !
>     >
>     >     I'm running a single query on a single node Drill setup.
>     >
>     >     I assumed that setting the *planner.memory.max_query_
> memory_per_node*
>     > property
>     >     controls the max amount of memory (in bytes) for each running on
> a
>     > single
>     >     node. Which means that in my setup, the *direct.used* metric in
> the
>     > metrics
>     >     page should never exceed that value in my case.
>     >
>     >     But it did and drastically. I assigned *34359738368* (32 GB) to
> the
>     >     *planner.memory.max_query_memory_per_node* option but while
>     > monitoring the
>     >     *direct.used* metric, I found that it reached *51640484458* (~48
> GB).
>     >
>     >     What did I mistakenly do\interpret ?
>     >
>     >     Thanks,
>     >     Gelbana
>     >     
>     >
>     >
>     >
>
>
>

RE: direct.used on the Metrics page exceeded planner.memory.max_query_memory_per_node while running a single query

Reply via email to