Re: direct.used on the Metrics page exceeded planner.memory.max_query_memory_per_node while running a single query

Muhammad Gelbana Tue, 15 Aug 2017 12:22:02 -0700

I guess that's why someone recommended to specify a low limit, but then
specify a higher one for each query before execution, using the *ALTER
SESSION* command. I can't remember were did I read that.


Thanks,
Gelbana

On Tue, Aug 15, 2017 at 7:49 PM, Kunal Khatua <[email protected]> wrote:

> The property *planner.memory.max_query_memory_per_node* is a cumulative
> limit of all the operators' minor fragments' memory consumption.
>
> However, like Boaz pointed out, the truly memory hungry operators like
> Hash, Sort or Scan operators will take the lion's share of a query's
> memory. Since, some operators which cannot spill to disk as yet (e.g. Hash
> Join) will continue to grab as much memory as they need, you can see the
> memory go up beyond the limit defined.
>
> So, with a 4GB limit, you should see the query constraint within that
> total for each node. It might be a bit higher because there are other
> overheads, like SCAN, but I doubt it would double!
>
> Also, the property 'direct.used' only shows the memory currently held by
> Netty. So, it could be a tad bit misleading (unless you know for sure that
> each subsequent query needs more memory, in which case you'll see the value
> rise).
>
> Boaz might be able to point you to the 10GB limit.
>
>
> -----Original Message-----
> From: Muhammad Gelbana [mailto:[email protected]]
> Sent: Tuesday, August 15, 2017 7:43 AM
> To: [email protected]
> Subject: Re: direct.used on the Metrics page exceeded
> planner.memory.max_query_memory_per_node while running a single query
>
> By "instance", you mean minor fragments, correct ? And does the
> *planner.memory.max_query_memory_per_node* limit apply to each *type* of
> minor fragments individually ? Assuming the memory limit is set to *4 GB*,
> and the running query involves external sort and hash aggregates, should I
> expect the query to consume at least *8 GB* ?
>
> Would you please point out to me, where in the code can I look into the
> implementation of this 10GB memory limit ?
>
> Thanks,
> Gelbana
>
> On Tue, Aug 15, 2017 at 2:40 AM, Boaz Ben-Zvi <[email protected]> wrote:
>
> > There is this page: https://drill.apache.org/docs/
> > sort-based-and-hash-based-memory-constrained-operators/
> > But it seems out of date (correct for 1.10). It does not explain about
> > the hash operators, except that they run till they can not allocate
> > any more memory. This will happen (undocumented) at either 10GB per
> > instance, or when there is no more memory at the node.
> >
> >      Boaz
> >
> > On 8/14/17, 1:16 PM, "Muhammad Gelbana" <[email protected]> wrote:
> >
> >     I'm not sure which version I was using when that happened. But
> > that's some
> >     precise details you've mentioned! Is this mentioned somewhere in
> > the docs ?
> >
> >     Thanks a lot.
> >
> >     On Aug 14, 2017 9:00 PM, "Boaz Ben-Zvi" <[email protected]> wrote:
> >
> >     > Did your query include a hash join ?
> >     >
> >     > As of 1.11, only the External Sort and Hash Aggregate operators
> > obey the
> >     > memory limit (that is, the “max query memory per node” figure is
> > divided
> >     > among all the instances of these operators).
> >     > The Hash Join (as was before 1.11) still does not take part in
> > this memory
> >     > allocation scheme, and each instance may use up to 10GB.
> >     >
> >     > Also in 1.11, the Hash Aggregate may “fall back” to the 1.10
> behavior
> >     > (same as the Hash Join; i.e. up to 10GB) in case there is too
> > little memory
> >     > per an instance (because it cannot perform memory spilling,
> > which requires
> >     > some minimal memory to hold multiple batches).
> >     >
> >     >    Thanks,
> >     >
> >     >           Boaz
> >     >
> >     > On 8/11/17, 4:25 PM, "Muhammad Gelbana" <[email protected]>
> wrote:
> >     >
> >     >     Sorry for the long subject !
> >     >
> >     >     I'm running a single query on a single node Drill setup.
> >     >
> >     >     I assumed that setting the *planner.memory.max_query_
> > memory_per_node*
> >     > property
> >     >     controls the max amount of memory (in bytes) for each running
> on
> > a
> >     > single
> >     >     node. Which means that in my setup, the *direct.used* metric in
> > the
> >     > metrics
> >     >     page should never exceed that value in my case.
> >     >
> >     >     But it did and drastically. I assigned *34359738368* (32 GB) to
> > the
> >     >     *planner.memory.max_query_memory_per_node* option but while
> >     > monitoring the
> >     >     *direct.used* metric, I found that it reached *51640484458*
> (~48
> > GB).
> >     >
> >     >     What did I mistakenly do\interpret ?
> >     >
> >     >     Thanks,
> >     >     Gelbana
> >     >     
> >     >
> >     >
> >     >
> >
> >
> >
>

Re: direct.used on the Metrics page exceeded planner.memory.max_query_memory_per_node while running a single query

Reply via email to