That would be good. If they eventually run successfully, a query profile would also be welcome.
Thanks On Tue, Jan 31, 2017 at 4:28 PM, William Cox <[email protected] > wrote: > Jeszy, > > Thanks for the suggestion. We also have a 25GB per-query limit set up. > Queries that estimate a large size are rejected with an error stating they > exceeded the memory limit. The queries I'm having trouble with are ones > that have no such error but simply wait in the CREATED state. Next time it > happens I'll see if I can grab the memory estimates and check. > Thanks. > -William > > > On Tue, Jan 31, 2017 at 7:08 AM, Jeszy <[email protected]> wrote: > >> Hey William, >> >> IIUC you have configured both a memory-based upper bound and a # queries >> upper bound for the default pool. A query can get queued if it would exceed >> either of these limits. If you're not hitting the number of queries one, >> then it's probably memory, which can happen even if not fully utilized - >> unless you specify a mem_limit for the query, the estimated memory >> requirement will be used for deciding whether the query should be admitted. >> This can get out of hand when the cardinality estimation is off, either due >> to a very complex query or because of missing / old stats. >> >> This is about memory-based admission control exclusively, but I think it >> will be helpful: http://www.cloudera.com/docume >> ntation/enterprise/latest/topics/impala_admission.html#admission_memory >> >> HTH >> >> On Mon, Jan 30, 2017 at 8:31 PM, William Cox < >> [email protected]> wrote: >> >>> I'm running CDH CDH-5.8.0-1 and Impala =version 2.6.0-cdh5.8.0 RELEASE >>> (build 8d8652f69461f0dd8d5f474573fb5de7ceb0ee6b). We have enabled >>> resource management and allocated ~700Gb of memory with 30 running queries >>> for the default. Our background data jobs are Unlimited. >>> >>> >>> In spite of this setup, we still encounter times where queries will be >>> marked as CREATED and waiting for allocation when the number of running >>> queries is well below 30 and the amount of used memory, as listed in the >>> CDH UI, is well below 700GB. >>> >>> This is seemingly unpredicable. We've created extensive monitors to >>> track # of running queries and memory usage but there seems to be no >>> pattern to why/when these queries won't be submitted to the cluster. >>> >>> Is there some key metric that I might be missing or is there any >>> suggestions folks have for tracking down these queries that won't be >>> submitted? >>> Thanks. >>> -William >>> >>> >> >
