Are you submitting all queries to the same coordinator? If so, you might have to increase the --fe_service_threads to allow more concurrent connections. That said the single coordinator will eventually become a bottleneck, so we recommend submitting queries to different impalads.
On Fri, Sep 1, 2017 at 9:41 AM, Tim Armstrong <[email protected]> wrote: > Hi Alexander, > It's hard to know based on the information available. Query profiles > often provide some clues here. I agree Impala would be able to max out one > of the resources in most circumstances. > > On Impala 2.8 and earlier we saw behaviour similar to what you described > when running queries with selective scans on machines with many cores: > https://issues.apache.org/jira/browse/IMPALA-4923 . The bottleneck there > was lock contention during memory allocation - the threads spent a lot of > time asleep waiting to get a shared lock. > > On Fri, Sep 1, 2017 at 8:36 AM, Alexander Shoshin < > [email protected]> wrote: > >> Hi, >> >> >> >> I am working with Impala trying to find its maximum throughput on my >> hardware. I have a cluster under Cloudera Manager which consists of 7 >> machines (1 master node + 6 worker nodes). >> >> >> >> I am running queries on Impala using JDBC. I’ve reached maximum >> throughput equals 80 finished queries per minute. It doesn’t grow up no >> matter how many hundreds of concurrent queries I send. But the strange >> thing is that no one of resources (memory, CPU, disk read/write, net >> send/received) hasn’t reached its maximum. They are used less than on a >> half. >> >> >> >> Could you suppose what can be a bottleneck? May it be some Impala setting >> that limits performance or maximum concurrent threads? The mem_limit option >> for my Impala daemons is about 70% of available machine memory. >> >> >> >> Thanks, >> >> Alexander >> > >
