Re: Bottleneck

Alexander Behm Fri, 01 Sep 2017 09:57:35 -0700

Are you submitting all queries to the same coordinator? If so, you might
have to increase the --fe_service_threads to allow more concurrent
connections.
That said the single coordinator will eventually become a bottleneck, so we
recommend submitting queries to different impalads.


On Fri, Sep 1, 2017 at 9:41 AM, Tim Armstrong <[email protected]>
wrote:

> Hi Alexander,
>   It's hard to know based on the information available. Query profiles
> often provide some clues here. I agree Impala would be able to max out one
> of the resources in most circumstances.
>
> On Impala 2.8 and earlier we saw behaviour similar to what you described
> when running queries with selective scans on machines with many cores:
> https://issues.apache.org/jira/browse/IMPALA-4923 . The bottleneck there
> was lock contention during memory allocation - the threads spent a lot of
> time asleep waiting to get a shared lock.
>
> On Fri, Sep 1, 2017 at 8:36 AM, Alexander Shoshin <
> [email protected]> wrote:
>
>> Hi,
>>
>>
>>
>> I am working with Impala trying to find its maximum throughput on my
>> hardware. I have a cluster under Cloudera Manager which consists of 7
>> machines (1 master node + 6 worker nodes).
>>
>>
>>
>> I am running queries on Impala using JDBC. I’ve reached maximum
>> throughput equals 80 finished queries per minute. It doesn’t grow up no
>> matter how many hundreds of concurrent queries I send. But the strange
>> thing is that no one of resources (memory, CPU, disk read/write, net
>> send/received) hasn’t reached its maximum. They are used less than on a
>> half.
>>
>>
>>
>> Could you suppose what can be a bottleneck? May it be some Impala setting
>> that limits performance or maximum concurrent threads? The mem_limit option
>> for my Impala daemons is about 70% of available machine memory.
>>
>>
>>
>> Thanks,
>>
>> Alexander
>>
>
>

Re: Bottleneck

Reply via email to