Re: coprocessor cause 100% cpu

hongbin ma Mon, 21 Sep 2015 23:29:17 -0700

hi

did you forget to attach the screenshot?


On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <[email protected]>
wrote:

> Hi
>
> We are kylin 0.7.2 .
> A screenshot of the call stack is attached for reference.
>
> Yesterday we have done some more debugging and we added a timeout check in
> co processor AggregationScanner -> buildAggrCache
> similar to checkMemoryUsage() check in the co processor but when we
> enabled fuzzy keys it simply remains stuck for hours.
> It's not even looping as even when we added timeout checks of 1 min, the
> timeout never happened but the co processor was hung for a long time and we
> had to bounce the regionserver. If you could explain what is causing the co
> processor to remain hung for so long and not even loop in. Is it just stuck
> on the scan forever.
>
> After this when we disable the fuzzy keys, the timeout does get executed.
> On further analysis we tried to reduce the fuzzy_value_cap and brought it
> down to 20.
> The problem is that when we switch on fuzzy and have filters which lead to
> IN clause, the co processor is not deterministic and it goes into a spin
> sometimes and it executes fine sometimes which becomes an issue as we need
> deterministic performance and do not want to co processor to be running for
> ever. Some queries run fine and are very fast and some just get stuck
> forever.
>
> The client time out with an rpc timeout but the co processor thread just
> hogs the CPU.
>
> Please comment.
>
> Thanks
>
>
> On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <[email protected]> wrote:
>
>> hi vipul,
>>
>> what version are you using? before
>> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some critical
>> performance issues caused by many IN clauses, if you could help to provide
>> a CPU/heap analysis(on your hbase's region server) it would be easier to
>> address the problem.
>>
>> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <[email protected]>
>> wrote:
>>
>> > Hi
>> >
>> > Have noticed a pattern that which caused the co processor to spike the
>> > regionserver cpu to 100% over time.
>> > If we end up issuing a query thru kylin which may involve a scanning a
>> lot
>> > of data assuming multiple days with multiple filters for many
>> dimensions in
>> > which case it has to scan a large number of rows and if it doesnt
>> return in
>> > the required rpc timeout then the client does get an error message with
>> the
>> > exception, but on the regionserver we see no end to processing and it
>> > ultimately hogs the regionserver.
>> >
>> > Are there any configs on the coprocessor which can be configured to say
>> > that if the processing is not completed in N time, then simply timeout
>> as
>> > that way we can look at the queries later but avoid cpu spike as it
>> makes
>> > the cluster unusable.
>> >
>> > Thanks
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> *Bin Mahone | 马洪宾*
>> Apache Kylin: http://kylin.io
>> Github: https://github.com/binmahone
>>
>
>


-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Re: coprocessor cause 100% cpu

Reply via email to