If you have many filters or IN clauses in your query, Kylin will generate a
lot of fuzzy keys for hbase scan. A proper amount of fuzzy keys will be
beneficial for hbase scanning, but when the number of fuzzy keys grow too
large, the performance of scanning will dramatically degrade, as
FuzzyKeyFilter will explore a large space of possibilities, and there is no
easy way to overcome this issue, see my patch to hbase at:

https://issues.apache.org/jira/browse/HBASE-14269

The side-effect is the high CPU usage you're observing.

so in  https://issues.apache.org/jira/browse/KYLIN-740, whenever we find
there're too many fuzzy filters generated(by using a magic number as
threshold), we'll discard them all, and scan hbase without any fuzzy keys.

hope this is useful to you








On Tue, Sep 22, 2015 at 11:26 PM, vipul jhawar <[email protected]>
wrote:

> Looks like attachments are stripped off the email.
> Here is a screenshot -
> https://monosnap.com/file/JmpHEMxJVVQUhTLxTrzE1sWDn7gXg4
>
> On Tue, Sep 22, 2015 at 5:32 PM, vipul jhawar <[email protected]>
> wrote:
>
> > Hi hongbin
> >
> > It is attached in the previous reply.
> > Attached again.
> >
> > Thanks
> >
> > On Tue, Sep 22, 2015 at 11:58 AM, hongbin ma <[email protected]>
> wrote:
> >
> >> hi
> >>
> >> did you forget to attach the screenshot?
> >>
> >> On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <[email protected]>
> >> wrote:
> >>
> >> > Hi
> >> >
> >> > We are kylin 0.7.2 .
> >> > A screenshot of the call stack is attached for reference.
> >> >
> >> > Yesterday we have done some more debugging and we added a timeout
> check
> >> in
> >> > co processor AggregationScanner -> buildAggrCache
> >> > similar to checkMemoryUsage() check in the co processor but when we
> >> > enabled fuzzy keys it simply remains stuck for hours.
> >> > It's not even looping as even when we added timeout checks of 1 min,
> the
> >> > timeout never happened but the co processor was hung for a long time
> >> and we
> >> > had to bounce the regionserver. If you could explain what is causing
> >> the co
> >> > processor to remain hung for so long and not even loop in. Is it just
> >> stuck
> >> > on the scan forever.
> >> >
> >> > After this when we disable the fuzzy keys, the timeout does get
> >> executed.
> >> > On further analysis we tried to reduce the fuzzy_value_cap and brought
> >> it
> >> > down to 20.
> >> > The problem is that when we switch on fuzzy and have filters which
> lead
> >> to
> >> > IN clause, the co processor is not deterministic and it goes into a
> spin
> >> > sometimes and it executes fine sometimes which becomes an issue as we
> >> need
> >> > deterministic performance and do not want to co processor to be
> running
> >> for
> >> > ever. Some queries run fine and are very fast and some just get stuck
> >> > forever.
> >> >
> >> > The client time out with an rpc timeout but the co processor thread
> just
> >> > hogs the CPU.
> >> >
> >> > Please comment.
> >> >
> >> > Thanks
> >> >
> >> >
> >> > On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <[email protected]>
> >> wrote:
> >> >
> >> >> hi vipul,
> >> >>
> >> >> what version are you using? before
> >> >> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some
> >> critical
> >> >> performance issues caused by many IN clauses, if you could help to
> >> provide
> >> >> a CPU/heap analysis(on your hbase's region server) it would be easier
> >> to
> >> >> address the problem.
> >> >>
> >> >> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <
> [email protected]
> >> >
> >> >> wrote:
> >> >>
> >> >> > Hi
> >> >> >
> >> >> > Have noticed a pattern that which caused the co processor to spike
> >> the
> >> >> > regionserver cpu to 100% over time.
> >> >> > If we end up issuing a query thru kylin which may involve a
> scanning
> >> a
> >> >> lot
> >> >> > of data assuming multiple days with multiple filters for many
> >> >> dimensions in
> >> >> > which case it has to scan a large number of rows and if it doesnt
> >> >> return in
> >> >> > the required rpc timeout then the client does get an error message
> >> with
> >> >> the
> >> >> > exception, but on the regionserver we see no end to processing and
> it
> >> >> > ultimately hogs the regionserver.
> >> >> >
> >> >> > Are there any configs on the coprocessor which can be configured to
> >> say
> >> >> > that if the processing is not completed in N time, then simply
> >> timeout
> >> >> as
> >> >> > that way we can look at the queries later but avoid cpu spike as it
> >> >> makes
> >> >> > the cluster unusable.
> >> >> >
> >> >> > Thanks
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Regards,
> >> >>
> >> >> *Bin Mahone | 马洪宾*
> >> >> Apache Kylin: http://kylin.io
> >> >> Github: https://github.com/binmahone
> >> >>
> >> >
> >> >
> >>
> >>
> >> --
> >> Regards,
> >>
> >> *Bin Mahone | 马洪宾*
> >> Apache Kylin: http://kylin.io
> >> Github: https://github.com/binmahone
> >>
> >
> >
>



-- 
Regards,

*Bin Mahone | 马洪宾*
Apache Kylin: http://kylin.io
Github: https://github.com/binmahone

Reply via email to