Hi hongbin It is attached in the previous reply. Attached again.
Thanks On Tue, Sep 22, 2015 at 11:58 AM, hongbin ma <[email protected]> wrote: > hi > > did you forget to attach the screenshot? > > On Tue, Sep 22, 2015 at 12:11 PM, vipul jhawar <[email protected]> > wrote: > > > Hi > > > > We are kylin 0.7.2 . > > A screenshot of the call stack is attached for reference. > > > > Yesterday we have done some more debugging and we added a timeout check > in > > co processor AggregationScanner -> buildAggrCache > > similar to checkMemoryUsage() check in the co processor but when we > > enabled fuzzy keys it simply remains stuck for hours. > > It's not even looping as even when we added timeout checks of 1 min, the > > timeout never happened but the co processor was hung for a long time and > we > > had to bounce the regionserver. If you could explain what is causing the > co > > processor to remain hung for so long and not even loop in. Is it just > stuck > > on the scan forever. > > > > After this when we disable the fuzzy keys, the timeout does get executed. > > On further analysis we tried to reduce the fuzzy_value_cap and brought it > > down to 20. > > The problem is that when we switch on fuzzy and have filters which lead > to > > IN clause, the co processor is not deterministic and it goes into a spin > > sometimes and it executes fine sometimes which becomes an issue as we > need > > deterministic performance and do not want to co processor to be running > for > > ever. Some queries run fine and are very fast and some just get stuck > > forever. > > > > The client time out with an rpc timeout but the co processor thread just > > hogs the CPU. > > > > Please comment. > > > > Thanks > > > > > > On Tue, Sep 22, 2015 at 7:14 AM, hongbin ma <[email protected]> > wrote: > > > >> hi vipul, > >> > >> what version are you using? before > >> https://issues.apache.org/jira/browse/KYLIN-740 we did spot some > critical > >> performance issues caused by many IN clauses, if you could help to > provide > >> a CPU/heap analysis(on your hbase's region server) it would be easier to > >> address the problem. > >> > >> On Mon, Sep 21, 2015 at 10:42 PM, vipul jhawar <[email protected]> > >> wrote: > >> > >> > Hi > >> > > >> > Have noticed a pattern that which caused the co processor to spike the > >> > regionserver cpu to 100% over time. > >> > If we end up issuing a query thru kylin which may involve a scanning a > >> lot > >> > of data assuming multiple days with multiple filters for many > >> dimensions in > >> > which case it has to scan a large number of rows and if it doesnt > >> return in > >> > the required rpc timeout then the client does get an error message > with > >> the > >> > exception, but on the regionserver we see no end to processing and it > >> > ultimately hogs the regionserver. > >> > > >> > Are there any configs on the coprocessor which can be configured to > say > >> > that if the processing is not completed in N time, then simply timeout > >> as > >> > that way we can look at the queries later but avoid cpu spike as it > >> makes > >> > the cluster unusable. > >> > > >> > Thanks > >> > > >> > >> > >> > >> -- > >> Regards, > >> > >> *Bin Mahone | 马洪宾* > >> Apache Kylin: http://kylin.io > >> Github: https://github.com/binmahone > >> > > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone >
