How frequent are the 100 concurrent queries? Also a `jstack` dump of the
hanging process might help.

We haven't done stress testing for a while. However eBay production is
quite stable regarding Kylin itself.

The only similar issue we had before is a query super busy at CPU resource.
Such slow query cannot be interrupted, thus cannot be killed by
BadQueryDetector. Again `jstack` command should reveal such issue.

On Thu, Nov 12, 2015 at 11:12 AM, ShaoFeng Shi <[email protected]>
wrote:

> Hi Chun En, did you analysis those “bad” SQLs, to see whether they well
> matched with the cube design? Kylin doesn't guarantee every query can be
> returned in a very short time, but 80396 seconds need administrator's
> attention. If the query is good, hbase is good, memory is enough, CPU is at
> normal level, you need investigate what's the real bottleneck;
>
> Previous in eBay deployment we encountered an extreme case (37 dimensions,
> separated into a couple of aggregation groups), when the query cross
> aggregation groups, the time is very long; Later we identified the
> bottleneck and made an enhancement in Kylin v1.1; after that we didn't
> observe such issue. As you already uses Kylin 1.1, I don't think it is this
> case. You may need do more investigation or provide more detailed
> information here to analysis.
>
>
>
> 2015-11-11 17:32 GMT+08:00 nichunen <[email protected]>:
>
> > Hi,
> >
> > We did a stress test for our kylin server with 100 concurrent queies. It
> > worked fine at first. But after 1 day, we can't query kylin any more, and
> > there is log like "query has been running 80396  seconds", many "bad
> > queries" were hung there. Hbase nodes were still alive, and the cubes and
> > jobs could still be listed on the pages. To make sure whether it was the
> > problem of hbase, I restarted hbase, and did a new query, no log from
> > region server shown hbase received the query, for as we know, a
> successful
> > query will create log like "Klin Coprocessor start; Klin Coprocessor
> > aggregation done". And from the kylin.log, there were still queries hung.
> >
> >
> > Do you know what caused the problem? In our opnion, it may be because:
> > 1. We use kylin 1.1 on hbase 1.0.1.1(I modified the hbase version in
> > pom.xml to create the package);
> > 2. The tomcat max threads setting, we didn't modify any setting in
> tomcat;
> > 3. Kylin's problem.
> >
> > BTW, we read the code of BadQueryDetector, and it seems a query thread
> > will be killed only when low available memory and 5 minutes lasted. We
> > doubt may be this is not very reasonable.
> >
> > Best Regards,
> >
> >
> >
> > George/倪春恩
> >
> > Software Engineer/软件工程师
> >
> > Mobile:+86-13501723787| Fax:+8610-56842040
> >
> > 北京明略软件系统有限公司(www <http://www.semidata.com/>.mininglamp.com)
> >
> > 北京市昌平区东小口镇中东路398号中煤建设集团大厦1号楼4层
> >
> > F4,1#,Zhongmei Construction Group Plaza,398# Zhongdong Road,Changping
> > District,Beijing,102218
> >
> >
> >
> ----------------------------------------------------------------------------------------------------------------------------
> >
> > [image: cid:[email protected]]
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>

Reply via email to