Re: Problem during stress test for kylin quey

Adunuthula, Seshu Thu, 12 Nov 2015 05:33:08 -0800

You can scale out the Kylin deployment, Am curious to know the details of
your deployment,


- Number of Kylin Server nodes
- Number of Hbase nodes.

Increasing these should increase your concurrency levels…


Regards
Seshu Adunuthula


On 11/12/15, 1:31 AM, "Li Yang" <[email protected]> wrote:

>How frequent are the 100 concurrent queries? Also a `jstack` dump of the
>hanging process might help.
>
>We haven't done stress testing for a while. However eBay production is
>quite stable regarding Kylin itself.
>
>The only similar issue we had before is a query super busy at CPU
>resource.
>Such slow query cannot be interrupted, thus cannot be killed by
>BadQueryDetector. Again `jstack` command should reveal such issue.
>
>On Thu, Nov 12, 2015 at 11:12 AM, ShaoFeng Shi <[email protected]>
>wrote:
>
>> Hi Chun En, did you analysis those “bad” SQLs, to see whether they well
>> matched with the cube design? Kylin doesn't guarantee every query can be
>> returned in a very short time, but 80396 seconds need administrator's
>> attention. If the query is good, hbase is good, memory is enough, CPU
>>is at
>> normal level, you need investigate what's the real bottleneck;
>>
>> Previous in eBay deployment we encountered an extreme case (37
>>dimensions,
>> separated into a couple of aggregation groups), when the query cross
>> aggregation groups, the time is very long; Later we identified the
>> bottleneck and made an enhancement in Kylin v1.1; after that we didn't
>> observe such issue. As you already uses Kylin 1.1, I don't think it is
>>this
>> case. You may need do more investigation or provide more detailed
>> information here to analysis.
>>
>>
>>
>> 2015-11-11 17:32 GMT+08:00 nichunen <[email protected]>:
>>
>> > Hi,
>> >
>> > We did a stress test for our kylin server with 100 concurrent queies.
>>It
>> > worked fine at first. But after 1 day, we can't query kylin any more,
>>and
>> > there is log like "query has been running 80396  seconds", many "bad
>> > queries" were hung there. Hbase nodes were still alive, and the cubes
>>and
>> > jobs could still be listed on the pages. To make sure whether it was
>>the
>> > problem of hbase, I restarted hbase, and did a new query, no log from
>> > region server shown hbase received the query, for as we know, a
>> successful
>> > query will create log like "Klin Coprocessor start; Klin Coprocessor
>> > aggregation done". And from the kylin.log, there were still queries
>>hung.
>> >
>> >
>> > Do you know what caused the problem? In our opnion, it may be because:
>> > 1. We use kylin 1.1 on hbase 1.0.1.1(I modified the hbase version in
>> > pom.xml to create the package);
>> > 2. The tomcat max threads setting, we didn't modify any setting in
>> tomcat;
>> > 3. Kylin's problem.
>> >
>> > BTW, we read the code of BadQueryDetector, and it seems a query thread
>> > will be killed only when low available memory and 5 minutes lasted. We
>> > doubt may be this is not very reasonable.
>> >
>> > Best Regards,
>> >
>> >
>> >
>> > George/倪春恩
>> >
>> > Software Engineer/软件工程师
>> >
>> > Mobile:+86-13501723787| Fax:+8610-56842040
>> >
>> > 北京明略软件系统有限公司（www <http://www.semidata.com/>.mininglamp.com）
>> >
>> > 北京市昌平区东小口镇中东路398号中煤建设集团大厦1号楼4层
>> >
>> > F4,1#,Zhongmei Construction Group Plaza,398# Zhongdong Road,Changping
>> > District,Beijing,102218
>> >
>> >
>> >
>> 
>>-------------------------------------------------------------------------
>>---------------------------------------------------
>> >
>> > [image: cid:[email protected]]
>> >
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi
>>

Re: Problem during stress test for kylin quey

Reply via email to