Re: 答复: Increase query performance

Adunuthula, Seshu Fri, 15 May 2015 06:30:22 -0700

As a short term fix, does it make sense to make this a tunable parameter
and move this to a config file?


On 5/15/15, 5:58 AM, "Huang Hua" <[email protected]> wrote:

>Hi Dong,
>
>I don't think so. You can safely change that setting but then you need to
>recompile kylin to generate the new war(don't use the deploy.sh because
>that will wipe out all your kylin hbase meta storage). After the war is
>generated, put that war under Tomcat webapps directory and restarts the
>Tomcat. That should work well.
>
>Best.
>Hua
>> -----邮件原件-----
>> 发件人: dev-return-1698-
>> [email protected] [mailto:dev-return-
>> [email protected]] 代表 dong
>> wang
>> 发送时间: 2015年5月15日 18:54
>> 收件人: [email protected]
>> 主题: Re: Increase query performance
>> 
>> I found the setting for the threshold locates in StorageContext.java,
>>the
>> related piece of codes are:
>> public class StorageContext {
>> 
>>     public static final int HARD_THRESHOLD = 4000000;
>> 
>> 
>> thus, I have a question that currently I have already built some
>>segments
>> successfully,  later on, if I change the threshold much greater, will
>>it affect the
>> existing data in the cube storage?
>> 
>> 2015-05-15 18:48 GMT+08:00 dong wang <[email protected]>:
>> 
>> > Hi all, today I also met with the same problem, however, maybe mine is
>> > much more strange, the SQL lies in the following:
>> > select count(* ) from (select 1 from test1 where condtionx group by
>> > col1, col2, col3) t1
>> >
>> > since the result of the sub query is greater than 4000000, the
>> > exception is thrown out~ however, the final row count of the the whole
>> > SQL is just 1 row, such kind of SQL is usually implemented to obtain
>> > the total row count of some queries for paging feature~
>> >
>> > 2015-05-13 18:15 GMT+08:00 Parkavi Nandagopal <[email protected]>:
>> >
>> >> After getting that below error (Scan row count exceeded threshold:
>> >> 4000000), kylin is stopped/crashed automatically.
>> >> Is Kylin single point of Failure?
>> >> How to make it has an High availability?
>> >>
>> >> Thanks,
>> >> Parkavi.
>> >>
>> >>
>> >> -----Original Message-----
>> >> From: Parkavi Nandagopal
>> >> Sent: Wednesday, May 13, 2015 10:49 AM
>> >> To: dev; '[email protected]'
>> >> Subject: RE: Increase query performance
>> >>
>> >> Size of my hive fact table = 3.27 GB ( row count 25,236,160) Cube
>> >> size =
>> >> 2.21 GB
>> >>
>> >> I created hierarchy dimension with 18 levels.
>> >> Col1 -> Col2 -> ......upto Col18
>> >> For this 18 levels, total cardinality = 2635
>> >>
>> >> I attached 2 log files.
>> >> Log1 - query with limit 1000000
>> >> Partial result came.
>> >> Log2 - Clicked show all in Query result.
>> >> Getting ERROR : exception while executing query: Scan row count
>> >> exceeded
>> >> threshold: 4000000, please add filter condition to narrow down
>> >> backend scan range, like where clause.
>> >>
>> >> Thanks,
>> >> Parkavi.
>> >>
>> >> -----Original Message-----
>> >> From: hongbin ma [mailto:[email protected]]
>> >> Sent: Wednesday, May 13, 2015 7:15 AM
>> >> To: dev
>> >> Subject: Re: Increase query performance
>> >>
>> >> before you expand your cluster, you might need to analyse why it's
>> >> delivering poor performance.
>> >>
>> >> how about the size of your hive fact table? the cardinality of the
>> >> dimension columns?
>> >>
>> >> if possible you can run a query,and paste the query's log in
>> >> KYLIN_HOME/logs/kylin.log for that query. we can help you check for
>> >> any abnormalities. (make sure you're writing a slightly different
>> >> query, to avoid hitting cache)
>> >>
>> >> On Tue, May 12, 2015 at 2:04 PM, Parkavi Nandagopal
>> >> <[email protected]>
>> >> wrote:
>> >>
>> >> > Hi ,
>> >> >
>> >> > I have installed kylin and created cube(3GB size) with only one
>> >> > region server and when I query the cube data, it is taking much
>> >> > time to show the query result in Kylin web UI.
>> >> > If I add 3 or more region server node with high configuration and I
>> >> > create a cube then query the cube means will it increase the query
>> >> performance?
>> >> >
>> >> >
>> >> > Thanks,
>> >> > Parkavi.
>> >> >
>> >> >
>> >> > ::DISCLAIMER::
>> >> >
>> >> > -------------------------------------------------------------------
>> >> > ---
>> >> > -------------------------------------------------------------------
>> >> > ---
>> >> > --------
>> >> >
>> >> > The contents of this e-mail and any attachment(s) are confidential
>> >> > and intended for the named recipient(s) only.
>> >> > E-mail transmission is not guaranteed to be secure or error-free as
>> >> > information could be intercepted, corrupted, lost, destroyed,
>> >> > arrive late or incomplete, or may contain viruses in transmission.
>> >> > The e mail and its contents (with or without referred errors) shall
>> >> > therefore not attach any liability on the originator or HCL or its
>>affiliates.
>> >> > Views or opinions, if any, presented in this email are solely those
>> >> > of the author and may not necessarily reflect the views or opinions
>> >> > of HCL or its affiliates. Any form of reproduction, dissemination,
>> >> > copying, disclosure, modification, distribution and / or
>> >> > publication of this message without the prior written consent of
>> >> > authorized representative of HCL is strictly prohibited. If you
>> >> > have received this email in error please delete it and notify the
>> >> > sender immediately.
>> >> > Before opening any email and/or attachments, please check them for
>> >> > viruses and other defects.
>> >> >
>> >> >
>> >> > -------------------------------------------------------------------
>> >> > ---
>> >> > -------------------------------------------------------------------
>> >> > ---
>> >> > --------
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Regards,
>> >>
>> >> *Bin Mahone | 马洪宾*
>> >> Apache Kylin: http://kylin.io
>> >> Github: https://github.com/binmahone
>> >>
>> >
>> >
>
>

Re: 答复: Increase query performance

Reply via email to