Hi Parkavi. I failed to build cube with my fact table and lookup table(both row count 3 million)with 5 dimensions,the error report "Too high cardinality is not suitable for dictionary ..." so I find the threshold 2m setting in DictionaryGenerator.java which is different from yours. I heard that you'd created a 2.21GB cube with fact table = 3.27 GB ( row count 25,236,160) and want to change the value of the setting. Have you changed it and rebuilded the codes successfully yet?
Thanks Tim >> On Mon, May 18, 2015 at 10:23 AM, dong wang <[email protected]> >> wrote: >> >> > Thanks hua, usually users don't need to fetch 4,000,000 + rows of the >> > result, but for the intermediate query result, the row number may be >> much >> > more than 4,000,000+ rows, in your above reply, u mentioned that we can >> > just change the value of the setting, then rebuild the codes and restart >> > the tomcat, is it what you have already tested? since currently there >> are >> > so much data in the existing cubes, I have to make it sure that all such >> > operations are safe to take~ >> > >> > 2015-05-15 21:29 GMT+08:00 Adunuthula, Seshu <[email protected]>: >> > >> > > As a short term fix, does it make sense to make this a tunable >> parameter >> > > and move this to a config file? >> > > >> > > On 5/15/15, 5:58 AM, "Huang Hua" <[email protected]> wrote: >> > > >> > > >Hi Dong, >> > > > >> > > >I don't think so. You can safely change that setting but then you >> need >> > to >> > > >recompile kylin to generate the new war(don't use the deploy.sh >> because >> > > >that will wipe out all your kylin hbase meta storage). After the war >> is >> > > >generated, put that war under Tomcat webapps directory and restarts >> the >> > > >Tomcat. That should work well. >> > > > >> > > >Best. >> > > >Hua >> > > >> -----????????----- >> > > >> ??????: dev-return-1698- >> > > >> [email protected] [mailto: >> > dev-return- >> > > >> [email protected]] ???? dong >> > > >> wang >> > > >> ????????: 2015??5??15?? 18:54 >> > > >> ??????: [email protected] >> > > >> ????: Re: Increase query performance >> > > >> >> > > >> I found the setting for the threshold locates in >> StorageContext.java, >> > > >>the >> > > >> related piece of codes are: >> > > >> public class StorageContext { >> > > >> >> > > >> public static final int HARD_THRESHOLD = 4000000; >> > > >> >> > > >> >> > > >> thus, I have a question that currently I have already built some >> > > >>segments >> > > >> successfully, later on, if I change the threshold much greater, >> will >> > > >>it affect the >> > > >> existing data in the cube storage? >> > > >> >> > > >> 2015-05-15 18:48 GMT+08:00 dong wang <[email protected]>: >> > > >> >> > > >> > Hi all, today I also met with the same problem, however, maybe >> mine >> > is >> > > >> > much more strange, the SQL lies in the following: >> > > >> > select count(* ) from (select 1 from test1 where condtionx group >> by >> > > >> > col1, col2, col3) t1 >> > > >> > >> > > >> > since the result of the sub query is greater than 4000000, the >> > > >> > exception is thrown out~ however, the final row count of the the >> > whole >> > > >> > SQL is just 1 row, such kind of SQL is usually implemented to >> obtain >> > > >> > the total row count of some queries for paging feature~ >> > > >> > >> > > >> > 2015-05-13 18:15 GMT+08:00 Parkavi Nandagopal < >> [email protected]>: >> > > >> > >> > > >> >> After getting that below error (Scan row count exceeded >> threshold: >> > > >> >> 4000000), kylin is stopped/crashed automatically. >> > > >> >> Is Kylin single point of Failure? >> > > >> >> How to make it has an High availability? >> > > >> >> >> > > >> >> Thanks, >> > > >> >> Parkavi. >> > > >> >> >> > > >> >> >> > > >> >> -----Original Message----- >> > > >> >> From: Parkavi Nandagopal >> > > >> >> Sent: Wednesday, May 13, 2015 10:49 AM >> > > >> >> To: dev; '[email protected]' >> > > >> >> Subject: RE: Increase query performance >> > > >> >> >> > > >> >> Size of my hive fact table = 3.27 GB ( row count 25,236,160) >> Cube >> > > >> >> size = >> > > >> >> 2.21 GB >> > > >> >> >> > > >> >> I created hierarchy dimension with 18 levels. >> > > >> >> Col1 -> Col2 -> ......upto Col18 >> > > >> >> For this 18 levels, total cardinality = 2635 >> > > >> >> >> > > >> >> I attached 2 log files. >> > > >> >> Log1 - query with limit 1000000 >> > > >> >> Partial result came. >> > > >> >> Log2 - Clicked show all in Query result. >> > > >> >> Getting ERROR : exception while executing query: Scan row count >> > > >> >> exceeded >> > > >> >> threshold: 4000000, please add filter condition to narrow down >> > > >> >> backend scan range, like where clause. >> > > >> >> >> > > >> >> Thanks, >> > > >> >> Parkavi. >> > > >> >> >> > > >> >> -----Original Message----- >> > > >> >> From: hongbin ma [mailto:[email protected]] >> > > >> >> Sent: Wednesday, May 13, 2015 7:15 AM >> > > >> >> To: dev >> > > >> >> Subject: Re: Increase query performance >> > > >> >> >> > > >> >> before you expand your cluster, you might need to analyse why >> it's >> > > >> >> delivering poor performance. >> > > >> >> >> > > >> >> how about the size of your hive fact table? the cardinality of >> the >> > > >> >> dimension columns? >> > > >> >> >> > > >> >> if possible you can run a query,and paste the query's log in >> > > >> >> KYLIN_HOME/logs/kylin.log for that query. we can help you check >> for >> > > >> >> any abnormalities. (make sure you're writing a slightly >> different >> > > >> >> query, to avoid hitting cache) >> > > >> >> >> > > >> >> On Tue, May 12, 2015 at 2:04 PM, Parkavi Nandagopal >> > > >> >> <[email protected]> >> > > >> >> wrote: >> > > >> >> >> > > >> >> > Hi , >> > > >> >> > >> > > >> >> > I have installed kylin and created cube(3GB size) with only >> one >> > > >> >> > region server and when I query the cube data, it is taking >> much >> > > >> >> > time to show the query result in Kylin web UI. >> > > >> >> > If I add 3 or more region server node with high configuration >> > and I >> > > >> >> > create a cube then query the cube means will it increase the >> > query >> > > >> >> performance? >> > > >> >> > >> > > >> >> > >> > > >> >> > Thanks, >> > > >> >> > Parkavi. >> > > >> >> > >> > > >> >> > >> > > >> >> > ::DISCLAIMER:: >> > > >> >> > >> > > >> >> > >> > ------------------------------------------------------------------- >> > > >> >> > --- >> > > >> >> > >> > ------------------------------------------------------------------- >> > > >> >> > --- >> > > >> >> > -------- >> > > >> >> > >> > > >> >> > The contents of this e-mail and any attachment(s) are >> > confidential >> > > >> >> > and intended for the named recipient(s) only. >> > > >> >> > E-mail transmission is not guaranteed to be secure or >> error-free >> > as >> > > >> >> > information could be intercepted, corrupted, lost, destroyed, >> > > >> >> > arrive late or incomplete, or may contain viruses in >> > transmission. >> > > >> >> > The e mail and its contents (with or without referred errors) >> > shall >> > > >> >> > therefore not attach any liability on the originator or HCL or >> > its >> > > >>affiliates. >> > > >> >> > Views or opinions, if any, presented in this email are solely >> > those >> > > >> >> > of the author and may not necessarily reflect the views or >> > opinions >> > > >> >> > of HCL or its affiliates. Any form of reproduction, >> > dissemination, >> > > >> >> > copying, disclosure, modification, distribution and / or >> > > >> >> > publication of this message without the prior written consent >> of >> > > >> >> > authorized representative of HCL is strictly prohibited. If >> you >> > > >> >> > have received this email in error please delete it and notify >> the >> > > >> >> > sender immediately. >> > > >> >> > Before opening any email and/or attachments, please check them >> > for >> > > >> >> > viruses and other defects. >> > > >> >> > >> > > >> >> > >> > > >> >> > >> > ------------------------------------------------------------------- >> > > >> >> > --- >> > > >> >> > >> > ------------------------------------------------------------------- >> > > >> >> > --- >> > > >> >> > -------- >> > > >> >> > >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> -- >> > > >> >> Regards, >> > > >> >> >> > > >> >> *Bin Mahone | ??????* >> > > >> >> Apache Kylin: http://kylin.io >> > > >> >> Github: https://github.com/binmahone >> > > >> >> >> > > >> > >> > > >> > >> > > > >> > > > >> > > >> > > >> > >> > >
