I am facing errors in kylin.log complaining about less than 100MB available -> then the Kylin server dies silently. The issues is caused by high cardinality dimension which requires approx 700MB data snapshot. I have increase the parameter kylin.table.snapshot.max_mb=750 to 750MB – with this settings the Build Step 4 is not anymore complaining about the snapshot more than 300MB (the exeception java.lang.IllegalStateException: Table snapshot should be no greater than 300 MB is gone) but the server dies after a while. There is a plenty of memory free on the node where Kylin runs (more than 20GB free) so it seems to be problem of Kylin total memory limit. I didn’t find a way how to increase the Kylin memory limit so the big snapshot won’t kill the Kylin server …. How to do that ???
It is urgent ! :) Thanx, ric From: Richard Calaba (Fishbowl) [mailto:[email protected]] Sent: Monday, June 27, 2016 5:23 PM To: '[email protected]' <[email protected]> Subject: RE: Dimension table 300MB Limit I have 2 scenarios: 1) time -dependent attributes of customer – here it might be an option to put those to fact table as the values are derived from date and ID -> but I need those dimensions to be “derived” from fact table (2 fields – date and id – define the value – I have 10 fields like that in the lookup table so bringing those as independent (normal) dimensions would increase the Build time by 2^10 times right … ??? 2) 2nd scenario is similar – lot of attributes of customer (which is the high cardinality dimension – approx 10 mio customers) to be used as derived dimension Forcing to put the high cardinality dimensions into fact table is in my opinion a step back – we are denormalizing the star-schema …. Ric. From: Li Yang [mailto:[email protected]] Sent: Monday, June 27, 2016 3:45 PM To: [email protected] <mailto:[email protected]> Subject: Re: Dimension table 300MB Limit Such big dimensions better be part of the fact table (rather than on a lookup table). Simplest way is to create a hive view joining the old fact and the customer, then assign the view to be the new fact table. On Tue, Jun 28, 2016 at 5:26 AM, Richard Calaba (Fishbowl) <[email protected] <mailto:[email protected]> > wrote: We have same issue though our size is just 700MB …. So interested in the background info and workarounds other than setting higher snapshot limit … if any ? Ric. From: Arun Khetarpal [mailto:[email protected] <mailto:[email protected]> ] Sent: Monday, June 27, 2016 11:55 AM To: [email protected] <mailto:[email protected]> Subject: Dimension table 300MB Limit Hi, We are evaluating Kylin as an Analytical Engine for OLAP. We are facing issues with OOM when dealing with large dimensions ~ 70GB (customer data) [set kylin.table.snapshot.max_mb to a high limit] I guess having a Dictionary this big in memory will not be a solution. Is there any suggested workaround for the same? Is there any work done to get around this by the community? Regards, Arun No virus found in this message. Checked by AVG - www.avg.com <http://www.avg.com> Version: 2016.0.7640 / Virus Database: 4613/12504 - Release Date: 06/27/16 No virus found in this message. Checked by AVG - www.avg.com <http://www.avg.com> Version: 2016.0.7640 / Virus Database: 4613/12505 - Release Date: 06/27/16
