There are options to treat columns on fact table without triggering the
dimension explosion (just like derived). One is the "joint" dimensions
introduced is v1.5. Another is "extended" measure. The related document
need to catch up however.

Yang

On Tue, Jun 28, 2016 at 10:09 AM, Arun Khetarpal <[email protected]> wrote:

> I agree with Ric - Forcing to put back the dimension value may be a step
> back.
> I propose to open a Jira to track this issue (and possibly work on this) -
> Thoughts/Suggestions?
>
> Regards,
> Arun
>
>
>
> On 28 June 2016 at 06:51, Richard Calaba (Fishbowl) <[email protected]>
> wrote:
>
>> Did little search in the bin/*.sh and found setenv.sh so tried setting
>> KYLIN_JVM_SETTINGS environment variable to -Xms1024M -Xmx16g – resolved my
>> ‘sudden’ death of the Kylin server after increasing
>> kylin.table.snapshot.max_mb
>>
>>
>>
>> So far all looks good, fingers crossed J
>>
>>
>>
>> Ric.
>>
>>
>>
>> *From:* Richard Calaba (Fishbowl) [mailto:[email protected]]
>> *Sent:* Monday, June 27, 2016 5:48 PM
>> *To:* [email protected]
>> *Cc:* 'Richard Calaba (Fishbowl)' <[email protected]>
>>
>> *Subject:* RE: Dimension table 300MB Limit
>>
>>
>>
>> I am facing errors in kylin.log complaining about less than 100MB
>> available -> then the Kylin server dies silently. The issues is caused by
>> high cardinality dimension which requires approx 700MB data snapshot. I
>> have increase the parameter *kylin.table.snapshot.max_mb=750 *to 750MB –
>> with this settings the Build Step 4 is not anymore complaining about the
>> snapshot more than 300MB (the exeception java.lang.IllegalStateException:
>> Table snapshot should be no greater than 300 MB is gone) but the server
>> dies after a while. There is a plenty of memory free on the node where
>> Kylin runs (more than 20GB free) so it seems to be problem of Kylin total
>> memory limit. I didn’t find a way how to increase the Kylin memory limit so
>> the big snapshot won’t kill the Kylin server …. How to do that ???
>>
>>
>>
>> It is urgent ! J
>>
>>
>>
>> Thanx, ric
>>
>>
>>
>> *From:* Richard Calaba (Fishbowl) [mailto:[email protected]
>> <[email protected]>]
>> *Sent:* Monday, June 27, 2016 5:23 PM
>> *To:* '[email protected]' <[email protected]>
>> *Subject:* RE: Dimension table 300MB Limit
>>
>>
>>
>> I have 2 scenarios:
>>
>>
>>
>> 1)      time -dependent attributes of customer – here it might be an
>> option to put those to fact table as the values are derived from date and
>> ID -> but I need those dimensions to be “derived” from fact table (2 fields
>> – date and id – define the value – I have 10 fields like that in the lookup
>> table so bringing those as independent (normal) dimensions would increase
>> the Build time by 2^10 times right … ???
>>
>>
>>
>> 2)      2nd scenario is similar – lot of attributes of customer (which
>> is the high cardinality dimension – approx 10 mio customers) to be used as
>> derived dimension
>>
>>
>>
>> Forcing to put the high cardinality dimensions into fact table is in my
>> opinion a step back – we are denormalizing the star-schema ….
>>
>>
>>
>> Ric.
>>
>>
>>
>> *From:* Li Yang [mailto:[email protected] <[email protected]>]
>> *Sent:* Monday, June 27, 2016 3:45 PM
>> *To:* [email protected]
>> *Subject:* Re: Dimension table 300MB Limit
>>
>>
>>
>> Such big dimensions better be part of the fact table (rather than on a
>> lookup table). Simplest way is to create a hive view joining the old fact
>> and the customer, then assign the view to be the new fact table.
>>
>>
>>
>> On Tue, Jun 28, 2016 at 5:26 AM, Richard Calaba (Fishbowl) <
>> [email protected]> wrote:
>>
>> We have same issue though our size is just 700MB …. So interested in the
>> background info and workarounds other than setting higher snapshot limit …
>> if any ?
>>
>>
>>
>> Ric.
>>
>>
>>
>> *From:* Arun Khetarpal [mailto:[email protected]]
>> *Sent:* Monday, June 27, 2016 11:55 AM
>> *To:* [email protected]
>> *Subject:* Dimension table 300MB Limit
>>
>>
>>
>> Hi,
>>
>>
>>
>> We are evaluating Kylin as an Analytical Engine for OLAP. We are facing
>> issues with OOM when dealing with large dimensions ~ 70GB (customer data)
>> [set kylin.table.snapshot.max_mb to a high limit]
>>
>>
>>
>> I guess having a Dictionary this big in memory will not be a solution. Is
>> there any suggested workaround for the same?
>>
>>
>>
>> Is there any work done to get around this by the community?
>>
>>
>>
>> Regards,
>>
>> Arun
>>
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2016.0.7640 / Virus Database: 4613/12504 - Release Date: 06/27/16
>>
>>
>>
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2016.0.7640 / Virus Database: 4613/12505 - Release Date: 06/27/16
>>
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 2016.0.7640 / Virus Database: 4613/12505 - Release Date: 06/27/16
>>
>
>

Reply via email to