for 1) .. seems like only the resource path / table desc etc is only kept in memory while a new lookupstringtable is created per query/request which holds onto data for the lifetime of the request. So once the request is done, it should be garbage collectable ?
/table is just for the hive table's schema, the look up table content is cached in SnapshotManager and it will not be evicted so far. So if you have a lot of large lookup tables this will be a problem 3) Also the derived filter translator, is there a way to modify the ' IN_THRESHOLD' via config file ? Are you facing performance issue with a lot of IN clauses? if so , please take a look at https://issues.apache.org/jira/browse/KYLIN-740, the patch will be merged into next release On Mon, Aug 31, 2015 at 9:54 PM, Abhilash L L <[email protected]> wrote: > Sorry for the confusion, > > for 1) .. seems like only the resource path / table desc etc is only > kept in memory while a new lookupstringtable is created per query/request > which holds onto data for the lifetime of the request. So once the request > is done, it should be garbage collectable ? > > > 3) Also the derived filter translator, is there a way to modify the ' > IN_THRESHOLD' via config file ? > > > > > > Regards, > Abhilash > > On Mon, Aug 31, 2015 at 7:05 PM, Abhilash L L <[email protected]> > wrote: > > > Hello, > > > > We started noticing that Kylin tomcat server is taking a lot of ram. > > It even hit a limit of 10GB. > > > > After spending some time by going over the code, it seems like the > > cube enumerator is not storing anything in memory. But the Lookup table > > enumerator seems to be loading all records and storing it in memory. > > > > 1) What happens when there are lot of projects defined and we end up > > with tons of look up tables across them. Does it get swapped out > > automatically ? I am not able to track where eviction is happening. The > > snapshot manager has a 'removeSnapshot' but its intent seems different to > > me. > > > > 2) How do we handle really higher cardinality dimension. Eg: If I > have > > sales as a fact and customers as a dimension, there will be millions of > > customers. However a store is good candidate to keep in memory but not > > customers. Whats the recommended setting while creating the cube to > handle > > such a case > > > > Regards, > > Abhilash > > > -- Regards, *Bin Mahone | 马洪宾* Apache Kylin: http://kylin.io Github: https://github.com/binmahone
