Thanks for replying Hongbin,

     for 1) we are trying to add some sort of evitction based cache instead
of a map. However, we still are trying to figure out what to do for 3).

    What is the general advice ? The case here is ..  I have order details
as a fact and order as a dimension and also customer. Now each of these
will run into many millions.  Also, the f-key is not a long/bigint, its a
string which is a combination of our custom columns. Making it a dictionary
will not work as we understand. Please suggest what should be the approach
taken

Regards,
Abhilash

On Tue, Sep 1, 2015 at 4:37 PM, hongbin ma <[email protected]> wrote:

>     for 1) ..  seems like only the resource path / table desc etc is only
> kept in memory while a new lookupstringtable is created per query/request
> which holds onto data for the lifetime of the request.  So once the request
> is done, it should be garbage collectable ?
>
> /table is just for the hive table's schema, the look up table content is
> cached in SnapshotManager and it will not be evicted so far. So if you have
> a lot of large lookup tables this will be a problem
>
>
> 3) Also the derived filter translator, is there a way to modify the '
> IN_THRESHOLD'  via config file ?
>
> Are you facing performance issue with a lot of IN clauses? if so , please
> take a look at https://issues.apache.org/jira/browse/KYLIN-740, the patch
> will be merged into next release
>
> On Mon, Aug 31, 2015 at 9:54 PM, Abhilash L L <[email protected]>
> wrote:
>
> > Sorry for the confusion,
> >
> >     for 1) ..  seems like only the resource path / table desc etc is only
> > kept in memory while a new lookupstringtable is created per query/request
> > which holds onto data for the lifetime of the request.  So once the
> request
> > is done, it should be garbage collectable ?
> >
> >
> > 3) Also the derived filter translator, is there a way to modify the '
> > IN_THRESHOLD'  via config file ?
> >
> >
> >
> >
> >
> > Regards,
> > Abhilash
> >
> > On Mon, Aug 31, 2015 at 7:05 PM, Abhilash L L <[email protected]>
> > wrote:
> >
> > > Hello,
> > >
> > >     We started noticing that Kylin tomcat server is taking a lot of
> ram.
> > > It even hit a limit of 10GB.
> > >
> > >     After spending some time by going over the code, it seems like the
> > > cube enumerator is not storing anything in memory. But the Lookup table
> > > enumerator seems to be loading all records and storing it in memory.
> > >
> > >     1) What happens when there are lot of projects defined and we end
> up
> > > with tons of look up tables across them. Does it get swapped out
> > > automatically ?  I am not able to track where eviction is happening.
> The
> > > snapshot manager has a 'removeSnapshot' but its intent seems different
> to
> > > me.
> > >
> > >     2) How do we handle really higher cardinality dimension. Eg: If I
> > have
> > > sales as a fact and customers as a dimension, there will be millions of
> > > customers. However a store is good candidate to keep in memory but not
> > > customers. Whats the recommended setting while creating the cube to
> > handle
> > > such a case
> > >
> > > Regards,
> > > Abhilash
> > >
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Reply via email to