Re: Kylin Performance

Alberto Ramón Mon, 26 Dec 2016 09:29:07 -0800

Hello

from v0, I correct english sintaxis



After tunning of cube:
  -  Use Hive input compress table
  -  Define  Hierarchy, Joint, Dim
  -  . . .

Now:  57% if for first steps (flat table, steps: 1,2,3)  and 43% for build
cube

I saw flat table uses SEQUENCEFILE, then I tested to use
   ORC,
   ORC + Snappy
   ORC + Snappy + Vectorization

without good results, more ideas ??


I'm thinking that 'Redistribute Flat Hive Table' is a simple count and uses

*30% of total time*
  Is this the normal case ?
  We can aprox this count to: count of Fact Table (Will true 99% of time),
and put in // with step 1, is necessary be precise?

2016-12-22 14:00 GMT+01:00 Li Yang <[email protected]>:

> Very good work!
>
> Btw, we are also doing benchmarks on SSB and TPC-H data sets, based on
> below work. Will share more info soon.
>
> - http://www.cs.umb.edu/~poneil/StarSchemaB.PDF
> - https://github.com/hortonworks/hive-testbench
>
>
> Cheers
> Yang
>
> On Wed, Dec 21, 2016 at 8:45 PM, Alberto Ramón <[email protected]>
> wrote:
>
> > When Kylin 2149 <https://issues.apache.org/jira/browse/KYLIN-2149> will
> be
> > solved the performance will be* improve even more*, because:
> >
> > you know that 2016-05-05 Belongs to May, Week 18, and friday , but kylin
> > doesnt know it
> > It will try to calulate the combination of 2016-05-05 with January
> February
> > March, ... Monday Tuesday ..., W1 W2 ..., Q2 Q3 Q4 ==> There are a lot of
> > combination wasted
> >
> > 2016-12-21 12:57 GMT+01:00 Luke_Selina <[email protected]>:
> >
> > > Great and Agree! But I still have an question like Alberto, why in an
> AGG
> > > one
> > > dim can use only one regulation(mandatory, join, hierachy)?
> > >
> > > --
> > > View this message in context: http://apache-kylin.74782.x6.
> > > nabble.com/Kylin-Performance-tp6713p6728.html
> > > Sent from the Apache Kylin mailing list archive at Nabble.com.
> > >
> >
>

Re: Kylin Performance

Reply via email to