This is Kylin's design, and I think we have benefited a lot from this:

1) Let Hive joins the tables, Kylin can focus on cubing;
2) The filter conditions can be easily applied in this step;
3) With this step the files are converted to an uniform file format
(sequence file so far), this eases Kylin's MR file parsing;
4) It makes using Hive View as fact table is possible; (View is abstract
table which doesn't have underlying files, and the data may come from
multiple tables; with this step we can shield that limitation;)


2015-09-15 16:50 GMT+08:00 hongbin ma <[email protected]>:

> I guess it's a natural solution to provide flatten data to the following MR
> steps
>
> On Tue, Sep 15, 2015 at 4:48 PM, Sarnath <[email protected]> wrote:
>
> > Hmmm. Surprising. Any reason why ?  Sthg to do with deserializer?
> > Also, if you could approve my other email regarding serving aggregations
> > through elastic search rest API it would be nice. Thanks
> > On Sep 15, 2015 1:15 PM, "hongbin ma" <[email protected]> wrote:
> >
> > > yes
> > >
> > > On Tue, Sep 15, 2015 at 2:56 PM, Sarnath <[email protected]> wrote:
> > >
> > > > Hi,
> > > > It seems from kylin source code that kylin always creates a new table
> > > > (effectively) duplicating original contents for cube creation. Is
> this
> > > so?
> > > > Can some1 confirm?
> > > > Best, Sarnath
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > *Bin Mahone | 马洪宾*
> > > Apache Kylin: http://kylin.io
> > > Github: https://github.com/binmahone
> > >
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

Reply via email to