This is Kylin's design, and I think we have benefited a lot from this: 1) Let Hive joins the tables, Kylin can focus on cubing; 2) The filter conditions can be easily applied in this step; 3) With this step the files are converted to an uniform file format (sequence file so far), this eases Kylin's MR file parsing; 4) It makes using Hive View as fact table is possible; (View is abstract table which doesn't have underlying files, and the data may come from multiple tables; with this step we can shield that limitation;)
2015-09-15 16:50 GMT+08:00 hongbin ma <[email protected]>: > I guess it's a natural solution to provide flatten data to the following MR > steps > > On Tue, Sep 15, 2015 at 4:48 PM, Sarnath <[email protected]> wrote: > > > Hmmm. Surprising. Any reason why ? Sthg to do with deserializer? > > Also, if you could approve my other email regarding serving aggregations > > through elastic search rest API it would be nice. Thanks > > On Sep 15, 2015 1:15 PM, "hongbin ma" <[email protected]> wrote: > > > > > yes > > > > > > On Tue, Sep 15, 2015 at 2:56 PM, Sarnath <[email protected]> wrote: > > > > > > > Hi, > > > > It seems from kylin source code that kylin always creates a new table > > > > (effectively) duplicating original contents for cube creation. Is > this > > > so? > > > > Can some1 confirm? > > > > Best, Sarnath > > > > > > > > > > > > > > > > -- > > > Regards, > > > > > > *Bin Mahone | 马洪宾* > > > Apache Kylin: http://kylin.io > > > Github: https://github.com/binmahone > > > > > > > > > -- > Regards, > > *Bin Mahone | 马洪宾* > Apache Kylin: http://kylin.io > Github: https://github.com/binmahone >
