Thanks, that clears my question well. Best, Ravion
On Sun, Sep 2, 2018, 7:30 AM <[email protected]> wrote: > Hello Ravion, > > > > Indeed Kylin generates a MOLAP cube from data source tables (Hive tables, > or also other systems like Kafka queues or JDBC-MySQL, Oracle...). In a > Kylin project, data sources are defined in "Data Sources" section and then > a "Data Model" has to be created where the relationship between the source > tables (joins in the star schema or level flake) is indicated, as well as > the columns of each table that will be used as dimensions and those that > will be used as measurements. After this, *the last metadata layer "Cube" > is defined, which is closely related to the generation and storage of the > MOLAP cube in HBase.* After the first construction, the generated MOLAP > cube is stored in HBase. > > > > *The size of this generated MOLAP cube therefore depends on the definition > of the "Cube", where the level of pre-aggregation of the data stored in the > MOLAP cube is determined by means of different concepts (e.g. Normal or > Derived dimensions).* For example, I have 2 Kylin Cubes mounted on Data > Model which is a DW in Hive. This DW fact table sizes 1 Gb (ORC format and > compression) Snappy. One of the generated kylin cubes sizes 1 Gb, that is, > almost the same size as the DW in Hive font (1 Gb Hive + 1 Cube in HBase). > However, other generated Kylin cube, with different cube definition over > same Data Model, sizes 10 Gb. This bigger size is due to I defined more > dimensions as Normal type in Kylin cube definition, in order to achieve > better results in querying times. > > > > I'm hoping to clear up the doubts for you. > > > > Best Regards, > > > > *Roberto Tardío Olmos* > > *Head of Big Data Analytics* > > Avenida de Brasil, 17, Planta 16.28020 Madrid > > Fijo: 91.788.34.10 > > > [image: > http://www.stratebi.com/image/layout_set_logo?img_id=21615&t=1486381163544] > > > > http://bigdata.stratebi.com/ > > > > http://www.stratebi.com > > > > *From:* ☼ R Nair [mailto:[email protected]] > *Sent:* sábado, 1 de septiembre de 2018 19:50 > *To:* [email protected] > *Subject:* Data Duplication > > > > Hi all, > > > > I am new to Kylin. So here is a fundamental question: When I create a > cube, as its MOLAP, I believe that irrespectivve of the already existing > data in HBase, Kylin will create a copy of the data in a > cube/multidimensional format (separate from the underlying Base data) to > help slice/dice faster. Any idea on size of the duplicate copy created? > Thanks > > > > Best, > > Ravion >
