Awesome Shaofeng, could you please help to add these to our FAQ page? Thanks.
Best Regards! --------------------- Luke Han 2015-03-17 2:14 GMT-07:00 Abhishek Sinha <[email protected]>: > Thanks. Good one :) > > On Tue, Mar 17, 2015 at 11:52 AM, hongbin ma <[email protected]> wrote: > > > it is quite a neat explanation of RowKey:) > > > > On Mon, Mar 16, 2015 at 11:15 PM, Shi, Shaofeng <[email protected]> > wrote: > > > > > Piece of my knowledge on Kylin: > > > > > > On 3/17/15, 1:38 PM, "Abhishek Sinha" <[email protected]> > > wrote: > > > > > > >Hi, > > > > > > > >Can anyone explain the two steps in the cube build process? > > > > > > > >1. Why do we need to extract the distinct columns from Fact Table or > > > >calculate the HIVE table cardinality? > > > > > > Kylin builds dictionary for each column, it needs to fetch the distinct > > > values for each column; Using dictionary will greatly reduce the > storage > > > size; > > > The cardinality can optimize the row key sequence, and so to determine > > the > > > roadmap of cube building, which will help 1) reduce the cube building > > time > > > 2) reduce the cube scan range so to improve query performance > > > > > > > > > > >2. What is the use of RowKey? How is it calculated? How does it help > in > > > >calculating HTable Region splits? > > > > > > RowKey is the key in Kylin¹s storage (Hbase); It is composed by the > > > dimensions¹ values (encoded in bytes); Assume your table has dimension > > > columns A, B, C; Their cardinality is n1, n2, n3; In the base cuboid, > > > there will be n1*n2*n3 rows; each row¹s key is A+B+C (concat of encoded > > > bytes); When user sends a query like ³select Š from fact group by A, > B, C > > > where A=XX and B=YY and C=ZZ², Kylin will use encode(XX) + encode(YY) + > > > encode(ZZ) as the key to query hbase to get the pre-aggregated result; > > > > > > > > > > > >Is there any documentation available on these? Or any research > > paper/book > > > >referred during the project? > > > Check the docs here, especially the "Design Cube in Kylin.pdf" : > > > https://github.com/KylinOLAP/Kylin/tree/master/docs > > > > > > > > > > > > > > > > > > > -- > Abhishek Sinha > Mobile: +919035191078 > infoworks.io >
