Kylin will encode the dimension values with Dictionary (default encoding) or other encoding methods when composing the rowkey; so the overhead will be less in most of cases.
2016-12-02 17:59 GMT+08:00 Alberto Ramón <[email protected]>: > yes, I will asume this overhead in rowKey > > 2016-12-02 9:58 GMT+01:00 Billy(Yiming) Liu <[email protected]>: > >> Using Joint Dimension for your 1:1 relation is the right design. >> >> 2016-12-02 0:21 GMT+08:00 Alberto Ramón <[email protected]>: >> >>> Nice Liu >>> >>> We have some cases like >>> DayWeekTXT , DayWeekID >>> MonthTXT, MonthID >>> >>> small proposal: >>> Can would be interesting create Derived with 1:1 relation, with support >>> for filters and Group by >>> >>> 2016-12-01 11:55 GMT+01:00 Billy(Yiming) Liu <[email protected]>: >>> >>>> The cost of joint dimension compared with extended column is you have >>>> more columns in the HBase rowkey. It may harm the query performance. But >>>> most time, joint dimension is still recommended, since the normal dimension >>>> column supports much more functions than extended column, such as count(*). >>>> >>>> 2016-12-01 17:07 GMT+08:00 Alberto Ramón <[email protected]>: >>>> >>>>> Hello >>>>> I was preparing a email with related doubts: >>>>> >>>>> Some times we have derived dimensions with relation 1:1, examples: >>>>> WeekDayID & WeekDayTxt >>>>> MonthID & WeekTxt >>>>> >>>>> SOL1: Derived. ID as Host and Txt Extended >>>>> PB: You can't filter / Group by Txt >>>>> >>>>> SOL2: Joint. Define tuples of ID & TXT >>>>> Some PB/limitation? (I need test this option) >>>>> >>>>> 2016-12-01 0:35 GMT+01:00 Billy(Yiming) Liu <[email protected]>: >>>>> >>>>>> Thanks, Alberto. The explanation is accurate. EXTENDED_COLUMN is only >>>>>> used for representation, but not filtering or grouping which is done by >>>>>> HOST_COLUMN. So EXTENDED_COLUMN is not a dimension, it works like a >>>>>> key/value map against the HOST_COLUMN. >>>>>> >>>>>> If the value in EXTENDED_COLUMN is not long, you could just define >>>>>> two dimensions with joint dimension setting, it has almost the same >>>>>> performance impact with EXTENDED_COLUMN which reduces one dimension, but >>>>>> better understanding. >>>>>> >>>>>> 2016-11-30 19:00 GMT+08:00 Alberto Ramón <[email protected]>: >>>>>> >>>>>>> This will help you >>>>>>> http://kylin.apache.org/docs/howto/howto_optimize_cubes.html >>>>>>> >>>>>>> The idea is always, How I can reduce the number of Dimension ? >>>>>>> If you reduce Dim, the time / resources to build the cube and final >>>>>>> size of >>>>>>> it decrease --> Its good >>>>>>> >>>>>>> An example can be DIM_Persons: Id_Person , Name, Surname, Address, >>>>>>> ..... >>>>>>> Id_Person can be HostColumn >>>>>>> and other columns can be calculated from ID --> are Extended >>>>>>> Column >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2016-11-30 11:35 GMT+01:00 仇同心 <[email protected]>: >>>>>>> >>>>>>> > Hi ,all >>>>>>> > I don’t understand the usage scenarios of >>>>>>> EXTENDED_COLUMN,although I saw >>>>>>> > this article “https://issues.apache.org/jira/browse/KYLIN-1313”. >>>>>>> > What,s the means about parameters of “Host Column” and “Extended >>>>>>> Column”? >>>>>>> > Why use this expression,and what aspects of optimization that this >>>>>>> > expression solved? >>>>>>> > Can be combined with a SQL statement to explain? >>>>>>> > >>>>>>> > >>>>>>> > Thanks~ >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> With Warm regards >>>>>> >>>>>> Yiming Liu (刘一鸣) >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> With Warm regards >>>> >>>> Yiming Liu (刘一鸣) >>>> >>> >>> >> >> >> -- >> With Warm regards >> >> Yiming Liu (刘一鸣) >> > > -- Best regards, Shaofeng Shi 史少锋
