Kaisen introduced extended column in his blog: https://blog.bcmeng.com/post/kylin-dimension.html#extended-column
ShaoFeng Shi <[email protected]> 于2018年11月7日周三 上午9:15写道: > One correction: from the code I see, when the extended column is not > determined, it will use the first appeared value; In your case, the first > value might be all blank, so it appeared like all empty. > The related code is: > https://github.com/apache/kylin/blob/master/core-metadata/src/main/java/org/apache/kylin/measure/extendedcolumn/ExtendedColumnMeasureType.java#L281 > > This measure has such limitations, please not use it if you are not sure > about it. I'm thinking whether we need to disable it. > > ShaoFeng Shi <[email protected]> 于2018年11月7日周三 上午9:04写道: > >> Hi Leo, >> >> "Extended column" is a special measure, which is to store a dimension >> value as a measure. It doesn't support aggregation, so if there are >> multiple values in this "measure", it will be empty. >> >> Let me give an example. >> >> Say you have a table, which has dimensions like "*user_id*", "*user_name*", >> "*age*"; Here the "*user_id*" is unique, which will be used to do >> filtering/searching; The "user_name" will not be used for filtering, but it >> is nice to show the value in the result. So, in this case, we can declare >> "user_id" and "age" as dimensions, but define "user_name" as an extended >> column. Then the cube will have 2 dimensions instead of 3, its size will be >> smaller. >> >> When you run a query like "select user_id, user_name, count(*) from >> fact_table group by user_id, user_name", it will be expected, because the >> "user_name" value is decided (明确的) under the context of a "user_id"; >> >> But if you run a query without "user_id", for example, "select age, >> user_name, count(*) from fact_table group by age, user_name", this >> column will be empty, because it has no "user_id" in the row. The extended >> column can not work in the query independently >> >> So, the conclusion is, the extended column(s) need to appear together >> with its "host" column. If its "host" column is absent, the value will be >> empty. >> >> We will add this to Kylin document. >> >> >> >> Francis Liang <[email protected]> 于2018年11月5日周一 下午5:21写道: >> >>> Actually you need at least one aggregation, such as sum(lost_day_cnt), >>> or you may try select a, b from table group by a, b, which should not be >>> the normal case however. Hope this makes sense. Best, Feng. >>> >>> >>> >>> *发件人: *宋海涛 <[email protected]> >>> *发送时间: *2018年11月5日 17:16 >>> *收件人: *[email protected] >>> *主题: *答复: Extend Column Is Empty >>> >>> >>> >>> select zone, data_from,ent_name,org_no, org_name, goods_id,goods_spec, >>> big_mc,mid_mc,sma_mc,mini_mc,lost_day_cnt, >>> >>> lost_amt,avg_paid_in_amt,dt >>> >>> from gjst.st_goods_lost_monitor_detail_day where dt='20181102' and >>> >>> zone='西南平台' and data_from='haitang' and org_no='16208' >>> >>> >>> >>> *发件人:* Francis Liang [mailto:[email protected]] >>> *发送时间:* 2018年11月5日 17:14 >>> *收件人:* [email protected] >>> *主题:* 答复: Extend Column Is Empty >>> >>> >>> >>> Could you share your sql statement? Thanks. Best, Feng. >>> >>> >>> >>> *发件人**:* 宋海涛 <[email protected]> >>> *发送时间**:* Monday, November 5, 2018 4:58:53 PM >>> *收件人**:* [email protected] >>> *主题**:* Extend Column Is Empty >>> >>> >>> >>> Hi guys, >>> >>> >>> >>> When I run a SQL which contains extended columns, I got empty >>> values for extended columns . Anyone knows why? >>> >>> For example , ENT_NAME and ORG_NAME are extended column based on >>> DATA_FROM and ORG_NO: >>> >>> [image: cid:[email protected]] >>> >>> >>> >>> Thanks a lot. >>> >>> >>> >>> Leo >>> >>> >>> >> >> >> -- >> Best regards, >> >> Shaofeng Shi 史少锋 >> >> > > -- > Best regards, > > Shaofeng Shi 史少锋 > > -- Best regards, Shaofeng Shi 史少锋
