Re: New document: "How to optimize cube build"

2017-01-25 Thread Alberto Ramón
Be careful about partition by "FLIGHTDATE"

>From https://github.com/albertoRamon/Kylin/tree/master/KylinPerformance

*"Option 1: Use id_date as partition column on Hive table. This have a big
problem: the Hive metastore is meant for few hundred of partitions not
thousand (Hive 9452 there is an idea to solve this isn’t in progress)*"

In Hive 2.0 will be a preview (only for testing) to solve this

2017-01-25 9:46 GMT+01:00 ShaoFeng Shi :

> Hello,
>
> A new document is added for the practices of cube build. Any suggestion or
> comment is welcomed. We can update the doc later with feedbacks;
>
> Here is the link:
> https://kylin.apache.org/docs16/howto/howto_optimize_build.html
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


New document: "How to optimize cube build"

2017-01-25 Thread ShaoFeng Shi
Hello,

A new document is added for the practices of cube build. Any suggestion or
comment is welcomed. We can update the doc later with feedbacks;

Here is the link:
https://kylin.apache.org/docs16/howto/howto_optimize_build.html

-- 
Best regards,

Shaofeng Shi 史少锋


Re: 答复: Re:答复: 答复: more than 1 append dict for globalDict

2017-01-25 Thread ShaoFeng Shi
Hi Zhangda, yes your understanding is correct; good to know this, thanks!

2017-01-25 13:57 GMT+08:00 市场中心-ZHANGDA32698 :

> Yup, the delete api works. Thanks!
>
>
>
> *发件人:* 星辰#&勿语 [mailto:446463...@qq.com]
> *发送时间:* 2017年1月25日 11:04
> *收件人:* 市场中心-ZHANGDA32698
> *主题:* Re:答复: 答复: more than 1 append dict for globalDict
>
>
>
> you can delete segment by rest api in kylin web UI
>
> -- 原始邮件 --
>
> *发件人:* "市场中心-ZHANGDA32698" ;
>
> *发送时间:* 2017年1月25日(星期三) 10:54
>
> *收件人:* "user@kylin.apache.org" ;"dev" <
> d...@kylin.apache.org>;"Yerui Sun" ;
>
> *主题:* 答复: 答复: more than 1 append dict for globalDict
>
>
>
> Hi ShaoFeng,
>
> I see, thanks for the solution. Actually I notice that the extra
> dictionary appeared right after the  last segment were successfully built,
> which means all segments in the cube.json are pointing to the older
> dictionary except the last one. So similar to your suggestion, I wonder if
> I can remove the last segment together with the extra the dictionary, then
> do a rebuild on the last segment. Is it ok to do so?
> Another thing, how do I remove an existing segment? Found this
> https://issues.apache.org/jira/browse/KYLIN-1540 but seem doesn’t work in
> v1.6.0 yet. Can I just edit the cube.json file and remove the
> want-to-remove segment part?
> Thanks !
>
> 发件人: ShaoFeng Shi [mailto:shaofeng...@apache.org]
> 发送时间: 2017年1月24日 20:27
> 收件人: user; dev; Yerui Sun
> 主题: Re: 答复: more than 1 append dict for globalDict
>
> Yerui is on vacation;
>
> If you're urgent, you can try this way (also confirmed with yerui);
>
> 1) backup your metadata to local folder;
> 2) copy to backup the two global dictionaries in some other folder;
> 3) among the two dictionaries, pick the bigger one (it should includes all
> values in the smaller one), and then in the cube json, replace all the
> references to smaller dict with the bigger one; You can edit the Cube json
> from Kylin's web gui, in the "Admin" -> "Edit JSON";
> 4) remove the smaller one dict from kylin metadata, use kylin's command,
> like : "./bin/metastore.sh remove /dict/APPLYDATA_DSDCADMIN.FLY_
> ZHUANGHUALV_PROC1/LOGINKEY/ae5a65ca-022f-4c81-89f2-7cacb2789888.dict"
>
>
>
> 2017-01-24 20:14 GMT+08:00 ShaoFeng Shi  mailto:shaofeng...@apache.org>>:
> @Yerui, do you have suggestion on this?
>
> 2017-01-24 18:19 GMT+08:00 市场中心-ZHANGDA32698 >:
> Hi ShaoFeng,
>
> Thanks for your suggestion. However since ours is a daily build, I guess
> there’s no concurrent issue.
> The problem stops us from building subsequent segments. I try to remove
> one of the Dicts from hbase and try a rebuild, but seems that they are
> referenced at somewhere else and I got ‘java.lang.IllegalStateException:
> No resource found at -- /dict/APPLYDATA_DSDCADMIN.FLY_
> ZHUANGHUALV_PROC1/LOGINKEY/ae5a65ca-022f-4c81-89f2-7cacb2789888.dict’
> problem at step 3 ‘Extract Fact Table Distinct Columns’. Of course I can
> add the key back and let it continue, but without surprise it will stop at
> setp4 ‘Build Dimension Dictionary’ with same error I encountered before
> ‘GlobalDict  should have 0 or 1 append dict but 2’
> How can I resolve this and continue my cube building? Any suggestion is
> appreciated, thanks!
>
>
> 发件人: ShaoFeng Shi [mailto:shaofengshi@apache.
> org]
> 发送时间: 2017年1月24日 17:01
> 收件人: user
> 主题: Re: more than 1 append dict for globalDict
>
> Hi zhangda,
>
> Do you have multiple segments (which has the count distinct measure)
> building concurrently? In 1.6.0 and before there is a concurrent bug I
> think; Yerui Sun fixed it in https://issues.apache.org/
> jira/browse/KYLIN-2192
>
> So, please check whether it is this problem first; If yes, you need add
> some control (lock) to avoid concurrent build for this cube.
>
> 2017-01-24 16:33 GMT+08:00 市场中心-ZHANGDA32698 >:
> Hi there,
>
> I have a cube computing for some UV statistics. Since it requires global
> count distinct operation, in the advanced dictionaries setting I put UV key
> column’s builder class as ‘GlobalDictionaryBuilder’.
> Everyday build went on without any problem until yesterday there was a
> ‘GlobalDict  should have 0 or 1 append dict but 2"’ exception. I
> checked the ‘kylin_metadata’ table in the hbase , and saw there were indeed
> 2 dict
> '/dict/APPLYDATA_DSDCADMIN.FLY_ZHUANGHUALV_PROC1/
> LOGINKEY/ae5a65ca-022f-4c81-89f2-7cacb2789888.dict'
> /dict/APPLYDATA_DSDCADMIN.FLY_ZHUANGHUALV_PROC1/LOGINKEY/
> cedce514-657d-4032-a591-7f01d984df65.dict'
> Fyi, APPLYDATA_DSDCADMIN.FLY_ZHUANGHUALV_PROC1 is the hive table name,
> LOGINKEY is the count distinct key
> I’m not sure what had gone wrong, and I suppose global dict on 1 key
> should be unique.  Has anyone ever encountered similar error before, can
> share some ideas? Thanks!
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>