Re: [DISCUSS] Upgrade Kylin's dependency to Hadoop 3 / HBase 2

2020-03-02 Thread Billy Liu
+1. Let's move to Hadoop3

With Warm regards

Billy Liu


ShaoFeng Shi  于2020年2月27日周四 下午10:07写道:

> Hi Yang,
>
> The main difference between 2.6 and 3.0 is the new real-time OLAP feature.
> Hadoop 2 users can select either of them, depends on whether they need the
> real-time feature.
>
> After 3.0, the next major features would be the Flink cube engine (planned
> in v3.1) and the Parquet storage (early stage, maybe in v4.0).
>
> When the parquet storage is released, as the dependency on HBase can be
> dropped, then we assume the API issue will easier than today. We can
> re-evaluate the possibility to support Hadoop 2.
>
> So I think the impact on today's Hadoop 2 users is acceptable. Not mention
> that they still can manually compile that.
>
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Li Yang  于2020年2月27日周四 上午7:37写道:
>
>> The proposal means Kylin 3.0 will be the last major version that supports
>> Hadoop 2.
>>
>> What will be recommended version for Hadoop 2 users after this? I feel the
>> latest stable version of 2.6 is better than 3.0.
>>
>> Anyway, I'm fine with moving focus to Hadoop 3. That is the direction.
>> However we shall also think about what it means for Hadoop 2 users.
>> Questions like below shall also be answered.
>>
>> - What is the recommended version/branch for Hadoop 2? (Btw, 3.0 does not
>> sound right here.)
>> - How that version/branch will be maintained?
>>
>> +1 in general
>>
>> Regards
>> -Yang
>>
>>
>> On Wed, Feb 26, 2020 at 5:36 PM Zhou Kang  wrote:
>>
>> > +1
>> >
>> >
>> > > 2020年2月26日 下午3:48,ShaoFeng Shi  写道:
>> > >
>> > > Hello, Kylin users and developers,
>> > >
>> > > As we know Hadoop 3 and HBase 2 have released for some time. Kylin
>> > starts to support Hadoop 3  since v2.5.0 in Sep 2018.  As the APIs of
>> HBase
>> > 1 and 2 are incompatible, we need to keep different branches for them.
>> And
>> > in each release, we need to build separate packages and do a round of
>> > testing for them separately. Furthermore, Cloudera's API difference with
>> > the Apache release makes the situation worse; We need to build 4 binary
>> > packages for reach release. That has spent much of our manual effort and
>> > computing resources.
>> > >
>> > > Today, Hadoop 3 + HBase 2 becomes enough mature and stable for
>> > production use; And we see more and more users are starting to use the
>> new
>> > versions. We think it is time for Kylin to totally upgrade to the new
>> > version. So that we can focus more on Kylin itself, instead of
>> environments.
>> > >
>> > >  Here is my proposal:
>> > > 1) From Kylin 3.1,  Hadoop/HBase version upgrades to 3.1/2.1 (or a
>> close
>> > version);
>> > > 2) Hadoop 2 and HBase 1 users can use Kylin 3.0 and previous releases;
>> > > 3) We will re-evaluate the need for building binary packages for
>> > Cloudera release. (we may raise another discuss)
>> > >
>> > > Please let us know your comments. And please also understand with the
>> > limited resource we couldn't support multiple Hadoop versions...
>> > >
>> > > Thanks!
>> > >
>> > > Best regards,
>> > >
>> > > Shaofeng Shi 史少锋
>> > > Apache Kylin PMC
>> > > Email: shaofeng...@apache.org
>> > >
>> > > Apache Kylin FAQ:
>> https://kylin.apache.org/docs/gettingstarted/faq.html
>> > > Join Kylin user mail group: user-subscr...@kylin.apache.org
>> > > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>> > >
>> > >
>> >
>> >
>>
>


Re: Dimension data often changes, and the historical dimensions data need to be queried. What is the recommended solution?

2019-12-25 Thread Billy Liu
I think you have figured out the solution.
SCD1 keeps the latest dimension value, in Kylin the derived dimension is
designed for that, which query the dimension value from the latest
snapshot.
SCD2 keeps the historical dimension value, in Kylin the normal dimension is
designed for this, which build the historical value into segment directly.

With Warm regards

Billy Liu


bo.hao  于2019年12月25日周三 下午2:19写道:

> sorry, picture of the first mail may not be displayed... here is an
> explanation:
>
> before merging:
> All fields have results.
>
> after merging:
> Data of the derived dimensions is empty, and other fields still have
> results.
>
>
>
> -- Original --
> *From:* "bo.hao";
> *Date:* Wed, Dec 25, 2019 01:45 PM
> *To:* "user";
> *Subject:* Dimension data often changes, and the historical dimensions
> data need to be queried. What is the recommended solution?
>
> Business requirement:
> Data in some dimension tables often changes. And we need to be able to
> query the historical data, both fact table and dimension table.
>
> The problem I encountered:
> If the cube is not merged, everything is ok. But after merging, the query
> result is incorrect.
> This only happens on the derived dimensions, and the normal dimensions are
> ok.
> After consulting colleagues, I learned that when querying a derived
> dimension, the data comes from a snapshot.
> When merging, only the latest sanpshot is retained, and other snapshots
> are thrown away.
>
> So what is the recommended solution for this scenario?
>
>
> For example:
> Fact table : FACT_DTAL
> Dimension table : ORG_TREE (often changes) , BUS_TYP (not often changes)
> sql:
> select a.dte, a.org, a.bus_typ, b.org as org_id, b.org_nam, b.sup_org,
> b.org_nam_1, c.cod, c.nam
> from FACT_DTAL a
> left join ORG_TREE b
> on a.dte=b.dte and a.org=b.org
> left join BUS_TYP c
> on a.bus_typ=c.cod
> order by a.dte, a.org, a.bus_typ;
>
> before meging :
>
>
> after merging :
>
>
>


Re: [DISCUSS] Upgrade Hadoop-related dependencies’ to Hadoop3 for master branch

2019-10-01 Thread Billy Liu
+1. Kylin 3 aligns with Hadoop 3

With Warm regards

Billy Liu

Luke Han  于2019年10月1日周二 下午1:24写道:
>
> +1, we should move on to next-g Hadoop
>
> Best Regards!
> -
>
> Luke Han
>
>
> On Mon, Sep 30, 2019 at 12:39 AM nichunen  wrote:
>>
>> Hi all,
>>
>>
>> As more users upgrade their Hadoop to Hadoop3, to catch up with this trend, 
>> I suggest Kylin’s master branch upgrades the Hadoop-related dependencies’ 
>> version to Hadoop3.
>>
>>
>> So Kylin 3.0-GA will based on Hadoop3, users may download packages which can 
>> be run on HDP 3.x and CDH 6.x. On the other side, branch 2.6.x will still 
>> based on Hadoop2. By the way, we should still maintain a branch of Kylin3.x 
>> for Hadoop2, and a branch of Kylin2.x for Hadoop3, so users may package 
>> Kylin binary packages by themselves.
>>
>>
>>
>> Best regards,
>>
>>
>>
>> Ni Chunen / George
>>
>>


Re: 事实表历史数据的更新,如何更新cube

2019-09-03 Thread Billy Liu
数据回填(write-back)目前是不支持的,也不在计划上。

做到更高效的历史数据更新,就是要尽量不重刷无变化的数据,而变化的数据总要重新刷新。Kylin刷新的单位是Segment。如果不需要“支付”“开票”“创建”之间的关联计算,可以把这些字段拆在不同的Cube中,分别做Partition字段,刷新的时候要重刷新旧两个时间段。但这个可能的潜在风险是不同Cube之间,数据的不一致。

一般来说,数据的更新具备冷热属性,即热数据经常更新,冷数据不更新,因此可以针对冷热,设计不同规模的Segment,减少刷新时的复杂度。

With Warm regards

Billy Liu


Yun, Henry (BJ/DEL)  于2019年9月3日周二 下午12:54写道:

> 这个问题非常具有普遍意义。我司在开发类似财务指标平台的时候也遇到类似问题。
>
>
>
> 我们不仅需要从Cube里面查询多维度财务指标数据(例如:公司A-华北区-产品A-已支付订单数-1
> 单),有些没有前端系统的汇报单位还要实时填报多维数据(例如:公司A-西南区-产品A-已支付订单数-
> ???单)。西南区的负责人填报后需要能够实时汇总到更上一层的区域当中,以供合并查询。
>
>
>
> 过去我们在Hyperion Essbase中是可以在多维数据库中进行填报的。现在要转型到Kylin,不知填报问题如何解决?请各位大师指点。
>
>
>
> 谢谢!
>
> 运宁
>
>
>
> *From:* 王刚 [mailto:m...@wanggang1987.com]
> *Sent:* Tuesday, September, 03, 2019 11:53 AM
> *To:* d...@kylin.apache.org
> *Cc:* user@kylin.apache.org
> *Subject:* Re: 事实表历史数据的更新,如何更新cube
>
>
>
> 补充一下这个场景:
>
>
>
> 有事实表orders,以创建时间作为分区,有已支付订单数,已开票订单数两个指标,都以订单创建时间作为统计分区。
>
>
>
>
>
> order1这条数据,20190801和20190901分别对时间分区为20190701的数据支付时间和开票时间做了更新。
>
>
>
> 那么20190701这个时间分区的已支付订单数和已开票订单数两个指标都需要更新。
>
>
>
> 在这个场景中,我们如何进行cube的更新呢
>
>
>
>
>
> 在 2019年9月3日,11:44,王刚  写道:
>
>
>
> Hi All
>
>   我是苏宁财务平台的研发,我们在财务指标平台升级计划中正在考虑平台选型,kylin作为考察目标之一。
>
>   在目前的测试步骤中,遇到了事实表历史数据更新的问题,请教一下各位developer。
>
>   举例hive事实表orders,以创建时间作为时间分区
>
> 订单号
>
> 创建时间
>
> 支付时间
>
> 开票时间
>
> Order1
>
> 20190701
>
> 20190801
>
> 20190901
>
> Order2
>
> 20190901
>
> 20190901
>
> 20190901
>
>
>
>   表中order2的时间分区一致且不更新,比较容易计算cube。在order1这条数据中,20190801和20190901
> 分别对时间分区为20190701的数据支付时间和开票时间做了更新,因此支付和开票相关的统计指标也需要更新。
>
>   请问如何配置cube和增量更新方式,能够最高效实现事实表历史数据的指标更新呢?
>
>   kylin新手,诚心请教,烦请各位不吝赐家,多谢。
>
>
>
> 
> Email Disclaimer
> 
> The information in this email is confidential and may be legally
> privileged.
> It is intended solely for the addressee. Access to this email by anyone
> else
> is unauthorised.
>
> If you have received this email in error, please reply to the sender and
> notify him/her of this and delete the email (including all its enclosures)
> and destroy all copies of it. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
>
> Any opinions, conclusions, advice or statements contained in this email
> should not be relied upon unless they are confirmed in writing on hard copy
> letterhead. Opinions, conclusions and other information in this email and
> any attachments that do not relate to the official business of the firm are
> neither given nor endorsed by it.
>
> There is no guarantee that email communications are secure or error-free,
> as information could be intercepted, corrupted, amended, lost, destroyed,
> arrive late or incomplete, or contain viruses.
>
> KPMG, a Hong Kong partnership and KPMG Huazhen LLP, a People's Republic
> of China partnership, are member firms of the KPMG network of independent
> member firms affiliated with KPMG International Cooperative.
>
> KPMG and KPMG Huazhen LLP, together with various
> affiliated entities operate in the People's Republic of China including
> Hong Kong SAR and Macau SAR, as KPMG China.
>
>
> KPMG International Cooperative is a Swiss entity. Member firms of the KPMG
> network of independent firms are affiliated with KPMG International
> Cooperative.
> KPMG International Cooperative provides no client services.
>
> Web sites: kpmg.com/cn
>
> *
>


Re: Welcome to use docker for kylin

2019-08-28 Thread Billy Liu
Sounds cool

With Warm regards

Billy Liu

Xiaoxiang Yu  于2019年8月28日周三 下午6:11写道:
>
> Great! I think it is the best way for user to learn Kylin if he/she do not 
> have a Hadoop env by just one command:
>
>
>
> docker run -d \
>
> -m 8G \
>
> -p 7070:7070 \
>
> -p 8088:8088 \
>
> -p 50070:50070 \
>
> -p 8032:8032 \
>
> -p 8042:8042 \
>
> -p 60010:60010 \
>
> apachekylin/apache-kylin-standalone:3.0.0-alpha2
>
>
>
> 
>
> Best wishes,
>
> Xiaoxiang Yu
>
>
>
>
>
> 发件人: "codingfor...@126.com" 
> 答复: "user@kylin.apache.org" 
> 日期: 2019年8月28日 星期三 17:30
> 收件人: "d...@kylin.apache.org" , "user@kylin.apache.org" 
> 
> 主题: Welcome to use docker for kylin
>
>
>
> Dear Apache Kylin users and developers:
>
> We provide docker for kylin to allow users to try Kylin and developer 
> verification code modifications. The docker mainly supports:
>
> - Kylin and its dependent services installation and deployment, can use Kylin 
> features directly
>
> - Support automatic copy code to the container, verify code changes after 
> packaging is complete and start Kylin
>
> See for details: http://kylin.apache.org/docs/install/kylin_docker.html
>
>
>
> Todo:
>
> - Support for running integration tests
>
>
>
> If you have any questions or suggestions about using docker for kylin, please 
> reply me, thank you !


Re: QUESTION

2019-07-15 Thread Billy Liu
There must be something wrong during your loading. Kylin does not import
real data from Hive, but import meta data only during the loading. Please
check the log first.

With Warm regards

Billy Liu


wangweilin  于2019年7月14日周日 下午7:28写道:

> I transferred more than 200 G of data to hive,and then when I loaded data
> from hive to kylin,it took a day and was still in loading state.I would
> like to ask how long it typically takes kylin to load large amounts of data
> from hive,or how to determine if it is loading data.
> Thank you very much!Looking forward to your early reply!
>
> wangweilin
> sdnuwangwei...@163.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1=wangweilin=sdnuwangweilin%40163.com=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm34b3cc94dce5f170cfe6377531ec0cf4.jpg=%5B%22sdnuwangweilin%40163.com%22%5D>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>


Re: Kylin on parquet

2019-06-28 Thread Billy Liu
The feature is still on-going, but I don't think it's a part of 3.0.
Kylin 3.0 alpha has been released, most features are finalized for
this major version.

With Warm regards

Billy Liu

David Kis  于2019年6月28日周五 下午11:57写道:
>
> Hi,
>
> There was a discussion earlier that Kylin will be able to use Parquet file as 
> storage format. Are there are any updates regarding this topic? Will it be 
> implemented in the upcoming 3.0 release?
>
> Thanks,
> David


Re: ask about kylin 3.0 Go-Live Date

2019-06-11 Thread Billy Liu
Before going live, more feedback is expected from beta users. There is
no confirmed data for this major release. If you meet some issues or
want to share some experience on the realtime Kylin, please update it
with the community.

With Warm regards

Billy Liu

Bryan Liu (CN)  于2019年6月11日周二 上午10:57写道:
>
> Dears,
>
>
>
>   I am from company Homecredit China .
>
>   I would like to ask when new version 3.0 of kylin will launch?  Due to we 
> are finding some realtime olap solution recently.
>
>
>
>   Waiting for your feedback.
>
> Thank you
>
>
>
> Best Regards
>
> Bryan.liu
>
> Home credit CN, Tianjing, China.


Re: 请问kylin当前支持的hive spark hbase的版本是什么

2019-06-02 Thread Billy Liu
Please check http://kylin.apache.org/docs/install/index.html

I think most components you are using are quite new, and not fully
verified by community user. You have to figure out the compitable
issue carefully.

With Warm regards

Billy Liu

aohanhe  于2019年6月2日周日 下午5:54写道:
>
> 各位大神,请问kylin当前支持的hive spark hbase的版本是什么?有谁测试过吗?我使用centeros 7  hadoop3.2.0 
>   hive3.1.1 spark2.3.0  hbase2.0.0,没有办法跑起来。启动的时候一堆的兼容问题。
>
>
>


Re: Re: jdbc query with limit not work

2019-05-27 Thread Billy Liu
What's the result when you were running in the Kylin Insight page? How
did you set the limit parameter by using JDBC?

With Warm regards

Billy Liu

lk_hadoop  于2019年5月28日周二 上午9:33写道:
>
> what I want is the limit clause should be pushed down to hbase.
>
> 2019-05-28
> 
> lk_hadoop
> 
>
> 发件人:JiaTao Tao 
> 发送时间:2019-05-27 19:47
> 主题:Re: jdbc query with limit not work
> 收件人:"user"
> 抄送:
>
> Hi
> Try to set "kylin.query.max-return-rows" a larger value(>1042201), and re-run 
> your query.
>
>
> --
>
>
> Regards!
>
> Aron Tao
>
>
> lk_hadoop  于2019年5月27日周一 上午11:00写道:
>>
>> hi,all:
>> I'm using kylin2.6.1 , when I use JDBC Driver to connect to Kylin and 
>> query data, I got such error :
>>
>> org.apache.kylin.rest.exception.InternalErrorException: Query returned 
>> 1042201 rows exceeds threshold 100
>> while executing SQL: "SELECT "SH_FETCH_SALE_BASE_FACT_ALL_NEW"."GOODS_SPEC" 
>> FROM "GJST"."SH_FETCH_SALE_BASE_FACT_ALL_NEW" 
>> "SH_FETCH_SALE_BASE_FACT_ALL_NEW"  GROUP BY 
>> "SH_FETCH_SALE_BASE_FACT_ALL_NEW"."GOODS_SPEC" ORDER BY 
>> "SH_FETCH_SALE_BASE_FACT_ALL_NEW"."GOODS_SPEC" limit 1000"
>> why ?
>>
>> 2019-05-27
>> 
>> lk_hadoop
>
>
>


Re:

2019-04-14 Thread Billy Liu
If Power BI DirectQuery connection mode is enabled for the Kylin data
source, the queries would run on the cubes. Otherwise, the data has to
be imported by "select all" statements into Power BI which is very
inefficiency. I think Power BI could not import the Cube definition
directly, as there is table model only in Power BI. I got a video demo
link on YouTube, introducing other vendor solution for Kylin and Power
BI, check this: https://www.youtube.com/watch?v=KynF6fds0aM

With Warm regards

Billy Liu

Khalil Mejdi  于2019年4月13日周六 下午6:38写道:
>
> Greetings
>
> Hello, Kylin !
> I am using Apache Kylin Community for my project and I want to confirm some 
> informations !
> Does the Insight runs the queries on the cubes that are saved on Hbase or on 
> the database directly?
> Does the Power BI give me the ability to inspect the cubes or the database 
> (using ODBC Kylin)?
>
> Cordialement


Re: Upgrade from 2.4.1 to 2.6.1

2019-03-19 Thread Billy Liu
Good to know. As you mentioned, the querying jost list has reduced the time
from 10 seconds to 1, the improvement is not only from MySQL backend
database, but also the querying job function itself. There are some
enhancements about loading huge metadata.

With Warm regards

Billy Liu


Iñigo Martínez  于2019年3月19日周二 下午2:01写道:

> Good afternoon.
>
> I want to share my experience on tests run this morning. Our current
> production environment is kyin 2.4.1.
>
> - First, we configured kylin 2.6.1 in order to check that everything was
> running properly and we launched several cube build processes (with example
> learn_cube). As metadata backend we started using mysql instead of hbase.
> - Second, we created a metadata backup from Kylin 2.4.1 and restored on
> kylin 2.6.1. Process was really fast, and when I say fast is VERY VERY
> fast. In the past, the restore procedure took around 15 minutes with HBase
> as backend. With mysql only a couple of minutes. No problem at all.
> - Finally, we started kylin 2.6.1. All cube metadata definitions were in
> place. We launched several queries and all of them worked without issues.
> In this moment we had a new mysql metadata from one side and an old base
> metadata in our production cluster. Kylin tables were, of course, located
> at hbase and were shared by both enviroments. Of course, beginning this
> point we have two different Kylin installations and we have to be carefull
> with build process in order to not delete shared segments, but this is
> another story.
>
> We can say that metadata performance is much better with mysql as backend.
> For example, querying job list takes around 10 seconds when metadata
> backend is HBase and only one second when mysql. Remember that we have
> migrated all metadata stuff from hbase, so both metadata databases are
> equal in terms of size and number of entries.
>
> For the next days we are going to evaluate another aspects of 2.6.1
> deployment, mainly focused on performance, bugs fixed and stability.
>
> Thank you Shao Feng for your tips.
>
>
> El mar., 19 mar. 2019 a las 11:01, Iñigo Martínez (<
> imarti...@telecoming.com>) escribió:
>
>> Hi Shao Feng.
>>
>> Yesterday I deployed a new 2.6.1 instance with both mysql and hbase as
>> metastore. Since I've only built test cube, everything runs smoothly.
>> I'm going to test today restoring metadata from 2.4.1 to see if we can
>> proceed with an in place migration. Using mysql as backend will probably
>> help is in order to get more stability with HBase. We have made huge
>> improvements tuning hbase, but still some issues as present and having
>> thousands of builds per month dumping logs in kylin_metadata table is not
>> very good.
>>
>> El mar., 19 mar. 2019 a las 8:48, ShaoFeng Shi ()
>> escribió:
>>
>>> Hello Inigo,
>>>
>>> This is a good question.
>>>
>>> The MySQL metadata store was introduced in Kylin 2.5, as preparation for
>>> the no-HBase deployment. There is no evidence to say the MySQL meta store
>>> will have better performance, or be more stable. But at least, when HBase
>>> has a problem, Kylin service won't be impacted.
>>>
>>> To migrate to MySQL meta store, you can just 1) dump all metadata to
>>> local disk; 2) change Kylin configuration to use MySQL meta store; 3)
>>> restore metadata from local disk. MySQL will use two tables to persist the
>>> metadata, one for static resources (project, cube, etc), the other for job
>>> outputs. But this is transparent to end user.
>>>
>>>
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>> Apache Kylin PMC
>>> Email: shaofeng...@apache.org
>>>
>>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>>
>>>
>>>
>>>
>>> Na Zhai  于2019年3月17日周日 下午9:28写道:
>>>
>>>> Hi, Iñigo Martínez.
>>>>
>>>>
>>>>
>>>> If you meet too many problems with HBase, you can try to use MySQL
>>>> instead. Hope this can help you.
>>>> http://kylin.apache.org/docs/tutorial/mysql_metastore.html. In HBase,
>>>> there is one metadata_table. In MySQL, there are two metadata_tables. So I
>>>> think you can not migrate metadata from HBase to MySQL directly.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 发送自 Windows 10 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>应用
>>&g

Re: [Discuss] Won't ship Spark binary in Kylin binary anymore

2019-03-07 Thread Billy Liu
+1

With Warm regards

Billy Liu


Zhong, Yanghong  于2019年3月8日周五 上午11:27写道:

> Agree to exclude spark binary.
>
>
>
> --
>
> Best regards,
>
> Yanghong Zhong
>
>
>
> *From: *yuzhang 
> *Reply-To: *"d...@kylin.apache.org" 
> *Date: *Friday, March 8, 2019 at 11:26 AM
> *To: *"user@kylin.apache.org" 
> *Cc: *"d...@kylin.apache.org" 
> *Subject: *Re: [Discuss] Won't ship Spark binary in Kylin binary anymore
>
>
>
> Agree[image: cid:2a92ddc2$1$1695b55d9ca$Coremail$shifengdefannao$163.com]!
> Downloading Spark binary when pack kylin has ever made me confuse.
>
>
>
> [image: Image removed by sender.]
>
> *yuzhang*
>
> shifengdefan...@163.com
>
> 签名由 网易邮箱大师
> <https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.163.com%2Fdashi%2Fdlpro.html%3Ffrom%3Dmail81=02%7C01%7Cyangzhong%40ebay.com%7Cbe6d858c116248a3109208d6a375caf8%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636876123644628778=5MRLjA%2FQkPUZMrkN52kX%2F3p1NzxjI9JObQupZR7%2BeFg%3D=0>
>  定制
>
>
> On 3/8/2019 10:42,ShaoFeng Shi
>  wrote:
>
> Hello,
>
>
>
> As we know Kylin ships a Spark in its binary package; The total package
> becomes bigger and bigger as the version grows; the latest version (v2.6.1)
> is bigger than 350MB, which was rejected by Apache SVN server when trying
> to upload the new package. Among the 350MB, more than 200MB is Spark, while
> Spark is not mandatory for Kylin.
>
>
>
> So I would propose to exclude Spark from Kylin's binary package, from the
> current v2.6.1; the user just needs to point SPARK_HOME to any a folder of
> the expected spark version, or manually download and then put it to
> KYLIN_HOME/spark.  All other behaviors are not impacted.
>
>
>
> Just share your comments if any.
>
>
> Best regards,
>
>
>
> Shaofeng Shi 史少锋
>
> Apache Kylin PMC
>
> Email: shaofeng...@apache.org
>
>
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> <https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkylin.apache.org%2Fdocs%2Fgettingstarted%2Ffaq.html=02%7C01%7Cyangzhong%40ebay.com%7Cbe6d858c116248a3109208d6a375caf8%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636876123644628778=SYVlGSyXD%2FpuIKgemh7Gk%2FnrzTE9U%2Fw18bTjTMIoj%2FY%3D=0>
>
> Join Kylin user mail group: user-subscr...@kylin.apache.org
>
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
>
>


Re: Safe deletion of $KYLIN_HOME/tomcat/temp stuff

2019-01-25 Thread Billy Liu
Hi Xiaoxiang,

I think you could put these operation info into Kylin FAQ, quite helpful.

With Warm regards

Billy Liu


ShaoFeng Shi  于2019年1月21日周一 下午10:22写道:

> Thanks for Xiao Xiang's information, it's quite clear.
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Work email: shaofeng@kyligence.io
> Kyligence Inc: https://kyligence.io/
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Iñigo Martínez  于2019年1月21日周一 下午8:44写道:
>
>> Thank you, Xiaoxiang.
>>
>> Completely clear. We well proceed very carefully with .json deletion and
>> we will only send to trash kylin_job_meta created several days ago.
>> Meanwhile, I will check github in order to find why those folders are not
>> deleted.
>>
>>
>> El sáb., 19 ene. 2019 a las 12:34, Xiaoxiang Yu (<
>> xiaoxiang...@kyligence.io>) escribió:
>>
>>> Hi,
>>>
>>>
>>>
>>> First things about safeToDelete.tmp:
>>>
>>>
>>>
>>> It looks like that ‘safeToDelete.tmp’ is a file which is
>>> owned/controlled by tomcat, but not kylin. It is used to indicate that
>>> folder tomcat/temp could be deleted safely. But that should not be true
>>> when other program(in this case, that is kylin) use that folder, so,
>>> for that reason, please ignore safeToDelete.tmp and delete
>>> kylin-related files under tomcat/temp carefully.
>>>
>>> I am trying to find manual about that file in tomcat document, but I
>>> failed. Related links
>>>
>>>
>>> https://stackoverflow.com/questions/7112591/what-is-the-tomcat-temp-directory-in-tomcat-7
>>>
>>> https://t246osslab.wordpress.com/2017/04/26/tomcatのsafetodelete-tmpの謎を追う
>>> /
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Second things about olap_model_XXX.json:
>>>
>>>
>>>
>>> These files are created by
>>> org.apache.kylin.query.schema.OLAPSchemaFactory. And it is read in sql
>>> query analysis procedure by apache calcite(code in method model of
>>> class CalciteConnectionConfigImpl and QueryConnection). So it must not
>>> deleted. These files are very small, and count of such files is equals
>>> to the count of kylin’s model.
>>>
>>>
>>>
>>> If you delete them but cannot recover them, please restart your kylin
>>> process. It will be created automatically after restart.
>>>
>>>
>>>
>>>
>>>
>>> Third things about folder kylin_job_metaXXX:
>>>
>>>
>>>
>>> These files are created by
>>> org.apache.kylin.engine.mr.common.AbstractHadoopJob, method name is
>>> dumpKylinPropsAndMetadata. Such folder will be created when some step of a
>>> kylin job start. When each step finished, these files should be deleted
>>> automatically( in finally clause of method run). If you find them in
>>> tomcat/temp and it is not fresh (for example, it last-modified
>>> timestamp is two days ago), it could be deleted safely. The count of
>>> such files will grow rapidly if you have a lot of submited job.
>>>
>>>
>>>
>>>
>>>
>>> You can check source code in github repo if you are interested in them.
>>> If your find any mistake, please let me know, thank you.
>>>
>>>
>>>
>>> 
>>>
>>> Best wishes,
>>>
>>> Xiaoxiang Yu
>>>
>>>
>>>
>>>
>>>
>>> *发件人**: *Iñigo Martínez 
>>> *答复**: *"user@kylin.apache.org" 
>>> *日期**: *2019年1月18日 星期五 22:35
>>> *收件人**: *"user@kylin.apache.org" 
>>> *主题**: *Safe deletion of $KYLIN_HOME/tomcat/temp stuff
>>>
>>>
>>>
>>> Good morning.
>>>
>>>
>>>
>>> We have detected that our Kylin installation has plenty temporary files
>>> located at $KYLIN_HOME/tomcat/temp. Now, around 90GB after two months
>>> running.
>>>
>>>
>>>
>>> Inside this folder, there is a "safeToDelete.tmp" file, so we assumed we
>>> could flush the contents of this folder safely. However, this was not true.
>>> As soon as we delete contents older than 2 days, a lot of errors appeared
>>> in kylin.log complaining abou

Re: [Announce] Welcome new Apache Kylin committer: ChunEn Ni (倪春恩)

2018-11-27 Thread Billy Liu
Congrats, ChunEn.

With Warm regards

Billy Liu

ShaoFeng Shi  于2018年11月27日周二 下午3:59写道:
>
> The Project Management Committee (PMC) for Apache Kylin
> has invited ChunEn Ni(倪春恩) to become a committer and we are pleased
> to announce that he has accepted.
>
> Congratulations and welcome, ChunEn!
>
> Shaofeng Shi
>
> On behalf of the Apache Kylin PMC
>
>


Re: 【2.5.1] NoSuchMethodError: com.facebook.fb303.FacebookService

2018-11-13 Thread Billy Liu
Most of these issues are caused by wrong classpath. Try google some
keyword. Point to the right environment variable path will solve your
problem.

With Warm regards

Billy Liu

Pengfei Guo  于2018年11月14日周三 下午2:32写道:
>
> hi,all:
>
> when I build cube,I get this error:
>
> java.lang.NoSuchMethodError: 
> com.facebook.fb303.FacebookService$Client.sendBaseOneway(Ljava/lang/String;Lorg/apache/thrift/TBase;)V
>
> My kylin version is apache-kylin-2.5.1-bin-cdh57 ,and this is the first time 
> run 2.5.1.
>
> The detail log is below:
>
> Exception in thread "main" java.lang.NoSuchMethodError: 
> com.facebook.fb303.FacebookService$Client.sendBaseOneway(Ljava/lang/String;Lorg/apache/thrift/TBase;)V
> at 
> com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:436)
> at 
> com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:430)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:538)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:105)
> at com.sun.proxy.$Proxy19.close(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2084)
> at com.sun.proxy.$Proxy19.close(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:357)
> at org.apache.hadoop.hive.ql.metadata.Hive.access$000(Hive.java:153)
> at org.apache.hadoop.hive.ql.metadata.Hive$1.remove(Hive.java:173)
> at org.apache.hadoop.hive.ql.metadata.Hive.closeCurrent(Hive.java:326)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1643)
> at org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:701)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:634)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>
>
> Thank you
>
> -- Pengfei Guo
>
>


Re: kylin.log Deleted, how to restore

2018-10-22 Thread Billy Liu
It seems the file handler pointing to kylin.log could not be released.
Please check if there is any kylin process is still there, and try to
kill it and restart.

With Warm regards

Billy Liu

安卫华 <158989...@qq.com> 于2018年10月23日周二 上午10:12写道:
>
> Thank you very much shaofeng reply!!
>
>   restart and Redeployment,Can not solve the problem!
>
>  The deployment of new machines can be solved.
>
>
> -- 原始邮件 --
> 发件人: "ShaoFeng Shi";
> 发送时间: 2018年10月22日(星期一) 晚上10:24
> 收件人: "user";
> 主题: Re: kylin.log Deleted, how to restore
>
> Try a restart ?
>
> 安卫华 <158989...@qq.com> 于2018年10月22日周一 下午6:52写道:
>>
>> hi all:
>>
>> kylin.log Deleted,Logs do not write this files
>>
>>  how to restore?
>>
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>


Re: [Announce] Welcome new Apache Kylin committer :Allen Ma

2018-10-16 Thread Billy Liu
Welcome Allen.

With Warm regards

Billy Liu

Luke Han  于2018年10月16日周二 下午11:28写道:
>
> I am very pleased to announce that the Project Management Committee (PMC) of 
> Apache Kylin has asked Allen Ma (Gang Ma) to become Apache Kylin committer, 
> and he has already accepted.
>
> Allen has already made many contributions to Kylin community, to answer 
> others questions actively, submit patches for bug fixes and contribute to 
> some features. We are so glad to have him to be our new committer.
>
> Please join me to welcome Allen.
>
> Luke Han
>
> On behalf of the Apache Kylin PPMC


Re: Kylin Interpreter on Zeppelin doesnt allow stting project name on paragraph

2018-09-27 Thread Billy Liu
Hi Shaofeng,

The project name defining in paragraph is supported since Zeppelin 0.7.1.
The test cases are included here:
https://github.com/apache/zeppelin/blob/master/kylin/src/test/java/org/apache/zeppelin/kylin/KylinInterpreterTest.java

Hi Moises,

If that does not work, please check the exception logs again. Thank you.

With Warm regards

Billy Liu


ShaoFeng Shi  于2018年9月27日周四 上午10:50写道:

> Thank you; we will check it.
>
> Moisés Català  于2018年9月26日周三 下午10:34写道:
>
>> Hi ShaoFeng,
>>
>> I’ve created the JIra Issue
>> https://issues.apache.org/jira/browse/KYLIN-3593 related to this topic
>>
>> Thanks
>>
>>
>>
>> Moisés Català
>> Senior Data Engineer
>> La Cupula Music - Sonosuite
>> T: *+34 93 250 38 05*
>> www.lacupulamusic.com
>>
>>
>>
>> C/. Trafalgar, 10   Pral-1ª
>> 08010 Barcelona (Spain)
>>
>>
>>
>>
>>
>>
>>
>> El 25 sept 2018, a las 6:01, ShaoFeng Shi 
>> escribió:
>>
>> Hi Moises,
>>
>> It doesn't support project name following "%kylin" I think. The project
>> name can only be set in the interpreter configuration with
>> "kylin.query.project".
>>
>> But I agree with you that it should be flexible to support connecting
>> with multiple projects. Would you like to report a JIRA to Kylin? Thank you!
>>
>> Moisés Català  于2018年9月19日周三 上午1:26写道:
>>
>>> Hi guys,
>>>
>>> I have started using Kylin with our Zeppelin Notebook Server, we have
>>> installed correctly the interpreter and we can run queries to the default
>>> project specified in the interpreter settings:
>>>
>>> The problem is when we try to set the project at the paragraph:
>>> 
>>> First paragraph fails with error 500, second paragraph has not specified
>>> the project (learn_kylin) and executes fine.
>>>
>>> Our Kylin version is 2.4.0
>>> Zeppelin Version 0.7.3
>>>
>>> Kylin interpreter settings are the default:
>>> 
>>>
>>> Am I writting the paragraph with incorrect syntax??
>>>
>>> %kylin(learn_kylin)
>>>
>>> select part_dt,count(*) from kylin_sales group by part_dt
>>>
>>> Thanks in advance
>>>
>>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


Re: [Announce] Apache Kylin 2.5.0 released

2018-09-19 Thread Billy Liu
Thanks Shaofeng.

I think this is one of the most remarkable releases since 2.0.
A few highlights:
  [KYLIN-3521] - Enable Cube Planner by default
  [KYLIN-2565] - Support Hadoop 3.0
  [KYLIN-3033] - Support HBase 2.0
  [KYLIN-3418] - User interface for hybrid model
  [KYLIN-3427] - Convert to HFile in Spark
  [KYLIN-3441] - Merge cube segments in Spark
  [KYLIN-3442] - Fact distinct columns in Spark
  [KYLIN-3453] - Improve cube size estimation for TOPN, COUNT DISTINCT
  and a lot of bug fixes and improvement on stability.

With Warm regards

Billy Liu
ShaoFeng Shi  于2018年9月19日周三 上午8:31写道:
>
> The Apache Kylin team is pleased to announce the immediate availability of 
> the 2.5.0 release.
>
> This is a major release after 2.4, with more than 100 enhancements and bug 
> fixes. All of the changes in this release can be found in:
> https://kylin.apache.org/docs/release_notes.html
>
> You can download the source release and binary packages from Apache Kylin's 
> download page: https://kylin.apache.org/download/
>
> Apache Kylin is an open source Distributed Analytics Engine designed to 
> provide SQL interface and multi-dimensional analysis (OLAP) on Apache Hadoop, 
> supporting extremely large datasets.
>
> Apache Kylin lets you query massive dataset at sub-second latency in 3 steps:
> 1. Identify a star schema or snowflake schema data set on Hadoop.
> 2. Build Cube on Hadoop.
> 3. Query data with ANSI-SQL and get results in sub-second, via ODBC, JDBC or 
> RESTful API.
>
> Thanks to everyone who has contributed to the 2.5.0 release.
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kylin.apache.org/
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>


Re: [Discuss] Exclude Spark from Kylin's binary package

2018-09-18 Thread Billy Liu
I like the idea to remove the built-in spark distribution. I think
Kylin could work well with all spark 2.1+

With Warm regards

Billy Liu

ShaoFeng Shi  于2018年9月18日周二 下午5:35写道:
>
> Hello,
>
> Today Kylin's binary package ships an Apache Spark binary, the total package 
> is 297 MB in v2.5.0; I noticed that if the package is above 300 MB, it would 
> be rejected by Apache SVN. Now in the whole package, Spark takes about 200 
> MB. Later when we upgrade to a higher version, the package can be bigger. 
> Without spark, the package is around 100 MB.
>
> Since we don't have customization to Spark, we can provide a 
> download-spark.sh script for the user to easily download. If the user already 
> has that in local, he just needs to copy or create a symbolic link in Kylin 
> folder.
>
> This will reduce 2/3 package size, and save time/bandwidth for distributing 
> Kylin. If you see a problem this change may bring, please complain.
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>


Re: [Announce] Apache Kylin 2.4.1 released

2018-09-09 Thread Billy Liu
Thanks Yanghong for this new release.

With Warm regards

Billy Liu

Yanghong Zhong  于2018年9月10日周一 上午9:20写道:
>
> The Apache Kylin team is pleased to announce the immediate availability of
> the 2.4.1 release.
>
> This is a bug fix release after 2.4.1, with 22 bug fixes and enhancements;
> All of the changes in this release can be found in:
> https://kylin.apache.org/docs/release_notes.html
>
> You can download the source release and binary packages from Apache Kylin's
> download page: https://kylin.apache.org/download/
>
> Apache Kylin is an open source Distributed Analytics Engine designed to
> provide SQL interface and multi-dimensional analysis (OLAP) on Apache
> Hadoop, supporting extremely large datasets.
>
> Apache Kylin lets you query massive data set at sub-second latency in 3
> steps:
> 1. Identify a star schema or snowflake schema data set on Hadoop.
> 2. Build Cube on Hadoop.
> 3. Query data with ANSI-SQL and get results in sub-second, via ODBC, JDBC
> or RESTful API.
>
> Thanks everyone who have contributed to the 2.4.1 release.
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kylin.apache.org/


Re: [Announce] New Apache Kylin PMC Yanghong Zhong

2018-08-30 Thread Billy Liu
Congrats, Yanghong. Welcome.

With Warm regards

Billy Liu

ShaoFeng Shi  于2018年8月30日周四 下午6:04写道:
>
> Congratulations Yanghong! Welcome to join Kylin PMC!
>
> 2018-08-30 17:56 GMT+08:00 Luke Han :
>>
>> On behalf of the Apache Kylin PMC, I am very pleased to announce
>> that Yonghong Zhong has accepted the PMC's invitation to become a
>> PMC member on the project.
>>
>> We appreciate all of Yonghong's generous contributions about many bug
>> fixes, patches, helped many users. We are so glad to have him to be our
>> new PMC and looking forward to his continued involvement.
>>
>> Congratulations and Welcome, Yonghong!
>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>


Re: Changes In Dimensions And Historical Cube's Data

2018-08-21 Thread Billy Liu
Hello Shrikant,

I think there are some new updates on lookup table capability. Check
out https://issues.apache.org/jira/browse/KYLIN-3221
Each dimension table is stored as snapshot(if you are using derived
dimension, not normal dimension), and be connected with the correspond
segment. With KYLIN-3221, you could refresh the existing lookup table
as needed.

With Warm regards

Billy Liu
Shrikant Bang  于2018年8月21日周二 上午11:26写道:
>
> Hello Kylin Users,
>
> I found a similar mail thread of Dec. 2015 ( 
> http://apache-kylin.74782.x6.nabble.com/Incremental-builds-assumptions-and-clarifications-td2736.html
>  ).
> Can someone please confirm if this understanding is applicable for 
> current version (v2.4) of Apache Kylin also?
>
> Thank You,
> Shrikant Bang.
>
> On Mon, Aug 20, 2018 at 3:29 PM Shrikant Bang  wrote:
>>
>> Hi Team,
>>
>> We have a use case where dimension's data may get modified ( slowly 
>> changing dimension (SCD)).
>>
>>   Here is example:
>>
>> Fact :
>>user_activity (
>> user_id STRING,
>> country_code INT,
>> 
>>   )
>>
>> Dimension:
>>   country_dim (
>>   country_code INT,
>>   country_name STRING ,
>>  
>> )
>>
>>
>> Let's say we have mapped country code 1 to  'example_country1' and keep 
>> building cube for years. Now country code 1 is assigned to  
>> 'new_country_code1'.
>>
>>
>> I have below query:
>>
>>   Is there any way to update the cube or it has to rebuild for all past time 
>> segment?
>>  Can we join cube data with other dimensions on runtime ( changing 
>> dimensions table) , something like look ups?
>>
>>
>> Thank You,
>> Shrikant Bang.
>>
>>
>>
>>


Re: Queries For Building Cube

2018-08-13 Thread Billy Liu
Hello Shrikant,

For 1, seems the 4 dimensions are hierarchy structure. You could
define them as hierarchy dimensions in Cube, and leave A as mandatory
dimension.

For 2, select 'user_activity' as partition column in model design.
There are a few built-in formats, most date types are supported.

With Warm regards

Billy Liu
Shrikant Bang  于2018年8月13日周一 下午5:39写道:
>
> Hi Team,
>
>  We are doing a PoC on building OLAP cubes. Could you please help me to 
> get answer of below queries?
>
> Selective Cuboids:
> We need to have selective cuboids as part of OLAP cubes.
> Let say if we have 4 dimensions : A, B, C, D then we need just  (A,B,C,D) , 
> (A,B,C), (A,B) and (A)
>
> Refresh Settings:
> How to specify partition column and format while building cube for fact table.
> e.g. user_activity is partitioned by date '-MM-dd' and cube should be 
> refreshed everyday with previous day's computation.
>
>
> Thank You,
> Shrikant Bang
>


Re: [Announce] Apache Kylin 2.3.2 released

2018-07-08 Thread Billy Liu
Thanks, Kaisen.

With Warm regards

Billy Liu

kangkaisen  于2018年7月8日周日 下午3:38写道:
>
> The Apache Kylin team is pleased to announce the immediate availability of 
> the 2.3.2 release.
>
> This is a bug fix release after 2.3.1, with 12 bug fixes and enhancements; 
> All of the changes in this release can be found in:
> https://kylin.apache.org/docs23/release_notes.html
>
> You can download the source release and binary packages from Apache Kylin's 
> download page: https://kylin.apache.org/download/
>
> Apache Kylin is an open source Distributed Analytics Engine designed to 
> provide SQL interface and multi-dimensional analysis (OLAP) on Apache Hadoop, 
> supporting extremely large datasets.
>
> Apache Kylin lets you query massive data set at sub-second latency in 3 steps:
> 1. Identify a star schema or snowflake schema data set on Hadoop.
> 2. Build Cube on Hadoop.
> 3. Query data with ANSI-SQL and get results in sub-second, via ODBC, JDBC or 
> RESTful API.
>
> Thanks everyone who have contributed to the 2.3.2 release.
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kylin.apache.org/
>
>
> --
> Best regards,
>
> Kaisen Kang 康凯森


Re: Kylin performance

2018-06-05 Thread Billy Liu
I just check it again. The Kybot Client download URL has been fixed.

With Warm regards

Billy Liu


Kosmachev, Dmitry  于2018年6月5日周二 下午3:54写道:

> Hi Billy!
>
>
>
> Thanks a lot!
>
> As I can see the first step to start using this tool is to generate
> diagnostic package with KyBot Client. But the link for downloading this
> client heads to
> http://cn.kyligence.io/download/kybot/1.1.32/kybot-client-1.1.32-hbase1.x-bin.tar.gz
> with “Nothing found” message. Products page from this site does not include
> any links to client. Don’t you know are there any mirrors to download
> client package?
>
>
>
> BR
>
>
>
> *Dmitry Kosmachev*
> Senior Specialist
> Luxoft
>
>
>
> *From:* Billy Liu 
> *Sent:* Saturday, June 2, 2018 9:11 AM
> *To:* user 
> *Subject:* Re: Kylin performance
>
>
>
> Hi Dmitry,
>
>
>
> This is an online query bottleneck diagnostic tools designed for Apache
> Kylin: https://kybot.io  You could have a try, to figure out the
> bottleneck from underline Hbase service, or not hit the right cuboid.
>
>
> With Warm regards
>
> Billy Liu
>
>
>
>
>
> Kosmachev, Dmitry  于2018年5月29日周二 下午10:10写道:
>
> We have several cubes which are used for requests from our application.
>
> Sometimes our application slows down and we can see in Kylin logs that
> queries from this app runs hundreds of seconds. Also these queries are
> shown as Slow Queries in Kylin UI.
>
> The same performance when you try to run the same queries via Insight.
>
> But if you try to run the same queries later (in hour or two for instance)
> they run only several seconds (from app and from Insight). So if it stucks,
> it stucks everywhere. If it runs well, it runs well in Insight and from app.
>
>
>
> No evidences from HBase side and we don’t understand if what is the source
> of problem.
>
>
>
> Here is the example from log:
>
>
>
> 2018-05-29 16:43:44,097 DEBUG [BadQueryDetector]
> badquery.BadQueryHistoryManager:90 : Loaded 10 Bad Query(s)
>
> 2018-05-29 16:43:44,102 INFO  [BadQueryDetector]
> service.BadQueryDetector:170 : Problematic thread 0x5cc2f
>
> at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:828)
>
> at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>
> at
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>
> at Baz.bind(Unknown Source)
>
> at
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
>
> at
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
>
> at
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
>
> at
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
>
> at
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
>
> at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>
>
>
> 2018-05-29 16:44:13,329 INFO  [Query
> f8464b34-589e-49c2-8f3a-648fa009309b-379951] service.QueryService:284 :
>
> ==[QUERY]===
>
> Query Id: f8464b34-589e-49c2-8f3a-648fa009309b
>
> SQL: SELECT QUERYID,
>
> CAST(SUM(EMPLOYERCNT) as float)/COUNT(PersonId) as AvgEmployer,
>
> MAX(EMPLOYERCNT) as MaxEmployer
>
> FROM
>
> (SELECT QUERYID, PersonId, MAX(EMPLOYERCNT) as EMPLOYERCNT
>
> FROM PERSON_VIEW_V2 where QueryId IN (1237)
>
> group by QUERYID, PersonId)
>
> WHERE EMPLOYERCNT IS NOT NULL AND EMPLOYERCNT > 0
>
> group by QUERYID
>
> User: ADMIN
>
> Success: true
>
> Duration: 119.237
>
> Project: scrm
>
> Realization Names: [CUBE[name=person_cube_v2_2]]
>
> Cuboid Ids: [768]
>
> Total scan count: 44354219
>
> Total scan bytes: 3183760803
>
> Result row count: 1
>
> Accept Partial: true
>
> Is Partial Result: false
>
> Hit Exception Cache: false
>
> Storage cache used: false
>
> Is Query Push-Down: false
>
> Message: null
>
> ==[QUERY]===
>
>
>
>
>
>
>
> *Dmitry Kosmachev*
> Senior Specialist
> Luxoft
>
>
>
> *From:* Kumar, Manoj H 
> *Sent:* Tuesday, May 29, 2018 3:04 PM
> *To:* user@kylin.apache.org
> *Subject:* [Sender Auth Failure] RE: Kylin performance
>
>
>
> Can you pls. explain the issue in detail? Did you run query in “insight”
> section first? Its working fine there?
>
>
>
> Regar

Re: Kylin performance

2018-06-02 Thread Billy Liu
Hi Dmitry,

This is an online query bottleneck diagnostic tools designed for Apache
Kylin: https://kybot.io  You could have a try, to figure out the bottleneck
from underline Hbase service, or not hit the right cuboid.

With Warm regards

Billy Liu


Kosmachev, Dmitry  于2018年5月29日周二 下午10:10写道:

> We have several cubes which are used for requests from our application.
>
> Sometimes our application slows down and we can see in Kylin logs that
> queries from this app runs hundreds of seconds. Also these queries are
> shown as Slow Queries in Kylin UI.
>
> The same performance when you try to run the same queries via Insight.
>
> But if you try to run the same queries later (in hour or two for instance)
> they run only several seconds (from app and from Insight). So if it stucks,
> it stucks everywhere. If it runs well, it runs well in Insight and from app.
>
>
>
> No evidences from HBase side and we don’t understand if what is the source
> of problem.
>
>
>
> Here is the example from log:
>
>
>
> 2018-05-29 16:43:44,097 DEBUG [BadQueryDetector]
> badquery.BadQueryHistoryManager:90 : Loaded 10 Bad Query(s)
>
> 2018-05-29 16:43:44,102 INFO  [BadQueryDetector]
> service.BadQueryDetector:170 : Problematic thread 0x5cc2f
>
> at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:828)
>
> at
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>
> at
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>
> at Baz.bind(Unknown Source)
>
> at
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:331)
>
> at
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:294)
>
> at
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:553)
>
> at
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:544)
>
> at
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:193)
>
> at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>
>
>
> 2018-05-29 16:44:13,329 INFO  [Query
> f8464b34-589e-49c2-8f3a-648fa009309b-379951] service.QueryService:284 :
>
> ==[QUERY]===
>
> Query Id: f8464b34-589e-49c2-8f3a-648fa009309b
>
> SQL: SELECT QUERYID,
>
> CAST(SUM(EMPLOYERCNT) as float)/COUNT(PersonId) as AvgEmployer,
>
> MAX(EMPLOYERCNT) as MaxEmployer
>
> FROM
>
> (SELECT QUERYID, PersonId, MAX(EMPLOYERCNT) as EMPLOYERCNT
>
> FROM PERSON_VIEW_V2 where QueryId IN (1237)
>
> group by QUERYID, PersonId)
>
> WHERE EMPLOYERCNT IS NOT NULL AND EMPLOYERCNT > 0
>
> group by QUERYID
>
> User: ADMIN
>
> Success: true
>
> Duration: 119.237
>
> Project: scrm
>
> Realization Names: [CUBE[name=person_cube_v2_2]]
>
> Cuboid Ids: [768]
>
> Total scan count: 44354219
>
> Total scan bytes: 3183760803
>
> Result row count: 1
>
> Accept Partial: true
>
> Is Partial Result: false
>
> Hit Exception Cache: false
>
> Storage cache used: false
>
> Is Query Push-Down: false
>
> Message: null
>
> ==[QUERY]===
>
>
>
>
>
>
>
> *Dmitry Kosmachev*
> Senior Specialist
> Luxoft
>
>
>
> *From:* Kumar, Manoj H 
> *Sent:* Tuesday, May 29, 2018 3:04 PM
> *To:* user@kylin.apache.org
> *Subject:* [Sender Auth Failure] RE: Kylin performance
>
>
>
> Can you pls. explain the issue in detail? Did you run query in “insight”
> section first? Its working fine there?
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Kosmachev, Dmitry [mailtodkosmac...@luxoft.com]
> *Sent:* Tuesday, May 29, 2018 3:47 PM
> *To:* user@kylin.apache.org
> *Subject:* Kylin performance
>
>
>
> Hi!
>
> We have a strange situation with Kylin performance.
>
> Sometimes queries to Kylin last hundreds and thousands seconds.
> Considering that our web portal runs several of these queries when user
> uses portal this performance issue leads to the situation when web portal
> stops working at all.
>
> Sometimes the same queries last several seconds (without using Kylin
> cache). We can’t find any relations to HDFS or HBase performance. Are there
> any best practices to debug this situation and find the bottleneck? We
> don’t understand if Kylin or HBase is the reason.
>
> By now we discovered that our biggest cube has 500 regions in HBase. We
> have default kylin.hbase.region.cut parameter and 

Re: kylin根据分区build cube,出错的job直接删除,没有执行job的DISCARDED操作,然后通过调用rest API删除没有成功的segment报错

2018-04-27 Thread Billy Liu
Agree with Silas. The job drop bug has been fixed and you are
suggested to upgrade.

With Warm regards

Billy Liu


2018-04-24 15:17 GMT+08:00 Ge Silas <go...@live.cn>:
> Hello,
>
> What is your Apache Kylin version? In latest versions, I don’t think the job 
> with error status can be dropped directly before discarding it.
>
> At this moment, you may need to download, edit to remove this segment and 
> then upload your metadata, or start over with cloning the original cube.
>
> Thanks,
> Silas
>
> On Apr 23, 2018, at 3:54 PM,  <mr.jhon...@qq.com> wrote:
>
> 您好:
>   我在创建一个 cube ,根据一个时间字段分区,定时build
> <f30ad...@f5efdc33.4a91dd5a.jpg>
> build 这个时间段的时候,job 出错,我直接将job 删除(drop)掉了
> 然后再重新build (执行上图的操作),会报如下错误:
> <5ba40...@06b12437.4a91dd5a.jpg>
> 我执行refresh操作<929f3...@6b75b22f.4a91dd5a.jpg>会报错<be002...@6fcc1714.4a91dd5a.jpg>
> 在storage中会有这个segment<9439a...@4ab61b61.4a91dd5a.jpg><96ead...@9ee4c26e.4a91dd5a.jpg>
>
> 但是我到hbase下查看 却没有KYLIN_KE4A0U0TUZ 这张表。
> 我通过javaScript调用kylin的rest API 删除这个segment 的时候报错:
> <9a430...@0003853f.4a91dd5a.jpg>
> 我想要重新build 这个segment 该怎么办? 谢谢
>
>
>


Re: Hybrid Cubes Document

2018-04-22 Thread Billy Liu
Hello Roberto,

Thanks for this sharing. Would you like to publish it on the Kylin website?


With Warm regards

Billy Liu

2018-04-20 1:18 GMT+08:00 <roberto.tar...@stratebi.com>:

> Hi,
>
>
>
> Last days I was doing some researching about the use of hybrid cubes
> (model). However, I just found this document http://kylin.apache.org/blog/
> 2015/09/25/hybrid-model/ published on Sep 25, 2015. Due to this fact, I
> wrote a little guide that aim to explain its use , possible use cases and
> current limitations. I share a document through the following link:
>
>
>
> https://drive.google.com/open?id=1qbvB1iONBcFMFE__SuF0ayq_l1_0vwXN
>
>
>
> Please, do not hesitate to correct me if you see something wrong. I have
> found this feature very interesting to mitigate the issues related to the
> problem of re building the entire Cube if we need to modify its definition.
> However, the hybrid model only combines the data from two cubes if a query
> uses only the common columns of this two cubes. I have analyzed this
> drawback in the document.
>
>
>
> I appreciate the help of the Kylin community and team, I hope this
> document helps.
>
>
>
> Best Regards,
>
> *Roberto Tardío Olmos*
>
> *Senior Big Data & Business Intelligence Consultant*
>
> Avenida de Brasil, 17
> <https://maps.google.com/?q=Avenida+de+Brasil,+17=gmail=g>,
> Planta 16.28020 Madrid
>
> Fijo: 91.788.34.10
>
>
> [image:
> http://www.stratebi.com/image/layout_set_logo?img_id=21615=1486381163544]
>
>
>
> http://bigdata.stratebi.com/
>
>
>
> http://www.stratebi.com
>
>
>


Re: [Announce] Welcome new Apache Kylin committer :Julian Pan

2018-04-16 Thread Billy Liu
Welcome Julian, congratus.

With Warm regards

Billy Liu


2018-04-17 9:20 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:
> Congratulations, Julian.
>
> On Mon, Apr 16, 2018 at 6:18 PM, Luke Han <luke...@apache.org> wrote:
>>
>> My fault, I have tried to send out this one week ago but somehow it not
>> send out at that time, sorry about that.
>>
>> Let's welcome our new committer.
>>
>> Thanks.
>>
>> Luke Han <luke...@apache.org>于2018年4月17日周二 上午9:16写道:
>>>
>>> I am very pleased to announce that the Project Management Committee (PMC)
>>> of Apache Kylin has asked Julian Pan to become Apache Kylin committer, and
>>> he has already accepted.
>>>
>>> Jianhua has already made many contributions to Kylin community, to answer
>>> others questions actively, submit patches for bug fixes and contribute to
>>> some features. We are so glad to have him to be our new committer.
>>>
>>> Please join me to welcome Julian.
>>>
>>> Luke Han
>>>
>>> On behalf of the Apache Kylin PPMC
>
>


[ECO] Redash Kylin Plugin

2018-04-09 Thread Billy Liu
Hello Kylin user,

Found one interesting project on Github from the Strikingly contribution.
https://github.com/strikingly/redash-kylin.

The description from project readme:
"At Strikingly we are using Apache Kylin as our BI solution to have
insight about multiple data sources. We are also using Redash, an
excellent open source dashboard service for drawing chart and
generating report.

So we made this plugin to let redash connect to Kylin without
configuring any JDBC connections. After installed, you should be able
to execute SQL query, test connections and list schemas upon a Kylin
data source."

Thanks for this contribution.

With Warm regards

Billy Liu


Re: The global dictionary does not specify the cube distinction.

2018-04-03 Thread Billy Liu
The global dictionary is designed to be shared among Cubes. So
deleting the cube will not remove the global dictionary is as
expected. If you want to remove the global dictionary, please use
storage clean up tools.
http://kylin.apache.org/docs23/howto/howto_cleanup_storage.html

With Warm regards

Billy Liu


2018-04-04 9:37 GMT+08:00 小村庄 <ren...@foxmail.com>:
> In the cube design, the COUNT_DISTINCT indicator USES the global dictionary.
> When building a cube, a dictionary file is generated at HDFS, and the
> corresponding directory is:
> "/ resources/GlobalDict/dict/database. The tableName/column/", at the same
> time will generate metadata information in hbase metadata, metadata in the
> hbase rowkey information:"/dict/database. The tableName/column/". The value
> information specifies the corresponding HDFS path.
> The problem is that the corresponding HDFS data and hbase data are not
> deleted when the cube is deleted. Another HDFS data path and hbase metadata
> rowkey information did not specify the concrete cube, if a table column in
> more than one cube is to use global dictionary, can lead to multiple share a
> dictionary data cube


Re: How to use MR to build UHC dimensions

2018-04-01 Thread Billy Liu
Hi Fei Yi,

This parameter only works for ultra high cardinality columns,
including the columns defined as "ShardBy" and "Global Dictionary".
Please check if your cube has these two definitions.

With Warm regards

Billy Liu


2018-03-30 16:45 GMT+08:00 Fei Yi <yijianhui...@gmail.com>:
> I use kylin 2.3.1 version,
> set kylin.engine.mr.build-uhc-dict-in-additional-step=true
> kylin.snapshot.max-mb=3000
>
> but job are still built in kylin server, I don't see a separate step to
> build UHC dimensions
>
>


Re: [Announce] Apache Kylin 2.3.1 released

2018-03-28 Thread Billy Liu
Download page updated. https://kylin.apache.org/download/


With Warm regards

Billy Liu

2018-03-28 17:03 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:

> Pls. advise how can I download 2.3.1 version. I don’t see there
>
>
>
> https://archive.apache.org/dist/kylin/
>
>
>
>
>
>
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Billy Liu [mailtobilly...@apache.org]
> *Sent:* Wednesday, March 28, 2018 9:14 AM
> *To:* dev <d...@kylin.apache.org>; user <user@kylin.apache.org>;
> annou...@apache.org
> *Subject:* [Announce] Apache Kylin 2.3.1 released
>
>
>
> The Apache Kylin team is pleased to announce the immediate
> availability of the 2.3.1 release.
>
> This is a bug fix release after 2.3.0 with 12 bug fixes and
> enhancements; All of the changes in this release can be found in:
> https://kylin.apache.org/docs23/release_notes.html
>
> You can download the source release and binary packages from Apache
> Kylin's download page: https://kylin.apache.org/download/
>
> Apache Kylin is an open source Distributed Analytics Engine designed
> to provide SQL interface and multi-dimensional analysis (OLAP) on
> Apache Hadoop, supporting extremely large datasets.
>
> Apache Kylin lets you query massive data set at sub-
> second latency in 3 steps:
> 1. Identify a star schema or snowflake schema data set on Hadoop.
> 2. Build Cube on Hadoop.
> 3. Query data with ANSI-SQL and get results in sub-second, via ODBC,
> JDBC or RESTful API.
>
> Thanks everyone who have contributed to the 2.3.1 release.
>
> We welcome your help and feedback. For more information on how to
> report problems, and to get involved, visit the project website at
> https://kylin.apache.org/
>
>
> With Warm regards
>
> Billy Liu
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>


[Announce] Apache Kylin 2.3.1 released

2018-03-27 Thread Billy Liu
The Apache Kylin team is pleased to announce the immediate
availability of the 2.3.1 release.

This is a bug fix release after 2.3.0 with 12 bug fixes and
enhancements; All of the changes in this release can be found in:
https://kylin.apache.org/docs23/release_notes.html

You can download the source release and binary packages from Apache
Kylin's download page: https://kylin.apache.org/download/

Apache Kylin is an open source Distributed Analytics Engine designed
to provide SQL interface and multi-dimensional analysis (OLAP) on
Apache Hadoop, supporting extremely large datasets.

Apache Kylin lets you query massive data set at sub-second latency in 3 steps:
1. Identify a star schema or snowflake schema data set on Hadoop.
2. Build Cube on Hadoop.
3. Query data with ANSI-SQL and get results in sub-second, via ODBC,
JDBC or RESTful API.

Thanks everyone who have contributed to the 2.3.1 release.

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kylin.apache.org/


With Warm regards

Billy Liu


Re: negative result in kylin 2.3.0

2018-03-18 Thread Billy Liu
Hi  ZhangXuan,

Could you share the return type definition on your SUM measure? And
also some test cases are welcomed.

With Warm regards

Billy Liu


2018-03-16 22:49 GMT-07:00 ZhangXuan <z...@czkj1010.com>:
> Hi
>Successfully built a Cube with 4 billion data on kylin upgraded from
> 2.2.0 to 2.3.0, When I use SQL query the data according to a large
> dimension, a negative sum measure number appears in the result, but when
> using a relatively small dimension, the data is correct. However, this did
> not appear in the previous 2.0.0 version, and the old Cube was built in
> version 2.0.0 also occurs in 2.3.0. I guess it should be dynamic calculated
> using int type and overflow.
>How can I fix this? Thanks.


Call for contribution: Kylin Integration with Ambari

2018-03-12 Thread Billy Liu
Hello Kylin dev & user mailer

Ambari is quite useful for cluster management, including service
start/stop, install/uninstall, monitoring and configuration. Kylin
could be an active service managed by Ambari. I know many companies
have already deployed Kylin as an Ambari service. This is the call for
contribution. If you are interested on cluster management, has skills
on python/shell, has experience on Ambari project. You are the right
person the community is looking for.

Any suggestion is welcomed.

With Warm regards

Billy Liu


Re: running spark on kylin 2.2

2018-02-28 Thread Billy Liu
Any exception in logs?

With Warm regards

Billy Liu


2018-02-28 22:53 GMT+08:00 Sonny Heer <sonnyh...@gmail.com>:
> Anyone know what I need to set in order for spark-submit to use the HDP
> version of spark and not the internal one?
>
> currently i see:
>
> export HADOOP_CONF_DIR=/ebs/kylin/hadoop-conf &&
> /ebs/kylin/apache-kylin-2.2.0-bin/spark/bin/spark-submit
>
>
> I see in the kylin.properties files:
> ## Spark conf (default is in spark/conf/spark-defaults.conf)
>
> Although it doesn't how how I can change this to use the HDP spark-submit.
>
> Also HDP is on 1.6.1 version of spark and kylin internally uses 2.x.  Not
> sure if that matters during submit.  I can't seem to get more than 2
> executors to run without it failing with other errors.  We have about 44
> slots on our cluster.
>
> Also uncommented:
> ## uncomment for HDP
>
> kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.version=current
>
> kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp.version=current
>
> kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhdp.version=current
>
> see attached for other properties set.


Re: Questions about data integrity in Kylin

2018-02-23 Thread Billy Liu
Hi  BELLIER,

I suggest you reading some Kylin introduction document or slide. It
will explain how Kylin works, for example:
https://www.slideshare.net/XuJiang2/kylin-hadoop-olap-engine

With Warm regards

Billy Liu


2018-02-21 19:53 GMT+08:00 BELLIER Jean-luc <jean-luc.bell...@rte-france.com>:
> Hello,
>
>
>
> I was wondering about a few things :
>
> · When I launch a query using filters on PART_DT (from the sample
> model), e.g. WHERE PART_DT=’2013-12-31’, I get a result through the Kylin
> web interface, whereas it gives me a mistake in Hive and Impala, indicating
> an unknown type on PART_DT. Does it mean that the data are not queried
> directly in Hive, but through a “copy”.  This couls explain why the syntax
> “DEFAULT. does not work in the query editor.
>
> · What does happen when the Hive tables are populated ? Should I
> resynchronize the tables or not ?
>
> · How is the data integrity ensured ? As far as I can notice, there
> is no control on data through the model creation interface; this supposes
> that the data are initially well-formed. So how does Kylin manage this
> (input mistakes, …) ?
>
>
>
> Thank you in advance for your help. Have a good day.
>
>
>
> Best regards,
>
> Jean-Luc.
>
>
>
> "Ce message est destiné exclusivement aux personnes ou entités auxquelles il
> est adressé et peut contenir des informations privilégiées ou
> confidentielles. Si vous avez reçu ce document par erreur, merci de nous
> l'indiquer par retour, de ne pas le transmettre et de procéder à sa
> destruction.
>
> This message is solely intended for the use of the individual or entity to
> which it is addressed and may contain information that is privileged or
> confidential. If you have received this communication by error, please
> notify us immediately by electronic mail, do not disclose it and delete the
> original message."


Re: Use impala for intermediate table

2018-02-22 Thread Billy Liu
The release is under voting. If you want the early access, please down
the package from
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.3.0-rc1/

With Warm regards

Billy Liu


2018-02-22 23:15 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:
> Did 2.3 version release fully? Can we download it from site? It has also
> support for hbase namespace.???
>
> I will send the details.. if more number of dimensions lookup is there, it
> takes more time
>
>
>
> Sent with BlackBerry Work (www.blackberry.com)
> 
> From: ShaoFeng Shi <shaofeng...@apache.org>
> Sent: 22-Feb-2018 7:46 PM
> To: user <user@kylin.apache.org>
> Subject: Re: Use impala for intermediate table
>
> Did you analyze why the first step is slow?
>
> Kylin 2.3 will support using SparkSQL to perform the first step:
>
> https://issues.apache.org/jira/browse/KYLIN-3125
>
> SparkSQL should be more general than Impala. You can take a try.
>
> 2018-02-22 19:31 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:
>>
>> Can we use Impala for getting #1 step in Cube building – I believe Impala
>> is faster than hive, it takes time to get first done using hive. Is there
>> way where we can use it. Pls. advise.
>>
>>
>>
>> Regards,
>>
>> Manoj
>>
>>
>>
>> This message is confidential and subject to terms at:
>> http://www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
>> privilege, viruses and monitoring of electronic messages. If you are not the
>> intended recipient, please delete this message and notify the sender
>> immediately. Any unauthorized use is strictly prohibited.
>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
> This message is confidential and subject to terms at:
> http://www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not the
> intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.


Re: Holes in merge segments problem.

2018-02-17 Thread Billy Liu
Please google "apply patch", and build new binary package:
http://kylin.apache.org/development/howto_package.html

With Warm regards

Billy Liu


2018-02-16 12:45 GMT+08:00 Prasanna <prasann...@trinitymobility.com>:
> Can you please suggest me , How to use that patch. I don’t have any idea. 
> Thank you reply.
>
>
>
> -Original Message-
> From: Billy Liu [mailto:billy...@apache.org]
> Sent: 15 February 2018 22:16
> To: user
> Subject: Re: Holes in merge segments problem.
>
> Please check if https://issues.apache.org/jira/browse/KYLIN-3048 could help
>
> With Warm regards
>
> Billy Liu
>
>
> 2018-02-15 21:02 GMT+08:00 Prasanna <prasann...@trinitymobility.com>:
>> Hi all,
>>
>>
>>
>> I used the below rest api command for getting any holes in between segments,
>>
>>
>>
>> /usr/bin/curl -b /home/hdfs/cookiefile.txt -X GET
>> http://10.1.0.7:7070/kylin/api/cubes/trinityicccCube/holes
>>
>>
>>
>> Its listing as empty array
>>
>>
>>
>> []
>>
>>
>>
>> But while doing Merge option in Kylin UI, then its giving holes between
>> segments error. Please suggest me how to solve this.
>>
>>
>


Re: Using several active cubes in same project

2018-02-15 Thread Billy Liu
If two cubes could answer your query at the same time, Kylin will
choose the "light" cost cube. That's what happened in your case. You
could disable 'cube1' manually to test the 'cube2'. If you need keep
both of them active, please define them into different projects.

With Warm regards

Billy Liu


2018-02-14 18:59 GMT+08:00 BELLIER Jean-luc <jean-luc.bell...@rte-france.com>:
> Hello
>
>
>
> I have a project with two active cubes (status READY) based on the same
> model. TO be complete, cube2 is a clone of cube1.
>
> I select cube2 and go then to the ‘Insight tab’, to run a query. But in the
> result panel, I can see ‘cube1’ and nor ‘cube2’.
>
>
>
> Does it mean that there only must be one active cube per project (in other
> words, should I disable cube1, or move it to another project) ?
>
>
>
> Thank you for your help.
>
>
>
> Best regards,
>
> Jean-Luc.
>
>
>
> "Ce message est destiné exclusivement aux personnes ou entités auxquelles il
> est adressé et peut contenir des informations privilégiées ou
> confidentielles. Si vous avez reçu ce document par erreur, merci de nous
> l'indiquer par retour, de ne pas le transmettre et de procéder à sa
> destruction.
>
> This message is solely intended for the use of the individual or entity to
> which it is addressed and may contain information that is privileged or
> confidential. If you have received this communication by error, please
> notify us immediately by electronic mail, do not disclose it and delete the
> original message."


Re: Holes in merge segments problem.

2018-02-15 Thread Billy Liu
Please check if https://issues.apache.org/jira/browse/KYLIN-3048 could help

With Warm regards

Billy Liu


2018-02-15 21:02 GMT+08:00 Prasanna <prasann...@trinitymobility.com>:
> Hi all,
>
>
>
> I used the below rest api command for getting any holes in between segments,
>
>
>
> /usr/bin/curl -b /home/hdfs/cookiefile.txt -X GET
> http://10.1.0.7:7070/kylin/api/cubes/trinityicccCube/holes
>
>
>
> Its listing as empty array
>
>
>
> []
>
>
>
> But while doing Merge option in Kylin UI, then its giving holes between
> segments error. Please suggest me how to solve this.
>
>


New Blog: Get Your Interactive Analytics Superpower, with Apache Kylin and Apache Superset

2018-01-29 Thread Billy Liu
Thanks Joanna He and Yongjie Zhao.

The blog introducing Kylin and SuperSet integration is published at
http://kylin.apache.org/blog/2018/01/01/kylin-and-superset/

Please have a try, and share your experience with Kylin community.


Re: Towards Kylin 2.3

2018-01-28 Thread Billy Liu
Here is the entry point: https://issues.apache.org/jira/browse/KYLIN

2018-01-29 13:16 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:

> Not yet I have not raised it.. Pls. let me know the Jira link, I will do
> that.
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Billy Liu [mailtobilly...@apache.org]
> *Sent:* Monday, January 29, 2018 10:44 AM
> *To:* user <user@kylin.apache.org>
> *Cc:* dev <d...@kylin.apache.org>
> *Subject:* Re: Towards Kylin 2.3
>
>
>
> Hello Kumar,
>
>
>
> Thanks for the inputs. Do we have some JIRAs on this?
>
>
>
> 2018-01-29 13:11 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:
>
> I would like to add following
>
>
>
> -How to enable to custom rule for Data transformation – while
> creating the data model, There should be some way user Define the rules on
> data being selected. Before building the cube, some rules needs to be
> applied based on User. Most of OLAP solution like ESSBASE/T1M does provide
> this features.
>
> -Entitlement of Data – How to do that in Kylin.
>
>
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Billy Liu [mailtobilly...@apache.org]
> *Sent:* Monday, January 29, 2018 10:36 AM
> *To:* dev <d...@kylin.apache.org>; user <user@kylin.apache.org>
> *Subject:* Towards Kylin 2.3
>
>
>
> Hello Kylin Dev & User community,
>
>
>
> Last Kylin 2.2.0 release was on 3rd Nov 2017, three months ago. I'd like
> to propose the next 2.3.0 release in the middle of Feb 2018. I volunteer to
> be the release manager.
>
>
>
> We already resolved 200+ issues during 2.3.0 developing cycle and many
> exciting features. Thank you, the community.
>
>
>
> I have already marked most of these issue with Fix Version to 2.3.0. If
> you have some urgent or fixed-ready issues not in 2.3.0, please let me
> know. Let's be together to work out a better Kylin.
>
>
>
> A new docs23 directory is created also for the new feature documents.
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>
>
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>


Re: Towards Kylin 2.3

2018-01-28 Thread Billy Liu
Hello Kumar,

Thanks for the inputs. Do we have some JIRAs on this?

2018-01-29 13:11 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:

> I would like to add following
>
>
>
> -How to enable to custom rule for Data transformation – while
> creating the data model, There should be some way user Define the rules on
> data being selected. Before building the cube, some rules needs to be
> applied based on User. Most of OLAP solution like ESSBASE/T1M does provide
> this features.
>
> -Entitlement of Data – How to do that in Kylin.
>
>
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Billy Liu [mailtobilly...@apache.org]
> *Sent:* Monday, January 29, 2018 10:36 AM
> *To:* dev <d...@kylin.apache.org>; user <user@kylin.apache.org>
> *Subject:* Towards Kylin 2.3
>
>
>
> Hello Kylin Dev & User community,
>
>
>
> Last Kylin 2.2.0 release was on 3rd Nov 2017, three months ago. I'd like
> to propose the next 2.3.0 release in the middle of Feb 2018. I volunteer to
> be the release manager.
>
>
>
> We already resolved 200+ issues during 2.3.0 developing cycle and many
> exciting features. Thank you, the community.
>
>
>
> I have already marked most of these issue with Fix Version to 2.3.0. If
> you have some urgent or fixed-ready issues not in 2.3.0, please let me
> know. Let's be together to work out a better Kylin.
>
>
>
> A new docs23 directory is created also for the new feature documents.
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>


Kylin Meetup Slides

2018-01-28 Thread Billy Liu
Hello Kylin community,

Last weekend, Kylin and Alluxio joint meetup were hosted by Kyligence in
Shanghai. Four speakers are invited to give the talks about Kylin and
Alluxio technology and solutions. Here are the slides shared with you:

https://pan.baidu.com/s/1pMYBv9x

(Some of the slides are Chinese, sorry for the inconvenience)


Billy Liu


Towards Kylin 2.3

2018-01-28 Thread Billy Liu
Hello Kylin Dev & User community,

Last Kylin 2.2.0 release was on 3rd Nov 2017, three months ago. I'd like to
propose the next 2.3.0 release in the middle of Feb 2018. I volunteer to be
the release manager.

We already resolved 200+ issues during 2.3.0 developing cycle and many
exciting features. Thank you, the community.

I have already marked most of these issue with Fix Version to 2.3.0. If you
have some urgent or fixed-ready issues not in 2.3.0, please let me know.
Let's be together to work out a better Kylin.

A new docs23 directory is created also for the new feature documents.


Re: Use kylin 2.2.0 on HBase 0.98 and Hive 0.13

2018-01-25 Thread Billy Liu
This is no binary build for HBase 0.98. You could package your own Kylin
against HBase 0.98.

2018-01-26 9:42 GMT+08:00 苏启龙 :

>
> Hi guys,
>
> We are currently using kylin 1.6.0 on with a cluster with CDH 5.2, which
> hbase version is 0.98, and hive version is 0.13. Now we’d love to upgrade
> kylin to 2.2.0. Kylin 2.2 docs says could support these but we found some
> hbase class not defied problems.
>
> So the question is: if we wanna use kylin 2.2.0, but not preferred to
> upgrade hbase and hive. Is there any long time solutions?
>
>
> Qilong
>
> Thanks.
>


Re: MDX queries on kylin cubes.

2018-01-18 Thread Billy Liu
Do you want to have a try on this?  http://dekarlab.de/wp/?p=363


2018-01-17 12:36 GMT+08:00 Prasanna :

> Hi all,
>
>
>
>   I am using kylin 2.2.0 version. Present I am using only sql type queries
> on kylin cubes like select with aggregation functions. I would like to use
> MDX queries on cubes. If anybody is using please can you guide me, any
> document is available regarding of this.
>
>
>
>
>
> Thanks,
>
> Prasanna.P
>


Re: Re: Support embedded JSON format ?

2018-01-15 Thread Billy Liu
When you say somehting is wrong, please leave complete log for that error.

2018-01-15 17:26 GMT+08:00 ShaoFeng Shi :

> Only replace jar in tomcat may not work; please rebuild a new binary
> package.
>
> 2018-01-15 16:43 GMT+08:00 446463...@qq.com <446463...@qq.com>:
>
> > but I can't modify the ‘_’ character,
> > so I wan't to modify source code in source-kafka module
> > at StreamingParser.java class
> > ```
> > static {
> > derivedTimeColumns.put("minute_start", 1);
> > derivedTimeColumns.put("hour_start", 2);
> > derivedTimeColumns.put("day_start", 3);
> > derivedTimeColumns.put("week_start", 4);
> > derivedTimeColumns.put("month_start", 5);
> > derivedTimeColumns.put("quarter_start", 6);
> > derivedTimeColumns.put("year_start", 7);
> > defaultProperties.put(PROPERTY_TS_COLUMN_NAME, "timestamp");
> > defaultProperties.put(PROPERTY_TS_PARSER,
> > "org.apache.kylin.source.kafka.DefaultTimeParser");
> > defaultProperties.put(PROPERTY_TS_PATTERN,
> > DateFormat.DEFAULT_DATETIME_PATTERN_WITHOUT_MILLISECONDS);
> >  + defaultProperties.put(EMBEDDED_PROPERTY_SEPARATOR, ".");
> >  -  defaultProperties.put(EMBEDDED_PROPERTY_SEPARATOR, "_");
> > }
> > 
> > and I rebuild this module and replace it to tomcat webapps
> > but error has still happened
> > I wonder Why?
> >
> >
> >
> > 446463...@qq.com
> >
> > From: ShaoFeng Shi
> > Date: 2018-01-15 16:37
> > To: dev
> > CC: user
> > Subject: Re: Support embedded JSON format ?
> > Hi,
> >
> > It supports the embedded format, while has a conflict with "_" in the
> JSON.
> >
> > You can check: https://issues.apache.org/jira/browse/KYLIN-3145
> >
> > To bypass at this moment, you can create a message which has no "_" in
> the
> > property name.
> >
> >
> > 2018-01-15 16:19 GMT+08:00 446463...@qq.com <446463...@qq.com>:
> >
> > > Hi:
> > > kylin Support embedded JSON format  since kylin1.6.0
> > > and I use streaming cube with kafka
> > > and my data:
> > >  {
> > > "data": {
> > > "account": "5942153",
> > > "actual_amount": "30.0",
> > > "app_name": "",
> > > "app_no": "",
> > > "app_session": "xxx",
> > > "app_type_code": "xxx",
> > > "button_name": "",
> > > "dur": "",
> > > "forum_version": "new",
> > > "order_Time": "2018-01-11 16:38:43 0",
> > > "order_no": "OD20180638322140",
> > > "order_type": "xxx",
> > > "original_price": "xxx",
> > > "page_name": "xxx",
> > > "pay_result": "",
> > > "pay_type": "",
> > > "product_name": "xxx",
> > > "scan_app": "",
> > > "scan_result": ""
> > > },
> > > "device":""
> > > }
> > >  but build cube throw Error
> > > Property ‘app’ is not  embedded  format ?
> > >
> > >
> > >
> > >
> > >
> > > 446463...@qq.coms
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>


Re: cachesync.Broadcaster:172 : error running wiping java.lang.IllegalArgumentException

2018-01-09 Thread Billy Liu
This document http://kylin.apache.org/docs21/install/advance_settings.html
introduces how to allocate more memory for Kylin, could you have a try?

2018-01-09 15:42 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:

> Whats  JVM size given for Kylin? Try with 16 GB Memory mentioned in
> setenv.sh file. You need to use max. JVM for loading billions of data.
>
>
>
> Regards,
>
> Manoj
>
>
>
> *From:* Ruslan Dautkhanov [mailtodautkha...@gmail.com]
> *Sent:* Tuesday, January 09, 2018 12:59 PM
> *To:* user@kylin.apache.org
> *Subject:* Re: cachesync.Broadcaster:172 : error running wiping java.lang.
> IllegalArgumentException
>
>
>
> Hi Billy,
>
>
>
> Thank you - that was it. Although it was just hiding real problem - with
> jvm memory [1].
>
>
>
> I have requested to build a cube on 12b records and can't now make Kylin
> back to stable state.
>
> Kylin crashes on startup.
>
> I have already tried to tune up Xmx and such but still no luck [2].
>
>
>
> [1]
>
>
>
> INFO: Starting ProtocolHandler ["ajp-bio-9009"]
> Jan 09, 2018 12:09:49 AM org.apache.catalina.startup.Catalina start
> INFO: Server startup in 12762 ms
> java.lang.OutOfMemoryError: Requested array size exceeds VM limit
> Dumping heap to java_pid60827.hprof ...
> Heap dump file created [1731330948 bytes in 2.963 secs]
> #
> # java.lang.OutOfMemoryError: Requested array size exceeds VM limit
> # -XX:OnOutOfMemoryError="kill -9 %p"
> #   Executing /bin/sh -c "kill -9 60827"...
> Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=256M; support was removed in 8.0
> Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is
> deprecated and will likely be removed in a future release
> Error occurred during initialization of VM
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:717)
> at java.lang.ref.Reference.(Reference.java:232)
>
>
>
>
>
> [2]
>
>
>
> export KYLIN_JVM_SETTINGS="-Xms2g -*Xmx16g* -Xss2g -XX:MaxPermSize=256M
> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> -Xloggc:$KYLIN_HOME/logs/kylin.gc.$$ -XX:+UseGCLogFileRotation
> -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M"
>
>
>
>
>
>
>
> --
> Ruslan Dautkhanov
>
>
>
> On Mon, Jan 8, 2018 at 8:29 PM, Billy Liu <billy...@apache.org> wrote:
>
> Hello Ruslan,
>
>
>
> Could you check the "*kylin.server.cluster-servers". There should no
> "http://; as prefix. *
>
>
>
> 2018-01-09 0:59 GMT+08:00 Ruslan Dautkhanov <dautkha...@gmail.com>:
>
> Below error crashes Kylin.
>
> Attempts to start Kylin back results in the same exception.
>
> This started happening after a new cube build was scheduled..
>
> Is this a known bug and any ideas how to work around this ?
>
>
>
>
>
> 2018-01-08 09:54:06,427 INFO  [Scheduler 1698178565 Job
> b595302c-35d4-45fd-a895-6766747eeae7-81] cube.CubeManager:358 : Updating
> cube instance 'mv2cube'
>
>
>
> 2018-01-08 09:54:06,430 WARN  [Scheduler 1698178565 Job
> b595302c-35d4-45fd-a895-6766747eeae7-81] model.Segments:421 : NEW segment
> start does not fit/connect with other segments: mv2cube[2010010100_
> 2018010100]
>
>
>
> 2018-01-08 09:54:06,430 WARN  [Scheduler 1698178565 Job
> b595302c-35d4-45fd-a895-6766747eeae7-81] model.Segments:423 : NEW segment
> end does not fit/connect with other segments: mv2cube[2010010100_
> 2018010100]
>
>
>
>
> 2018-01-08 09:54:06,454 INFO  [Scheduler 1698178565 Job
> b595302c-35d4-45fd-a895-6766747eeae7-81] cli.DictionaryGeneratorCLI:57 :
> Building dictionary for MV_PROD.CONVERTED_FACT.INDIVID
> 2018-01-08 09:54:06,454 DEBUG [pool-7-thread-1] cachesync.Broadcaster:141
> : Servers in the cluster: [http://pc1udatahad03:7070]
>
>
>
>
> 2018-01-08 09:54:06,466 ERROR [pool-7-thread-1]
> *cachesync.Broadcaster:172 : error running wiping
> java.lang.IllegalArgumentException*: URI: http://pc1udatahad03:7070 --
> does not match pattern (?:([^:]+)[:]([^@]+)[@])?([^:]+)(?:[:](\d+))?
> at org.apache.kylin.common.restclient.RestClient.(
> RestClient.java:91)
> at org.apache.kylin.metadata.cachesync.Broadcaster$1.run(
> Broadcaster.java:144)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
>
>
>
>
>
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>


Re: cachesync.Broadcaster:172 : error running wiping java.lang.IllegalArgumentException

2018-01-08 Thread Billy Liu
Hello Ruslan,

Could you check the "kylin.server.cluster-servers". There should no "http://;
as prefix.

2018-01-09 0:59 GMT+08:00 Ruslan Dautkhanov :

> Below error crashes Kylin.
> Attempts to start Kylin back results in the same exception.
> This started happening after a new cube build was scheduled..
> Is this a known bug and any ideas how to work around this ?
>
>
> 2018-01-08 09:54:06,427 INFO  [Scheduler 1698178565 Job
>> b595302c-35d4-45fd-a895-6766747eeae7-81] cube.CubeManager:358 : Updating
>> cube instance 'mv2cube'
>>
>
>
>> 2018-01-08 09:54:06,430 WARN  [Scheduler 1698178565 Job
>> b595302c-35d4-45fd-a895-6766747eeae7-81] model.Segments:421 : NEW
>> segment start does not fit/connect with other segments:
>> mv2cube[2010010100_2018010100]
>>
>
>
>> 2018-01-08 09:54:06,430 WARN  [Scheduler 1698178565 Job
>> b595302c-35d4-45fd-a895-6766747eeae7-81] model.Segments:423 : NEW
>> segment end does not fit/connect with other segments:
>> mv2cube[2010010100_2018010100]
>
>
>
>>
>> 2018-01-08 09:54:06,454 INFO  [Scheduler 1698178565 Job
>> b595302c-35d4-45fd-a895-6766747eeae7-81] cli.DictionaryGeneratorCLI:57 :
>> Building dictionary for MV_PROD.CONVERTED_FACT.INDIVID
>> 2018-01-08 09:54:06,454 DEBUG [pool-7-thread-1] cachesync.Broadcaster:141
>> : Servers in the cluster: [http://pc1udatahad03:7070]
>
>
>
>>
>> 2018-01-08 09:54:06,466 ERROR [pool-7-thread-1]
>> *cachesync.Broadcaster:172 : error running wiping*
>> *java.lang.IllegalArgumentException*: URI: http://pc1udatahad03:7070 --
>> does not match pattern (?:([^:]+)[:]([^@]+)[@])?([^:]+)(?:[:](\d+))?
>> at org.apache.kylin.common.restclient.RestClient.(
>> RestClient.java:91)
>> at org.apache.kylin.metadata.cachesync.Broadcaster$1.run(
>> Broadcaster.java:144)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1149)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>
>
>
>


Re: superset Kylin connector

2017-12-28 Thread Billy Liu
Thank you, Joanna. Very interesting project, the community is expecting
more integration between SuperSet and Kylin. Could you like to share more
use cases on SuperSet application? Maybe we could publish it through Kylin
blog.

2017-12-29 14:05 GMT+08:00 Joanna He :

> Hello kylin community,
>
> We recently open sourced a superset kylin connector.
> This connector has addressed major issues between superset and Kylin
> connection.
>
> Please have a try if you are interested and let me know your feedback.
>
> You may read more about the connector from this repository:
>
> https://github.com/Kyligence/kylinpy
>
>
>
> Joanna He
>


Re: Does kylin support Hadoop 3.0?

2017-12-22 Thread Billy Liu
Not yet, but there are some contributors working on that. The gap is very
small as I know.

2017-12-22 10:34 GMT+08:00 jxs :

> Hi, kylin developers,
>
> My colleage is building a new small cluster with hadoop 3.0, since the
> version 3.0 is just released, I am wondering if kylin supports it?
> Thanks.
>


Re: Re: Re: How to config kylin to use kerberized hadoop cluster

2017-12-17 Thread Billy Liu
There is no kerberized version Kylin since no additional setting required
from kylin.

2017-12-18 14:29 GMT+08:00 bubugao0809 <bubugao0...@163.com>:

>
> So, it means kylin does not need additional setting to use on a kerberized
> hadoop cluster?
> What's more, does kylin has kerberized version of its own?
>
>
>
>
> At 2017-12-18 14:23:29, "Billy Liu" <billy...@apache.org> wrote:
>
> kylin.job.status.with.kerberos=true has been deprecated since 
> https://issues.apache.org/jira/browse/KYLIN-1319, and code removed recently.
>
> please check your YARN configuration file. For reference: 
> http://pivotalhd-210.docs.pivotal.io/doc/2100/webhelp/topics/ConfiguringKerberosforHDFSandYARNMapReduce.html
>
>
> 2017-12-18 14:14 GMT+08:00 bubugao0809 <bubugao0...@163.com>:
>
>>
>>
>> hbase has been successfully kerberized, because otherwise hbase can not
>> be started, since mine can work properly.
>> Can you tell me what is the setting that leads to the error ?
>> @{alex...@163.com}
>>
>>
>>
>> At 2017-12-18 12:47:49, "alexinx" <alex...@163.com> wrote:
>>
>> I have encountered a similar problem.
>> Just check your hbase's configuration, it might do not  use kerberos.
>>
>>
>> On 12/18/2017 11:38,bubugao0809<bubugao0...@163.com>
>> <bubugao0...@163.com> wrote:
>>
>> Our hadoop cluster has been kerberized, including hdfs , yarn , hbase ,
>> hive
>> I don't know how to config kylin to integrate with current cluster, cause
>> it always fail in step Build N-Dimension Cuboid with :
>>
>> Exception: java.io.IOException: Failed on local exception: 
>> java.io.IOException: Server asks us to fall back to SIMPLE auth, but this 
>> client is configured to only allow secure connections.; Host Details : local 
>> host is: "dp-test-04.aispeech.com/172.16.20.159"; destination host is: 
>> "dp-test-01.aispeech.com":10020;
>> java.io.IOException: Failed on local exception: java.io.IOException: Server 
>> asks us to fall back to SIMPLE auth, but this client is configured to only 
>> allow secure connections.; Host Details : local host is: 
>> "dp-test-04.aispeech.com/172.16.20.159"; destination host is: 
>> "dp-test-01.aispeech.com":10020;
>>
>> FYI, I tried 'kylin.job.status.with.kerberos=true', but kylin reported the 
>> same error as above.
>>
>>
>


Re: Re: Strange HBase rpc operation timeout error

2017-12-17 Thread Billy Liu
Actually, in your questions, here are two HBase timeout. One is about the
Cube build, the other one is metadata access.
For the first issue, please check this article:
http://kylin.apache.org/docs21/install/kylin_aws_emr.html  It introduces
how to increase the HBase rpc timeout.
For the second issue, as previous discussion. We should keep it.

2017-12-17 10:37 GMT+08:00 jxs <jxsk...@126.com>:

> Hi Billy,
> Thank you for pointing the previous discussion. But for now we are running
> a very small hbase cluster for lower cost, which has only one slave node.
> So the unsteady response time (in a range not two bad, eg: within 1
> minute) is somehow acceptable.
> The previous timeout error just interrupted the cube building procedure,
> we don't wan't that.
> What is your suggestion for this use case?
>
>
>
> 在2017年12月16 11时48分, "Billy Liu"<billy...@apache.org>写道:
>
>
> Check this: http://apache-kylin.74782.x6.nabble.com/hbase-configed-
> with-fixed-value-td9241.html
>
> 2017-12-15 18:03 GMT+08:00 jxs <jxsk...@126.com>:
>
>> Hi,
>>
>> Finally, I found this in org.apache.kylin.storage.hbase
>> .HBaseResourceStore:
>>
>> ```
>> private StorageURL buildMetadataUrl(KylinConfig kylinConfig) throws
>> IOException {
>> StorageURL url = kylinConfig.getMetadataUrl();
>> if (!url.getScheme().equals("hbase"))
>> throw new IOException("Cannot create HBaseResourceStore. Url
>> not match. Url: " + url);
>>
>> // control timeout for prompt error report
>> Map<String, String> newParams = new LinkedHashMap<>();
>> newParams.put("hbase.client.scanner.timeout.period", "1");
>> newParams.put("hbase.rpc.timeout", "5000");
>> newParams.put("hbase.client.retries.number", "1");
>> newParams.putAll(url.getAllParameters());
>>
>> return url.copy(newParams);
>> }
>> ```
>> Is this related to the timeout error? Why these params are hard coded
>> instead of reading from configuration, is there any workaround for this
>> timeout error?
>>
>>
>> 在2017年12月15 16时03分, "jxs"<jxsk...@126.com>写道:
>>
>>
>> Hi, kylin users,
>>
>> I encountered an strange timeout error today when buiding a cube.
>>
>> By "strange", I mean the "hbase.rpc.timeout" configuration is set to
>> 6 in hbase, but I get "org.apache.hadoop.hbase.ipc.CallTimeoutException:
>> Call id=8099904, waitTime=5001, operationTimeout=5000 expired" errors.
>>
>> Kylin version 2.2.0, runs on EMR, and it runs wihtout error for about
>> half of a month, suddenly it not work, the current cube is not the biggest
>> one.
>> I am wondering where should I look, any help is appreciated.
>>
>> The traceback from log:
>>
>> ```
>> 2017-12-15 06:46:57,892 ERROR [Scheduler 2090031901 <020%209003%201901>
>> Job c9067736-eac7-48ad-88f3-dbd6f4e870ae-167]
>> execution.ExecutableManager:149 : fail to get job
>> output:c9067736-eac7-48ad-88f3-dbd6f4e870ae-14
>> org.apache.kylin.job.exception.PersistentException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
>> attempts=1, exceptions:
>> Fri Dec 15 14:46:57 GMT+08:00 2017, 
>> RpcRetryingCaller{globalStartTime=1513320412890,
>> pause=100, retries=1}, java.io.IOException: Call to
>> ip-172-31-5-71.cn-north-1.compute.internal/172.31.5.71:16020 failed on
>> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
>> id=8099904, waitTime=5001, operationTimeout=5000 expired.
>>
>> at org.apache.kylin.job.dao.ExecutableDao.getJobOutput(Executab
>> leDao.java:202)
>> at org.apache.kylin.job.execution.ExecutableManager.getOutput(
>> ExecutableManager.java:145)
>> at org.apache.kylin.job.execution.AbstractExecutable.getOutput(
>> AbstractExecutable.java:312)
>> at org.apache.kylin.job.execution.AbstractExecutable.isDiscarde
>> d(AbstractExecutable.java:392)
>> at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork
>> (MapReduceExecutable.java:149)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:125)
>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
>> rk(DefaultChainedExecutable.java:64)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:125)
>> at o

Re: Strange HBase rpc operation timeout error

2017-12-15 Thread Billy Liu
Check this:
http://apache-kylin.74782.x6.nabble.com/hbase-configed-with-fixed-value-td9241.html

2017-12-15 18:03 GMT+08:00 jxs :

> Hi,
>
> Finally, I found this in org.apache.kylin.storage.
> hbase.HBaseResourceStore:
>
> ```
> private StorageURL buildMetadataUrl(KylinConfig kylinConfig) throws
> IOException {
> StorageURL url = kylinConfig.getMetadataUrl();
> if (!url.getScheme().equals("hbase"))
> throw new IOException("Cannot create HBaseResourceStore. Url
> not match. Url: " + url);
>
> // control timeout for prompt error report
> Map newParams = new LinkedHashMap<>();
> newParams.put("hbase.client.scanner.timeout.period", "1");
> newParams.put("hbase.rpc.timeout", "5000");
> newParams.put("hbase.client.retries.number", "1");
> newParams.putAll(url.getAllParameters());
>
> return url.copy(newParams);
> }
> ```
> Is this related to the timeout error? Why these params are hard coded
> instead of reading from configuration, is there any workaround for this
> timeout error?
>
>
> 在2017年12月15 16时03分, "jxs"写道:
>
>
> Hi, kylin users,
>
> I encountered an strange timeout error today when buiding a cube.
>
> By "strange", I mean the "hbase.rpc.timeout" configuration is set to 6
> in hbase, but I get "org.apache.hadoop.hbase.ipc.CallTimeoutException:
> Call id=8099904, waitTime=5001, operationTimeout=5000 expired" errors.
>
> Kylin version 2.2.0, runs on EMR, and it runs wihtout error for about half
> of a month, suddenly it not work, the current cube is not the biggest one.
> I am wondering where should I look, any help is appreciated.
>
> The traceback from log:
>
> ```
> 2017-12-15 06:46:57,892 ERROR [Scheduler 2090031901 <020%209003%201901>
> Job c9067736-eac7-48ad-88f3-dbd6f4e870ae-167] execution.ExecutableManager:149
> : fail to get job output:c9067736-eac7-48ad-88f3-dbd6f4e870ae-14
> org.apache.kylin.job.exception.PersistentException:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=1, exceptions:
> Fri Dec 15 14:46:57 GMT+08:00 2017, 
> RpcRetryingCaller{globalStartTime=1513320412890,
> pause=100, retries=1}, java.io.IOException: Call to
> ip-172-31-5-71.cn-north-1.compute.internal/172.31.5.71:16020 failed on
> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
> id=8099904, waitTime=5001, operationTimeout=5000 expired.
>
> at org.apache.kylin.job.dao.ExecutableDao.getJobOutput(
> ExecutableDao.java:202)
> at org.apache.kylin.job.execution.ExecutableManager.
> getOutput(ExecutableManager.java:145)
> at org.apache.kylin.job.execution.AbstractExecutable.
> getOutput(AbstractExecutable.java:312)
> at org.apache.kylin.job.execution.AbstractExecutable.isDiscarded(
> AbstractExecutable.java:392)
> at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:149)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:125)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:64)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:125)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> JobRunner.run(DefaultScheduler.java:144)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Failed after attempts=1, exceptions:
> Fri Dec 15 14:46:57 GMT+08:00 2017, 
> RpcRetryingCaller{globalStartTime=1513320412890,
> pause=100, retries=1}, java.io.IOException: Call to
> ip-172-31-5-71.cn-north-1.compute.internal/172.31.5.71:16020 failed on
> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
> id=8099904, waitTime=5001, operationTimeout=5000 expired.
> ```
>
>
>


Re: kylin1.6 集群有些时候创建cube不能同步到其他节点上

2017-12-14 Thread Billy Liu
Hello Chenping,

Quite a few metadata sync related issues are resolved recently, including:
https://issues.apache.org/jira/browse/KYLIN-2834
https://issues.apache.org/jira/browse/KYLIN-2858
https://issues.apache.org/jira/browse/KYLIN-2834

You are strongly suggested to upgrade to the latest Kylin 2.2

2017-12-12 18:12 GMT+08:00 chenping...@keruyun.com 
:

> 大家好,最近遇到一个问题,kylin1.6  cdh5.8.4
> 其中管理的kylin节点配置如下
> ### SERVICE ###
>
> # Kylin server mode, valid value [all, query, job]
> kyin.server.mode=all
>
> # Optional information for the owner of kylin platform,
> it can be your team's email
> # Currently it will be attached to each kylin's htable attribute
> kylin.owner=who...@kylin.apache.org
>
> # List of web servers in use, this enables one web server
> instance to sync up with other servers.
> #kylin.rest.servers=localhost:7070
> kylin.rest.servers=prod-slave5:7070,prod-slave4:7070,prod-slave3:7070
>
> # Display timezone on UI,format like[GMT+N or GMT-N]
> kylin.rest.timezone=GMT+8
>
> ### SOURCE ###
>
> # Hive client, valid value [cli, beeline]
> kylin.hive.client=cli
>
> # Parameters for beeline client, only necessary if hive client is beeline
> #kylin.hive.beeline.params=-n root --hiveconf hive.security.
> authorization.sqlstd.confwhitelist.append='mapreduce.job.*|dfs.*' -u '
> jdbc:hive2://localhost:
> 1'
>
> kylin.hive.keep.flat.table=false
>
> ### STORAGE ###
>
> # The metadata store in hbase
> kylin.metadata.url=kylin_metadata@hbase
>
> # The storage for final cube file in hbase
> kylin.storage.url=hbase
>
> # Working folder in HDFS, make sure user has the right
> access to the hdfs directory
> kylin.hdfs.working.dir=/kylin
>
> # Compression codec for htable, valid value [none, snappy, lzo, gzip, lz4]
> kylin.hbase.default.compression.codec=none
>
> # HBase Cluster FileSystem, which serving hbase, format
> as hdfs://hbase-cluster:8020
> # Leave empty if hbase running on same cluster with hive and mapreduce
> #kylin.hbase.cluster.fs=
>
> # The cut size for hbase region, in GB.
> kylin.hbase.region.cut=5
>
> # The hfile size of GB, smaller hfile leading to the
> converting hfile MR has more reducers and be faster.
> # Set 0 to disable this optimization.
> kylin.hbase.hfile.size.gb=2
>
> kylin.hbase.region.count.min=1
> kylin.hbase.region.count.max=500
>
> ### JOB ###
>
> # max job retry on error, default 0: no retry
> kylin.job.retry=0
>
> # If true, job engine will not assume that hadoop CLI
> reside on the same server as it self
> # you will have to specify kylin.job.remote.cli.hostname,
>  kylin.job.remote.cli.username and kylin.job.remote.cli.password
> # It should not be set to "true" unless you're NOT
> running Kylin.sh on a hadoop client machine
> # (Thus kylin instance has to ssh to another real hadoop
> client machine to execute hbase,hive,hadoop commands)
> kylin.job.run.as.remote.cmd=false
>
> # Only necessary when kylin.job.run.as.remote.cmd=true
> kylin.job.remote.cli.hostname=
> kylin.job.remote.cli.port=22
>
> # Only necessary when kylin.job.run.as.remote.cmd=true
> kylin.job.remote.cli.username=
>
> # Only necessary when kylin.job.run.as.remote.cmd=true
> kylin.job.remote.cli.password=
>
> # Used by test cases to prepare synthetic data for sample cube
> kylin.job.remote.cli.working.dir=/tmp/kylin
>
> # Max count of concurrent jobs running
> kylin.job.concurrent.max.limit=10
>
> # Time interval to check hadoop job status
> kylin.job.yarn.app.rest.check.interval.seconds=10
>
> # Hive database name for putting the intermediate flat tables
> kylin.job.hive.database.for.intermediatetable=default
>
> # The percentage of the sampling, default 100%
> kylin.job.cubing.inmem.sampling.percent=100
>
> # Whether get job status from resource manager with
> kerberos authentication
> kylin.job.status.with.kerberos=false
>
> kylin.job.mapreduce.default.reduce.input.mb=500
>
> kylin.job.mapreduce.max.reducer.number=500
>
> kylin.job.mapreduce.mapper.input.rows=100
>
> kylin.job.step.timeout=7200
>
> ### CUBE ###
>
> # 'auto', 'inmem', 'layer' or 'random' for testing
> kylin.cube.algorithm=auto
>
> kylin.cube.algorithm.auto.threshold=8
>
> kylin.cube.aggrgroup.max.combination=4096
>
> kylin.dictionary.max.cardinality=500
>
> kylin.table.snapshot.max_mb=300
>
> ### QUERY ###
>
> kylin.query.scan.threshold=1000
>
> # 3G
> kylin.query.mem.budget=3221225472
>
> kylin.query.coprocessor.mem.gb=3
>
> # the default coprocessor timeout is (hbase.rpc.timeout
> * 0.9) / 1000 seconds,
> # you can set it to a smaller value. 0 means use default.
> # kylin.query.coprocessor.timeout.seconds=0
>
> # Enable/disable ACL check for cube query
> kylin.query.security.enabled=true
>
> kylin.query.cache.enabled=true
>
> ### SECURITY ###
>
> # Spring security profile, options: testing, ldap, saml
> # with "testing" profile, user can use pre-defined name/
> pwd like KYLIN/ADMIN to login
> kylin.security.profile=testing
>
> ### SECURITY ###
> # Default roles and admin roles in LDAP, for 

Re: Kylin Hue Integration Blog

2017-12-14 Thread Billy Liu
Very useful. Thanks, Joanna. I will we could repost it on the Kylin website
also.

2017-12-14 18:04 GMT+08:00 JHe :

> we have recently explored integrating kylin with Hue and have contributed
> a blog to hue community.
>
> In this blog, we explained how you can integrate kylin with hue locally or
> in AWS EMR.
>
> If you are interested, please take a look.
> http://gethue.com/using-hue-to-interact-with-apache-kylin/
>
>
>
>
>
>
>


Re: A problem about retention rate analyze

2017-12-07 Thread Billy Liu
This is not by design, could you show more exception logs?

2017-12-07 16:22 GMT+08:00 skyyws :

> Hi guys,
> I found that kylin supported retention rate analyze function, so I made
> some test for this function. The following SQL executed successful:
> 
> ---
>
>
>
>
>
>
>
>
> *select city, version,intersect_count(uuid, dt, array['20161014']) as
> first_day,intersect_count(uuid, dt, array['20161015']) as
> second_day,intersect_count(uuid, dt, array['20161016']) as
> third_day,intersect_count(uuid, dt, array['20161014', '20161015']) as
> retention_oneday,intersect_count(uuid, dt, array['20161014', '20161015',
> '20161016']) as retention_twodayfrom visit_logwhere dt in ('2016104',
> '20161015', '20161016')group by city, version*
> 
> ---
> but, other SQLs executed failed like this:
> 
> ---
>
> *select city, version, intersect_count(uuid, dt, array['20161014',
> '20161015']) as retention_onedayfrom visit_log where dt in ('2016104',
> '20161015',) group by city, version*
> 
> ---
>
> *select city, version,intersect_count(uuid, dt, array['20161014',
> '20161015', '20161016']) as retention_twodayfrom visit_log where dt in
> ('2016104', '20161015', '20161016') group by city, version*
> 
> ---
> which means I cannot use just one intersect_count UDAF in a SQL, at lease
> two intersect_count, is this a bug or designed to do so?
>
> 2017-12-07
> --
> skyyws
>


Re: kylin service is getting down frequently!

2017-12-05 Thread Billy Liu
Hi Prasana,

What kinds of queries you were using? If you issued some "select *" to very
huge table, it may cause OOM. Kylin is designed to answer OLAP-style
queries very fast, but not detailed query. Could you check again?

2017-12-04 21:12 GMT+08:00 Ge Silas :

> Hello,
>
> Can you share more details in kylin.log and kylin.out, which might provide
> more information to us?
>
> And when you say “getting down”, what’s the observation?
>
> Thanks,
> Silas
>
> On 4 Dec 2017, at 12:27 PM, Prasanna 
> wrote:
>
> Hi all,
>
> I am using hadoop cluster setup with hdp 2.4.3.0-227 version, all services
> are working fine in that cluster. I installed klyin cluster setup in 3
> servers of same cluster. But frequently kylin service is getting down ,so
> my applications are not getting data . I didn’t get what are the reasons
> for kylin stopping. Where i will get logs for this, either in kylin/logs or
> hadoop logs. I check the logs in kylin folder, it’s not showing any errors.
> Can you please tell me where is the problem.
>
>
>


Re: [Discuss] Disable/hide "RAW" measure in Kylin web GUI

2017-11-28 Thread Billy Liu
+1 to turn this feature off by default. The advanced user could enable it
also.

2017-11-27 13:53 GMT+08:00 Luke Han <luke...@gmail.com>:

> +1 to remove it from new release, people could backport to new version
> with previous code
>
>
> Best Regards!
> -
>
> Luke Han
>
> On Sun, Nov 26, 2017 at 9:40 PM, ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
>
>> Last year I raised this discussion but didn't have follow-up action.
>>
>> Now we see there are still some new users misuse this feature, and then
>> face performance and maintenance issues.
>>
>> In Kylin 2.1, the new "query pushdown" feature can forward the Cube
>> unmatched queries to alternative query engines like Hive / SparkSQL. The
>> raw data query is just such a scenario.
>>
>> So I think it is time to disable the RAW measure on Kylin now.  JIRA
>> created for it: https://issues.apache.org/jira/browse/KYLIN-3062
>>
>> Please comment if you see any issue.
>>
>> 2016-12-19 22:03 GMT+08:00 Billy Liu <billy...@apache.org>:
>>
>> > The experimental mode is system wide feature toggle. I think case by
>> case
>> > is more flexible. Most new features could have toggles, default are off.
>> >
>> > 2016-12-19 21:40 GMT+08:00 Luke Han <luke...@gmail.com>:
>> >
>> > > Beta or Experimental will also bring confusing for most of users.
>> > >
>> > > Maybe we could have something called "expert" or "experimental" model
>> in
>> > > system configuration.
>> > >
>> > > User will not see such content since they will be hidden by default
>> but
>> > > admin could set it to true if they confident to enable such features.
>> > >
>> > > How do you think?
>> > >
>> > >
>> > > Best Regards!
>> > > -
>> > >
>> > > Luke Han
>> > >
>> > > On Mon, Dec 19, 2016 at 11:24 AM, Xiaoyu Wang <wangxy...@gmail.com>
>> > wrote:
>> > >
>> > > > I’m sorry for too long not maintain it!
>> > > >
>> > > > I agree with liyang to give a label to the RAW measure “Beta” or
>> > others.
>> > > >
>> > > > I will improve it when I have time!
>> > > >
>> > > > 2016-12-19 10:21 GMT+08:00 Li Yang <liy...@apache.org>:
>> > > >
>> > > > > Or display the Raw with "Beta" or "Experimental" to warn user
>> that it
>> > > is
>> > > > > not a mature feature?
>> > > > >
>> > > > > On Fri, Dec 16, 2016 at 12:30 PM, 康凯森 <kangkai...@qq.com> wrote:
>> > > > >
>> > > > > > +1.
>> > > > > > But the "RAW" measure is still some useful, we could improve it
>> > next
>> > > > year
>> > > > > > when we have time.
>> > > > > >
>> > > > > >
>> > > > > > -- 原始邮件 --
>> > > > > > 发件人: "ShaoFeng Shi";<shaofeng...@apache.org>;
>> > > > > > 发送时间: 2016年12月15日(星期四) 中午12:05
>> > > > > > 收件人: "dev"<d...@kylin.apache.org>;
>> > > > > >
>> > > > > > 主题: [Discuss] Disable/hide "RAW" measure in Kylin web GUI
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > Hello developers and users,
>> > > > > >
>> > > > > > I have a proposal want to disucss here, which is about the "RAW"
>> > > > measure
>> > > > > in
>> > > > > > Kylin.
>> > > > > >
>> > > > > > The RAW measure was developed to solve the requirement for
>> getting
>> > > raw
>> > > > > data
>> > > > > > when user drill down from high level to low levels. It's
>> > performance
>> > > > > would
>> > > > > > be much better than fetching from source like Hive, so some
>> users
>> > > like
>> > > > > it.
>> > > > > > This blog introduces it:
>> > > > > > https://kylin.apache.org/blog/2016/05/29/raw-measure-in-kylin/
>> > > > > >
>> > > > > > While it has some limitations:
>> > > > > > 1) always use dictionary encoding, which means it couldn't
>> support
>> > > UHC;
>> > > > > > since raw columns usually be transaction IDs, numbers, etc, the
>> > > > building
>> > > > > > cost is much higher than ordinary dimensions;
>> > > > > > 2) the raw messages for a dimension combination are persisted in
>> > one
>> > > > big
>> > > > > > cell,; When too many rows dumped in one cell, will get
>> > BufferOverflow
>> > > > > > error. This couldn't be predicted, so for an modeler or analyst
>> he
>> > > > > doesn't
>> > > > > > know whether this feature could work when he creates the cube.
>> > > > > > 3) seems no people maintains it;
>> > > > > >
>> > > > > > Based on above, I propose hiding this measure on Web GUI by
>> > default,
>> > > so
>> > > > > > avoiding confusing user; If someone hear and want to use this,
>> he
>> > > still
>> > > > > can
>> > > > > > enable it by simply set in kylin.properties.
>> > > > > >
>> > > > > > Any input is welcomed.
>> > > > > >
>> > > > > > --
>> > > > > > Best regards,
>> > > > > >
>> > > > > > Shaofeng Shi 史少锋
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>
>


Re: Build cube and streaming cube problems

2017-11-24 Thread Billy Liu
I suggest you seperating different questions into sepearted mail thread.
That makes the dicussion more focus.

Kylin is designed of OLAP queries which is aggreated query, not for RAW
data. Although it has a workaround solution for RAW measure. I am not sure
if it works for dozen columns. Maybe you could first try from one or two
RAW measures, and seperate different RAW measure into different HBase
column mapping.

For more performance issue, it may relate to your cube design and query
pattern. Here is a tool may help figure out the query bottleneck:
https://kybot.io It analyzes the kylin log and show your how the query hit
the cube.

For the failed job, please describe the issue with detail logs, including
the kylin.log and logs in YARN job.

For the Kafka issue, most issue may caused by the front-end small changes.
You could file an JIRA for that. To use Kafka as datasource, please export
KAFKA_HOME before you start Kylin. That's how the tutorial is saying.

2017-11-24 11:34 GMT+08:00 Wei Li :

> Hi all
> I install Kylin binary package with Hbase namespace patch (KYLIN-2846) and
> have been used for nearly a month. My work always need several
> dimensions(large cardinality happened such as ID-num)and a dozen RAW
> measures.
>
> I have some questiona about building cube. There are lots of successful
> cases that build cube with ten-billions of  data and have sub-seconds query
> speed, but when I actually use,cubing with ten-millions data failed
> sometimes and my querying is slow with a where filter and becomes slower
> when it comes to LIKE (10 million data costs 40 seconds).
>
> And here is a strange phenomenon like this:
> I have a cube with 200 million rows, which contains three dimensions and
> no lookup table. But when I add a lookup table with 1400 rows and 4 RAW
> measures (two of them are Chinese string ), it fails at the 3 step, output
> is 'Job Counters \n failed reduce tasks=4'. I find that some key values in
> fact table out of the inner join lookup table, dose that cause the error?
> Are there any specific constrains when build a cube?
> For example, I notice that the dimensions should pick up an unique row or
> an error would happened.
>
> Turn to streaming cube, I meet three problems.
> Firstly, when I add streaming table, the Advanced Setting only has Timeout
> in web, Buffer Size  & Margin are missing.
> Secondly, when I save my table and browse the table schema,Streaming
> Cluster config are blank which has been set before,and I can't Edit,throw
> an error message when I click save, which is Failed to deal with the
> request: SteamingConfig Illegal.
> Thirdly, after I new model and cube succesful and comes to build, an
> ‘Oops...Could not find Kafka dependency' happended. Obviously, my kafka is
> ready,because I can consume it by java。
>
> A long Email,thanks for reading,and hope for your reply!
>
>
> Sincerely
> Wei Li
>


Re: Can hierarchyDims contain jointDims

2017-11-20 Thread Billy Liu
Hi Doom,

Thank you for reading the code so carefullly. You are welcomed to
contribute on this JIRA.

2017-11-19 16:42 GMT+08:00 doom <43535...@qq.com>:

> So this issues is work in progress, and result in the two piece contradict
> code.  thank you for make me clear.
>
>
> -- 原始邮件 --
> *发件人:* "Alberto Ramón";
> *发送时间:* 2017年11月18日(星期六) 凌晨5:59
> *收件人:* "user";
> *主题:* Re: Can hierarchyDims contain jointDims
>
> https://issues.apache.org/jira/browse/KYLIN-2149
>
> Check this link, you need choose between use one or other
> Some times would be great use both together
>
> On 17 November 2017 at 06:43, doom <43535...@qq.com> wrote:
>
>> So what's the second code segment mean in AggregationGroup build step?
>> is it means replace the hierarchy dim with the joint dims witch contain
>> it?
>>
>>
>> -- 原始邮件 --
>> *发件人:* "ShaoFeng Shi";;
>> *发送时间:* 2017年11月17日(星期五) 下午2:02
>> *收件人:* "user";
>> *主题:* Re: Can hierarchyDims contain jointDims
>>
>> Joint could not be used in the hierarchy.
>>
>> Joint means treating multiple dimensions as one: they either all
>> appeared, either all not; It is a conflict with hierarchy.
>>
>> 2017-11-16 21:29 GMT+08:00 doom <43535...@qq.com>:
>>
>>> HI ALL:
>>> I read the src code of kylin 2.2, and find:
>>>
>>> In class CubeDes, if hierarchyDims contain jointDims will throw
>>> exception.
>>> public void validateAggregationGroups() {
>>> ...
>>> if (CollectionUtils.containsAny(hierarchyDims, jointDims)) {
>>> logger.error("Aggregation group " + index + " hierarchy
>>> dimensions overlap with joint dimensions");
>>> throw new IllegalStateException(
>>> "Aggregation group " + index + " hierarchy
>>> dimensions overlap with joint dimensions: "
>>> + 
>>> ensureOrder(CollectionUtils.intersection(hierarchyDims,
>>> jointDims)));
>>> }
>>>
>>> But in class AggregationGroup will replace the hierarchy dim with the
>>> joint dims witch contain it.
>>> private void buildHierarchyMasks(RowKeyDesc rowKeyDesc) {
>>> .
>>> for (int i = 0; i < hierarchy_dims.length; i++) {
>>> TblColRef hColumn = cubeDesc.getModel().findColumn
>>> (hierarchy_dims[i]);
>>> Integer index = rowKeyDesc.getColumnBitIndex(hColumn);
>>> long bit = 1L << index;
>>>
>>> // combine joint as logic dim
>>> if (dim2JointMap.get(bit) != null) {
>>> bit = dim2JointMap.get(bit);
>>> }
>>>
>>> mask.fullMask |= bit;
>>> allMaskList.add(mask.fullMask);
>>> dimList.add(bit);
>>> }
>>> }
>>>
>>> do i understand in a wrong way?
>>>
>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>


Re: 如何在本地Windows环境下搭建kylin本地运行环境

2017-11-16 Thread Billy Liu
Hello,

If you follow the doc, but meet some problems, please describe your steps
and results(better with snapshot) in the mail. Then others will know where
you are, and if in the wrong way.

在 2017年11月17日 上午10:35,Hey <1428117...@qq.com>写道:

> 按照kylin官网来是可以的 ,需要nodejs下载kylin web依赖的东西,然后按照kylin官方文档来就可以了!!!


Re: how to change the order of rowkey_columns

2017-11-14 Thread Billy Liu
You could change the rowkey order by drag-and-drop from GUI directly.

2017-11-14 10:52 GMT+08:00 杨浩 :

>To speed up the query, we should position the frequent column before
> others, such as partition column date should be at the first for every
> query would contain it. Then how to change the order of  rowkey_columns, or
> in other words,  what decides the order of rowkey_columns?
>


Re: Set up Development environment at intelliji

2017-11-12 Thread Billy Liu
Hi Kumar,

Coudl you check out this doc first
http://kylin.apache.org/development/dev_env.html ?

2017-11-12 23:45 GMT+08:00 Kumar, Manoj H :

> Can you pls. give step by step to do Dev. Set up of Apache Kylin? Since I
> am not from Java Background, Pls. give the complete step so that It helps
> to do the my local set up?  How to give reference of Hadoop
> environment(Cloudera VM) at my Laptop? Spark home,hive,Hbase etc. Pls. give
> us complete details. Thanks.
>
>
>
> Manoj
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>


Re: About degenerate dimensions on Kylin cubes

2017-11-09 Thread Billy Liu
Hi Roberto,

Degenerate dimensions on fact table is not supported I think. There is only
two types dimensions: "normal" and "derived". All "normla" dimensions will
be precaculated into cube. So it affects the construction cost and query
latency. If some column in fact table do not need to be dimension, you
could define it as "extended column". The "extended column" will not be
precaculated.

2017-11-04 18:04 GMT+08:00 Roberto Tardío :

> Hi,
>
> I have a question about how Kylin compute degenerate dimensions, i.e.,
> dimensions on the fact fable that do not need dimension lookup table. These
> type of these dimension is "Normal" by default but, What is the cost of add
> this dimensions? I guess they are not used on the cuboids concept because
> they are naturally combinated on fact table, so the question are:
>
>- Do they add appreciable complexity to the construction process of
>the cube?
>- Do they affect the query latency over cube built in any way?
>
> Thanks in advance!
> --
>
> *Roberto Tardío Olmos*
> *Senior Big Data & Business Intelligence Consultant*
> Avenida de Brasil, 17
> ,
> Planta 16.28020 Madrid
> Fijo: 91.788.34.10
>


Re: Kylin 2.1.0 new features than old versions.

2017-11-09 Thread Billy Liu
I will suggest waiting a few days. I know the bug has been fixed recently.
But the code has not merged into master yet.

2017-11-09 14:29 GMT+08:00 Prasanna :

> Hi all,
>
>
>
> Present I am using kylin 1.6.0 . If  anybody is using kylin latest version
> 2.X ,can you please give me what are the new features  available than old
> versions. In  kylin 1.6.0 I am facing Holes between segments while merging
> problem. Is this problem will solve in new versions? Am I able to merge
> segments with holes also. Please suggest me in this regarding.
>


Re: How to drop the segments

2017-11-08 Thread Billy Liu
http://kylin.apache.org/docs21/howto/howto_use_restapi.html
Delete Segment

DELETE /kylin/api/cubes/{cubeName}/segs/{segmentName}

2017-11-09 14:56 GMT+08:00 Kumar, Manoj H :

> Can you pls. tell how to drop the Cube segment from backend? While
> re-running the Cube build, I need to drop the older segments? I am using
> 2.1 Apache Kylin.
>
>
>
> Regards,
>
> Manoj
>
>
>
> This message is confidential and subject to terms at: http://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>


Re: Export and Import metadata between Kylin clusters

2017-11-02 Thread Billy Liu
The requirement is clear. Could you check metastore.sh.
It has a lot of features, such as
metadata.sh backup-cube/restore-cube
metadata.sh backup-project/restore-project

2017-11-01 21:00 GMT+08:00 Roberto Tardío <roberto.tar...@stratebi.com>:

> Hi,
>
> I think this is related to this unresolved JIRA. KYLIN-1605
> <https://issues.apache.org/jira/browse/KYLIN-1605>
>
> I mean the posibility of migrate metadata from source, data model, cube
> definition or all project in order to re build the all cube, e.g. with
> different Hadoop cluster.
>
> Thanks!
>
> El 24/10/2017 a las 3:29, Billy Liu escribió:
>
> Hi Shaofeng,
>
> Do you think the CubeMetaExtractor and CubeMetaIngestor could do the this
> job?
>
> 2017-10-23 22:51 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:
>
>> BTW, would you like to report a JIRA for this? thanks.
>>
>> 2017-10-23 22:51 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:
>>
>>> I see; currently, there is no support for this. But it should be easy.
>>> Just look at `bin/metastore.sh`, you can extend the function there: e.g,
>>> add the project as an optional parameter (when specified, only export that
>>> project's metadata). After the cube be imported in new environment, you can
>>> call the API to purge it (clear segments), and then rebuild it from source.
>>>
>>> 2017-10-23 19:02 GMT+08:00 Roberto Tardío <roberto.tar...@stratebi.com>:
>>>
>>>> Hi ShaoFeng Shi,
>>>>
>>>> Sorry, I forgot to give more details. I would kike to export a Project
>>>> metadata including the following:
>>>>
>>>>- Data sources (Hive Tables)
>>>>- Data Model definition
>>>>- Cube definition
>>>>
>>>> That is to say, I need the metadata necesary to build the same cube
>>>> from a similar data source in a new cluster and HBase storage. However I do
>>>> not need data about HBase table. E.g. I add a new dimension on Pre
>>>> Production cube and I would like to update this cube definition on
>>>> Production environment.
>>>>
>>>> Best Regards,
>>>>
>>>> El 23/10/2017 a las 9:52, ShaoFeng Shi escribió:
>>>>
>>>> Hi Roberto,
>>>>
>>>> What kind of metadata are you going to export from pre-production to
>>>> production? Is it a single cube or whole metadata set? Need more about your
>>>> scenario to understand the requirement.
>>>>
>>>> 2017-10-22 0:53 GMT+08:00 Roberto Tardío <roberto.tar...@stratebi.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have to two Kylin servers, one for Pre-Production and another one
>>>>> for Production. What is the best way to export project metadata between 
>>>>> the
>>>>> two? I could think if use metatada backup tools the metadata will be
>>>>> related to Hbase tables in Pre-Production.
>>>>>
>>>>> Is metadata backup the best way or what do you recomend to copy
>>>>> metadata between pre and production servers?
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Roberto
>>>>> --
>>>>>
>>>>> *Roberto Tardío Olmos*
>>>>> *Senior Big Data & Business Intelligence Consultant*
>>>>> Avenida de Brasil, 17
>>>>> <https://maps.google.com/?q=Avenida+de+Brasil,+17=gmail=g>,
>>>>> Planta 16.28020 Madrid
>>>>> Fijo: 91.788.34.10
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Shaofeng Shi 史少锋
>>>>
>>>>
>>>> --
>>>>
>>>> *Roberto Tardío Olmos*
>>>> *Senior Big Data & Business Intelligence Consultant*
>>>> Avenida de Brasil, 17
>>>> <https://maps.google.com/?q=Avenida+de+Brasil,+17=gmail=g>,
>>>> Planta 16.28020 Madrid
>>>> Fijo: 91.788.34.10
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
> --
>
> *Roberto Tardío Olmos*
> *Senior Big Data & Business Intelligence Consultant*
> Avenida de Brasil, 17
> <https://maps.google.com/?q=Avenida+de+Brasil,+17=gmail=g>,
> Planta 16.28020 Madrid
> Fijo: 91.788.34.10
>


Re: Re: kylin2.1 can't work with hdp2.6.0

2017-11-02 Thread Billy Liu
Have checked the log file, no error or exception found. Everything looks
fine. Could you check from the frontend, maybe some javascript file load
failed.
Please "reload metadata" again and refresh the page.

2017-11-02 10:44 GMT+08:00 lk_kylin <lk_ky...@163.com>:

> some error message in kylin.out about tomcat :
>
> SEVERE: Failed to load keystore type JKS with path conf/.keystore due to
> /usr/local/kylin/kylin-2.1.0/tomcat/conf/.keystore (No
>  such file or directory)
> java.io.FileNotFoundException: 
> /usr/local/kylin/kylin-2.1.0/tomcat/conf/.keystore
> (No such file or directory)
>  at java.io.FileInputStream.open0(Native Method)
>  at java.io.FileInputStream.open(FileInputStream.java:195)
>  at java.io.FileInputStream.(FileInputStream.java:138)
>  at java.io.FileInputStream.(FileInputStream.java:93)
>  at sun.net.www.protocol.file.FileURLConnection.connect(
> FileURLConnection.java:90)
>  at sun.net.www.protocol.file.FileURLConnection.getInputStream(
> FileURLConnection.java:188)
>  at java.net.URL.openStream(URL.java:1045)
>  at org.apache.tomcat.util.file.ConfigFileLoader.getInputStream(
> ConfigFileLoader.java:100)
>  at org.apache.tomcat.util.net.jsse.JSSESocketFactory.
> getStore(JSSESocketFactory.java:470)
>  at org.apache.tomcat.util.net.jsse.JSSESocketFactory.
> getKeystore(JSSESocketFactory.java:381)
>  at org.apache.tomcat.util.net.jsse.JSSESocketFactory.getKeyManagers(
> JSSESocketFactory.java:634)
>  at org.apache.tomcat.util.net.jsse.JSSESocketFactory.getKeyManagers(
> JSSESocketFactory.java:574)
>  at org.apache.tomcat.util.net.jsse.JSSESocketFactory.init(
> JSSESocketFactory.java:519)
>  at org.apache.tomcat.util.net.jsse.JSSESocketFactory.createSocket(
> JSSESocketFactory.java:255)
>  at org.apache.tomcat.util.net.JIoEndpoint.bind(JIoEndpoint.java:400)
>  at org.apache.tomcat.util.net.AbstractEndpoint.init(
> AbstractEndpoint.java:650)
>  at org.apache.coyote.AbstractProtocol.init(AbstractProtocol.java:434)
>  at org.apache.coyote.http11.AbstractHttp11JsseProtocol.init(
> AbstractHttp11JsseProtocol.java:119)
>  at org.apache.catalina.connector.Connector.initInternal(
> Connector.java:978)
>  at org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:102)
>  at org.apache.catalina.core.StandardService.initInternal(
> StandardService.java:560)
>  at org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:102)
>  at org.apache.catalina.core.StandardServer.initInternal(
> StandardServer.java:838)
>  at org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:102)
>  at org.apache.catalina.startup.Catalina.load(Catalina.java:642)
>  at org.apache.catalina.startup.Catalina.start(Catalina.java:681)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:294)
>  at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:428)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
>  at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
>
> please  check the attachment for detail.
>
>
> 2017-11-02
> --
> lk_kylin
> --
>
> *发件人:*ShaoFeng Shi <shaofeng...@apache.org>
> *发送时间:*2017-10-31 16:07
> *主题:*Re: kylin2.1 can't work with hdp2.6.0
> *收件人:*"user"<user@kylin.apache.org>
> *抄送:*
>
> Since 2.1, Kylin consolidates the metadata into one HBase table, so if you
> see one 'kylin_metadata', that is expected.
>
> If no message in kylin.log, please check logs/kylin.out ; If no message,
> check tomcat/logs to see whether there is any clue.
>
> 2017-10-31 11:14 GMT+08:00 Billy Liu <billy...@apache.org>:
>
>> Could you check the kylin.log first?
>>
>> 2017-10-31 10:55 GMT+08:00 lk_kylin <lk_ky...@163.com>:
>>
>>> hi,all:
>>>I want try kylin , my hadoop version is hdp2.6 , I can run the sample
>>> by useing kylin2.0 . but when I try 2.1 with sample, kylin can't load hive
>>> table and I found it only create one table kylin_metadata  in hbase。also I
>>> did't find error msg from kylin.log
>>>
>>>
>>>
>>>
>>> 2017-10-31
>>> --
>>> lk_kylin
>>>
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


Re: kylin2.1 can't work with hdp2.6.0

2017-10-30 Thread Billy Liu
Could you check the kylin.log first?

2017-10-31 10:55 GMT+08:00 lk_kylin :

> hi,all:
>I want try kylin , my hadoop version is hdp2.6 , I can run the sample
> by useing kylin2.0 . but when I try 2.1 with sample, kylin can't load hive
> table and I found it only create one table kylin_metadata  in hbase。also I
> did't find error msg from kylin.log
>
>
>
>
> 2017-10-31
> --
> lk_kylin
>


Re: Export and Import metadata between Kylin clusters

2017-10-23 Thread Billy Liu
Hi Shaofeng,

Do you think the CubeMetaExtractor and CubeMetaIngestor could do the this
job?

2017-10-23 22:51 GMT+08:00 ShaoFeng Shi :

> BTW, would you like to report a JIRA for this? thanks.
>
> 2017-10-23 22:51 GMT+08:00 ShaoFeng Shi :
>
>> I see; currently, there is no support for this. But it should be easy.
>> Just look at `bin/metastore.sh`, you can extend the function there: e.g,
>> add the project as an optional parameter (when specified, only export that
>> project's metadata). After the cube be imported in new environment, you can
>> call the API to purge it (clear segments), and then rebuild it from source.
>>
>> 2017-10-23 19:02 GMT+08:00 Roberto Tardío :
>>
>>> Hi ShaoFeng Shi,
>>>
>>> Sorry, I forgot to give more details. I would kike to export a Project
>>> metadata including the following:
>>>
>>>- Data sources (Hive Tables)
>>>- Data Model definition
>>>- Cube definition
>>>
>>> That is to say, I need the metadata necesary to build the same cube from
>>> a similar data source in a new cluster and HBase storage. However I do not
>>> need data about HBase table. E.g. I add a new dimension on Pre Production
>>> cube and I would like to update this cube definition on Production
>>> environment.
>>>
>>> Best Regards,
>>>
>>> El 23/10/2017 a las 9:52, ShaoFeng Shi escribió:
>>>
>>> Hi Roberto,
>>>
>>> What kind of metadata are you going to export from pre-production to
>>> production? Is it a single cube or whole metadata set? Need more about your
>>> scenario to understand the requirement.
>>>
>>> 2017-10-22 0:53 GMT+08:00 Roberto Tardío :
>>>
 Hi,

 I have to two Kylin servers, one for Pre-Production and another one for
 Production. What is the best way to export project metadata between the
 two? I could think if use metatada backup tools the metadata will be
 related to Hbase tables in Pre-Production.

 Is metadata backup the best way or what do you recomend to copy
 metadata between pre and production servers?

 Thanks in advance,

 Roberto
 --

 *Roberto Tardío Olmos*
 *Senior Big Data & Business Intelligence Consultant*
 Avenida de Brasil, 17
 ,
 Planta 16.28020 Madrid
 Fijo: 91.788.34.10

>>>
>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi 史少锋
>>>
>>>
>>> --
>>>
>>> *Roberto Tardío Olmos*
>>> *Senior Big Data & Business Intelligence Consultant*
>>> Avenida de Brasil, 17
>>> ,
>>> Planta 16.28020 Madrid
>>> Fijo: 91.788.34.10
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


Re: [外部邮件] Re: [外部邮件] 回复:kylin drop cube segment Htable 仍然存在

2017-10-18 Thread Billy Liu
Agree. A new patch is needed in KYLIN-2846. It's not a bug still, but an
unfinished improvement I think. Let's continue to improve this issue.

2017-10-18 21:05 GMT+08:00 ShaoFeng Shi :

> Hi Yaowu,
>
> Could you please update this patch in KYLIN-2846? Storage cleanup should
> also support customized namespace. Thanks.
>
> 2017-10-18 20:37 GMT+08:00 曾耀武 :
>
>>
>> Yeah I have checked it was a bug , because I  used hbase namespace
>> “kylin” with the patch "https://issues.apache.org/jir
>> a/secure/attachment/12885320/KYLIN-2846-001.patch”
>>
>> And I modified the code as  below and it works  well:
>>
>>
>> MOMO@MOMOdeMacBook-Pro-5:~/gitworkspace/kylin2.0/kylin/dist/apache-kylin-2.1.0-bin/tool$
>> git diff ../../../server-base/src/main/java/org/apache/kylin/rest/job
>> /StorageCleanJobHbaseUtil.java
>> diff --git 
>> a/server-base/src/main/java/org/apache/kylin/rest/job/StorageCleanJobHbaseUtil.java
>> b/server-base/src/main/java/org/apache/kylin/rest/job/Storag
>> eCleanJobHbaseUtil.java
>> index 3728ea1..937d02d 100644
>> --- a/server-base/src/main/java/org/apache/kylin/rest/job/Storag
>> eCleanJobHbaseUtil.java
>> +++ b/server-base/src/main/java/org/apache/kylin/rest/job/Storag
>> eCleanJobHbaseUtil.java
>> @@ -46,15 +46,28 @@ public class StorageCleanJobHbaseUtil {
>>
>>  public static void cleanUnusedHBaseTables(boolean delete, int
>> deleteTimeout) throws IOException {
>>  Configuration conf = HBaseConfiguration.create();
>> -CubeManager cubeMgr = CubeManager.getInstance(KylinC
>> onfig.getInstanceFromEnv());
>> +KylinConfig config = KylinConfig.getInstanceFromEnv();
>> +CubeManager cubeMgr = CubeManager.getInstance(config);
>>  // get all kylin hbase tables
>>  try (HBaseAdmin hbaseAdmin = new HBaseAdmin(conf)) {
>> -String tableNamePrefix = IRealizationConstants.SharedHb
>> aseStorageLocationPrefix;
>> +String namespace = config.getHBaseStorageNameSpace();
>> +StringBuffer sb = new StringBuffer();
>> +String tableNamePrefix = null;
>> +if(namespace.equals("default")){
>> +tableNamePrefix = IRealizationConstants.SharedHb
>> aseStorageLocationPrefix;
>> +}else{
>> +sb.append(config.getHBaseStor
>> ageNameSpace()).append(":");
>> +sb.append(IRealizationConstan
>> ts.SharedHbaseStorageLocationPrefix);
>> +tableNamePrefix = sb.toString();
>> +}
>> +
>>
>>
>>
>>
>>
>> 发件人: ShaoFeng Shi 
>> 答复: user 
>> 日期: 2017年10月17日 星期二 下午4:16
>> 至: user 
>> 主题: [外部邮件] Re: [外部邮件] 回复:kylin drop cube segment Htable 仍然存在
>>
>> Hi yaowu,
>>
>> The StorageCleanupJob will delete the hbase tables that has no reference;
>> If you found it didn't work, please check the output log of that command.
>> It should say something.
>>
>> 在 2017年10月17日 下午3:39,yuyong.zhai 写道:
>>
>>>
>>> sh metastore.sh clean --delete true
>>>
>>> sh kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob
>>> --delete true
>>>  原始邮件
>>> *发件人:* 曾耀武
>>> *收件人:* user
>>> *发送时间:* 2017年10月17日(周二) 15:23
>>> *主题:* Re: [外部邮件] 回复:kylin drop cube segment Htable 仍然存在
>>>
>>>  这个我看了,执行之后发现只是删除了 hive 中间表  kylin_intermediate_* 任务失败遗留的中间表
>>>  和 hdfs://nameservice1:8020/kylin/kylin-kylin_metadata/  目录下没用的临时目录,
>>> hbase 里的 segment 表仍然存在的
>>>
>>> 发件人: "yuyong.zhai" 
>>> 答复: user 
>>> 日期: 2017年10月17日 星期二 下午3:04
>>> 至: user 
>>> 主题: [外部邮件] 回复:kylin drop cube segment Htable 仍然存在
>>>
>>> cleanup storage
>>>
>>> http://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>>>
>>>  原始邮件
>>> *发件人:* 曾耀武
>>> *收件人:* user
>>> *发送时间:* 2017年10月17日(周二) 14:02
>>> *主题:* kylin drop cube segment Htable 仍然存在
>>>
>>>
>>>
>>> 嗨,all ,在清理没用的cube (kylin2.1.0) 时,发现cube 被purge和drop 之后,对应的segment  hbase
>>> 表仍然在
>>> base中存在,只是在kylinmeta 表的里的元信息被清除,  这是我的特例还是为了数据安全特意让他存在,
>>> 只能人工删除相关的表呢?
>>>
>>>
>>> Best regards
>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


Re: Persistent error after Kafka streaming cube source stopped

2017-10-16 Thread Billy Liu
Hi Roberto,

Could you update more logs. It should has more logs from the context.

2017-10-16 17:19 GMT+08:00 Roberto Tardío :

> Hi,
>
> I'm doing a PoC with to build a cube with Kylin using a Streaming Data
> with Kafka. Kafka connection and cube creation works correctly. I have
> tested using the Kafka test generator included with Kylin and I scheduled
> the process with Crontab for its execution every 10 minutes, during a
> period of 24 hours. However when I stopped Kafka generator and Crontab
> schedule, I got errors in Kylin:
>
>- Monitor show error when it tries to load jobs. This affects not only
>to Kafka test project, but also to rest other batch projects.
>   - Kyling log show this:
>   - ERROR [http-bio-7070-exec-4] controller.BasicController:57 :
>  org.apache.kylin.rest.exception.InternalErrorException:
>  java.lang.RuntimeException: 
> org.apache.kylin.job.exception.PersistentException:
>  com.fasterxml.jackson.databind.JsonMappingException: No content
>  to map due to end-of-input
>   at [Source: java.io.DataInputStream@189348ee; line: 1, column:
>  1]
>  - I cannot purge or drop this Kafka test streaming cube.
>   - Kyli UI shows the following error:
>  - Failed to delete cube. Caused by: org.apache.kylin.job.
>  exception.hdfs.BlockMissingException: Could not obtain block:
>  BP-699932432.
>
> I have tried and received the same error with Kylin 1.6 and Kylin 2.1. Now
> I Kylin is still running, but I have to solve this errors to use it in a
> normal way (e.g. building new cubes)
>
> ¿Anyone can help me?
>
> Thanks in advance,
>
> *Roberto Tardío Olmos*
> *Senior Big Data & Business Intelligence Consultant*
> Avenida de Brasil, 17
> ,
> Planta 16.28020 Madrid
> Fijo: 91.788.34.10
>


Re: 关于kylin设置queuename遇到的问题

2017-10-16 Thread Billy Liu
Seems like a bug, could you open a JIRA for this?

在 2017年10月16日 下午5:47,wanglb 写道:

> 最近在cdh5.7搭建kylin2.1环境后,构建cube时遇到问题
>
> 需要修改队列名称
> 在kylin.properties文件配置队列参数如下:
> kylin.source.kylin.client=beeline
> kylin.engine.mr.config-override.mapreduce.job.queuename = root.cbasQueue
> kylin.source.hive.config-override.mapreduce.job.queuename = root.cbasQueue
>
> 修改完成,重启kylin,BUILD CUBE后
> 在执行1 step name create intermediate flat hive table步骤,
> 提交的mr任务使用的queue正确为root.cbasQueue
> 在执行2 step name redistribute flat hive table报错,提交的mr任务使用的queue为default
>
> 另外在load hive table from tree那里添加hive表时,后台提交的mr任务也是使用的queue为default
>
> 请问这种情况是什么原因?
> 非常感谢!
>


Re: Kafka Streaming data - Error while building the Cube

2017-10-13 Thread Billy Liu
And in Kylin tutorial, the topic is kylindemo, in your sample, the topic is
kylin_demo. Please double check the topic name.

2017-10-13 14:27 GMT+08:00 Billy Liu <billy...@apache.org>:

> If you could package the source code, please try to add more debug log
> when retrieving partition info by Kafka consumer. Check which topic and how
> many partitions you got.
>
> 2017-10-12 23:19 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:
>
>> I guess KafkaSource Class file where it has
>> enrichSourcePartitionBeforeBuild() to get partition values. There its
>> getting Error out. Do we know how can we test out to find why its coming as
>> 0 for start & end offset?
>>
>>
>>
>> Regards,
>>
>> Manoj
>>
>>
>>
>> *From:* Kumar, Manoj H
>> *Sent:* Thursday, October 12, 2017 3:35 PM
>> *To:* 'user@kylin.apache.org'
>> *Subject:* RE: Kafka Streaming data - Error while building the Cube
>>
>>
>>
>> Yes its there.. I could see the messages..
>>
>>
>>
>> Regards,
>>
>> Manoj
>>
>>
>>
>> *From:* Billy Liu [mailto:billy...@apache.org <billy...@apache.org>]
>> *Sent:* Thursday, October 12, 2017 3:11 PM
>>
>> *To:* user
>> *Subject:* Re: Kafka Streaming data - Error while building the Cube
>>
>>
>>
>> STREAMING_SALES_TABLE table reads messages from Kafka topic kylin_demo,but
>> got 0 message.
>>
>>
>>
>> Could you check if the topic has incoming message: 
>> bin/kafka-console-consumer.sh
>> --zookeeper localhost:2181 --bootstrap-server localhost:9092 --topic
>> kylin_demo
>>
>>
>>
>> 2017-10-12 17:19 GMT+08:00 Kumar, Manoj H <manoj.h.ku...@jpmorgan.com>:
>>
>> Pls. find below information about consumer config from Kylin log file.
>>
>>
>>
>> 2017-10-11 02:11:43,787 INFO  [pool-8-thread-1]
>> threadpool.DefaultScheduler:123 : Job Fetcher: 0 should running, 0
>> actual running, 0 stopped, 0 ready, 1 already succeed, 0 error, 0
>> discarded, 0 others
>>
>> 2017-10-11 02:12:13,783 INFO  [pool-8-thread-1]
>> threadpool.DefaultScheduler:123 : Job Fetcher: 0 should running, 0
>> actual running, 0 stopped, 0 ready, 1 already succeed, 0 error, 0
>> discarded, 0 others
>>
>> 2017-10-11 02:12:40,734 INFO  [http-bio-7070-exec-3]
>> streaming.StreamingManager:222 : Reloading Streaming Metadata from folder
>> kylin_metadata(key='/streaming')@kylin_metadata@hbase
>>
>> 2017-10-11 02:12:40,760 DEBUG [http-bio-7070-exec-3]
>> streaming.StreamingManager:247 : Loaded 1 StreamingConfig(s)
>>
>> 2017-10-11 02:12:43,789 INFO  [pool-8-thread-1]
>> threadpool.DefaultScheduler:123 : Job Fetcher: 0 should running, 0
>> actual running, 0 stopped, 0 ready, 1 already succeed, 0 error, 0
>> discarded, 0 others
>>
>> 2017-10-11 02:13:13,788 INFO  [pool-8-thread-1]
>> threadpool.DefaultScheduler:123 : Job Fetcher: 0 should running, 0
>> actual running, 0 stopped, 0 ready, 1 already succeed, 0 error, 0
>> discarded, 0 others
>>
>> 2017-10-11 02:13:43,785 INFO  [pool-8-thread-1]
>> threadpool.DefaultScheduler:123 : Job Fetcher: 0 should running, 0
>> actual running, 0 stopped, 0 ready, 1 already succeed, 0 error, 0
>> discarded, 0 others
>>
>> 2017-10-11 02:14:13,789 INFO  [pool-8-thread-1]
>> threadpool.DefaultScheduler:123 : Job Fetcher: 0 should running, 0
>> actual running, 0 stopped, 0 ready, 1 already succeed, 0 error, 0
>> discarded, 0 others
>>
>> 2017-10-11 02:14:43,796 INFO  [pool-8-thread-1]
>> threadpool.DefaultScheduler:123 : Job Fetcher: 0 should running, 0
>> actual running, 0 stopped, 0 ready, 1 already succeed, 0 error, 0
>> discarded, 0 others
>>
>> 2017-10-11 02:15:03,911 DEBUG [http-bio-7070-exec-1]
>> controller.StreamingController:255 : Saving StreamingConfig
>> {"uuid":"8613b0e1-40ac-438c-bdf5-72be4d91c230","last_modifie
>> d":1507705685859,"version":"2.1.0","name":"DEFAULT.
>> STREAMING_SALES_TABLE","type":"kafka"}
>>
>> 2017-10-11 02:15:03,913 DEBUG [http-bio-7070-exec-1]
>> controller.StreamingController:273 : Saving KafkaConfig
>> {"uuid":"87dc6ab5-5141-4bd8-8e00-c16ec86dce41","last_modifie
>> d":1507705685916,"version":"2.1.0","name":"DEFAULT.
>> STREAMING_SALES_TABLE","clusters":[{"brokers":[{"id":"
>> 1","host":"sandbox&quo

Re: cannot file ojdbc6.jar while kylin startup

2017-10-12 Thread Billy Liu
You could ignore the boring message. That's tomcat tries to detect ODBC
libraries from environment. Not real Kylin error.

2017-10-12 17:19 GMT+08:00 li...@fcyun.com :

>
> Hi, all
>
> Here my logs in kylin.out file.  why this error ocuurs?
>
>
> Another confusion, while I start kylin from 
> /usr/local/apache-kylin-2.0.0-bin/lib/,
> then it complaint  "/usr/local/apache-kylin-2.0.0-bin/lib/ojdbc6.jar" not
> found.
>
> #cd /usr/local/apache-kylin-2.0.0-bin/lib/
> #../bin/kylin.sh start
>
>
> if I start kylin in other directory, for example /usr/local/apache-
> kylin-2.0.0-bin,then it complaint  
> "/usr/local/apache-kylin-2.0.0-bin/ojdbc6.jar"
> not found.
> #cd /usr/local/apache-kylin-2.0.0-bin/
> #bin/kylin.sh start
>
>
>
>
>
>
> usage: java org.apache.catalina.startup.Catalina [ -
> config {pathname} ] [ -nonaming ]  { -help | start | stop }
> Oct 12, 2017 3:40:07 PM org.apache.catalina.core.AprLifecycleListener
> lifecycleEvent
> INFO: The APR based Apache Tomcat Native library which
> allows optimal performance in production environments was
> not found on the java.library.path: :/usr/hdp/2.5.3.0-37/
> hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.5.3.0-37/hadoop/lib/native
> Oct 12, 2017 3:40:07 PM org.apache.coyote.AbstractProtocol init
> INFO: Initializing ProtocolHandler ["http-bio-7070"]
> Oct 12, 2017 3:40:07 PM org.apache.coyote.AbstractProtocol init
> INFO: Initializing ProtocolHandler ["http-bio-7443"]
> Oct 12, 2017 3:40:07 PM org.apache.coyote.AbstractProtocol init
> INFO: Initializing ProtocolHandler ["ajp-bio-9009"]
> Oct 12, 2017 3:40:07 PM org.apache.catalina.startup.Catalina load
> INFO: Initialization processed in 1397 ms
> Oct 12, 2017 3:40:07 PM org.apache.catalina.core.
> StandardService startInternal
> INFO: Starting service Catalina
> Oct 12, 2017 3:40:07 PM org.apache.catalina.core.
> StandardEngine startInternal
> INFO: Starting Servlet Engine: Apache Tomcat/7.0.69
> Oct 12, 2017 3:40:07 PM org.apache.catalina.startup.HostConfig deployWAR
> INFO: Deploying web application archive /usr/local/apache-kylin-2.0.0-bin/
> tomcat/webapps/kylin.war
> Oct 12, 2017 3:40:08 PM org.apache.tomcat.util.scan.
> StandardJarScanner scan
> WARNING: Failed to scan [file:/usr/local/apache-kylin-2.0.0-
> bin/lib/ojdbc6.jar] from classloader hierarchy
> java.io.FileNotFoundException: /usr/local/apache-kylin-2.0.
> 0-bin/lib/ojdbc6.jar (No such file or directory)
> at java.util.zip.ZipFile.open(Native Method)
> at java.util.zip.ZipFile.(ZipFile.java:219)
> at java.util.zip.ZipFile.(ZipFile.java:149)
> at java.util.jar.JarFile.(JarFile.java:166)
> at java.util.jar.JarFile.(JarFile.java:103)
> at sun.net.www.protocol.jar.URLJarFile.(URLJarFile.java:93)
> at sun.net.www.protocol.jar.URLJarFile.
> getJarFile(URLJarFile.java:69)
> at sun.net.www.protocol.jar.JarFileFactory.
> get(JarFileFactory.java:99)
> at sun.net.www.protocol.jar.JarURLConnection.
> connect(JarURLConnection.java:122)
> at sun.net.www.protocol.jar.JarURLConnection.
> getJarFile(JarURLConnection.java:89)
> at org.apache.tomcat.util.scan.FileUrlJar.(
> FileUrlJar.java:41)
> at org.apache.tomcat.util.scan.JarFactory.
> newInstance(JarFactory.java:34)
> at org.apache.catalina.startup.ContextConfig$
> FragmentJarScannerCallback.scan(ContextConfig.java:2679)
> at org.apache.tomcat.util.scan.StandardJarScanner.
> process(StandardJarScanner.java:259)
> ...skipping...
> at org.apache.catalina.startup.HostConfig$
> DeployWar.run(HostConfig.java:1984)
> at java.util.concurrent.Executors$RunnableAdapter.call(
> Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.
> runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$
> Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
>
> Oct 12, 2017 3:40:38 PM org.apache.catalina.startup.ContextConfig
> processResourceJARs
> SEVERE: Failed to process JAR found at URL [jar:file:/usr/
> local/apache-kylin-2.0.0-bin/lib/ojdbc6.jar!/] for static
> resources to be included in context with name [/kylin]
> Oct 12, 2017 3:40:38 PM org.apache.catalina.startup.TaglibUriRule body
> INFO: TLD skipped. URI: urn:com:sun:jersey:api:view is already defined
> Oct 12, 2017 3:40:38 PM org.apache.catalina.startup.TaglibUriRule body
> INFO: TLD skipped. URI: urn:com:sun:jersey:api:view is already defined
> Oct 12, 2017 3:40:38 PM org.apache.catalina.startup.TaglibUriRule body
> INFO: TLD skipped. URI: urn:com:sun:jersey:api:view is already defined
> Oct 12, 2017 3:40:38 PM org.apache.catalina.startup.TaglibUriRule body
> INFO: TLD skipped. URI: urn:com:sun:jersey:api:view is already defined
> Oct 12, 2017 3:40:38 PM org.apache.catalina.startup.TaglibUriRule body
> INFO: TLD skipped. URI: urn:com:sun:jersey:api:view is 

Re: Kafka Streaming data - Error while building the Cube

2017-10-12 Thread Billy Liu
 will
> seek from topic's earliest offset.
>
>
>
> 2017-10-11 20:50:42,558 INFO  [http-bio-7070-exec-8]
> utils.AppInfoParser:83 : Kafka version : 0.10.2-kafka-2.2.0
>
> 2017-10-11 20:50:42,563 INFO  [http-bio-7070-exec-8]
> utils.AppInfoParser:84 : Kafka commitId : unknown
>
> 2017-10-11 20:50:42,570 DEBUG [http-bio-7070-exec-8] kafka.KafkaSource:105
> : Seek end offsets from topic
>
> 2017-10-11 20:50:42,570 INFO  [http-bio-7070-exec-8]
> consumer.ConsumerConfig:196 : ConsumerConfig values:
>
> auto.commit.interval.ms = 5000
>
> auto.offset.reset = latest
>
> bootstrap.servers = [localhost:9092]
>
> check.crcs = true
>
> client.id =
>
> connections.max.idle.ms = 54
>
> enable.auto.commit = false
>
> exclude.internal.topics = true
>
> fetch.max.bytes = 52428800
>
> fetch.max.wait.ms = 500
>
> fetch.min.bytes = 1
>
> group.id = streaming_cube
>
> heartbeat.interval.ms = 3000
>
> interceptor.classes = null
>
> internal.leave.group.on.close = true
>
> key.deserializer = class org.apache.kafka.common.serialization.
> StringDeserializer
>
> max.partition.fetch.bytes = 1048576
>
> max.poll.interval.ms = 30
>
> max.poll.records = 500
>
> metadata.max.age.ms = 30
>
> metric.reporters = []
>
> metrics.num.samples = 2
>
> metrics.recording.level = INFO
>
> metrics.sample.window.ms = 3
>
> partition.assignment.strategy = [class org.apache.kafka.clients.
> consumer.RangeAssignor]
>
> receive.buffer.bytes = 65536
>
> reconnect.backoff.ms = 50
>
> request.timeout.ms = 305000
>
> retry.backoff.ms = 100
>
> sasl.jaas.config = null
>
> sasl.kerberos.kinit.cmd = /usr/bin/kinit
>
> sasl.kerberos.min.time.before.relogin = 6
>
> sasl.kerberos.service.name = null
>
> sasl.kerberos.ticket.renew.jitter = 0.05
>
> request.timeout.ms = 305000
>
> retry.backoff.ms = 100
>
> sasl.jaas.config = null
>
> sasl.kerberos.kinit.cmd = /usr/bin/kinit
>
> sasl.kerberos.min.time.before.relogin = 6
>
> sasl.kerberos.service.name = null
>
> sasl.kerberos.ticket.renew.jitter = 0.05
>
> sasl.kerberos.ticket.renew.window.factor = 0.8
>
> sasl.mechanism = GSSAPI
>
> security.protocol = PLAINTEXT
>
> send.buffer.bytes = 131072
>
> session.timeout.ms = 1
>
> ssl.cipher.suites = null
>
> ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>
> ssl.endpoint.identification.algorithm = null
>
> ssl.key.password = null
>
> ssl.keymanager.algorithm = SunX509
>
> ssl.keystore.location = null
>
> ssl.keystore.password = null
>
> ssl.keystore.type = JKS
>
> ssl.protocol = TLS
>
> ssl.provider = null
>
> ssl.secure.random.implementation = null
>
> ssl.trustmanager.algorithm = PKIX
>
> ssl.truststore.location = null
>
> ssl.truststore.password = null
>
> ssl.truststore.type = JKS
>
> value.deserializer = class org.apache.kafka.common.serialization.
> StringDeserializer
>
>
>
> 2017-10-11 20:50:42,573 INFO  [http-bio-7070-exec-8]
> utils.AppInfoParser:83 : Kafka version : 0.10.2-kafka-2.2.0
>
> 2017-10-11 20:50:42,573 INFO  [http-bio-7070-exec-8]
> utils.AppInfoParser:84 : Kafka commitId : unknown
>
> 2017-10-11 20:50:42,586 DEBUG [http-bio-7070-exec-8] kafka.KafkaSource:107
> : The end offsets are {0=0}
>
> 2017-10-11 20:50:42,588 ERROR [http-bio-7070-exec-8]
> controller.CubeController:305 : No new message comes, startOffset =
> endOffset:0
>
> java.lang.IllegalArgumentException: No new message comes, startOffset =
> endOffset:0
>
> at org.apache.kylin.source.kafka.KafkaSource.
> enrichSourcePartitionBeforeBuild(KafkaSource.java:134)
>
> at org.apache.kylin.rest.service.JobService.submitJobInternal(
> JobService.java:236)
>
> Regards,
>
> Manoj
>
>
>
> *From:* Billy Liu [mailto:billy...@apache.org]
> *Sent:* Thursday, October 12, 2017 1:06 PM
> *To:* user
> *Subject:* Re: Kafka Streaming data - Error while building the Cube
>
>
>
> Hi Kumar,
>
>
>
> Could you paste more Kafka Consumer related log in kylin.log? And also
> check from the Kafka broker side, if the Kylin clien

Re: Merge Segment ERROR

2017-10-12 Thread Billy Liu
Fixed at https://issues.apache.org/jira/browse/KYLIN-2794, and will be
released in Apache Kylin 2.2

2017-10-11 15:41 GMT+08:00 s丶影中人* <845286...@qq.com>:

> When i merge the segment encounter error, how can i solve it. Please...
> Log as follows:
>
> 2017-10-11 15:24:56,139 ERROR [pool-9-thread-10]
> threadpool.DefaultScheduler:145 : ExecuteException
> job:7508dfa0-5a89-4c3c-8685-701226628207
> org.apache.kylin.job.exception.ExecuteException: 
> org.apache.kylin.job.exception.ExecuteException:
> java.lang.IllegalStateException: Invalid input data. Unordered data
> cannot be split into multi trees
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:135)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:141)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kylin.job.exception.ExecuteException: 
> java.lang.IllegalStateException:
> Invalid input data. Unordered data cannot be split into multi trees
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:135)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:65)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:125)
> ... 4 more
> Caused by: java.lang.IllegalStateException: Invalid input data. Unordered
> data cannot be split into multi trees
> at org.apache.kylin.dict.TrieDictionaryForestBuilder.addValue(
> TrieDictionaryForestBuilder.java:92)
> at org.apache.kylin.dict.TrieDictionaryForestBuilder.addValue(
> TrieDictionaryForestBuilder.java:78)
> at org.apache.kylin.dict.DictionaryGenerator$NumberTrieDictForestBuilder.
> addValue(DictionaryGenerator.java:261)
> at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(
> DictionaryGenerator.java:79)
> at org.apache.kylin.dict.DictionaryGenerator.buildDictionary(
> DictionaryGenerator.java:64)
> at org.apache.kylin.dict.DictionaryGenerator.mergeDictionaries(
> DictionaryGenerator.java:104)
> at org.apache.kylin.dict.DictionaryManager.mergeDictionary(
> DictionaryManager.java:275)
> at org.apache.kylin.engine.mr.steps.MergeDictionaryStep.mergeDictionaries(
> MergeDictionaryStep.java:146)
> at org.apache.kylin.engine.mr.steps.MergeDictionaryStep.
> makeDictForNewSegment(MergeDictionaryStep.java:136)
> at org.apache.kylin.engine.mr.steps.MergeDictionaryStep.
> doWork(MergeDictionaryStep.java:68)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:125)
> ... 6 more
>


Re: Kafka Streaming data - Error while building the Cube

2017-10-12 Thread Billy Liu
Hi Kumar,

Could you paste more Kafka Consumer related log in kylin.log? And also
check from the Kafka broker side, if the Kylin client has connected to
Broker.

2017-10-12 14:29 GMT+08:00 Kumar, Manoj H :

> Building the Cube from Kylin UI - Although Messages are there in Kafka
> topic but Kylin is not able read the offset. Can someone help on this?
>
> 2017-10-11 20:50:42,573 INFO  [http-bio-7070-exec-8]
> utils.AppInfoParser:83 : Kafka version : 0.10.2-kafka-2.2.0
> 2017-10-11 20:50:42,573 INFO  [http-bio-7070-exec-8]
> utils.AppInfoParser:84 : Kafka commitId : unknown
> 2017-10-11 20:50:42,586 DEBUG [http-bio-7070-exec-8] kafka.KafkaSource:107
> : The end offsets are {0=0}
> 2017-10-11 20:50:42,588 ERROR [http-bio-7070-exec-8]
> controller.CubeController:305 : No new message comes, startOffset =
> endOffset:0
> java.lang.IllegalArgumentException: No new message comes, startOffset =
> endOffset:0
> at org.apache.kylin.source.kafka.KafkaSource.
> enrichSourcePartitionBeforeBuild(KafkaSource.java:134)
> at org.apache.kylin.rest.service.JobService.submitJobInternal(
> JobService.java:236)
> at org.apache.kylin.rest.service.JobService.submitJob(
> JobService.java:208)
> at org.apache.kylin.rest.service.JobService$$
> FastClassBySpringCGLIB$$83a44b2a.invoke()
>
> Regards,
> Manoj
>
>
> This message is confidential and subject to terms at:
> http://www.jpmorgan.com/emaildisclaimer including on confidentiality,
> legal privilege, viruses and monitoring of electronic messages. If you are
> not the intended recipient, please delete this message and notify the
> sender immediately. Any unauthorized use is strictly prohibited.
>
>


Re: query return error result.

2017-10-11 Thread Billy Liu
Nice found. +1

2017-10-11 15:08 GMT+08:00 yu feng :

> I checked the code, and find root cause is DumpMerger.enqueueFromDump()
>
> I create a jira KYLIN-2926
>  to trace the bug.
>
>
>
> 2017-10-09 10:37 GMT+08:00 yu feng :
>
>> the cube is using hllc15, we are tracing the code and try to find the
>> reason.
>>
>> 2017-10-08 14:52 GMT+08:00 Li Yang :
>>
>>> Interesting... is it HLL count distinct or bitmap count distinct?
>>>
>>> On Wed, Sep 27, 2017 at 11:19 AM, yu feng  wrote:
>>>
 I add some log and find data from hbase is incorrect.

 2017-09-27 11:17 GMT+08:00 yu feng :

> I have a cube like this :
> dimensions : source_type, source_id, name, dt
> measures:count(distinct uid), count(1) , count(distinct buyer)
>
> I run the query :
>
> select source_type, source_id, name,
> count(distinct uid), count(uid) as cnum, count(distinct buyer) as
> buyerNum,
> count(buyer) as bnum
> from
> table_name
> where
> dt between '2017-06-01' and '2017-09-18'
> and source_id is not null
> and source_type is not null
> group by
> source_type, source_id, name
> order by buyerNum desc limit 1 offset 0
>
> return :
>
> mv
> 423031
> 起点‧终站
> 193794
> 92
> 42043
> 92
>
>
>
>
>
> obviously, it is error result, I query the sourceid like this:
>
> select source_type, source_id, name,
> count(distinct uid), count(uid) as cnum, count(distinct buyer) as
> buyerNum,
> count(buyer) as bnum
> from
> vip_buying_funnel_cube_view
> where
> dt between '2017-06-01' and '2017-09-18'
> and source_id is not null
> and source_type is not null
> and source_id = '423031'
> group by
> source_type, source_id, name
> order by buyerNum desc limit 1 offset 0
>
> the result is corrent :
>
> mv
> 423031
> 起点‧终站
> 77
> 92
> 11
> 92
>


>>>
>>
>


Re: How to merge cube segments with holes?

2017-10-10 Thread Billy Liu
Another workaround is to trigger PUT kylin/api/cubes/AnalyticsCube/holes to
fill up the holes.

2017-10-10 18:01 GMT+08:00 Billy Liu <billy...@apache.org>:

> The code is a little old. Could you have a try on the latest 2.1?
>
> 2017-10-09 15:41 GMT+08:00 <prasann...@trinitymobility.com>:
>
>> Thank you for your reply Billy. I tried with 'force'=true parameter but
>> its giving same exception as,
>>
>>
>>
>> I used as below,
>>
>>
>>
>> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
>> 'Content-Type: application/json' -d '{"startTime":'149886720',
>> "endTime":'1507366853000',"buildType":"MERGE","force":true}'
>> http://192.168.1.135:7070/kylin/api/cubes/trinityAnalyticsCube/rebuild
>>
>>
>>
>> *EXCEPTION:*
>>
>>
>>
>> {"url":"http://ipaddress:7070/kylin/api/cubes/AnalyticsCube/rebuild","exception":"Merging
>> segments must not have holes between 
>> AnalyticsCube[2017070100_20170911192537]
>> and AnalyticsCube[20170912102545_20170912131004]"}
>>
>>
>>
>>
>>
>>
>>
>> *From:* Billy Liu [mailto:billy...@apache.org]
>> *Sent:* Sunday, October 08, 2017 1:52 PM
>>
>> *To:* user
>> *Subject:* Re: How to merge cube segments with holes?
>>
>>
>>
>> Sorry, my fault. Just check the code again, the parameter should be
>> "force", not "isForce".
>>
>>
>>
>> 2017-10-04 12:58 GMT+08:00 <prasann...@trinitymobility.com>:
>>
>> HI,
>>
>> I am using Kylin 1.6 and i tried with isForce=true parameter but its
>> giving as com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
>> Unrecognized field \"isForce\"
>>
>>
>>
>> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
>> 'Content-Type: application/json' -d '{"startTime":'149886720',
>> "endTime":'1507112233000',"buildType":"MERGE","isForce":true}'
>> http://192.168.1.135:7070/kylin/api/cubes/trinityAnalyticsCube/rebuild
>>
>>
>>
>>
>>
>> {"url":"http://192.168.1.135:7070/kylin/api/cubes/AnalyticsCube/rebuild
>> ","exception":"Could not read JSON: Unrecognized field \"isForce\"
>> (class org.apache.kylin.rest.request.JobBuildRequest), not marked as
>> ignorable (5 known properties: \"endTime\", \"force\", \"startTime\",
>> \"buildType\", \"forceMergeEmptySegment\"])\n at [Source:
>> org.apache.catalina.connector.CoyoteInputStream@61b3d534; line: 1,
>> column: 87] (through reference chain: 
>> org.apache.kylin.rest.request.JobBuildRequest[\"isForce\"]);
>> nested exception is com.fasterxml.jackson.databind
>> .exc.UnrecognizedPropertyException: Unrecognized field \"isForce\"
>> (class org.apache.kylin.rest.request.JobBuildRequest), not marked as
>> ignorable (5 known properties: \"endTime\", \"force\", \"startTime\",
>> \"buildType\", \"forceMergeEmptySegment\"])\n at [Source:
>> org.apache.catalina.connector.CoyoteInputStream@61b3d534; line: 1,
>> column: 87] (through reference chain: org.apache.kylin.rest.request.
>> JobBuildRequest[\"isForce\"])"}
>>
>>
>>
>>
>>
>> I tried with force=true parameter also, its giving as,
>>
>>
>>
>> {"url":"http://192.168.1.135:7070/kylin/api/cubes/trinityAna
>> lyticsCube/rebuild","exception":"Merging segments must not have holes
>> between trinityAnalyticsCube[2017070100_20170911192537] and
>> trinityAnalyticsCube[20170912102545_20170912131004]"}
>>
>>
>>
>> Please suggest me which one  I have to use?
>>
>>
>>
>> *From:* Billy Liu [mailto:billy...@apache.org]
>> *Sent:* Wednesday, October 04, 2017 4:44 AM
>> *To:* user
>> *Subject:* Re: How to merge cube segments with holes?
>>
>>
>>
>> First, the isForceMergeEmptySegment has been deprecated, use
>> isForce instead is OK in your case.
>>
>> Second, the stacktrace shows the error is from the parameter
>> serialization issue, not from the backend code. The error line1 column 15
>> may indicate the wrong parser in $marge_start_time. Could you replace this
>> variable into the number and try again?
>>
>>
>>
>> 201

Re: How to merge cube segments with holes?

2017-10-10 Thread Billy Liu
The code is a little old. Could you have a try on the latest 2.1?

2017-10-09 15:41 GMT+08:00 <prasann...@trinitymobility.com>:

> Thank you for your reply Billy. I tried with 'force'=true parameter but
> its giving same exception as,
>
>
>
> I used as below,
>
>
>
> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
> 'Content-Type: application/json' -d '{"startTime":'149886720',
> "endTime":'1507366853000',"buildType":"MERGE","force":true}'
> http://192.168.1.135:7070/kylin/api/cubes/trinityAnalyticsCube/rebuild
>
>
>
> *EXCEPTION:*
>
>
>
> {"url":"http://ipaddress:7070/kylin/api/cubes/AnalyticsCube/rebuild","exception":"Merging
> segments must not have holes between 
> AnalyticsCube[2017070100_20170911192537]
> and AnalyticsCube[20170912102545_20170912131004]"}
>
>
>
>
>
>
>
> *From:* Billy Liu [mailto:billy...@apache.org]
> *Sent:* Sunday, October 08, 2017 1:52 PM
>
> *To:* user
> *Subject:* Re: How to merge cube segments with holes?
>
>
>
> Sorry, my fault. Just check the code again, the parameter should be
> "force", not "isForce".
>
>
>
> 2017-10-04 12:58 GMT+08:00 <prasann...@trinitymobility.com>:
>
> HI,
>
> I am using Kylin 1.6 and i tried with isForce=true parameter but its
> giving as com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
> Unrecognized field \"isForce\"
>
>
>
> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
> 'Content-Type: application/json' -d '{"startTime":'149886720',
> "endTime":'1507112233000',"buildType":"MERGE","isForce":true}'
> http://192.168.1.135:7070/kylin/api/cubes/trinityAnalyticsCube/rebuild
>
>
>
>
>
> {"url":"http://192.168.1.135:7070/kylin/api/cubes/AnalyticsCube/rebuild",;
> exception":"Could not read JSON: Unrecognized field \"isForce\" (class
> org.apache.kylin.rest.request.JobBuildRequest), not marked as ignorable
> (5 known properties: \"endTime\", \"force\", \"startTime\", \"buildType\",
> \"forceMergeEmptySegment\"])\n at [Source: org.apache.catalina.connector.
> CoyoteInputStream@61b3d534; line: 1, column: 87] (through reference
> chain: org.apache.kylin.rest.request.JobBuildRequest[\"isForce\"]);
> nested exception is 
> com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
> Unrecognized field \"isForce\" (class 
> org.apache.kylin.rest.request.JobBuildRequest),
> not marked as ignorable (5 known properties: \"endTime\", \"force\",
> \"startTime\", \"buildType\", \"forceMergeEmptySegment\"])\n at [Source:
> org.apache.catalina.connector.CoyoteInputStream@61b3d534; line: 1,
> column: 87] (through reference chain: org.apache.kylin.rest.request.
> JobBuildRequest[\"isForce\"])"}
>
>
>
>
>
> I tried with force=true parameter also, its giving as,
>
>
>
> {"url":"http://192.168.1.135:7070/kylin/api/cubes/
> trinityAnalyticsCube/rebuild","exception":"Merging segments must not have
> holes between trinityAnalyticsCube[2017070100_20170911192537] and
> trinityAnalyticsCube[20170912102545_20170912131004]"}
>
>
>
> Please suggest me which one  I have to use?
>
>
>
> *From:* Billy Liu [mailto:billy...@apache.org]
> *Sent:* Wednesday, October 04, 2017 4:44 AM
> *To:* user
> *Subject:* Re: How to merge cube segments with holes?
>
>
>
> First, the isForceMergeEmptySegment has been deprecated, use
> isForce instead is OK in your case.
>
> Second, the stacktrace shows the error is from the parameter serialization
> issue, not from the backend code. The error line1 column 15 may indicate
> the wrong parser in $marge_start_time. Could you replace this variable into
> the number and try again?
>
>
>
> 2017-10-03 22:34 GMT+08:30 <prasann...@trinitymobility.com>:
>
> I am using forceMergeEmptySegment parameter for merging empty segments,
> can please tell me the parameter for holes between segments.I tried what
> you suggested previous, but its giving wrong parameter as error. I am using
> the below for merging,
>
>
>
> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
> 'Content-Type: application/json' -d '{"startTime":'$marge_start_time',
> "endTime":'$merger_end_time',"buildType":"MERGE","
> forceMergeEmptySegment":true,"isForce&

Re: apache-kylin-2.1.0-bin-cdh57.tar.gz

2017-10-09 Thread Billy Liu
The CDH57 works for all CDH 5.7+

2017-10-10 2:35 GMT+08:00 Ruslan Dautkhanov :

> From the Downloads page I can only see  apache-kylin-2.1.0-bin-cdh*57*.
> tar.gz
> Are there more current builds .. like for CDH 5.*12*?
> If not, does it make sense to build Kylin myself? (not sure what could be
> an advantage
> to have a build specific to latest CDH build?)
>
> Thank you,
> Ruslan
>
>


Re: How to merge cube segments with holes?

2017-10-08 Thread Billy Liu
Sorry, my fault. Just check the code again, the parameter should be
"force", not "isForce".

2017-10-04 12:58 GMT+08:00 <prasann...@trinitymobility.com>:

> HI,
>
> I am using Kylin 1.6 and i tried with isForce=true parameter but its
> giving as com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
> Unrecognized field \"isForce\"
>
>
>
> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
> 'Content-Type: application/json' -d '{"startTime":'149886720',
> "endTime":'1507112233000',"buildType":"MERGE","isForce":true}'
> http://192.168.1.135:7070/kylin/api/cubes/trinityAnalyticsCube/rebuild
>
>
>
>
>
> {"url":"http://192.168.1.135:7070/kylin/api/cubes/AnalyticsCube/rebuild",;
> exception":"Could not read JSON: Unrecognized field \"isForce\" (class
> org.apache.kylin.rest.request.JobBuildRequest), not marked as ignorable
> (5 known properties: \"endTime\", \"force\", \"startTime\", \"buildType\",
> \"forceMergeEmptySegment\"])\n at [Source: org.apache.catalina.connector.
> CoyoteInputStream@61b3d534; line: 1, column: 87] (through reference
> chain: org.apache.kylin.rest.request.JobBuildRequest[\"isForce\"]);
> nested exception is 
> com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
> Unrecognized field \"isForce\" (class 
> org.apache.kylin.rest.request.JobBuildRequest),
> not marked as ignorable (5 known properties: \"endTime\", \"force\",
> \"startTime\", \"buildType\", \"forceMergeEmptySegment\"])\n at [Source:
> org.apache.catalina.connector.CoyoteInputStream@61b3d534; line: 1,
> column: 87] (through reference chain: org.apache.kylin.rest.request.
> JobBuildRequest[\"isForce\"])"}
>
>
>
>
>
> I tried with force=true parameter also, its giving as,
>
>
>
> {"url":"http://192.168.1.135:7070/kylin/api/cubes/
> trinityAnalyticsCube/rebuild","exception":"Merging segments must not have
> holes between trinityAnalyticsCube[2017070100_20170911192537] and
> trinityAnalyticsCube[20170912102545_20170912131004]"}
>
>
>
> Please suggest me which one  I have to use?
>
>
>
> *From:* Billy Liu [mailto:billy...@apache.org]
> *Sent:* Wednesday, October 04, 2017 4:44 AM
> *To:* user
> *Subject:* Re: How to merge cube segments with holes?
>
>
>
> First, the isForceMergeEmptySegment has been deprecated, use
> isForce instead is OK in your case.
>
> Second, the stacktrace shows the error is from the parameter serialization
> issue, not from the backend code. The error line1 column 15 may indicate
> the wrong parser in $marge_start_time. Could you replace this variable into
> the number and try again?
>
>
>
> 2017-10-03 22:34 GMT+08:30 <prasann...@trinitymobility.com>:
>
> I am using forceMergeEmptySegment parameter for merging empty segments,
> can please tell me the parameter for holes between segments.I tried what
> you suggested previous, but its giving wrong parameter as error. I am using
> the below for merging,
>
>
>
> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
> 'Content-Type: application/json' -d '{"startTime":'$marge_start_time',
> "endTime":'$merger_end_time',"buildType":"MERGE","
> forceMergeEmptySegment":true,"isForce":true}' http://192.168.1.61:7070/
> kylin/api/cubes/EnvironmentDetailsCube/rebuild.
>
>
>
> error as follows,
>
>
>
> {"url":"http://10.82.0.17:7070/kylin/api/cubes/CameraAlertCube/rebuild","exception":"Could
> not read JSON: Unexpected character (',' (code 44)): expected a valid value
> (number, String, array, object, 'true', 'false' or 'null')\n at [Source:
> org.apache.catalina.connector.CoyoteInputStream@10255ad; line: 1, column:
> 15]; nested exception is com.fasterxml.jackson.core.JsonParseException:
> Unexpected character (',' (code 44)): expected a valid value (number,
> String, array, object, 'true', 'false' or 'null')\n at [Source:
> org.apache.catalina.connector.CoyoteInputStream@10255ad; line: 1, column:
> 15]"}
>
>
>
>
>
> Please provide me proper way to do this..
>
>
>
>
>
> *From:* Billy Liu [mailto:billy...@apache.org]
> *Sent:* Tuesday, October 03, 2017 7:15 PM
> *To:* user
> *Subject:* Re: How to merge cube segments with holes?
>
>
>
> The merge API has one parameter named isForce in JobBuildRequest. Could
> you set it to true and try again?
>
>
>
> 2017-10-03 20:25 GMT+08:30 <prasann...@trinitymobility.com>:
>
> Hi all,
>
>
>
> Anybody have any idea about how to merge cube segments with hole. if I
> tried to merge its giving as,
>
>
>
>
>
> Merging segments must not have holes between 
> CameraAlertCube[2017092612_20170926154956]
> and CameraAlertCube[20170926161959_20170926165002]
>
>
>
>
>
> can you please help me how to solve this problem
>
>
>
>
>
> Thanks & Regards,
>
> Prasanna.P
>
>
>
>
>
>
>


Re: How to merge cube segments with holes?

2017-10-03 Thread Billy Liu
First, the isForceMergeEmptySegment has been deprecated, use
isForce instead is OK in your case.
Second, the stacktrace shows the error is from the parameter serialization
issue, not from the backend code. The error line1 column 15 may indicate
the wrong parser in $marge_start_time. Could you replace this variable into
the number and try again?

2017-10-03 22:34 GMT+08:30 <prasann...@trinitymobility.com>:

> I am using forceMergeEmptySegment parameter for merging empty segments,
> can please tell me the parameter for holes between segments.I tried what
> you suggested previous, but its giving wrong parameter as error. I am using
> the below for merging,
>
>
>
> /usr/bin/curl -b /home/hdfs/cookiefile.txt --user ADMIN:KYLIN -X PUT -H
> 'Content-Type: application/json' -d '{"startTime":'$marge_start_time',
> "endTime":'$merger_end_time',"buildType":"MERGE","
> forceMergeEmptySegment":true,"isForce":true}' http://192.168.1.61:7070/
> kylin/api/cubes/EnvironmentDetailsCube/rebuild.
>
>
>
> error as follows,
>
>
>
> {"url":"http://10.82.0.17:7070/kylin/api/cubes/CameraAlertCube/rebuild","exception":"Could
> not read JSON: Unexpected character (',' (code 44)): expected a valid value
> (number, String, array, object, 'true', 'false' or 'null')\n at [Source:
> org.apache.catalina.connector.CoyoteInputStream@10255ad; line: 1, column:
> 15]; nested exception is com.fasterxml.jackson.core.JsonParseException:
> Unexpected character (',' (code 44)): expected a valid value (number,
> String, array, object, 'true', 'false' or 'null')\n at [Source:
> org.apache.catalina.connector.CoyoteInputStream@10255ad; line: 1, column:
> 15]"}
>
>
>
>
>
> Please provide me proper way to do this..
>
>
>
>
>
> *From:* Billy Liu [mailto:billy...@apache.org]
> *Sent:* Tuesday, October 03, 2017 7:15 PM
> *To:* user
> *Subject:* Re: How to merge cube segments with holes?
>
>
>
> The merge API has one parameter named isForce in JobBuildRequest. Could
> you set it to true and try again?
>
>
>
> 2017-10-03 20:25 GMT+08:30 <prasann...@trinitymobility.com>:
>
> Hi all,
>
>
>
> Anybody have any idea about how to merge cube segments with hole. if I
> tried to merge its giving as,
>
>
>
>
>
> Merging segments must not have holes between 
> CameraAlertCube[2017092612_20170926154956]
> and CameraAlertCube[20170926161959_20170926165002]
>
>
>
>
>
> can you please help me how to solve this problem
>
>
>
>
>
> Thanks & Regards,
>
> Prasanna.P
>
>
>
>
>


Re: How to merge cube segments with holes?

2017-10-03 Thread Billy Liu
The merge API has one parameter named isForce in JobBuildRequest. Could you
set it to true and try again?

2017-10-03 20:25 GMT+08:30 :

> Hi all,
>
>
>
> Anybody have any idea about how to merge cube segments with hole. if I
> tried to merge its giving as,
>
>
>
>
>
> Merging segments must not have holes between 
> CameraAlertCube[2017092612_20170926154956]
> and CameraAlertCube[20170926161959_20170926165002]
>
>
>
>
>
> can you please help me how to solve this problem
>
>
>
>
>
> Thanks & Regards,
>
> Prasanna.P
>
>
>


Re: Re: Segment overlap when merge

2017-09-28 Thread Billy Liu
bin/metastore.sh backup
Remove the the wrong segment in in /cube/XXX.json
bin/metasatore.sh restore TO_META_DIR

2017-09-28 16:07 GMT+08:00 luoyong...@yaochufa.com <luoyong...@yaochufa.com>
:

> Is to clear the wrong data Of segment /cube/{cube_name}.json?
>
> {
>   "uuid" : "2fbca32a-a33e-4b69-83dd-0bb8b1f8c53b",
>   "last_modified" : 1506411023041,
>   "version" : "1.6.0",
>   "name" : "sales_cube",
>   "owner" : null,
>   "descriptor" : "sales_cube_desc",
>   "cost" : 50,
>   "status" : "READY",
>   "segments" : [ {
> "uuid" : "7f05f8c7-d848-49ea-9c22-ad1c15e24cdb",
> "name" : "2012010100_2013010100",
> "storage_location_identifier" : "KYLIN_MJWNSKHD2B",
> "date_range_start" : 132537600,
> "date_range_end" : 135699840,
> "source_offset_start" : 0,
> "source_offset_end" : 0,
> "status" : "READY",
> "size_kb" : 5744,
> "input_records" : 4957,
> "input_records_size" : 104480,
> "last_build_time" : 1506396911691,
> "last_build_job_id" : "735ee8d3-f83a-4c6f-a81c-da30f08e990f",
> "create_time_utc" : 1506338600562,
> "cuboid_shard_nums" : { },
> "total_shards" : 1,
> "blackout_cuboids" : [ ],
> "binary_signature" : null,
> "dictionaries" : {
>   "DEFAULT.KYLIN_CATEGORY_GROUPINGS/CATEG_LVL2_NAME" : "
> /dict/DEFAULT.KYLIN_CATEGORY_GROUPINGS/CATEG_LVL2_NAME/
> e3bf3894-4355-49c9-915f-0ef8f305f073.dict",
>   "DEFAULT.KYLIN_SALES/REGION" : "/dict/DEFAULT.
> KYLIN_SALES/REGION/a9c75c6f-f6f8-4cb2-94d2-ed30007ada27.dict",
>   "DEFAULT.KYLIN_SALES/USER_ID" : "/dict/DEFAULT.
> KYLIN_SALES/USER_ID/ca95744f-4428-40de-b66f-c9b7487395a5.dict",
>   "DEFAULT.KYLIN_SALES/LEAF_CATEG_ID" : "/dict/DEFAULT.KYLIN_CATEGORY_
> GROUPINGS/LEAF_CATEG_ID/b63352b8-f71f-4810-a2bb-aced5bea7e69.dict",
>   "DEFAULT.KYLIN_SALES/SELLER_ID" : "/dict/DEFAULT.
> KYLIN_SALES/SELLER_ID/cc240f80-0718-4e0d-b13c-dae70c8d09cf.dict",
>   "DEFAULT.KYLIN_CATEGORY_GROUPINGS/META_CATEG_NAME" : "
> /dict/DEFAULT.KYLIN_CATEGORY_GROUPINGS/META_CATEG_NAME/
> f6080fa6-0fde-4683-862c-08f63f4dfbe3.dict",
>   "DEFAULT.KYLIN_CATEGORY_GROUPINGS/CATEG_LVL3_NAME" : "
> /dict/DEFAULT.KYLIN_CATEGORY_GROUPINGS/CATEG_LVL3_NAME/
> 05f8e3e3-c422-4659-b80f-64019f21e2f8.dict",
>   "DEFAULT.KYLIN_SALES/LSTG_SITE_ID" : "/dict/DEFAULT.KYLIN_CATEGORY_
> GROUPINGS/SITE_ID/9f7cfc71-ccf7-4e20-8d87-85e7146a0a64.dict"
> },
> "snapshots" : {
>   "DEFAULT.KYLIN_CAL_DT" : "/table_snapshot/DEFAULT.
> KYLIN_CAL_DT/a291282d-6022-4410-8d22-352ae6741107.snapshot",
>   "DEFAULT.KYLIN_CATEGORY_GROUPINGS" : "/table_snapshot/
> DEFAULT.KYLIN_CATEGORY_GROUPINGS/b3875fd4-44a5-4100-
> bf08-7d54e307cf8b.snapshot"
> },
> "rowkey_stats" : [ [ "LEAF_CATEG_ID", 134, 1 ], [ "
> META_CATEG_NAME", 44, 1 ], [ "CATEG_LVL2_NAME", 94, 1 ], [ "
> CATEG_LVL3_NAME", 127, 1 ], [ "USER_ID", 3, 1 ], [ "REGION",
>  3, 1 ], [ "LSTG_SITE_ID", 8, 1 ], [ "SELLER_ID", 996, 2 ] ]
>   } ],
>   "create_time_utc" : 0,
>   "size_kb" : 5744,
>   "input_records_count" : 4957,
>   "input_records_size" : 104480
> }
> --
> luoyong...@yaochufa.com
>
>
> *发件人:* luoyong...@yaochufa.com
> *发送时间:* 2017-09-28 12:38
> *收件人:* user <user@kylin.apache.org>
> *主题:* Re: Re: Segment overlap when merge
> How to Remove that segment and restore the metadata !
>
> --
> luoyong...@yaochufa.com
>
>
> *From:* Billy Liu <billy...@apache.org>
> *Date:* 2017-09-28 12:36
> *To:* user <user@kylin.apache.org>
> *Subject:* Re: Segment overlap when merge
> The metadata has the wrong segments already. If you dump out the metadata,
> should figure out one of that segments is invalid. Remove that segment and
> restore the metadata.
>
> 2017-09-28 11:59 GMT+08:00 luoyong...@yaochufa.com <
> luoyong...@yaochufa.com>:
>
>>
>> Hi:
>>   kylin causes an exception(Segment overlap)  when merge !
>>   The segment has two 2017020800_2017020900 segment(in the
>> picture),it can not merge (cause Segment overlap)!
>>Is this a bug?
>>How to solve !
>>
>>
>> --
>> luoyong...@yaochufa.com
>>
>
>


Re: Segment overlap when merge

2017-09-27 Thread Billy Liu
The metadata has the wrong segments already. If you dump out the metadata,
should figure out one of that segments is invalid. Remove that segment and
restore the metadata.

2017-09-28 11:59 GMT+08:00 luoyong...@yaochufa.com 
:

>
> Hi:
>   kylin causes an exception(Segment overlap)  when merge !
>   The segment has two 2017020800_2017020900 segment(in the
> picture),it can not merge (cause Segment overlap)!
>Is this a bug?
>How to solve !
>
>
> --
> luoyong...@yaochufa.com
>


Re: Kylin and SuperSet

2017-09-12 Thread Billy Liu
Kylin is driven by the community. If some proposal or patch working on this
task, the new feature will be ready soon. The community will be excited if
we have this integration, but no actual effort yet. I saw some prototypes
on the integration between Apache Kylin and SuperSet on some customer's
cases. But it's not available on Kylin repo right now.

2017-09-13 7:10 GMT+08:00 Alberto Ramón :

> Hi
>
> Will be an official support of Apache Kylin on Apache SuperSet?
>
>
>


Re: Query across different MODEL/CUBES

2017-09-06 Thread Billy Liu
It sounds like a HINT in SQL.

2017-09-06 17:51 GMT+08:00 Yuxiang Mai :

> Hi,all
>
> We have some questions about query across different model/cubes.  We know
> that kylin will evaluate the cost and select the best cube to query. In
> our usage, this worked very good. But if we add more models with filter,
> the problem comes.
>
> Here is our scenario, 2 model comes from the same fact table: table_A.
>
> Model test1 without any filter.
> cube1 with dimension A,B,C,D,E
>
> Model test1 with filter xxx = yyy.
> cube2 with dimension A
>
>
> If we query select count(1) from table_A; Kylin engine will route the
> query to cube2. But our target will be cube1.
>
> I wonder if we can specify the model when we make the query? Because if
> someone by mistake create a cube with overlaping
>  dimension in a new model with filter conditions. It will impact others.
> In our usage, we temporarily limit only 1 model for 1 project.
>
> So, im sum, my question is:
> I wonder if we can specify the model when we make the query?
>
>
> Thanks.
>
>
> --
> Yuxiang Mai
>
>


Re: different mr config for different project or cube

2017-09-05 Thread Billy Liu
You could override the default MR config on the project level or cube level
through the GUI, not the file.

2017-09-06 10:11 GMT+08:00 yu feng :

> I remember Kylin support use different mr config file for different
> project, like KYLIN-1706
>  and KYLIN-1706
>   However, I do not
> know how to use it in kylin-2.0.0.
>
> it would be appreciated if anyone can show me how to do it.  Thanks a lot.
>


Re: Can KYLIN support select if ... from SQL?

2017-09-01 Thread Billy Liu
The drawback is the performance.

Kylin is pre-calculation engine, which need define the dimensions and
measure before queries. In your case, the count distinct statement will be
evaluated at the query running time,  not from the pre-calculated cuboid
directly .

2017-09-01 23:41 GMT+08:00 Yuxiang Mai <yuxiang@gmail.com>:

> Thanks..
>
> We will try it.
>
> BTW, any drawbacks if we directly use in a where condition?
>
>
>
> On Fri, Sep 1, 2017 at 11:37 PM, Billy Liu <billy...@apache.org> wrote:
>
>> The suggested way is to define new column for the if statement, for
>> example, in a Hive view.
>>
>> 2017-09-01 23:08 GMT+08:00 Yuxiang Mai <yuxiang@gmail.com>:
>>
>>> Hi, experts
>>>
>>> We are using KYLIN for a Hive table with 3 columns with binary values: 1
>>> for true, 0 for false. For example: is_new, is_recommend, is_discount.
>>> We wonder if KYLIN the select if .. from ... like the following:
>>>
>>> hive> select shop, count(distinct if(is_new =1, item_id, NULL)) from
>>> table where dt='xxx' group by shop;
>>>
>>> or are we mandatory to use where condition is_new=1 ?
>>>
>>>
>>> Thanks for your reply.
>>>
>>>
>>> --
>>> Yuxiang Mai
>>>
>>>
>>
>
>
> --
> Yuxiang Mai
> Sun Yat-Sen Unitversity
> State Key Lab of Optoelectronic
> Materials and Technologies
>


Re: 答复: where OPS_REGION=lower('Shanghai') ERROR

2017-08-31 Thread Billy Liu
Could you file a JIRA to track this issue?

2017-08-24 11:44 GMT+08:00 程 万胜 <chivise...@hotmail.com>:

> it's fine if no LOWER function used!
>
> --
> *发件人:* Billy Liu <billy...@apache.org>
> *发送时间:* 2017年8月24日 10:44
> *收件人:* user
> *主题:* Re: where OPS_REGION=lower('Shanghai') ERROR
>
> What happened if no LOWER function used?
>
> 2017-08-21 11:05 GMT+08:00 程 万胜 <chivise...@hotmail.com>:
>
>> hello all:
>>
>>
>> error log:
>>
>>
>> 2017-08-21 10:26:32,624 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> service.QueryService:366 : Using project: learn_kylin
>> 2017-08-21 10:26:32,625 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> service.QueryService:367 : The original query:  select * from KYLIN_SALES
>> where OPS_REGION=lower('Shanghai')
>> 2017-08-21 10:26:32,630 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> service.QueryService:493 : The corrected query: select * from KYLIN_SALES
>> where OPS_REGION=lower('Shanghai')
>> LIMIT 5
>> 2017-08-21 10:26:32,632 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> schema.OLAPSchemaFactory:116 : Schema json:{
>> "version": "1.0",
>> "defaultSchema": "DEFAULT",
>> "schemas": [
>> {
>> "type": "custom",
>> "name": "DEFAULT",
>> "factory": "org.apache.kylin.query.schema.OLAPSchemaFactory",
>> "operand": {
>> "project": "LEARN_KYLIN"
>> },
>> "functions": [
>>{
>>name: 'PERCENTILE',
>>className: 'org.apache.kylin.measure.perc
>> entile.PercentileAggFunc'
>>},
>>{
>>name: 'INTERSECT_COUNT',
>>className: 'org.apache.kylin.measure.bitm
>> ap.BitmapIntersectDistinctCountAggFunc'
>>},
>>{
>>name: 'MASSIN',
>>className: 'org.apache.kylin.query.udf.MassInUDF'
>>},
>>{
>>name: 'CONCAT',
>>className: 'org.apache.kylin.query.udf.ConcatUDF'
>>},
>>{
>>name: 'VERSION',
>>className: 'org.apache.kylin.query.udf.VersionUDF'
>>}
>> ]
>> }
>> ]
>> }
>> 2017-08-21 10:26:32,685 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> routing.QueryRouter:56 : Find candidates by table DEFAULT.KYLIN_SALES and
>> project=LEARN_KYLIN : CUBE[name=kylin_sales_cube]
>> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> routing.QueryRouter:51 : Applying rule: class org.apache.kylin.query.routing
>> .rules.RemoveBlackoutRealizationsRule, realizations before:
>> [kylin_sales_cube(CUBE)], realizations after: [kylin_sales_cube(CUBE)]
>> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> routing.QueryRouter:51 : Applying rule: class org.apache.kylin.query.routing
>> .rules.RemoveUncapableRealizationsRule, realizations before:
>> [kylin_sales_cube(CUBE)], realizations after: [kylin_sales_cube(CUBE)]
>> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> rules.RealizationSortRule:40 : CUBE[name=kylin_sales_cube] priority 1 cost
>> 83600.
>> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> routing.QueryRouter:51 : Applying rule: class 
>> org.apache.kylin.query.routing.rules.RealizationSortRule,
>> realizations before: [kylin_sales_cube(CUBE)], realizations after:
>> [kylin_sales_cube(CUBE)]
>> 2017-08-21 10:26:32,687 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> routing.QueryRouter:68 : The realizations remaining:
>> [kylin_sales_cube(CUBE)] And the final chosen one is the first one
>> 2017-08-21 10:26:32,714 DEBUG [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> enumerator.OLAPEnumerator:109 : query storage...
>> 2017-08-21 10:26:32,714 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> cube.RawQueryLastHacker:42 : No group by and aggregation found in this
>> query, will hack some result for better look of output...
>> 2017-08-21 10:26:32,714 WARN  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
>> cube.RawQueryLastHacker:

Re: count_distinct_if, count_if and sum_if

2017-08-28 Thread Billy Liu
This kinds of contribution is very welcomed. Could you send your issue to
dev mailer, the community will help you.

2017-08-28 21:40 GMT+08:00 Alexander Sterligov :

> Hi!
>
> It would be nice to have if-functions in kylin. What do you think?
>
> I've already implemented count_distinct_if, but I have problems adding new
> functions for BasicMeasureType.
>
> Best regards,
> Alexander Sterligov
>


Re: where OPS_REGION=lower('Shanghai') ERROR

2017-08-23 Thread Billy Liu
What happened if no LOWER function used?

2017-08-21 11:05 GMT+08:00 程 万胜 :

> hello all:
>
>
> error log:
>
>
> 2017-08-21 10:26:32,624 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> service.QueryService:366 : Using project: learn_kylin
> 2017-08-21 10:26:32,625 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> service.QueryService:367 : The original query:  select * from KYLIN_SALES
> where OPS_REGION=lower('Shanghai')
> 2017-08-21 10:26:32,630 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> service.QueryService:493 : The corrected query: select * from KYLIN_SALES
> where OPS_REGION=lower('Shanghai')
> LIMIT 5
> 2017-08-21 10:26:32,632 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> schema.OLAPSchemaFactory:116 : Schema json:{
> "version": "1.0",
> "defaultSchema": "DEFAULT",
> "schemas": [
> {
> "type": "custom",
> "name": "DEFAULT",
> "factory": "org.apache.kylin.query.schema.OLAPSchemaFactory",
> "operand": {
> "project": "LEARN_KYLIN"
> },
> "functions": [
>{
>name: 'PERCENTILE',
>className: 'org.apache.kylin.measure.
> percentile.PercentileAggFunc'
>},
>{
>name: 'INTERSECT_COUNT',
>className: 'org.apache.kylin.measure.bitmap.
> BitmapIntersectDistinctCountAggFunc'
>},
>{
>name: 'MASSIN',
>className: 'org.apache.kylin.query.udf.MassInUDF'
>},
>{
>name: 'CONCAT',
>className: 'org.apache.kylin.query.udf.ConcatUDF'
>},
>{
>name: 'VERSION',
>className: 'org.apache.kylin.query.udf.VersionUDF'
>}
> ]
> }
> ]
> }
> 2017-08-21 10:26:32,685 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> routing.QueryRouter:56 : Find candidates by table DEFAULT.KYLIN_SALES and
> project=LEARN_KYLIN : CUBE[name=kylin_sales_cube]
> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> routing.QueryRouter:51 : Applying rule: class org.apache.kylin.query.
> routing.rules.RemoveBlackoutRealizationsRule, realizations before:
> [kylin_sales_cube(CUBE)], realizations after: [kylin_sales_cube(CUBE)]
> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> routing.QueryRouter:51 : Applying rule: class org.apache.kylin.query.
> routing.rules.RemoveUncapableRealizationsRule, realizations before:
> [kylin_sales_cube(CUBE)], realizations after: [kylin_sales_cube(CUBE)]
> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> rules.RealizationSortRule:40 : CUBE[name=kylin_sales_cube] priority 1 cost
> 83600.
> 2017-08-21 10:26:32,686 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> routing.QueryRouter:51 : Applying rule: class org.apache.kylin.query.
> routing.rules.RealizationSortRule, realizations before:
> [kylin_sales_cube(CUBE)], realizations after: [kylin_sales_cube(CUBE)]
> 2017-08-21 10:26:32,687 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> routing.QueryRouter:68 : The realizations remaining:
> [kylin_sales_cube(CUBE)] And the final chosen one is the first one
> 2017-08-21 10:26:32,714 DEBUG [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> enumerator.OLAPEnumerator:109 : query storage...
> 2017-08-21 10:26:32,714 INFO  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> cube.RawQueryLastHacker:42 : No group by and aggregation found in this
> query, will hack some result for better look of output...
> 2017-08-21 10:26:32,714 WARN  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> cube.RawQueryLastHacker:73 : SUM is not defined for measure column
> SELLER_ACCOUNT:DEFAULT.KYLIN_ACCOUNT.ACCOUNT_ID, output will be
> meaningless.
> 2017-08-21 10:26:32,715 WARN  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> cube.RawQueryLastHacker:73 : SUM is not defined for measure column
> BUYER_ACCOUNT:DEFAULT.KYLIN_ACCOUNT.ACCOUNT_ID, output will be
> meaningless.
> 2017-08-21 10:26:32,715 WARN  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> cube.RawQueryLastHacker:73 : SUM is not defined for measure column
> DEFAULT.KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID, output will be
> meaningless.
> 2017-08-21 10:26:32,715 WARN  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> cube.RawQueryLastHacker:73 : SUM is not defined for measure column
> DEFAULT.KYLIN_CATEGORY_GROUPINGS.USER_DEFINED_FIELD3, output will be
> meaningless.
> 2017-08-21 10:26:32,715 WARN  [Query 4bb70ccb-68e7-4b11-965f-e3d8e8bfd5d9-82]
> cube.RawQueryLastHacker:73 : SUM is not defined for measure column
> DEFAULT.KYLIN_CATEGORY_GROUPINGS.USER_DEFINED_FIELD1, output will be
> meaningless.
> 2017-08-21 10:26:32,716 WARN  [Query 

  1   2   >