from:"ShaoFeng Shi"

Re: org.apache.hadoop.hive.ql.metadata.HiveException

2016-01-04 Thread ShaoFeng Shi

Hi 和风, screenshot is search engine unfriendly, please use text as much as
possible

2016-01-05 14:34 GMT+08:00 hongbin ma <mahong...@apache.org>:

> can't see attachment. please provide detailed log
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: error when insert data @@@@@create new cube

2015-12-31 Thread ShaoFeng Shi

8019788229
>
> *：上海徐汇区凯滨路166号平安大厦B座9楼
>
> **
>
>
>
>
>
>
>
>
>
> 
> The information in this email is confidential and may be legally
> privileged. If you have received this email in error or are not the
> intended recipient, please immediately notify the sender and delete this
> message from your computer. Any use, distribution, or copying of this email
> other than by the intended recipient is strictly prohibited. All messages
> sent to and from us may be monitored to ensure compliance with internal
> policies and to protect our business.
> Emails are not secure and cannot be guaranteed to be error free as they
> can be intercepted, amended, lost or destroyed, or contain viruses. Anyone
> who communicates with us by email is taken to accept these risks.
>
> 收发邮件者请注意：
> 本邮件含保密信息，若误收本邮件，请务必通知发送人并直接删去，不得使用、传播或复制本邮件。
> 进出邮件均受到本公司合规监控。邮件可能发生被截留、被修改、丢失、被破坏或包含计算机病毒等不安全情况。
>
> 
>



-- 
Best regards,

Shaofeng Shi

Re: Select * not returning any rows

2015-12-31 Thread ShaoFeng Shi

This is expected; So far the result in Kylin cube is aggregated. But there
is a plan to support raw record query, please check:
https://issues.apache.org/jira/browse/KYLIN-1122

2015-12-31 15:35 GMT+08:00 Kiriti Sai <kiriti163.i...@gmail.com>:

> Hi,
> Thank you Hongbin Ma for suggestion to upgrade.
> I've just now changed the version of Kylin to 1.2 and I'm able to see some
> results when I perform 'select *' on the tables. But the number of results
> returned are not exactly as expected.
> In my case, I have 100 rows in a table, some of which might be repeated.
> But when I'm using select *, it's kind of performing group by over all the
> columns in the cube and returning very few rows.
> Is this behavior intended or a bug?
> If the second, please let me know if there is a way to correct it.
>
> Thank you guys for responding quickly on new year's eve.
> On Dec 31, 2015 4:04 PM, "Shi, Shaofeng" <shao...@ebay.com> wrote:
>
> > This “how to_upgrade” is specifically for v0.6 to v0.7 upgrade, not for
> > other versions;
> >
> > Between v0.7 to v1.2, the metadata is compatible; What user need do is
> > just backup and restore the $KYLIN_HOME/conf folder after switch to a new
> > Kylin binary; All the cube metadata and cube data are in HBase so no need
> > to re-create or rebuild;
> >
> > On 12/31/15, 2:49 PM, "250635...@qq.com" <250635...@qq.com> wrote:
> >
> > >Have you checked out this one ?
> > >http://kylin.apache.org/docs/howto/howto_upgrade.html
> > >
> > >
> > >
> > >
> > >250635...@qq.com
> > >
> > >From: Kiriti Sai
> > >Date: 2015-12-31 14:36
> > >To: dev
> > >Subject: Re: Select * not returning any rows
> > >Since I'm just working with binaries, can you please explain how to
> > >upgrade
> > >from v1.1 to v1.2?
> > >Should I just extract and replace the whole folder or should I backup
> and
> > >restore the data also in some way?
> > >
> > >Thank you for the immediate response. :)
> > >On Dec 31, 2015 3:25 PM, "hongbin ma" <mahong...@apache.org> wrote:
> > >
> > >> i believe this issue has been fixed in v1.2, why not use the latest
> > >> version?
> > >>
> > >>
> > >> --
> > >> Regards,
> > >>
> > >> *Bin Mahone | 马洪宾*
> > >> Apache Kylin: http://kylin.io
> > >> Github: https://github.com/binmahone
> > >>
> >
> >
>



-- 
Best regards,

Shaofeng Shi

Re: org.apache.hadoop.hive.ql.metadata.HiveException

2016-01-05 Thread ShaoFeng Shi

50)
> ... 21 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
> ... 31 more
> Caused by: java.lang.NoClassDefFoundError:
> javax/jdo/JDOObjectNotFoundException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:274)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getClass(MetaStoreUtils.java:1489)
> at
> org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:63)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:66)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:199)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:181)
> at
> org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.(HiveClientCache.java:330)
> ... 36 more
> Caused by: java.lang.ClassNotFoundException:
> javax.jdo.JDOObjectNotFoundException
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 50 more
>
>
>
>
>
>
> -- 原始邮件 --
> 发件人: "ShaoFeng Shi";<shaofeng...@apache.org>;
> 发送时间: 2016年1月5日(星期二) 下午3:21
> 收件人: "dev"<dev@kylin.apache.org>;
>
> 主题: Re: org.apache.hadoop.hive.ql.metadata.HiveException
>
>
>
> Hi hefeng,
>
> It seems the hcatalog jars doesn't exist on your hadoop node; The solution
> is to upload the jar files to an HDFS folder, and then set  that patch as
> the value of "kylin.job.mr.lib.dir" in kylin.properties, you can checkout
> this JIRA: https://issues.apache.org/jira/browse/KYLIN-1021
>
> In our env, the "kylin.job.mr.lib.dir" folder has the following 4 jar
> files, just for your reference:
> hive-common-xx.jar
> hive-exec-xx.jar
> hive-hcatalog-core-xx.jar
> hive-metastore-xx.jar
>
> Here "xx" means the version number;
>
> Just take a try and let us know whether it works.
>
> 2016-01-05 14:47 GMT+08:00 和风 <363938...@qq.com>:
>
> > Thanks for your help. error logs:
> > [pool-7-thread-1]:[2016-01-05
> >
> 14:45:31,312][INFO][org.apache.kylin.job.manager.ExecutableManager.updateJobOutput(ExecutableManager.java:241)]
> > - job id:d0e2f259-9541-4b6f-9f54-c502781549e2-00 from RUNNING to SUCCEED
> > [pool-7-thread-1]:[2016-01-05
> >
> 14:45:31,438][INFO][org.apache.kylin.job.manager.ExecutableManager.updateJobOutput(ExecutableManager.java:241)]
> > - job id:d0e2f259-9541-4b6f-9f54-c502781549e2 from RUNNING to READY
> > [pool-6-thread-1]:[2016-01-05
> >
> 14:45:31,483][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:102)]
> > - CubingJob{id=d0e2f259-9541-4b6f-9f54-c502781549e2,
> name=learn_kylin_four
> > - 2015020100_2015122900 - BUILD - GMT-08:00 2016-01-04 22:44:05,
> > state=READY} prepare to schedule
> > [pool-6-thread-1]:[2016-01-05
> >
> 14:45:31,484][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:106)]
> > - CubingJob{id=d0e2f259-9541-4b6f-9f54-c50278154

Re: derived dimension

2016-01-07 Thread ShaoFeng Shi

The FK/PK between fact and lookup table can be found in the data model
descriptor.

2016-01-08 3:11 GMT+08:00 Zhang, Zhong <zzh...@cardlytics.com>:

> Hi Hongbin,
>
> For the table "KYLIN_CAL_DT", the primary key is "CAL_DT" and
> The foreign_key is "PART_DT".
>
> For the table "KYLIN_CATEGORY_GROUPINGS", the primary
> Key is "LEAF_CATEG_ID" and "SITE_ID", the foreign key
> Is "LEAF_CATEG_ID" and "LSTG_SITE_ID".
>
> Best regards,
> Zhong
>
> -Original Message-
> From: Zhang, Zhong [mailto:zzh...@cardlytics.com]
> Sent: Thursday, January 07, 2016 1:53 PM
> To: dev@kylin.apache.org
> Subject: RE: derived dimension
>
> Hi Hongbin,
>
> Thanks so so so ... much for your kind help.
> The following is my understanding based on your excellent explanation:
> Since both the foreign key and "WEEK_BEG_DT" are in the cube and
> "WEEK_BEG_DT" can be derived from the foreign key, we mark the column
> "WEEK_BEG_DT" as a derived dimension in the UI. The same case happens for
> "USER_DEFINED_FIELD1","USER_DEFINED_FIELD3","UPD_DATE" and "UPD_USER"
> columns. Can I ask which column is the foreign key for the table
> "KYLIN_CAL_DT"?
>
> The following is another understanding based on reference link [1]. In [1]
> (page 10), "Dimensions on lookup table that can be derived by PK." It seems
> that the primary Key is the column that other columns are derived from.
> Back to the sample cube example, since both the primary key and
> "WEEK_BEG_DT" are in the cube and "WEEK_BEG_DT" can be derived from the
> primary key, we mark the column "WEEK_BEG_DT" as a derived dimension in the
> UI. I assume the primary key in the table "KYLIN_CAL_DT" is "CAL_DT"?
>
> Please help me verify the above two explanations, thanks a million.
>
> [1]
> http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin?next_slideshow=5
>
> Best regards,
> Zhong
>
> -Original Message-
> From: hongbin ma [mailto:mahong...@apache.org]
> Sent: Thursday, January 07, 2016 7:30 AM
> To: dev@kylin.apache.org
> Subject: Re: derived dimension
>
> if the dimension's not explicitly specifying, FK is the column that's
> derived from.
>
> On Thu, Jan 7, 2016 at 11:15 AM, Zhang, Zhong <zzh...@cardlytics.com>
> wrote:
>
> > Hi All,
> >
> > I'm confused by the derived dimension. The following two sentences are
> > the source that I found online to guide me use derived dimension. It's
> > kind of unclear to me.
> >
> > Dimensions on lookup table that can be derived by PK.
> > -like User ID derives [Name, Age, Gender] from [1] at page 10
> >
> > Given a value in DimA, the value of DimB is determined, so we say dimB
> > can be derived from DimA. When we build a cube that contains both DimA
> > and DimB, we simple include DimA, and marking DimB as Derived.
> > from [2]
> >
> > Let us use the sample cube "kylin_sales_cube" as the example to
> > discuss it. There are two derived dimensions: CAL_DT and CATEGORY.
> > In CAL_DT, which column derives WEEK_BEG_DT?
> > In CATEGORY, which column derives
> > "USER_DEFINED_FIELD1","USER_DEFINED_FIELD3","UPD_DATE","UPD_USER"?
> >
> > Is derived dimension used only in lookup table?
> >
> > [1]
> > http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin?next_sl
> > ideshow=5
> > [2]
> > https://mail-archives.apache.org/mod_mbox/incubator-kylin-dev/201507.m
> > box/%3c82073af5-ae07-4441-bfdf-e8b9d36ff...@163.com%3E
> >
> > Best regards,
> > Zhong
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: Welcome new Apache Kylin committer: Yu Feng

2016-01-08 Thread ShaoFeng Shi

Welcome Feng Yu!

2016-01-09 0:46 GMT+08:00 yu feng <olaptes...@gmail.com>:

> Thanks to all of you, I am very glad to became a committer of Apache Kylin,
> and thank the community for recognition.
>
> I am working in Hangzhou Research Institute of Netease(www.163.com),one of
> the earliest Internet companies in China. I am  focus on Kylin in the past
> six months and do some changes to adapt to our environment, I have done
> some patchs and a new feature and contributed them to kylin community. In
> my spare time, I like to communicate with others about kylin in mail list.
> Besides OLAP and Big Data, I'm interested in distributed storage and NoSQL
> system.
>
> Currently we are using kylin provides fast and stable OLAP analysis
> services to multiple products in our company, we choose kylin because it
> has a lot of advantages, including ease to use, low latency of query
> ,supports standard SQL etc. At present, our users are very satisfied with
> the performance of kylin too.
>
> I am very proud to be a committer of Apache Kylin, and I will always do my
> best to make contributions for Kylin community!
>
>
> 2016-01-08 23:46 GMT+08:00 Dong Li <lid...@apache.org>:
>
> > Welcome!
> >
> > Thanks,
> > Dong Li
> >
> > 2016-01-08 23:28 GMT+08:00 Luke Han <luke...@apache.org>:
> >
> > > I am very pleased to announce that the Project Management Committee
> > > (PMC) of Apache Kylin has asked Yu Feng to becomeApache Kylin
> committer,
> > > and she has already accepted.
> > >
> > > Yu has already made many contributions to Kylin community, to answer
> > > others questions activity, submit patches for bug fixes and
> contributing
> > a
> > > great
> > > feature about multi-hive source from different cluster.
> > >
> > > Please join me to welcome Yu.
> > >
> > > @Yu, please share with us a little about yourself.
> > >
> > > Luke Han
> > >
> > > On behalf of the Apache Kylin PPMC
> > >
> >
> >
> >
> > --
> > Thanks,
> > Dong
> >
>



-- 
Best regards,

Shaofeng Shi

[RESULT] [VOTE] Release apache-kylin-1.2

2015-12-20 Thread ShaoFeng Shi

This vote passes with 6 +1s and no 0 or -1 votes:

+1 Shaofeng Shi (binding)
+1 Jason Zhong (binding)
+1 Luke Han (binding)
+1 Hua Huang (binding)
+1 Xiaoyu Wang (binding)
+1 Dong Li (non-binding)

Thanks everyone. We’ll now roll the release out to the mirrors.

Shaofeng Shi, on behalf of Apache Kylin PPMC

Re: Welcome new Apache Kylin committer: Luwei Chen

2015-12-30 Thread ShaoFeng Shi

Welcome Luwei!

2015-12-30 21:52 GMT+08:00 hongbin ma <mahong...@apache.org>:

> welcome!
>
> On Wed, Dec 30, 2015 at 9:51 PM, 李栋 <lidong_s...@126.com> wrote:
>
> > Welcome Luwei!
> >
> > Dong Li
> >
> >
> > 发自 网易邮箱大师
> >
> >
> >
> > On 2015-12-30 21:47 , Luke Han Wrote:
> >
> > I am very pleased to announce that the Project Management Committee
> > (PMC) of Apache Kylin has asked Luwei Chen to become Apache Kylin
> committer
> > ,
> > and she has already accepted.
> >
> > Luwei has already made many contribution to Kylin community, about
> website,
> > documentation, UI and others as well.
> >
> > Welcome Luwei, our first female committer:)
> > Please share with us a little about yourself,
> >
> > Luke
> >
> > On behalf of the Apache Kylin PPMC
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: [Draft][REPORT] Apache Kylin - Jun 2016

2016-06-04 Thread ShaoFeng Shi

This is a very comprehensive summary, thanks Luke;

2016-06-05 7:52 GMT+08:00 Li Yang <liy...@apache.org>:

> Covers everything I know!  :-)
>
> On Wed, Jun 1, 2016 at 11:23 PM, Luke Han <luke...@apache.org> wrote:
>
> > Dear community,
> >  I have drafted below board report for review, please help to check
> and
> > let me know if there's any issue.
> >  Feel free to reply here if there's more activities, community
> > development and so on which should included in this report.
> >
> >  Will submit this report to board later.
> >
> >  Thanks.
> >
> > Luke
> >
> >
> > ## Description:
> > ===
> > Apache Kylin is an open source Distributed Analytics Engine designed
> > to provide SQL interface and multi-dimensional analysis (OLAP) on
> > Hadoop supporting extremely large datasets.
> >
> > ## Issues:
> > ==
> > - there are no issues requiring board attention at this time
> >
> > ## Activity:
> > ====
> > - Mailing list, JIRA, and commit activity are at or above average
> > - Yang Li presented Kylin new archiecture at Hadoop Summit EU
> > in Dublin on 2016-04-13
> > - Shaofeng Shi presented Kylin deployment practices at ITA2014
> > Big Data Event in Beijing
> > on 2016-04-22
> > - Apache Kylin meetup Beijing hosted on 2016-04-23 in Beijing,
> > engaged more than 200 participants, with 6 sessions from Luke Han,
> > Xiaoyu Wang, Yerui Sun, Dong Wang, Lei Zhao and Shaofeng Shi.
> > - Luke Han presented Kylin at Apache Big Data 2016 NA in Vancouver
> > on 2016-05-09
> > - Luke Han presented Kylin community pracitices at ApacheCon 2016
> > NA in Vancouver on 2016-05-13
> > - Hongbin Ma presented performance topic at HBaseCon 2016 in San
> > Francisco on 2016-05-24
> >
> > ## Community:
> > =
> > - 1 committers and 1 PMC members appointed after last report.
> > - Messages on the dev mailing list after last report: 1234
> > - Messages on the user mailing list after last report: 335
> > - 365 JIRA tickets created after last report
> > - 510 JIRA tickets closed/resolved after last report
> >
> > ## Releases:
> > 
> > - The next generation of Kylin, v1.5.0, released on 2016-03-12
> > - v1.5.1, released on 2016-04-13
> > - The latest release, v1.5.2, released on 2016-05-26
> >
>



-- 
Best regards,

Shaofeng Shi

Re: [Announce] Apache Kylin 1.5.2.1 released

2016-06-08 Thread ShaoFeng Shi

I checked other project's announcement letter; yes you're right we're lack
of a project brief introduction; will add that on next time, Thanks for the
suggestion!

2016-06-08 18:07 GMT+08:00 sebb <seb...@gmail.com>:

> What is the project about? Why should I be interested in it?
>
> The Announce emails are sent to people not on the developer or user lists.
> Most will have no idea what the project is about.
>
> So the e-mails should contain at least brief details of what the
> product does, and some info on why the new release might be of
> interest to them.
>
> Readers should not have to click the link to find out the basic information
> (although of course it is useful to have such links for further detail).
>
> Please can you add that information to future announce mails?
>
> Thanks.
>
>
> On 8 June 2016 at 03:09, ShaoFeng Shi <shaofeng...@apache.org> wrote:
> > The Apache Kylin team is pleased to announce the immediate availability
> of
> > the 1.5.2.1 release. The release note can be found here [1]; The source
> > code and binary package can be downloaded from Kylin's download page [2].
> >
> > The Apache Kylin Team would like to hear from you and welcomes your
> > comments and contributions.
> >
> > Thanks,
> > The Apache Kylin Team
> >
> > [1] https://kylin.apache.org/docs15/release_notes.html
> > [2] https://kylin.apache.org/download/
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
> >
>



-- 
Best regards,

Shaofeng Shi

Re: question to building cube from kafka

2016-06-07 Thread ShaoFeng Shi

Hi Jie, it need the json data format be flat (no embeded attribute);

2016-06-07 14:56 GMT+08:00 Jie Tao <jie@gameforge.com>:

> It is a nice feature to build cube directly from kafka. From the example
> on your docs I see that the table schema is extracted from the input JSON.
> The question is: do your support recursive JSON structure, i.e., a JSON
> attribute is an object containing other attributes? Like:
>
> {
> "foo": {
> "attr1": 70,
> "att2: "blabla"
> },
> "fa":
> }
>
> Cheers,
>
> Jie
>



-- 
Best regards,

Shaofeng Shi

[Announce] Apache Kylin 1.5.2.1 released

2016-06-07 Thread ShaoFeng Shi

The Apache Kylin team is pleased to announce the immediate availability of
the 1.5.2.1 release. The release note can be found here [1]; The source
code and binary package can be downloaded from Kylin's download page [2].

The Apache Kylin Team would like to hear from you and welcomes your
comments and contributions.

Thanks,
The Apache Kylin Team

[1] https://kylin.apache.org/docs15/release_notes.html
[2] https://kylin.apache.org/download/

-- 
Best regards,

Shaofeng Shi

Re: 答复: Timeout visiting cube!

2016-06-07 Thread ShaoFeng Shi

-06-08 09:10
> 收件人： gaolv123...@163.com
> 主题： 答复: Timeout visiting cube!
>
> 我不是KYLIN的开发，一样是用户，我之前也碰到过这个问题
>
> 你可以尝试在kylin.profile的配置文件里面添加下面的参数，默认值是1，我这边是改成3倍
>
> kylin.query.cube.visit.timeout.times=3
> #default is 1
>
> -邮件原件-
> 发件人: gaolv123...@163.com [mailto:gaolv123...@163.com]
> 发送时间: 2016年6月7日 20:39
> 收件人: dev <dev@kylin.apache.org>
> 主题: Timeout visiting cube!
>
>
> 你好：
> 每次当我build完 cube之后，都必须去更新 Update HBase Coprocessor
> 否则查询的时候便会报错，如下：Error while executing SQL "SELECT * FROM
> YOOSHU_BID_REQUEST_VIEWS LIMIT 10": Timeout visiting cube!
>
> 当执行完$KYLIN_HOME/bin/kylin.sh
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI
> $KYLIN_HOME/lib/kylin-coprocessor-*.jar all
> 之后，便可以正常查询。
> 但是下次build之后，又会出现这个问题。
>
> 应该如何解决的啊？
>
>
>
>
> gaolv123...@163.com
>



-- 
Best regards,

Shaofeng Shi

Re: question to TOP-N feature

2016-06-08 Thread ShaoFeng Shi

Jie, thanks for the input, we will analysis and evaluate it; please watch
the JIRA ticket;

2016-06-08 14:11 GMT+08:00 Jie Tao <jie@gameforge.com>:

> use case: find the last 100 players that do not play much in this month
> (we have a column "playtime").
>
> Thanks,
>
> Jie
>
>
> Am 07.06.2016 um 16:56 schrieb ShaoFeng Shi:
>
>> Although there is a JIRA (
>> https://issues.apache.org/jira/browse/KYLIN-1477)
>> for this, I haven't found a case; Jie, would you mind to share your
>> scenario?
>>
>> 2016-06-07 17:57 GMT+08:00 Jie Tao <jie@gameforge.com>:
>>
>> is it possible to use this feature to show the last_N records, i.e. let
>>> Kylin sort in ascending order rather than descending order?
>>>
>>> Cheers,
>>>
>>> Jie
>>>
>>>
>>
>>
>


-- 
Best regards,

Shaofeng Shi

[RESULT] [VOTE] Release apache-kylin-1.5.2.1 (release candidate 1)

2016-06-07 Thread ShaoFeng Shi

Thanks to everyone who has tested the release candidate and given
their comments and votes.

The tally is as follows.

5 binding +1s:
Shaofeng Shi
Dong Li
Dayue Gao
Xiaoyu Wang
Luke Han

1 non-binding +1s:
Chunen Ni


No 0s or -1s.

Therefore I am delighted to announce that the proposal to release
Apache-Kylin-1.5.2.1 has passed.

-- 
Best regards,

Shaofeng Shi

Re: Timeout visiting cube

2016-06-12 Thread ShaoFeng Shi

Hi, what's your kylin/hbase/hadoop version?

2016-06-12 11:43 GMT+08:00 gaolv123...@163.com <gaolv123...@163.com>:

> 你好：
> 每次当我build完 cube之后，都必须去更新 Update HBase Coprocessor
> 否则查询的时候便会报错，如下：Error while executing SQL "SELECT * FROM
> YOOSHU_BID_REQUEST_VIEWS LIMIT 10": Timeout visiting cube!
>
> 当执行完$KYLIN_HOME/bin/kylin.sh
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI
> $KYLIN_HOME/lib/kylin-coprocessor-*.jar all
> 之后，便可以正常查询。
> 但是下次build之后，又会出现这个问题。
>
> 应该如何解决的啊？
>
> Caused by: org.apache.hadoop.hbase.exceptions.UnknownProtocolException:
> org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No registered
> coprocessor service found for name CubeVisitService in region
> KYLIN_NDT1PYHI7P,,1465298389410.ebf41fc2bd7ac44fb2f0b14a2146ecae.
> at
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7457)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1891)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1873)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32389)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2117)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> at java.lang.Thread.run(Thread.java:745)
>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> at
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
> at
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:325)
> at
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1620)
> at
> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:92)
> at
> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:89)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
> at
> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:95)
> at
> org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callMethod(CoprocessorRpcChannel.java:56)
> at
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService$Stub.visitCube(CubeVisitProtos.java:3861)
> at
> org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$2.call(CubeHBaseEndpointRPC.java:362)
> at
> org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$2.call(CubeHBaseEndpointRPC.java:358)
> at org.apache.hadoop.hbase.client.HTable$16.call(HTable.java:1751)
> ... 4 more
> Caused by:
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.UnknownProtocolException):
> org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No registered
> coprocessor service found for name CubeVisitService in region
> KYLIN_NDT1PYHI7P,,1465298389410.ebf41fc2bd7ac44fb2f0b14a2146ecae.
> at
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7457)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1891)
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1873)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32389)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2117)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> at java.lang.Thread.run(Thread.java:745)
>
> at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1235)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:222)
> at
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:323)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:32855)
> at
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1616)
> ... 13 more
>
>
>
>
> gaolv123...@163.com
>



-- 
Best regards,

Shaofeng Shi

Re: Re: Timeout visiting cube

2016-06-12 Thread ShaoFeng Shi

When release 1.5.1 we hadn't run successfully on cdh 5.7; Since in 1.5.2,
Kylin starts to provide a binary pacckage (as well as a code branch) for
cdh 5.7; Did you try the latest version? It can be downloaded from Kylin's
download page. If still the same issue after upgrade, feel free to open a
JIRA with the necessary logs, we will investigate.

2016-06-13 9:58 GMT+08:00 gaolv123...@163.com <gaolv123...@163.com>:

> version infos ：
>
> Kylin 1.5.1
> HBase 1.1.4 for apache package
> hadoop cdh5.7
>
>
>
> gaolv123...@163.com
>
> From: ShaoFeng Shi
> Date: 2016-06-13 09:51
> To: dev
> Subject: Re: Timeout visiting cube
> Hi, what's your kylin/hbase/hadoop version?
>
> 2016-06-12 11:43 GMT+08:00 gaolv123...@163.com <gaolv123...@163.com>:
>
> > 你好：
> > 每次当我build完 cube之后，都必须去更新 Update HBase Coprocessor
> > 否则查询的时候便会报错，如下：Error while executing SQL "SELECT * FROM
> > YOOSHU_BID_REQUEST_VIEWS LIMIT 10": Timeout visiting cube!
> >
> > 当执行完$KYLIN_HOME/bin/kylin.sh
> > org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI
> > $KYLIN_HOME/lib/kylin-coprocessor-*.jar all
> > 之后，便可以正常查询。
> > 但是下次build之后，又会出现这个问题。
> >
> > 应该如何解决的啊？
> >
> > Caused by: org.apache.hadoop.hbase.exceptions.UnknownProtocolException:
> > org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No
> registered
> > coprocessor service found for name CubeVisitService in region
> > KYLIN_NDT1PYHI7P,,1465298389410.ebf41fc2bd7ac44fb2f0b14a2146ecae.
> > at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7457)
> > at
> >
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1891)
> > at
> >
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1873)
> > at
> >
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32389)
> > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2117)
> > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> > at
> >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> > at
> >
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> > at
> >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > at
> >
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> > at
> >
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
> > at
> >
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:325)
> > at
> >
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1620)
> > at
> >
> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:92)
> > at
> >
> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel$1.call(RegionCoprocessorRpcChannel.java:89)
> > at
> >
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
> > at
> >
> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:95)
> > at
> >
> org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callMethod(CoprocessorRpcChannel.java:56)
> > at
> >
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService$Stub.visitCube(CubeVisitProtos.java:3861)
> > at
> >
> org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$2.call(CubeHBaseEndpointRPC.java:362)
> > at
> >
> org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$2.call(CubeHBaseEndpointRPC.java:358)
> > at org.apache.hadoop.hbase.client.HTable$16.call(HTable.java:1751)
> > ... 4 more
> > Caused by:
> >
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.UnknownProtocolException):
> > org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No
> registered
> > coprocessor service found for name CubeVisitService in region
> > KYLIN_NDT1PYHI7P,,1465298389410.ebf41fc2bd7ac44fb2f0b14a2146ecae.
> > at
> >
> org.apache.hadoop.hbase.regionserver.HRegion.execSe

Re: How to do DateDiff in Kylin

2016-06-11 Thread ShaoFeng Shi

Julian, from which version Calcite supports the minus operation between two
dates?

I tried "select cal_dt, (current_date - cal_dt) as diff from kylin_cal_dt"
in Kylin, but get this error:

Error while executing SQL "select cal_dt, (current_date - cal_dt) as diff
from kylin_cal_dt LIMIT 5": From line 1, column 17 to line 1, column
37: Cannot apply '-' to arguments of type ' - '. Supported
form(s): ' - ' ' -
' ' - '

Kylin uses Calcite 1.6.0 now; Is an upgrade or type cast needed? Thanks!

2016-06-11 6:02 GMT+08:00 Julian Hyde <jh...@apache.org>:

> In Calcite you can do
>
>   (currentdate - cal_date) DAYS
>
> which returns an INTERVAL DAYS value.
>
> In https://issues.apache.org/jira/browse/CALCITE-1124 we added
> TIMESTAMPADD, TIMESTAMPDIFF.
>
> DATEDIFF is mentioned in
> https://issues.apache.org/jira/browse/CALCITE-759 but that has not
> been implemented.
>
> Julian
>
>
> On Fri, Jun 10, 2016 at 7:54 AM, ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
> > I don't see such a function in Calcite (Kylin's SQL parser):
> > https://calcite.apache.org/docs/reference.html#datetime-functions
> >
> >
> > 2016-06-10 10:04 GMT+08:00 Uma Maheshwar Kamuni <ukam...@centramed.co>:
> >
> >> i want make DateDiff between currentdate and cal_date (which contains
> date)
> >>
> >> 
> >> From: ShaoFeng Shi <shaofeng...@apache.org>
> >> Sent: Thursday, June 9, 2016 8:22:08 PM
> >> To: dev@kylin.apache.org
> >> Subject: Re: How to do DateDiff in Kylin
> >>
> >> Hi Uma, could you please give a sample SQL with DateDiff?
> >>
> >> 2016-06-09 13:49 GMT+08:00 Uma Maheshwar Kamuni <ukam...@centramed.co>:
> >>
> >> > is there any function for DateDiff in Kylin.
> >> >
> >>
> >>
> >>
> >> --
> >> Best regards,
> >>
> >> Shaofeng Shi
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
>



-- 
Best regards,

Shaofeng Shi

Re: kylin_client_tool.py

2016-05-25 Thread ShaoFeng Shi

Check this: https://issues.apache.org/jira/browse/KYLIN-1249

We haven't merged that, you can apply the patch in local.

2016-05-25 15:45 GMT+08:00 Tao Li <tao...@envisioncn.com>:

> Hi,
>
>Where can I get the kylin_client_tool ?
>
>
> Best regards,
>
>
>
> Tao Li
>
>
>
>
>
> ???(,?
> This email message (including any attachments) is confidential and may be
> legally privileged. If you have received it by mistake, please notify the
> sender by return email and delete this message from your system. Any
> unauthorized use or dissemination of this message in whole or in part is
> strictly prohibited. Envision Energy Limited and all its subsidiaries shall
> not be liable for the improper or incomplete transmission of the
> information contained in this email nor for any delay in its receipt or
> damage to your system. Envision Energy Limited does not guarantee the
> integrity of this email message, nor that this email message is free of
> viruses, interceptions, or interference.
>



-- 
Best regards,

Shaofeng Shi

Re: How to do DateDiff in Kylin

2016-06-10 Thread ShaoFeng Shi

I don't see such a function in Calcite (Kylin's SQL parser):
https://calcite.apache.org/docs/reference.html#datetime-functions


2016-06-10 10:04 GMT+08:00 Uma Maheshwar Kamuni <ukam...@centramed.co>:

> i want make DateDiff between currentdate and cal_date (which contains date)
>
> ____
> From: ShaoFeng Shi <shaofeng...@apache.org>
> Sent: Thursday, June 9, 2016 8:22:08 PM
> To: dev@kylin.apache.org
> Subject: Re: How to do DateDiff in Kylin
>
> Hi Uma, could you please give a sample SQL with DateDiff?
>
> 2016-06-09 13:49 GMT+08:00 Uma Maheshwar Kamuni <ukam...@centramed.co>:
>
> > is there any function for DateDiff in Kylin.
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>



-- 
Best regards,

Shaofeng Shi

Re: How to do DateDiff in Kylin

2016-06-11 Thread ShaoFeng Shi

Thanks Julian; I added the time unit, there is no SQL syntax error anymore;
Although the diff values are all "+0", it should be Kylin's issue;

Uma, please try the query, if you see the same result, then pls open a JIRA
to Kylin: http://issues.apache.org/jira/secure/Dashboard.jspa

Thanks;

2016-06-12 8:18 GMT+08:00 Julian Hyde <jh...@apache.org>:

> You need to include a time unit, e.g. DAY or MONTH or YEAR TO MONTH. The
> syntax is
>
>   (datetime - datetime) timeunit
>
> e.g.
>
>   (d2 - d1) DAY
>   (d2 - d1) YEAR TO MONTH
>
> This has been in Calcite for quite a few releases.
>
>
>
> > On Jun 11, 2016, at 3:27 AM, ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
> >
> > Julian, from which version Calcite supports the minus operation between
> two
> > dates?
> >
> > I tried "select cal_dt, (current_date - cal_dt) as diff from
> kylin_cal_dt"
> > in Kylin, but get this error:
> >
> > Error while executing SQL "select cal_dt, (current_date - cal_dt) as diff
> > from kylin_cal_dt LIMIT 5": From line 1, column 17 to line 1, column
> > 37: Cannot apply '-' to arguments of type ' - '. Supported
> > form(s): ' - ' ' -
> > ' ' - '
> >
> > Kylin uses Calcite 1.6.0 now; Is an upgrade or type cast needed? Thanks!
> >
> > 2016-06-11 6:02 GMT+08:00 Julian Hyde <jh...@apache.org>:
> >
> >> In Calcite you can do
> >>
> >>  (currentdate - cal_date) DAYS
> >>
> >> which returns an INTERVAL DAYS value.
> >>
> >> In https://issues.apache.org/jira/browse/CALCITE-1124 we added
> >> TIMESTAMPADD, TIMESTAMPDIFF.
> >>
> >> DATEDIFF is mentioned in
> >> https://issues.apache.org/jira/browse/CALCITE-759 but that has not
> >> been implemented.
> >>
> >> Julian
> >>
> >>
> >> On Fri, Jun 10, 2016 at 7:54 AM, ShaoFeng Shi <shaofeng...@apache.org>
> >> wrote:
> >>> I don't see such a function in Calcite (Kylin's SQL parser):
> >>> https://calcite.apache.org/docs/reference.html#datetime-functions
> >>>
> >>>
> >>> 2016-06-10 10:04 GMT+08:00 Uma Maheshwar Kamuni <ukam...@centramed.co
> >:
> >>>
> >>>> i want make DateDiff between currentdate and cal_date (which contains
> >> date)
> >>>>
> >>>> ____
> >>>> From: ShaoFeng Shi <shaofeng...@apache.org>
> >>>> Sent: Thursday, June 9, 2016 8:22:08 PM
> >>>> To: dev@kylin.apache.org
> >>>> Subject: Re: How to do DateDiff in Kylin
> >>>>
> >>>> Hi Uma, could you please give a sample SQL with DateDiff?
> >>>>
> >>>> 2016-06-09 13:49 GMT+08:00 Uma Maheshwar Kamuni <ukam...@centramed.co
> >:
> >>>>
> >>>>> is there any function for DateDiff in Kylin.
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>>
> >>>> Shaofeng Shi
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>>
> >>> Shaofeng Shi
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
>
>


-- 
Best regards,

Shaofeng Shi

Re: 答复: 答复: 答复: snapshot table not update

2016-06-11 Thread ShaoFeng Shi

Issue created, please subscribe it for update:
https://issues.apache.org/jira/browse/KYLIN-1780

2016-06-12 11:54 GMT+08:00 yubo-...@yolo24.com <yubo-...@yolo24.com>:

> Yes. All rows in the table changed
>
> 发件人: ShaoFeng Shi-2 [via Apache Kylin] [mailto:
> ml-node+s74782n4911...@n6.nabble.com]
> 发送时间: 2016年6月12日 11:11
> 收件人: yubo-ds1(于渤.大数据中心.大数据平台部)
> 主题: Re: 答复: 答复: snapshot table not update
>
> Is the table content changed in between?
>
> 2016-06-12 10:55 GMT+08:00 [hidden
> email] <[hidden
> email]>:
>
> > I updated all the rows.
> >
> > Actually, I dropped the table and created it again.
> >
> > 发件人: ShaoFeng Shi-2 [via Apache Kylin] [mailto:
> > [hidden email]]
> > 发送时间: 2016年6月12日 10:07
> > 收件人: yubo-ds1(于渤.大数据中心.大数据平台部)
> > 主题: Re: 答复: snapshot table not update
> >
> > I checked the code, it is very likely a bug; Before start working on
> that,
> > I want to know what kind of change was made in your lookup table:
> > add/delete record, or update a couple of rows? Thanks.
> >
> > 2016-06-12 9:08 GMT+08:00 [hidden
> > email] <[hidden
> > email]>:
> >
> > > Yes,right
> > >
> > > 发件人: Yang [via Apache Kylin] [mailto:[hidden
> > email]]
> > > 发送时间: 2016年6月12日 6:32
> > > 收件人: yubo-ds1(于渤.大数据中心.大数据平台部)
> > > 主题: Re: snapshot table not update
> > >
> > > To confirm understanding. You built a cube, updated the lookup table in
> > > hive, and built it again. And the second build didn't pick up the
> latest
> > > lookup table. Is that correct?
> > >
> > > On Wed, Jun 8, 2016 at 11:45 AM, [hidden
> > > email] <[hidden
> > > email]>
> > > wrote:
> > >
> > > > I define a loopup table in a cube,when data changed in lookup table,
> > > > snapshot not updated .
> > > >
> > > > I can find logs as below:
> > > >
> > > >
> > > >
> > > > 2016-06-08 11:03:07,443 INFO  [pool-5-thread-7]
> > > lookup.SnapshotManager:181
> > > > :
> > > > Loading snapshotTable from /table_snapshot/siteidmapping/8bd
> > > > d3aba-0842-432b-be99-2ba2bb1a852d.snapshot, with loadData: true
> > > > 2016-06-08 11:03:07,447 DEBUG [pool-5-thread-7]
> > > lookup.SnapshotManager:187
> > > > :
> > > > Loaded snapshot at /table_snapshot/siteidmapping/8bdd3aba-08
> > > > 42-432b-be99-2ba2bb1a852d.snapshot
> > > > 2016-06-08 11:03:07,447 INFO  [pool-5-thread-7]
> > > lookup.SnapshotManager:130
> > > > :
> > > > Identical snapshot content org.apache.kylin.dict.lookup.Snap
> > > > shotTable@9652a8b3, reuse existing snapshot at
> > > >
> > >
> >
> /table_snapshot/siteidmapping/8bdd3aba-0842-432b-be99-2ba2bb1a852d.snapshot
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > >
> > >
> >
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854.html
> > > > Sent from the Apache Kylin mailing list archive at Nabble.com.
> > > >
> > >
> > > 
> > > If you reply to this email, your message will be added to the
> discussion
> > > below:
> > >
> > >
> >
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854p4902.html
> > > To start a new topic under Apache Kylin, email
> > > [hidden email]
> > > To unsubscribe from Apache Kylin, click here<
> > > >.
> > > NAML<
> > >
> >
> http://apache-kylin.74782.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> > > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854p4906.html
> > > Sent from the Apache Kylin mailing list archive at Nabble.com.
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
> >
> > 
> > If you reply to this email, your message will be added to the discussion
> > below:
> >
> >
> http://apache-kylin.74782.x6

Re: 答复: 答复: snapshot table not update

2016-06-11 Thread ShaoFeng Shi

Is the table content changed in between?

2016-06-12 10:55 GMT+08:00 yubo-...@yolo24.com <yubo-...@yolo24.com>:

> I updated all the rows.
>
> Actually, I dropped the table and created it again.
>
> 发件人: ShaoFeng Shi-2 [via Apache Kylin] [mailto:
> ml-node+s74782n490...@n6.nabble.com]
> 发送时间: 2016年6月12日 10:07
> 收件人: yubo-ds1(于渤.大数据中心.大数据平台部)
> 主题: Re: 答复: snapshot table not update
>
> I checked the code, it is very likely a bug; Before start working on that,
> I want to know what kind of change was made in your lookup table:
> add/delete record, or update a couple of rows? Thanks.
>
> 2016-06-12 9:08 GMT+08:00 [hidden
> email] <[hidden
> email]>:
>
> > Yes,right
> >
> > 发件人: Yang [via Apache Kylin] [mailto:[hidden
> email]]
> > 发送时间: 2016年6月12日 6:32
> > 收件人: yubo-ds1(于渤.大数据中心.大数据平台部)
> > 主题: Re: snapshot table not update
> >
> > To confirm understanding. You built a cube, updated the lookup table in
> > hive, and built it again. And the second build didn't pick up the latest
> > lookup table. Is that correct?
> >
> > On Wed, Jun 8, 2016 at 11:45 AM, [hidden
> > email] <[hidden
> > email]>
> > wrote:
> >
> > > I define a loopup table in a cube,when data changed in lookup table,
> > > snapshot not updated .
> > >
> > > I can find logs as below:
> > >
> > >
> > >
> > > 2016-06-08 11:03:07,443 INFO  [pool-5-thread-7]
> > lookup.SnapshotManager:181
> > > :
> > > Loading snapshotTable from /table_snapshot/siteidmapping/8bd
> > > d3aba-0842-432b-be99-2ba2bb1a852d.snapshot, with loadData: true
> > > 2016-06-08 11:03:07,447 DEBUG [pool-5-thread-7]
> > lookup.SnapshotManager:187
> > > :
> > > Loaded snapshot at /table_snapshot/siteidmapping/8bdd3aba-08
> > > 42-432b-be99-2ba2bb1a852d.snapshot
> > > 2016-06-08 11:03:07,447 INFO  [pool-5-thread-7]
> > lookup.SnapshotManager:130
> > > :
> > > Identical snapshot content org.apache.kylin.dict.lookup.Snap
> > > shotTable@9652a8b3, reuse existing snapshot at
> > >
> >
> /table_snapshot/siteidmapping/8bdd3aba-0842-432b-be99-2ba2bb1a852d.snapshot
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854.html
> > > Sent from the Apache Kylin mailing list archive at Nabble.com.
> > >
> >
> > 
> > If you reply to this email, your message will be added to the discussion
> > below:
> >
> >
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854p4902.html
> > To start a new topic under Apache Kylin, email
> > [hidden email]
> > To unsubscribe from Apache Kylin, click here<
> > >.
> > NAML<
> >
> http://apache-kylin.74782.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> > >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854p4906.html
> > Sent from the Apache Kylin mailing list archive at Nabble.com.
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
> 
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854p4908.html
> To start a new topic under Apache Kylin, email
> ml-node+s74782n1...@n6.nabble.com
> To unsubscribe from Apache Kylin, click here<
> http://apache-kylin.74782.x6.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=1=eXViby1kczFAeW9sbzI0LmNvbXwxfC0xMTE5OTYzOTg4
> >.
> NAML<
> http://apache-kylin.74782.x6.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
>
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/snapshot-table-not-update-tp4854p4910.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: 配置问题

2016-06-08 Thread ShaoFeng Shi

nnectionManager$HConnectionImplementation: Closing zookeeper
> sessionid=0x15530aec581000a
> 2016-06-09 00:41:05,349 INFO  [main-EventThread] zookeeper.ClientCnxn:
> EventThread shut down
> 2016-06-09 00:41:05,354 INFO  [Thread-1] zookeeper.ZooKeeper: Session:
> 0x15530aec581000a closed




-- 
Best regards,

Shaofeng Shi

Re: stream build problem

2016-06-08 Thread ShaoFeng Shi

I guess you're running the master branch, which is unstable; suggest to use
the latest released version 1.5.2.1. For the error you reported, a JIRA is
created: https://issues.apache.org/jira/browse/KYLIN-1777

Thanks;

2016-06-09 10:19 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:

> Hi felix, which version are you using? It seems not v1.5.2;
>
> 2016-06-09 8:03 GMT+08:00 felixcui01 <cxf@gmail.com>:
>
>>
>> anyone can help? thanks in advance
>>
>> --
>> View this message in context:
>> http://apache-kylin.74782.x6.nabble.com/stream-build-problem-tp4857p4879.html
>> Sent from the Apache Kylin mailing list archive at Nabble.com.
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>


-- 
Best regards,

Shaofeng Shi

Re: Unable build sample cube.Failing at Step 2

2016-05-30 Thread ShaoFeng Shi

what's the kylin version? In the step 2, was there any message in the "Log"
(beside to "parameters")?

2016-05-30 13:23 GMT+08:00 Uma Maheshwar Kamuni <ukam...@centramed.co>:

> Hi ,
>
> Below are versions on using,
>
>
> Hadoop : 2.7.1
>
> HBase : 1.1.3
>
> Hive : 1.2.1
>
>
> Below is the error i am getting in log file:
> 2016-05-26 12:38:15,365 INFO  [pool-2-thread-2]
> manager.ExecutableManager:274 : job
> id:5401c534-e765-4c47-b688-72291b03ef13-01 from READY to RUNNING
> 2016-05-26 12:38:15,509 INFO  [pool-2-thread-2]
> execution.AbstractExecutable:113 : parameters of the MapReduceExecutable:
> 2016-05-26 12:38:15,509 INFO  [pool-2-thread-2]
> execution.AbstractExecutable:114 :  -conf
> /usr/local/kylin/bin/../conf/kylin_job_conf.xml -cubename kylin_sales_cube
> -output
> /kylin/kylin_metadata/kylin-5401c534-e765-4c47-b688-72291b03ef13/kylin_sales_cube/fact_distinct_columns
> -segmentname 2012010100_2014040100 -statisticsenabled true
> -statisticsoutput
> /kylin/kylin_metadata/kylin-5401c534-e765-4c47-b688-72291b03ef13/kylin_sales_cube/statistics
> -statisticssamplingpercent 100 -jobname
> Kylin_Fact_Distinct_Columns_kylin_sales_cube_Step
> 2016-05-26 12:38:26,156 INFO  [pool-2-thread-2]
> manager.ExecutableManager:274 : job
> id:5401c534-e765-4c47-b688-72291b03ef13-01 from RUNNING to ERROR
>
>
> I have checked Resource Manager UI. even there i am not abed to see any
> job.
>
> Job is not submitting to Hadoop.
>
>
> Can i know any solution for this
> Regards,
> Mahesh
>



-- 
Best regards,

Shaofeng Shi

Re: question about the joint dimension and query

2016-05-26 Thread ShaoFeng Shi

You can configure more mem to Kylin in bin/setenv.sh; If the OOM error
still occurs, you need investigate more.

2016-05-27 11:24 GMT+08:00 耳东 <775620...@qq.com>:

> hi All:
>
>
> when I execute a sql, the kylin server is down and in the kylin.out,
> it shows as follows, where should I configure？
> # java.lang.OutOfMemoryError: Java heap space
> # -XX:OnOutOfMemoryError="kill -9 %p"
> #   Executing /bin/sh -c "kill -9 1909"...




-- 
Best regards,

Shaofeng Shi

Re: Understanding the cube building process

2016-05-26 Thread ShaoFeng Shi

FYI, the doc for streaming cubing is published in website:
https://kylin.apache.org/docs15/tutorial/cube_streaming.html

2016-05-04 14:13 GMT+08:00 Li Yang <liy...@apache.org>:

> Shaofeng is working on a document about Kafka and streaming cubing. Let's
> wait.
>
> On Tue, May 3, 2016 at 11:26 PM, Nick Dimiduk <ndimi...@apache.org> wrote:
>
> > Very nice talk, thank you. That helped put many things into context for
> me.
> > I will resume my study of the code for understanding engine
> implementation
> > details.
> >
> > One final question -- is there a doc for getting started with the
> > experimental Kafka integration?
> >
> > Thanks,
> > Nick
> >
> > On Tue, May 3, 2016 at 2:45 AM, Li Yang <liy...@apache.org> wrote:
> >
> > > It's complicated. As of Kylin 1.5, there are two flavors of cubing
> > > algorithm. Below talk covered a bit. There's no comprehensive document
> at
> > > the moment.
> > >
> > > https://www.youtube.com/watch?v=n74zvLmIgF0
> > >
> > >
> > > On Tue, May 3, 2016 at 7:52 AM, Nick Dimiduk <ndimi...@apache.org>
> > wrote:
> > >
> > > > Hi there,
> > > >
> > > > I'm curious to understand how Kylin goes about building cubes. I've
> > > > deployed it on a single-node cluster and played around with the
> sample
> > > cube
> > > > [0]. Now i'm looking through the kylin server log and the code in the
> > > > 'engine-mr'. I'm not finding much in the way of docs in the source
> code
> > > > though :(
> > > >
> > > > Is there any presentation, blog post,  that gives and overview of
> > these
> > > > internals? I did find [1] but I'm looking go descend another level.
> I'm
> > > > curious about the various steps involved (looks like it ran 18
> "steps"
> > > and
> > > > 10 MR jobs), what they're doing. I'm also curious about the schema
> > design
> > > > for the data model in HBase.
> > > >
> > > > Thanks in advance!
> > > > -n
> > > >
> > > > [0]: http://kylin.apache.org/docs15/tutorial/kylin_sample.html
> > > > [1]: http://www.slideshare.net/YangLi43/design-cube-in-apache-kylin
> > > >
> > >
> >
>



-- 
Best regards,

Shaofeng Shi

Re: [ANNOUNCE] Apache Kylin 1.5.2 released

2016-05-26 Thread ShaoFeng Shi

The upgrade guide has been updated with this version, please check through
if you're updating from an old version:

https://kylin.apache.org/docs15/howto/howto_upgrade.html

2016-05-26 17:40 GMT+08:00 Dong Li <lid...@apache.org>:

> The Apache Kylin team is pleased to announce the immediate availability of
> the 1.5.2 release. The release note can be found here [1]; The source
> code and binary package can be downloaded from Kylin's download page [2].
>
> The Apache Kylin Team would like to hear from you and welcomes your
> comments and contributions.
>
> Thanks,
> The Apache Kylin Team
>
> [1] https://kylin.apache.org/docs15/release_notes.html
> [2] https://kylin.apache.org/download/
>



-- 
Best regards,

Shaofeng Shi

Re: kylin hbase 1.2.1

2016-06-15 Thread ShaoFeng Shi

t; at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> 2016-06-15 17:06:42,218 INFO  [Thread-1]
> client.ConnectionManager$HConnectionImplementation: Closing zookeeper
> sessionid=0x155534c2e010006
> 2016-06-15 17:06:42,313 INFO  [Thread-1] zookeeper.ZooKeeper: Session:
> 0x155534c2e010006 closed
> 2016-06-15 17:06:42,329 INFO  [main-EventThread] zookeeper.ClientCnxn:
> EventThread shut down
>



-- 
Best regards,

Shaofeng Shi

Re: GHow to query TOP-N

2016-06-13 Thread ShaoFeng Shi

yes

2016-06-14 0:40 GMT+08:00 Jie Tao <jie@gameforge.com>:

>
> shall I use "select sum(price) group by seller_id" for the defined measure
> TOP_SELLER with sum(price) and seller_id?
>
> Cheers,
>
> Jie
>



-- 
Best regards,

Shaofeng Shi

Re: 答复: Re: Timeout visiting cube!

2016-06-21 Thread ShaoFeng Shi

Hi Gaolv,

Did you compare the difference of the HTable before and after running "
org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI"?

You can build a new segment, and then describe the table in hbase shell;
and then run the DeployCoprocessorCLI, then describe it again to see
whether there is change in the table's metadata. I'm wondering whether
Kylin still uses the old coprocessor for a newly created table.

2016-06-13 18:00 GMT+08:00 赵天烁 <zhaotians...@meizu.com>:

> use to have the same issue with you,solved by update HBase
> coprocessor,here is the official guide:
> http://kylin.apache.org/docs15/howto/howto_update_coprocessor.html
>
> -
>
> 赵天烁
> Kevin Zhao
> zhaotians...@meizu.com
>
> 珠海市魅族科技有限公司
> MEIZU Technology Co., Ltd.
> 广东省珠海市科技创新海岸魅族科技楼
> MEIZU Tech Bldg., Technology & Innovation Coast
> Zhuhai, 519085, Guangdong, China
>
>
> meizu.com
>
>
>
>
> -邮件原件-
> 发件人: Li Yang [mailto:liy...@apache.org]
> 发送时间: 2016年6月13日 16:41
> 收件人: dev@kylin.apache.org
> 抄送: 吴钰彬 <wuyu...@baixing.com>
> 主题: Re: Re: Timeout visiting cube!
>
> I suspect something went wrong when creating the HTable and registering
> the coprocessor.
>
> However without logs, I'm not sure.
>
>
> On Fri, Jun 10, 2016 at 12:27 PM, gaolv123...@163.com <gaolv123...@163.com
> >
> wrote:
>
> > version infos ：
> >
> > Kylin 1.5.1
> > HBase 1.1.4 for apache package。
> >
> > What may be the problem?
> >
> >
> >
> > gaolv123...@163.com
> >
> > 发件人： Li Yang
> > 发送时间： 2016-06-08 17:24
> > 收件人： dev
> > 抄送： 吴钰彬
> > 主题： Re: 答复: Timeout visiting cube!
> > There must be something wrong during cube build. If you are using
> > 1.5.2, there is a diagnosis tool that extract useful info that helps
> diagnose.
> >
> > On Wed, Jun 8, 2016 at 10:07 AM, ShaoFeng Shi <shaofeng...@apache.org>
> > wrote:
> >
> > > Hi Gao, what's your Kylin version and HBase version?
> > >
> > > 2016-06-08 9:20 GMT+08:00 gaolv123...@163.com <gaolv123...@163.com>:
> > >
> > > > 你好：
> > > > 我并不是真的超时，而是kylin抛异常了，异常如下：
> > > > 每次构建之后都会有这个问题，必须去$KYLIN_HOME/bin/kylin.sh
> > > > org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI
> > > > $KYLIN_HOME/lib/kylin-coprocessor-*.jar all才行
> > > >
> > > > Caused by:
> org.apache.hadoop.hbase.exceptions.UnknownProtocolException:
> > > > org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No
> > > registered
> > > > coprocessor service found for name CubeVisitService in region
> > > > KYLIN_NDT1PYHI7P,,1465298389410.ebf41fc2bd7ac44fb2f0b14a2146ecae.
> > > > at
> > > >
> > >
> > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:
> > 7457)
> > > > at
> > > >
> > >
> > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion
> > (RSRpcServices.java:1891)
> > > > at
> > > >
> > >
> > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcSe
> > rvices.java:1873)
> > > > at
> > > >
> > >
> > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$
> > 2.callBlockingMethod(ClientProtos.java:32389)
> > > > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2117)
> > > > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:104)
> > > > at
> > > >
> > >
> > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:
> > 133)
> > > > at
> > > > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108
> > > > ) at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > Method)
> > > > at
> > > >
> > >
> > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructo
> > rAccessorImpl.java:57)
> > > > at
> > > >
> > >
> > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingCo
> > nstructorAccessorImpl.java:45)
> > > > at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> > > > at
> > > >
> > >
> > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteExcep
> > tion.java:106)
> > > > at
> > > >
> > >
> >

Re: Kylin failed to startup

2016-06-22 Thread ShaoFeng Shi

I remember CDH 5.4 isn't supported, see
https://issues.apache.org/jira/browse/KYLIN-1089

If for a trial, you can use CDH 5.7 and then download the specific Kylin
binary package from Kylin's download page.

2016-06-22 12:44 GMT+08:00 沙漠火狐 <278211...@qq.com>:

> Hi
>I'm a new user for Kylin.  install the   Cloudera QuickStart VM 5.4 in
> my computer  and the Kylin version is  Apache Kylin v1.5.2.1
> <http://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-1.5.2.1/>
>set KYLIN_HOME  and CATALINA_HOME
>
>when I finish  installatioin, enter the hbase shell create a table,
> it's can reach by my java program in eclipse. like:
> Configuration configuration = HBaseConfiguration.create();
>configuration.set("hbase.zookeeper.quorum", "quickstart.cloudera");
>configuration.set("hbase.zookeeper.property.clientPort", "2181");
>configuration.set("hbase.defaults.for.version.skip", "true");
>configuration.set("zookeeper.znode.parent", "/hbase");
> so  I think the cluster is correct.
>
>then i  add dfs.permissions property in hdfs-site.xml  set value fasle.
>
>
>  use the root execute  ./check-env.sh
>
> but when start the Kylin, ./kylin.sh start .  I get a error.
>
> aused by: java.lang.IllegalArgumentException: File not exist by
> 'kylin_metadata@hbase': /usr/mysoft/kylin/bin/kylin_metadata@hbase
> at
> org.apache.kylin.common.persistence.FileResourceStore.(FileResourceStore.java:49)
> ... 58 more
> 2016-06-21 21:37:17,095 ERROR [localhost-startStop-1]
> persistence.ResourceStore:91 : Create new store instance failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>
> Caused by: java.lang.IllegalArgumentException: Failed to find metadata
> store by url: kylin_metadata@hbase
> at
> org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:93)
> at
> org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:104)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:880)
>
>
>   is there any problem in kylin.properties  setting of
> kylin_metadata@hbase  ?  thx
>
>
>
>
>
>


-- 
Best regards,

Shaofeng Shi

Re: [jira] [Created] (KYLIN-1813) intermediate table in Hive not cleaned up

2016-06-22 Thread ShaoFeng Shi

Hi Jie, the screenshots were not attached in the JIRA; could you try again,
or send via email so that I can attache it. Thank you.


2016-06-22 15:02 GMT+08:00 Jie Tao <jie@gameforge.com>:

> I can not load the files to this JIRA, so hier they are.
>
> Jie
>
>
> Am 22.06.2016 um 08:59 schrieb Jie Tao (JIRA):
>
>> Jie Tao created KYLIN-1813:
>> --
>>
>>   Summary: intermediate table in Hive not cleaned up
>>   Key: KYLIN-1813
>>   URL: https://issues.apache.org/jira/browse/KYLIN-1813
>>   Project: Kylin
>>Issue Type: Bug
>>Components: General
>>  Affects Versions: v1.5.2
>>   Environment: linux
>>  Reporter: Jie Tao
>>  Priority: Minor
>>
>>
>> after the cube building with ERROR I discarded the cube, but the cube is
>> not marked as discarded like other jobs that I discarded during the cube
>> building wa srunning (see job.png: the job is not black colour marked but
>> has only one choice "diagnose"). Hence, as I manually clean up intermediate
>> data, I got info like:
>>
>> Remove intermediate hive table with job id
>> 493fd20b-3074-403e-9963-fe4fb7ff7c65 with job status ERROR
>> 2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove
>> intermediate hive table with job id
>> 8a377e30-e3ba-4fe2-be12-e7d412afec5ewith job status ERROR
>>
>> In this case the intermediate tables were not removed from hive (see
>> hive.png). The json is also attached json.txt
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>>
>
>


-- 
Best regards,

Shaofeng Shi

Re: Extract Fact Table Distinct Columns problem

2016-06-16 Thread ShaoFeng Shi

You can read this article:
https://kylin.apache.org/blog/2015/08/13/kylin-dictionary/

2016-06-16 17:23 GMT+08:00 仇同心 <qiutong...@jd.com>:

> Kylin crew:
>
>
>
>“Extract Fact Table Distinct Columns”, In this step will often happen
> OOM,I don't know what this step is doing, Can do a simple function
> description?
>
>
>
> In this FactDistinctColumnsJob.class , Line 80 ,List
> columnsNeedDict = cubeMgr.getAllDictColumnsOnFact(cubeDesc) : The method to
> realize the function of what.
>
>
>
>
>
>
>
>
>
> Thanks!
>
>
>
>
>



-- 
Best regards,

Shaofeng Shi

Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi

This is common; If you have a job failed in between, and you discard that
job, the "Garbage collection" step will not be executed, so the garbages
will be left there.

This is why we still recommend user to run offline cleanup every some
period; It is not perfert, but be good for most scenarios:
https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html

2016-06-17 15:00 GMT+08:00 Li Yang <liy...@apache.org>:

> Woo... something new to me. Anybody knows?
>
> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao <jie@gameforge.com> wrote:
>
> > Kylin actually drops useless intermediate tables after cube building, but
> > I still see one "kylin_intermediate_cubename_searchdata" table for each
> > cube building in Hive. Are these tables still usefull for Kylin? I use
> > Kylin 1.5.2.1.
> >
> > Cheers,
> >
> > Jie
> >
>



-- 
Best regards,

Shaofeng Shi

Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi

BTW, are you using a view as lookup table?

2016-06-17 15:15 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:

> This is common; If you have a job failed in between, and you discard that
> job, the "Garbage collection" step will not be executed, so the garbages
> will be left there.
>
> This is why we still recommend user to run offline cleanup every some
> period; It is not perfert, but be good for most scenarios:
> https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>
> 2016-06-17 15:00 GMT+08:00 Li Yang <liy...@apache.org>:
>
>> Woo... something new to me. Anybody knows?
>>
>> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao <jie@gameforge.com> wrote:
>>
>> > Kylin actually drops useless intermediate tables after cube building,
>> but
>> > I still see one "kylin_intermediate_cubename_searchdata" table for each
>> > cube building in Hive. Are these tables still usefull for Kylin? I use
>> > Kylin 1.5.2.1.
>> >
>> > Cheers,
>> >
>> > Jie
>> >
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>


-- 
Best regards,

Shaofeng Shi

Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi

by default the web UI only shows the jobs in LAST ONE WEEK, pls have a
check.

2016-06-17 16:58 GMT+08:00 Jie Tao <jie@gameforge.com>:

> actually I discarded all jobs and I do not see any ERROR job in the
> Monitor view of Kylin UI.
>
> Where can I see these error jobs?
>
> Jie
>
>
> Am 17.06.2016 um 10:31 schrieb ShaoFeng Shi:
>
>> Hi Jie,
>>
>> If a job is "ERROR", the intermediate hive table of it will not be
>> dropped,
>> as "ERROR" is not a final state; User can resume an "Error" job at any
>> time, so Kylin skipped to cleanup for that.
>>
>> If you discard these error jobs, and re-run the cleanup, the intermediate
>> hive table will be dropped.
>>
>> The message here is not clear, will change the wording...
>>
>> 2016-06-17 15:48 GMT+08:00 Jie Tao <jie@gameforge.com>:
>>
>> You are correct, the intermediate tables are left by fail-building. I do
>>> clean up storage based on the linked guide. Intermediate data in HDFS and
>>> Hbase are deleted, but the intermediate tables in Hive not. The command
>>> shows the tables but do not drop them. I donot have a lookup table but my
>>> fact table is a view.
>>>
>>> As I run the cleanup command,
>>> kylin_intermediate_logout_full_cube_1970010100_2015100100
>>> kylin_intermediate_logout_full_cube_1970010100_20160529010500
>>> kylin_intermediate_logout_full_cube_1970010100_2016060800
>>> kylin_intermediate_logout_full_cube_1970010100_20160608010500
>>> kylin_intermediate_logout_full_cube_1970010100_20160609010500
>>> kylin_intermediate_logout_full_cube_1970010100_2016061500
>>> kylin_intermediate_logout_full_cube_1970010100_2016062600
>>> kylin_intermediate_logout_full_cube_1970010100_20160626042000
>>> kylin_intermediate_test_cube_1970010100_20151201010500
>>> kylin_intermediate_test_cube_1970010100_20151231234000
>>> kylin_intermediate_test_cube_1970010100_20160302063000
>>> kylin_intermediate_test_cube_1970010100_2016062600
>>> kylin_intermediate_test_cube_1970010100_20160626042000
>>> kylin_intermediate_test_cube_1970010100_20160704082000
>>> Time taken: 0.189 seconds, Fetched: 14 row(s)
>>> 2016-06-17 09:37:12,645 INFO  [main StorageCleanupJob:262]: Remove
>>> intermediate hive table with job id 493fd20b-3074-403e-9963-fe4fb7ff7c65
>>> with job status ERROR
>>> 2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove
>>> intermediate hive table with job id 8a377e30-e3ba-4fe2-be12-e7d412afec5e
>>> with job status ERROR
>>>
>>> Best regards,
>>>
>>> Jie
>>>
>>>
>>> Am 17.06.2016 um 09:16 schrieb ShaoFeng Shi:
>>>
>>> BTW, are you using a view as lookup table?
>>>>
>>>> 2016-06-17 15:15 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:
>>>>
>>>> This is common; If you have a job failed in between, and you discard
>>>> that
>>>>
>>>>> job, the "Garbage collection" step will not be executed, so the
>>>>> garbages
>>>>> will be left there.
>>>>>
>>>>> This is why we still recommend user to run offline cleanup every some
>>>>> period; It is not perfert, but be good for most scenarios:
>>>>> https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>>>>>
>>>>> 2016-06-17 15:00 GMT+08:00 Li Yang <liy...@apache.org>:
>>>>>
>>>>> Woo... something new to me. Anybody knows?
>>>>>
>>>>>> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao <jie@gameforge.com>
>>>>>> wrote:
>>>>>>
>>>>>> Kylin actually drops useless intermediate tables after cube building,
>>>>>> but
>>>>>>
>>>>>> I still see one "kylin_intermediate_cubename_searchdata" table for
>>>>>>> each
>>>>>>> cube building in Hive. Are these tables still usefull for Kylin? I
>>>>>>> use
>>>>>>> Kylin 1.5.2.1.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Jie
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Shaofeng Shi
>>>>>
>>>>>
>>>>>
>>>>>
>>
>


-- 
Best regards,

Shaofeng Shi

Re: DISTINCT_COUNT精确计算问题

2016-06-21 Thread ShaoFeng Shi

For the precise distinct count on all data types feature, Yerui Sun is
working on that; you should look at:
https://issues.apache.org/jira/browse/KYLIN-1379


在 2016年6月21日 下午3:53，Roy <aqinnxuk...@163.com>写道：

> Hash 可能会存在重复数值.我们建立映射表的方法是使用 Dense_Rank() 函数.然后去统计.
>
> 而且目前kyiin count(distinct xx)  函数统计好像只支持 int类型的统计.
>
> 希望可以帮到你.
>
> Roy
>
>
>
>
> 在 2016-06-21 15:08:46，"仇同心" <qiutong...@jd.com> 写道：
> >大家好：
> >
>  Hive字段类型为varchar,字段内容也包含英文字母和中文，对这样的字段能否做DISTINCT_COUNT精确计算？如果不能，有什么好的建议吗？
> >
> >谢谢！
>



-- 
Best regards,

Shaofeng Shi

Re: kylin question

2016-06-21 Thread ShaoFeng Shi

Cool, good to know you solved the issue; your feedback to Kylin is welcomed!

2016-06-21 10:47 GMT+08:00 倪成伟 <549066...@qq.com>:

> you are right,I upgrade Hbase to 1.1x, All the problems have been
> solved.Thank you very much!
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/java-lang-NoSuchMethodError-org-apache-hadoop-hbase-coprocessor-RegionCoprocessorEnvironment-getRegi-tp4698p5061.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: kylin intermediate tables in Hive

2016-06-21 Thread ShaoFeng Shi

Hi Jie, would you mind to report a JIRA with this problem you found? If you
can attache the json of this job and a couple of screen shot, that would be
great for analysis. Thank you!

2016-06-21 16:49 GMT+08:00 Jie Tao <jie@gameforge.com>:

> actually the jobs have been disgarded. Maybe it is a bug that the status
> of the job is still "ERROR". I have a look of the jobs  at the Kylin Web UI
> and found that the job was not marked with black colors like other
> discarded jobs althouth the "action" button only has one choice
> "diagonose". My Kylin is 1.5.2.1.
>
> Cheers,
>
> Jie
>
>
> Am 17.06.2016 um 11:05 schrieb ShaoFeng Shi:
>
>> by default the web UI only shows the jobs in LAST ONE WEEK, pls have a
>> check.
>>
>> 2016-06-17 16:58 GMT+08:00 Jie Tao <jie@gameforge.com>:
>>
>> actually I discarded all jobs and I do not see any ERROR job in the
>>> Monitor view of Kylin UI.
>>>
>>> Where can I see these error jobs?
>>>
>>> Jie
>>>
>>>
>>> Am 17.06.2016 um 10:31 schrieb ShaoFeng Shi:
>>>
>>> Hi Jie,
>>>>
>>>> If a job is "ERROR", the intermediate hive table of it will not be
>>>> dropped,
>>>> as "ERROR" is not a final state; User can resume an "Error" job at any
>>>> time, so Kylin skipped to cleanup for that.
>>>>
>>>> If you discard these error jobs, and re-run the cleanup, the
>>>> intermediate
>>>> hive table will be dropped.
>>>>
>>>> The message here is not clear, will change the wording...
>>>>
>>>> 2016-06-17 15:48 GMT+08:00 Jie Tao <jie@gameforge.com>:
>>>>
>>>> You are correct, the intermediate tables are left by fail-building. I do
>>>>
>>>>> clean up storage based on the linked guide. Intermediate data in HDFS
>>>>> and
>>>>> Hbase are deleted, but the intermediate tables in Hive not. The command
>>>>> shows the tables but do not drop them. I donot have a lookup table but
>>>>> my
>>>>> fact table is a view.
>>>>>
>>>>> As I run the cleanup command,
>>>>> kylin_intermediate_logout_full_cube_1970010100_2015100100
>>>>> kylin_intermediate_logout_full_cube_1970010100_20160529010500
>>>>> kylin_intermediate_logout_full_cube_1970010100_2016060800
>>>>> kylin_intermediate_logout_full_cube_1970010100_20160608010500
>>>>> kylin_intermediate_logout_full_cube_1970010100_20160609010500
>>>>> kylin_intermediate_logout_full_cube_1970010100_2016061500
>>>>> kylin_intermediate_logout_full_cube_1970010100_2016062600
>>>>> kylin_intermediate_logout_full_cube_1970010100_20160626042000
>>>>> kylin_intermediate_test_cube_1970010100_20151201010500
>>>>> kylin_intermediate_test_cube_1970010100_20151231234000
>>>>> kylin_intermediate_test_cube_1970010100_20160302063000
>>>>> kylin_intermediate_test_cube_1970010100_2016062600
>>>>> kylin_intermediate_test_cube_1970010100_20160626042000
>>>>> kylin_intermediate_test_cube_1970010100_20160704082000
>>>>> Time taken: 0.189 seconds, Fetched: 14 row(s)
>>>>> 2016-06-17 09:37:12,645 INFO  [main StorageCleanupJob:262]: Remove
>>>>> intermediate hive table with job id
>>>>> 493fd20b-3074-403e-9963-fe4fb7ff7c65
>>>>> with job status ERROR
>>>>> 2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove
>>>>> intermediate hive table with job id
>>>>> 8a377e30-e3ba-4fe2-be12-e7d412afec5e
>>>>> with job status ERROR
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jie
>>>>>
>>>>>
>>>>> Am 17.06.2016 um 09:16 schrieb ShaoFeng Shi:
>>>>>
>>>>> BTW, are you using a view as lookup table?
>>>>>
>>>>>> 2016-06-17 15:15 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:
>>>>>>
>>>>>> This is common; If you have a job failed in between, and you discard
>>>>>> that
>>>>>>
>>>>>> job, the "Garbage collection" step will not be executed, so the
>>>>>>> garbages
>>>>>>> will be left there.
>>>>>>>
>>>>>>> This is why we still recommend user to run offline cleanup every some
>>>>>>> period; It is not perfert, but be good for most scenarios:
>>>>>>> https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>>>>>>>
>>>>>>> 2016-06-17 15:00 GMT+08:00 Li Yang <liy...@apache.org>:
>>>>>>>
>>>>>>> Woo... something new to me. Anybody knows?
>>>>>>>
>>>>>>> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao <jie@gameforge.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Kylin actually drops useless intermediate tables after cube
>>>>>>>> building,
>>>>>>>> but
>>>>>>>>
>>>>>>>> I still see one "kylin_intermediate_cubename_searchdata" table for
>>>>>>>>
>>>>>>>>> each
>>>>>>>>> cube building in Hive. Are these tables still usefull for Kylin? I
>>>>>>>>> use
>>>>>>>>> Kylin 1.5.2.1.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Jie
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>
>>>>>>> Shaofeng Shi
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>
>


-- 
Best regards,

Shaofeng Shi

Re: join & left join in kylin

2016-06-20 Thread ShaoFeng Shi

Hi Yanwei,

1: Kylin doesn't directly support snowflake schema; If you want to build
cube for such a model, you can firstly create a flat Hive View over those
tables, and then use the view as the fact table in Kylin;

2. "join" here means "inner join" I think; "inner join" will filter the
records that doesn't match with the PK in lookup table, I belive you know
its difference with "left join".


2016-06-20 17:53 GMT+08:00 李寅威 <251469...@qq.com>:

> Hi all:
>
>
>   I have two questions as follows:
>
>
>   1.Can kylin support snowflake schema in data warehouse?
>   2.If Kylin can only support star schema, under what circumstances shall
> we use join instead of left join?
>
>
>   look forward to your help, thx~
>
>
>
>
>
>
> --
> 李寅威 | CVTE
> cor...@foxmail.com
> 广州视源电子科技股份有限公司
> Guangzhou Shiyuan Electronics Co., Ltd.




-- 
Best regards,

Shaofeng Shi

Re: Timeout visiting cube

2016-06-21 Thread ShaoFeng Shi

e.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:536)
> at
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:187)
> at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:65)
> at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
> at
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:566)
> at
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:578)
> at
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:571)
> at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:135)
> ... 80 more
> 2016-06-16 09:06:46,289 INFO  [http-bio-7070-exec-10]
> service.QueryService:250 :
> ==[QUERY]===
> SQL: select count(distinct userid) from cortanauu
> User: ADMIN
> Success: false
> Duration: 0.0
> Project: CortanaTest
> Realization Names: [CortanaUUTest_clone] Cuboid Ids: [1048062] Total scan
> count: 0 Result row count: 0 Accept Partial: true Is Partial Result: false
> Hit Exception Cache: false Storage cache used: false
> Message: Error while executing SQL "select count(distinct userid) from
> cortanauu LIMIT 5": Timeout visiting cube!
> ==[QUERY]===
>
>
>
> -Original Message-
> From: hongbin ma [mailto:mahong...@apache.org]
> Sent: Wednesday, June 15, 2016 10:09 PM
> To: dev <dev@kylin.apache.org>
> Subject: Re: Timeout visiting cube
>
> actually, can you please attach latest 2 minutes' log before
> 2016-06-15 02:18:22,741?
> it's still incomplete for our analysis
>
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin:
> https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fkylin.io=01%7c01%7cjiaoli%40064d.mgd.microsoft.com%7ce53a04299632406e334208d395269be8%7c72f988bf86f141af91ab2d7cd011db47%7c1=vpTxGJMsJQ7CoMHsEjHWBxaRVgQWfLVp6mcuFmlqzzI%3d
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: Number of cube dimensions is limited to 62?

2016-06-17 Thread ShaoFeng Shi

Almost true; You can think Kylin is 64 bit, in theory it supports up to 63
dimension in one cube;

There is no plan to extend to 128 or more in near term I believe; Since in
most of the cases the dimension number wouldn't exceed 20, 64 is already
"redundant" and causing extra space;

With so many dimensions, there must be room for optimization; You can try
some ways like:
1) extract some columns to lookup tables, and create them as "derived"
dimension in the cube;
2) or create multiple cubes, each serving a part of these columns;

If you have other way, please also share with the community; Thanks;


2016-06-18 0:01 GMT+08:00 Victoria Tskhay <victoria.tsk...@glispamedia.com>:

> Hello,
>
> It looks like the max number of dimensions in one cube is 62, is that
> correct?
>
> We would like to add more than that. That may sound crazy, I know, but we
> have a special case where all the dimensions have low cardinality (3) and
> the data is very sparse. We already tried with 62 dimensions and it works
> great.
>
> Is there any way to work around that limit? What would you suggest? Thank
> you!
>
>
>
> Best regards
> --
> Victoria Tskhay
>
> *Java Backend Developer*I glispa GmbH
>
> Sonnenburger Straße 73, 10437 Berlin, Germany
> E victoria.tsk...@glispamedia.com  e.mail.ru/compose/?mailto=mailto%3avictoria.tsk...@glispamedia.com>
> Skype: vikatskhay I www.glispa.com <http://www.glispa.com>
>
> Sitz Berlin, AG Charlottenburg HRB 114678B
>



-- 
Best regards,

Shaofeng Shi

Re: 问下怎么调试sample.sh

2016-06-20 Thread ShaoFeng Shi

You can import Kylin's source code to your IDE (Eclipse or IntelliJ), and
then run org.apache.kylin.common.persistence.ResourceTool with debug mode ;

2016-06-18 0:14 GMT+08:00 TTS2沉默天使 <85546...@qq.com>:

> 在sample.sh中有一段
> hbase org.apache.hadoop.util.RunJar ${job_jar}
> org.apache.kylin.common.persistence.ResourceTool upload
> ${KYLIN_HOME}/sample_cube/metadata
> 这段我想debug查看执行过程应该怎么做?




-- 
Best regards,

Shaofeng Shi

Re: Kylin and Tableau -- Top N query

2016-01-16 Thread ShaoFeng Shi

The hive table contains 6 months data but cube only built with 1 month's,
if you want to compare the query result between hive and kylin, a where
condition should be added in the SQL to let them calculate on the same
scope, while I didn't see that on your query; Could you please double
confirm?

2016-01-15 23:10 GMT+08:00 sdangi <sda...@datalenz.com>:

> How much difference between Hive and Kylin?
> We have 6 months of data in Hive but the Cube in Kylin contained only 1
> month as we were testing incremental cube refresh. So Kylin did not have at
> least 5 months of data.
> a) any filtering condition in Cube descriptor?  - No filtering condition in
> the cube.  If the cube had the filtering condition and I was firing a query
> thru Tableau, why would it be any different, as it is not hitting Hive?
> b) is the Cube built with the full date range of hive table?
> No. It had only 1 month of data from hive tables.
>  c) Was the fact/lookup table data changed since cube be built?
> No.
>
>
> ShaoFeng Shi-2 wrote
> > How much difference between Hive and Kylin? Did you check some factors
> > like: a) any filtering condition in Cube descriptor? b) is the Cube built
> > with the full date range of hive table? c) Was the fact/lookup table data
> > changed since cube be built? Just some hints to exclude those mistakes.
> > Besides, you can run the SQL from Kylin UI to eliminate the possibility
> of
> > ODBC driver.
> >
> > 2016-01-15 7:50 GMT+08:00 sdangi 
>
> > sdangi@
>
> > :
> >
> >> Results from Kylin and Tableau on a live connection don't match.  Any
> >> reason?
> >> I'm creating a custom data source (Custom SQL Query) in Tableau and
> >> adding
> >> a
> >> parameter control using a query similar to below:
> >>
> >> SELECT
> >> t2.c1
> >> ,sum(t1.c2) AS c3
> >> FROM t1
> >> Inner join t2
> >> on t1.k1 = t2.k1
> >> group by t2.c1
> >> order by c3
> >> LIMIT
> > 
> >>
> >> t1 (fact) has 130MM rows and t2 (dimension) has 1.7MM
> >>
> >> The query shows different Top N records in Tableau as compared to Kylin
> >> and
> >> Hive.
> >>
> >> Thanks,
> >> Regards,
> >>
> >> --
> >> View this message in context:
> >>
> http://apache-kylin.74782.x6.nabble.com/Kylin-and-Tableau-Top-N-query-tp3250.html
> >> Sent from the Apache Kylin mailing list archive at Nabble.com.
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
>
>
> ShaoFeng Shi-2 wrote
> > How much difference between Hive and Kylin? Did you check some factors
> > like: a) any filtering condition in Cube descriptor? b) is the Cube built
> > with the full date range of hive table? c) Was the fact/lookup table data
> > changed since cube be built? Just some hints to exclude those mistakes.
> > Besides, you can run the SQL from Kylin UI to eliminate the possibility
> of
> > ODBC driver.
> >
> > 2016-01-15 7:50 GMT+08:00 sdangi 
>
> > sdangi@
>
> > :
> >
> >> Results from Kylin and Tableau on a live connection don't match.  Any
> >> reason?
> >> I'm creating a custom data source (Custom SQL Query) in Tableau and
> >> adding
> >> a
> >> parameter control using a query similar to below:
> >>
> >> SELECT
> >> t2.c1
> >> ,sum(t1.c2) AS c3
> >> FROM t1
> >> Inner join t2
> >> on t1.k1 = t2.k1
> >> group by t2.c1
> >> order by c3
> >> LIMIT
> > 
> >>
> >> t1 (fact) has 130MM rows and t2 (dimension) has 1.7MM
> >>
> >> The query shows different Top N records in Tableau as compared to Kylin
> >> and
> >> Hive.
> >>
> >> Thanks,
> >> Regards,
> >>
> >> --
> >> View this message in context:
> >>
> http://apache-kylin.74782.x6.nabble.com/Kylin-and-Tableau-Top-N-query-tp3250.html
> >> Sent from the Apache Kylin mailing list archive at Nabble.com.
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
>
>
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/Kylin-and-Tableau-Top-N-query-tp3250p3269.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: TopN Results Differ in Hive and Kylin

2016-01-16 Thread ShaoFeng Shi

in which way are you connecting to Kylin, ODBC, JDBC, Rest API or the web
UI?

Re: beg suggestions to speed up the Kylin cube build

2016-01-14 Thread ShaoFeng Shi

The cube build performance is much determined by your Hadoop cluster's
capacity. You can do some inspection with the MR job's statistics to
analysis the potential bottlenecks.



2016-01-15 7:19 GMT+08:00 zhong zhang <zzaco...@gmail.com>:

> Hi All,
>
> We are trying to build a nine-dimension cube:
> eight mandatory dimensions and one hierarchy
> dimension. The fact table is like 20G. Two lookup
> tables are 1.3M and 357k separately. It takes like
> 3 hours to go to 30% progress which is kind of slow.
>
> We'd like to know are there suggestions to speed up
> the Kylin cube build. We got a suggestion from
> a slide said that sort the dimension based on the
> cardinality. Are there any other ways we can try?
>
> We also noticed that only half of the memory and
> half of the CPU are used during the cube build.
> Are there any ways to fully utilize the resource?
>
> Looking forward to hear from you.
>
> Best regards,
> Zhong
>



-- 
Best regards,

Shaofeng Shi

Re: Kylin and Tableau -- Top N query

2016-01-14 Thread ShaoFeng Shi

How much difference between Hive and Kylin? Did you check some factors
like: a) any filtering condition in Cube descriptor? b) is the Cube built
with the full date range of hive table? c) Was the fact/lookup table data
changed since cube be built? Just some hints to exclude those mistakes.
Besides, you can run the SQL from Kylin UI to eliminate the possibility of
ODBC driver.

2016-01-15 7:50 GMT+08:00 sdangi <sda...@datalenz.com>:

> Results from Kylin and Tableau on a live connection don't match.  Any
> reason?
> I'm creating a custom data source (Custom SQL Query) in Tableau and adding
> a
> parameter control using a query similar to below:
>
> SELECT
> t2.c1
> ,sum(t1.c2) AS c3
> FROM t1
> Inner join t2
> on t1.k1 = t2.k1
> group by t2.c1
> order by c3
> LIMIT 
>
> t1 (fact) has 130MM rows and t2 (dimension) has 1.7MM
>
> The query shows different Top N records in Tableau as compared to Kylin and
> Hive.
>
> Thanks,
> Regards,
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/Kylin-and-Tableau-Top-N-query-tp3250.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: beg suggestions to speed up the Kylin cube build

2016-01-14 Thread ShaoFeng Shi

For Meng's case, write 5GB takes 40 minutes, that's really slow. The
bottleneck should be on HDFS write (cuboid has been calculated, just
convert to HFile format in that step, no calculation and others).

2016-01-15 15:36 GMT+08:00 hongbin ma <mahong...@apache.org>:

> if it works I'd love to see the change
>
> On Fri, Jan 15, 2016 at 3:35 PM, hongbin ma <mahong...@apache.org> wrote:
>
> > I'm not sure if it will work, does hbase bulk load allow that?
> >
> > On Fri, Jan 15, 2016 at 2:28 PM, Yerui Sun <sunye...@gmail.com> wrote:
> >
> >> hongbin，
> >>
> >> I understand how the number of reducers is determined, and it could be
> >> improved.
> >>
> >> Supposed that we got 100GB data after cuboid building, and with setting
> >> that 10GB per region. For now, 10 split keys was calculated, and 10
> region
> >> created, 10 reducer used in ‘convert to hfile’ step.
> >>
> >> With optimization, we could calculate 100 (or more) split keys, and use
> >> all them in ‘covert to file’ step, but sampled 10 keys in them to create
> >> regions. The result is still 10 region created, but 100 reducer used in
> >> ‘convert to file’ step. Of course, the hfile created is also 100, and
> load
> >> 10 files per region. That’s should be fine, doesn’t affect the query
> >> performance dramatically.
> >>
> >> > 在 2016年1月15日，13:53，hongbin ma <mahong...@apache.org> 写道：
> >> >
> >> > hi, yerui,
> >> >
> >> > the reason why the number of "convert to hfile" reducers is small is
> >> > because each region's output will become a htable region. Too many
> >> regions
> >> > will be a burden to hbase cluster. In our production env we have cubes
> >> that
> >> > are 10T+, guess how many regions will it populate?
> >> >
> >> > What's more Kylin provides different profiles to control the expected
> >> > region size (thus controlling the number of regions & parallelism of
> >> > "create htable" reducer), you can modify it depending on your cube
> >> size. In
> >> > 2.x it's basically 10G for small cubes, 20G for medium cubes and 100G.
> >> > However this is a manual work when creating cube, and I admit the
> value
> >> > settings for the three profiles is still discussable.
> >> >
> >> >
> >> >
> >> >
> >> > On Fri, Jan 15, 2016 at 11:29 AM, Yerui Sun <sunye...@gmail.com>
> wrote:
> >> >
> >> >> Agreed with 梁猛.
> >> >>
> >> >> Actually we found the same issue, the number of reducers is too small
> >> in
> >> >> step ‘convert to hfile’, which is same as the region count.
> >> >>
> >> >> I think we could increase the number of reducers, to improve
> >> performance.
> >> >> If anyone has interesting in this, we could discuss more about the
> >> solution.
> >> >>
> >> >>> 在 2016年1月15日，09:46，13802880...@139.com 写道：
> >> >>>
> >> >>> actually，I found the last step " convert to hfile"  take too much
> >> time,
> >> >> more than 40 minutes for single region(use small, and result file
> >> about 5GB）
> >> >>>
> >> >>>
> >> >>>
> >> >>> 中国移动广东有限公司 网管中心 梁猛
> >> >>> 13802880...@139.com
> >> >>>
> >> >>> From: ShaoFeng Shi
> >> >>> Date: 2016-01-15 09:40
> >> >>> To: dev
> >> >>> Subject: Re: beg suggestions to speed up the Kylin cube build
> >> >>> The cube build performance is much determined by your Hadoop
> cluster's
> >> >>> capacity. You can do some inspection with the MR job's statistics to
> >> >>> analysis the potential bottlenecks.
> >> >>>
> >> >>>
> >> >>>
> >> >>> 2016-01-15 7:19 GMT+08:00 zhong zhang <zzaco...@gmail.com>:
> >> >>>
> >> >>>> Hi All,
> >> >>>>
> >> >>>> We are trying to build a nine-dimension cube:
> >> >>>> eight mandatory dimensions and one hierarchy
> >> >>>> dimension. The fact table is like 20G. Two lookup
> >> >>>> tables are 1.3M and 357k separately. It takes like
> >> >>>> 3 hours to go to 30% progress which is kind of slow.
> >> >>>>
> >> >>>> We'd like to know are there suggestions to speed up
> >> >>>> the Kylin cube build. We got a suggestion from
> >> >>>> a slide said that sort the dimension based on the
> >> >>>> cardinality. Are there any other ways we can try?
> >> >>>>
> >> >>>> We also noticed that only half of the memory and
> >> >>>> half of the CPU are used during the cube build.
> >> >>>> Are there any ways to fully utilize the resource?
> >> >>>>
> >> >>>> Looking forward to hear from you.
> >> >>>>
> >> >>>> Best regards,
> >> >>>> Zhong
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Best regards,
> >> >>>
> >> >>> Shaofeng Shi
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > Regards,
> >> >
> >> > *Bin Mahone | 马洪宾*
> >> > Apache Kylin: http://kylin.io
> >> > Github: https://github.com/binmahone
> >>
> >>
> >
> >
> > --
> > Regards,
> >
> > *Bin Mahone | 马洪宾*
> > Apache Kylin: http://kylin.io
> > Github: https://github.com/binmahone
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: kylin.hdfs.working.dir becoming too large

2016-06-28 Thread ShaoFeng Shi

Did you try to run cleanup as this guide:
https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html

The files in "cuboid" folder are calculated cube data; They were kept there
only for further segment merge; If you don't plan to merge, you can
manually drop them, will not hurt on query;

2016-06-28 15:51 GMT+08:00 alaleiwang <alaleiw...@sohu-inc.com>:

> hi：
> we found the hdfs storage defined by kylin.hdfs.working.dir becoming
> too
> large,and seems to be larger than kylin related hbase storage,what are
> these
> storage for? is it safe to remove some data in this dir?
> some sub directory  for example:
>  <http://apache-kylin.74782.x6.nabble.com/file/n5146/kylin.png>
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/kylin-hdfs-working-dir-becoming-too-large-tp5146.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: kylin intermediate tables in Hive

2016-06-17 Thread ShaoFeng Shi

Hi Jie,

If a job is "ERROR", the intermediate hive table of it will not be dropped,
as "ERROR" is not a final state; User can resume an "Error" job at any
time, so Kylin skipped to cleanup for that.

If you discard these error jobs, and re-run the cleanup, the intermediate
hive table will be dropped.

The message here is not clear, will change the wording...

2016-06-17 15:48 GMT+08:00 Jie Tao <jie@gameforge.com>:

> You are correct, the intermediate tables are left by fail-building. I do
> clean up storage based on the linked guide. Intermediate data in HDFS and
> Hbase are deleted, but the intermediate tables in Hive not. The command
> shows the tables but do not drop them. I donot have a lookup table but my
> fact table is a view.
>
> As I run the cleanup command,
> kylin_intermediate_logout_full_cube_1970010100_2015100100
> kylin_intermediate_logout_full_cube_1970010100_20160529010500
> kylin_intermediate_logout_full_cube_1970010100_2016060800
> kylin_intermediate_logout_full_cube_1970010100_20160608010500
> kylin_intermediate_logout_full_cube_1970010100_20160609010500
> kylin_intermediate_logout_full_cube_1970010100_2016061500
> kylin_intermediate_logout_full_cube_1970010100_2016062600
> kylin_intermediate_logout_full_cube_1970010100_20160626042000
> kylin_intermediate_test_cube_1970010100_20151201010500
> kylin_intermediate_test_cube_1970010100_20151231234000
> kylin_intermediate_test_cube_1970010100_20160302063000
> kylin_intermediate_test_cube_1970010100_2016062600
> kylin_intermediate_test_cube_1970010100_20160626042000
> kylin_intermediate_test_cube_1970010100_20160704082000
> Time taken: 0.189 seconds, Fetched: 14 row(s)
> 2016-06-17 09:37:12,645 INFO  [main StorageCleanupJob:262]: Remove
> intermediate hive table with job id 493fd20b-3074-403e-9963-fe4fb7ff7c65
> with job status ERROR
> 2016-06-17 09:37:12,648 INFO  [main StorageCleanupJob:262]: Remove
> intermediate hive table with job id 8a377e30-e3ba-4fe2-be12-e7d412afec5e
> with job status ERROR
>
> Best regards,
>
> Jie
>
>
> Am 17.06.2016 um 09:16 schrieb ShaoFeng Shi:
>
>> BTW, are you using a view as lookup table?
>>
>> 2016-06-17 15:15 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:
>>
>> This is common; If you have a job failed in between, and you discard that
>>> job, the "Garbage collection" step will not be executed, so the garbages
>>> will be left there.
>>>
>>> This is why we still recommend user to run offline cleanup every some
>>> period; It is not perfert, but be good for most scenarios:
>>> https://kylin.apache.org/docs15/howto/howto_cleanup_storage.html
>>>
>>> 2016-06-17 15:00 GMT+08:00 Li Yang <liy...@apache.org>:
>>>
>>> Woo... something new to me. Anybody knows?
>>>>
>>>> On Tue, Jun 14, 2016 at 6:57 PM, Jie Tao <jie@gameforge.com> wrote:
>>>>
>>>> Kylin actually drops useless intermediate tables after cube building,
>>>>>
>>>> but
>>>>
>>>>> I still see one "kylin_intermediate_cubename_searchdata" table for each
>>>>> cube building in Hive. Are these tables still usefull for Kylin? I use
>>>>> Kylin 1.5.2.1.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Jie
>>>>>
>>>>>
>>>
>>> --
>>> Best regards,
>>>
>>> Shaofeng Shi
>>>
>>>
>>>
>>
>


-- 
Best regards,

Shaofeng Shi

Re: re-use Hive and Hbase from another kylin cluster issue

2016-02-07 Thread ShaoFeng Shi

what's the version of your new and old kylin?

发送自 Outlook Mobile




On Sat, Feb 6, 2016 at 2:49 PM -0800, "greg gu"  wrote:










I originally have a kylin cluster, I deleted it without deleting the Hbase and 
Hive storages.
now I created a new kylin cluster and would like to re-use the existing hbase 
and hive tables. 
in the new cluster, I can find e projects of the old cluster but when I create 
new projects , the new projects won't show up and I cannot use the old projects 
either.
 
Is this scenario supported?
 
thanks,

Re: [VOTE] Release apache-kylin-2.0-alpha (release candidate 1)

2016-02-09 Thread ShaoFeng Shi

+1 binding

I checked the md5 hash, and verified the source package file is signed
by Li Yang.

"mvn test" failed at first run, the reason is there is a hard-coded path (
"file:///tmp/kylin/cuboidstatistics/") in
FactDistinctColumnsReducerTest.java;
After grant the permission, mvn test got success. This is a minor issue
which should be fixed in next release.



2016-02-09 22:43 GMT+08:00 hongbin ma <mahong...@apache.org>:

> +1 binding
>
> mvn test passed
>
> On Tue, Feb 9, 2016 at 9:52 PM, Dong Li <lid...@apache.org> wrote:
>
> > +1(no binding)
> >
> > build success
> > mvn test passed
> >
> > Thanks,
> > Dong Li
> >
> > 2016-02-09 21:46 GMT+08:00 Adunuthula, Seshu <sadunuth...@ebay.com>:
> >
> > > -1 Kylin 2.0  is not ready for releaseŠ
> > >
> > >
> > >
> > > On 2/9/16, 5:13 AM, "Li Yang" <liy...@apache.org> wrote:
> > >
> > > >Hi all,
> > > >
> > > >I have created a build for Apache Kylin 2.0-alpha, release candidate
> 1.
> > It
> > > >is alpha due to the big amount of new features and improvements
> > > >accumulated
> > > >and I want to be cautious. Yet still it is well tested. Cubes
> (hundreds
> > of
> > > >TB) have been rebuilt and compared with previous version to ensure
> > > >correctness and performance improvement.
> > > >
> > > >Changes highlights:
> > > >
> > > >[KYLIN-875] - A plugin-able architecture, to allow alternative cube
> > engine
> > > >/ storage engine / data source.
> > > >[KYLIN-1245] - A better MR cubing algorithm, about 1.5 times faster
> than
> > > >1.x by comparing hundreds of jobs.
> > > >[KYLIN-942] - A better storage engine, makes query roughly 2 times
> > faster
> > > >(especially for slow queries) than 1.x by comparing tens of thousands
> > > >sqls.
> > > >[KYLIN-738] - Streaming cubing EXPERIMENTAL support, source from
> kafka,
> > > >build cube in-mem at minutes interval
> > > >[KYLIN-943] - TopN pre-calculation (more UDFs coming)
> > > >[KYLIN-1065] - ODBC compatible with Tableau 9.1, MS Excel, MS PowerBI
> > > >[KYLIN-1219] - Kylin support SSO with Spring SAML
> > > >
> > > >
> > > >Thanks to everyone who has contributed to this release. Here¹s release
> > > >notes:
> > > >https://kylin.apache.org/docs/release_notes.html
> > > >
> > > >The commit to be voted upon:
> > > >
> > >
> >
> https://github.com/apache/kylin/commit/dc10a38360c4c014dd9853c6bae0d5a2c11
> > > >5c4c5
> > > >
> > > >The artifacts to be voted on are located here:
> > > >
> > https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.0-alpha-rc1/
> > > >
> > > >The hashes of the artifacts are as follows:
> > > >apache-kylin-2.0-alpha-src.tar.gz.md5
>  e2d78d1a99e49c7fbdd2aa26d2cf09e5
> > > >apache-kylin-2.0-alpha-src.tar.gz.sha1
> > > >a29f9623486adb7467bee94056a0403888fdccf5
> > > >
> > > >A staged Maven repository is available for review at:
> > > >
> https://repository.apache.org/content/repositories/orgapachekylin-1018/
> > > >
> > > >Release artifacts are signed with the following key:
> > > >https://people.apache.org/keys/committer/liyang.asc
> > > >
> > > >Please vote on releasing this package as Apache Kylin 2.0-alpha.
> > > >
> > > >The vote is open for the next 72 hours and passes if a majority of at
> > > >least
> > > >three +1 PPMC votes are cast.
> > > >
> > > >[ ] +1 Release this package as Apache Kylin 1.2
> > > >[ ]  0 I don't feel strongly about it, but I'm okay with the release
> > > >[ ] -1 Do not release this package because...
> > > >
> > > >
> > > >Here is my vote:
> > > >
> > > >+1 (binding)
> > > >
> > > >--
> > > >Best regards,
> > > >
> > > >Yang
> > >
> > >
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: Kylin.sandbox=true in config

2016-01-29 Thread ShaoFeng Shi

Hi Greg, please ignore "deploy.env" or leave it as "DEV", we will remove it
from there as it is confusing:
https://issues.apache.org/jira/browse/KYLIN-1383

2016-01-29 11:15 GMT+08:00 greg gu <gug...@hotmail.com>:

> thanks for your help,
>
> another configure is
> deploy.env=Dev|QA|prod
>
> what's the differences of the three values?
>
> Greg
>
> Sent from my iPhone
>
> > On Jan 28, 2016, at 7:06 PM, ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
> >
> > set Kylin.sandbox=false will start to use LDAP for user authentication,
> it
> > has no relationship with the cluster. Please change it back to true to
> > using the testing profile.
> >
> > 2016-01-29 10:52 GMT+08:00 Jian Zhong <hellowode...@gmail.com>:
> >
> >> what's the error in $KYLIN_HOME/tomcat/logs/kylin.log when you start
> server
> >>
> >>> On Fri, Jan 29, 2016 at 10:20 AM, greg gu <gug...@hotmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have 4 nodes hadoop cluster, I tried to set  Kylin.sandbox=false in
> >>> kylin.properties, then I started kylin.  but the kylin web site stopped
> >>> working after the change.
> >>>
> >>> I use ssh tunneling to visit web site. the url I used is
> >>> http://localhost:7070/kylin
> >>>
> >>> before the changes, the above web site works, it shows kylin UI. after
> I
> >>> changed sandbox to false, the browser says "the web page cannot be
> found"
> >>>
> >>> could you let me know if I need to change Kylin.sandbox to false ?
> >>>
> >>> thanks,
> >>> Greg
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
>



-- 
Best regards,

Shaofeng Shi

Re: elastic search as kylin storage engine

2016-01-27 Thread ShaoFeng Shi

Hi Ze, from what you mentioned, it seems there is no dictionary encoding
and pre-calculation (cube build), is it? How about the query performance
and storage space on ES from your experiments? Thanks for the info.

2016-01-28 10:03 GMT+08:00 zeLiu <liu...@wanda.cn>:

> hi,shaofeng
> I am trying to review the code of 2.x,and Compared with 1.x, the code of
> 2.x
> change is very big.
> this version  is just implemented functional,There are a lot of place needs
> to be optimized in the performance and architecture.
>
>
> Because I do not know how to modify the webapp code,So if the description
> of
> Cube contains  "ES", then it is a ES task
> build job has four step:"Create Intermediate Flat Hive Table">"Bulk
> Index and Load Data to ES">"Update Cube Info">"Garbage Collection"
> "Bulk Index and Load Data to ES" is a new mapreduce job which use to
> mapping
> and bulk index .
> the central class is "org.apache.kylin.job.hadoop.cube.BulkESMapper" .
>
> The query is based on SQLDigest,and return a ITupleIterator.
> when the description of Cube contains  "ES" ,the "StorageEngineFactory"
> will
> return
> "org.apache.kylin.storage.elasticsearch.ElasticSearchStorageEngine".
> if there are no aggregate,excute
>
> "org.apache.kylin.storage.elasticsearch.SerializedElasticSearchTupleIterator"
> ,Otherwise, excute
>
> "org.apache.kylin.storage.elasticsearch.SerializedElasticAggregationTupleIterator"
>
> the count() function is blocked before the SQLDigest,so it does not support
> count() now.
>
> thanks!
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/elastic-search-as-kylin-storage-engine-tp3429p3470.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: Exact distinct count support

2016-01-28 Thread ShaoFeng Shi

is this matched your case? https://issues.apache.org/jira/browse/KYLIN-1186

2016-01-28 17:42 GMT+08:00 Abhilash L L <abhil...@infoworks.io>:

> +user ml
>
> Regards,
> Abhilash
>
> On Thu, Jan 28, 2016 at 11:32 AM, Abhilash L L <abhil...@infoworks.io>
> wrote:
>
> > Hello,
> >
> >Is there a way to ask Kylin to get exact distinct count ?  From what
> we
> > understand, we can choose between hllc(10) to hllc(16)
> >
> >I understand that for every cuboid, you will need to go through the
> > whole data set again, but with the new cubing algo (2.x branch) should be
> > simpler to add ?
> >
> >If currently not present are there any plans to introduce this ?
> >
> > Regards,
> > Abhilash
> >
>



-- 
Best regards,

Shaofeng Shi

Re: N Cuboids preparation MapReduce - Trying to avoid multiple stage read

2016-02-01 Thread ShaoFeng Shi

Is your idea similar with the algorithm we called "fast-c ubing"? :
https://kylin.apache.org/blog/2015/08/15/fast-cubing/



2016-02-01 15:11 GMT+08:00 Ilamparithi M <mailtoilampari...@gmail.com>:

> One more item i havn't specified is :
> Number of key_groups being sent to each subsequent mar reduce jobs.
> In the current design of Kylin, It is very optimal in terms of taking
> minimal key_groups to the next stage.
>
> But looking at the approach I was thinking about - ( Emiting cuboid based
> keys from one stage mapper with C_Id approach ), Combiner becomes a key as
> it would lead to group at mapper side and bringing down too many number of
> values to be transferred to reducer side.
>
> -Ilamparithi M.
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/N-Cuboids-preparation-MapReduce-Trying-to-avoid-multiple-stage-read-tp3528p3531.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: StringIndexOutOfBoundsException: String index out of range: -1

2016-02-03 Thread ShaoFeng Shi

Kylin uses HCatalog to read the hive table, ideally HCatalog will
understand the different formats and partitions; I tried to search whether
HCatalog supports bucket tables, but there is no related discussion. Could
you please report a JIRA with your findings? Firstly we can fix the string
index out of bounds error, and then look into the hive source issue.

2016-02-03 22:09 GMT+08:00 <h...@uni.de>:

> Hi,
>
> we found the reason for the empty output files: the Hive table are
> bucketed. It looks like Kylin does not support bucketed tables and is
> looking in the wrong folder for the necessary files.
>
> Can anyone confirm this?
>
>
> 2016-01-29 7:34 GMT+01:00  <h...@uni.de>:
> > Hi,
> >
> > the output file is actually empty (that's probably the cause for "out
> > of range -1" -> length (0)-1 = -1). There is no output logging which
> > could be used to investigate why the file is actually empty. Any hints
> > on how we can debug why it is empty?
> >
> >
> > 2016-01-29 2:52 GMT+01:00 hongbin ma <mahong...@apache.org>:
> >> HiveColumnCardinalityUpdateJob
> >> desc in source code:
> >>
> >> /**
> >>  * This job will update save the cardinality result into Kylin table
> >> metadata store.
> >>  * @author shaoshi
> >>  */
> >>
> >>
> >>
> >> it does not belong to a cubing job, it's a separate task to help
> modeling.
> >> can you checkout the output in /tmp/kylin/cardinality/KYLIN_DK.DIM_DTM,
> it
> >> seems the content format is not as expected:
> >>
> https://github.com/apache/kylin/blob/kylin-1.2/job/src/main/java/org/apache/kylin/job/hadoop/cardinality/HiveColumnCardinalityUpdateJob.java#L113
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> *Bin Mahone | 马洪宾*
> >> Apache Kylin: http://kylin.io
> >> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: how to bootstrap kylin example project/cube in 2.x-staging

2016-02-03 Thread ShaoFeng Shi

Edward, you can run org.apache.kylin.job.DeployLocalMetaToRemoteTest

2016-02-04 12:10 GMT+08:00 Edward Zhang <yonzhang2...@apache.org>:

> Hi,
>
> I am working on branch 2.x-staging, how do we bootstrap kylin example
> project/cube which is under /examples.
>
> (I started kylin server in IDE, but need run job to get example
> project/cube)
>
> Thanks
>



-- 
Best regards,

Shaofeng Shi

Re: only one reducer in job

2016-02-02 Thread ShaoFeng Shi

KYLIN-1066 <https://issues.apache.org/jira/browse/KYLIN-1066> is irrelevant
with your issue, it was an intermediate issue when developing v2.0, you can
see its "affected Version" and "fixed Version" are all "v2.0";

The  "Kylin Hive Column Cardinality Job" uses 1 reducer to merge the
HyperLogLog counters from mappers, to do a rough estimation on the column
cardinality;  As the output from from each mapper is is a list of HLL
object, instead of the full distinct values, the data size is small (1KB *
# columns), so using 1 reducer to merge all output should be more efficient.

Besides, this job is not a step in cube building, and is invisible from UI
so far, are you sure it is the slow one that you observed?


2016-02-03 8:11 GMT+08:00 greg gu <gug...@hotmail.com>:

> By the way, the job step that uses 1 reducer is "Kylin Hive Column
> Cardinality Job ", is this expected?
>
> > From: gug...@hotmail.com
> > To: dev@kylin.apache.org
> > Subject: only one reducer in job
> > Date: Tue, 2 Feb 2016 11:31:37 -0800
> >
> > When I process the cube, I found there on only one reducer, which cause
> the job to run very long time.
> > I found this https://issues.apache.org/jira/browse/KYLIN-1066, it
> mentioned the issue is fixed.
> >
> > If there a way to change the number of reducer?
> >
> > Thanks,
> >
> >
> >
>
>



-- 
Best regards,

Shaofeng Shi

Re: Exact distinct count support

2016-01-28 Thread ShaoFeng Shi

what's the cardinality of the dimension that you want to count distinct
values? Integer's range is enough for most cases, if your case is under
this scope, you can try the bitmap with integer; but you need map the value
to an unique id and use that within the bitmap. For example, if you want to
count distinct users, use the numeric user_id, instead of email address; To
support other data types, as Hongbin mentioned, the storage cost is very
high, we don't have that plan.





2016-01-28 20:54 GMT+08:00 hongbin ma <mahong...@apache.org>:

> KYLIN-1186 <https://issues.apache.org/jira/browse/KYLIN-1186> is not a
> mature feature yet and it only supports integer
> we don't yet have plans to support any other forms of precise distinct
> count, as it is too expensive to pre-calculate
>
> On Thu, Jan 28, 2016 at 6:56 PM, Abhilash L L <abhil...@infoworks.io>
> wrote:
>
> > Thanks ShaoFeng Shi,
> >
> > We might need for other data types as well
> >
> > date & string
> >
> >  (eg, distinct count of dates of certain activity)
> >
> > So in the rest call instead of hllc return type it should be bitmap for
> > int,tinyint etc ?
> >
> > And we still send it as hllc for other data types ?
> >
> >
> > Also in one of the comments, it said we cast long to int..  wont we be
> > losing data due to truncation ?
> >
> >
> > Regards,
> > Abhilash
> >
> > On Thu, Jan 28, 2016 at 3:43 PM, ShaoFeng Shi <shaofeng...@apache.org>
> > wrote:
> >
> > > is this matched your case?
> > > https://issues.apache.org/jira/browse/KYLIN-1186
> > >
> > > 2016-01-28 17:42 GMT+08:00 Abhilash L L <abhil...@infoworks.io>:
> > >
> > > > +user ml
> > > >
> > > > Regards,
> > > > Abhilash
> > > >
> > > > On Thu, Jan 28, 2016 at 11:32 AM, Abhilash L L <
> abhil...@infoworks.io>
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > >Is there a way to ask Kylin to get exact distinct count ?  From
> > what
> > > > we
> > > > > understand, we can choose between hllc(10) to hllc(16)
> > > > >
> > > > >I understand that for every cuboid, you will need to go through
> > the
> > > > > whole data set again, but with the new cubing algo (2.x branch)
> > should
> > > be
> > > > > simpler to add ?
> > > > >
> > > > >If currently not present are there any plans to introduce this ?
> > > > >
> > > > > Regards,
> > > > > Abhilash
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > > Shaofeng Shi
> > >
> >
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: [jira] [Created] (KYLIN-1089) Kylin failed to run on CDH with HBase 1.0

2016-02-27 Thread ShaoFeng Shi

Hi sdangi, we don't have progress on this; please go ahead and share your
findings with
the community on finish, either works or not works would be helpful for CDH
users, thank you!

2016-02-27 23:53 GMT+08:00 sdangi <sda...@datalenz.com>:

> Team -- Any successful builds against CDH5.5.x? I have just attempted it
> with
> changes in job/storage package to fix the HBase interface changes borrowed
> from 1.1.4 branch.  1.0.0-cdh5.5.x version of HBase from Cloudera is not
> aligning with 1.0.0 from Apache.  However, I have got a clean build and am
> testing it.
>
>
>
> The pom is as below:
>
>
> 2.6.0-cdh5.5.2
> 2.6.0-cdh5.5.2
> 3.4.5-cdh5.5.2
> 1.1.0-cdh5.5.2
>
> 1.1.0-cdh5.5.2
>
> 1.0.0-cdh5.5.2
>
>
>
> INFO]
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Kylin:HadoopOLAPEngine . SUCCESS [
> 0.919
> s]
> [INFO] Kylin:AtopCalcite .. SUCCESS [
> 2.660
> s]
> [INFO] Kylin:Common ... SUCCESS [
> 46.248
> s]
> [INFO] Kylin:Metadata . SUCCESS [
> 5.385
> s]
> [INFO] Kylin:Dictionary ... SUCCESS [
> 1.558
> s]
> [INFO] Kylin:Cube . SUCCESS [
> 3.652
> s]
> [INFO] Kylin:InvertedIndex  SUCCESS [
> 0.635
> s]
> [INFO] Kylin:Job .. SUCCESS [
> 7.054
> s]
> [INFO] Kylin:Storage .. SUCCESS [
> 3.900
> s]
> [INFO] Kylin:Query  SUCCESS [
> 1.272
> s]
> [INFO] Kylin:JDBC . SUCCESS [
> 2.235
> s]
> [INFO] Kylin:RESTServer ... SUCCESS [
> 11.829
> s]
> [INFO] Kylin:Monitor .. SUCCESS [
> 1.129
> s]
> [INFO]
> 
>
> I will report any issues on running and building the cubes.
>
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/jira-Created-KYLIN-1089-Kylin-failed-to-run-on-CDH-with-HBase-1-0-tp2027p3742.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: [jira] [Created] (KYLIN-1089) Kylin failed to run on CDH with HBase 1.0

2016-02-28 Thread ShaoFeng Shi

Need more detail logs to analysis, like the query log (in kylin.log), hbase
region server side log etc.

2016-02-28 11:25 GMT+08:00 sdangi <sda...@datalenz.com>:

> I have some good news.  I have rebuilt the current Kylin Master1.2  against
> CDH5.5.3.  I had to bring some of 1.1.3 branch changes dealing with
> Cloudera
> API around additional methods for RegionScanner interface.
>
> Also, I had exclude some HBase (server) dependencies in the pom and upgrade
> curator binaries.  Compiles/Builds ok, server starts ok,  kylin sample cube
> build and query works fine.  I did apply this to our ongoing POC with over
> 1B rows.  Build and most queries work.  However, there is one query against
> time dimension (which had reported to Luke directly due to sensitive
> nature)
> and that is throwing a new error.  Earlier it was AbstractMethod error
> related to getBatch interface of Scanner that is now fixed due to binary
> compatibility with CDH5.5.2.
>
> But now, it fails with
>
>
> Caused by:
>
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException):
> org.apache.hadoop.hbase.DoNotRetryIOException:
> java.lang.AbstractMethodError
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2065)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.AbstractMethodError
> at
>
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2278)
> at
>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32205)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2034)
> ... 4 more
>
> at
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1219)
> at
>
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
> at
>
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
> at
>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
> at
>
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:213)
>
>
> I have remote debugged it.  Here is the stack
>
> <http://apache-kylin.74782.x6.nabble.com/file/n3745/Debugger_stack.png>
>
> Similar query (See bel0w) on the sample cube works great.
>
> //this works
> SELECT SUM(PRICE) FROM  KYLIN_SALES
> WHERE PART_DT=DATE'2012-01-01'
>
> //this does not (1B rows in fact table)
> SELECT
>   CURRENCY
> ,SUM(TXN_AMT)  TOT_USD_TXN_AMT
> FROM TXN_FCT_ORC as TXN_FCT_ORC
> INNER JOIN DIM_ORC as DT_DIM_ORC
> ON BOOK_DT_KEY = DT_DIM_ORC.DT_KEY
> INNER JOIN CCY_DIM_ORC as CCY_DIM_ORC
> ON TXN_FCT_ORC.RMTR_CCY_KEY = CCY_DIM_ORC.CCY_KEY
> WHERE DT_DIM_ORC.DT_KEY=date'2015-03-03'
> GROUP BY
> CCY_DIM_ORC.CCY_NM, DT_DIM_ORC.CDR_YR
> ORDER BY TOT_USD_TXN_AMT  DESC
>
>
>
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/jira-Created-KYLIN-1089-Kylin-failed-to-run-on-CDH-with-HBase-1-0-tp2027p3745.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: no projects in the Kylin UI

2016-01-25 Thread ShaoFeng Shi

Zhong, thanks for the update; and hope you have run kylin successfully in the 
POC.

发送自 Outlook Mobile




On Mon, Jan 25, 2016 at 5:54 AM -0800, "zhong zhang" <zzaco...@gmail.com> wrote:










Hi Dong, Jia, Shaofeng and all,

The problem is solved. I'm so sorry for the late update.
And thanks so much for taking time to help. Sorry once
again.

Best regards,
Zhong

B

On Sun, Jan 24, 2016 at 8:47 AM, ShaoFeng Shi 
wrote:

> I think Zhong's problem has got fixed by himself. Other guys please ignore.
>
> For the people ask question in mailing list: when you found the root cause,
> please do a quick update here; that not only for knowledge sharing, but
> also for saving other people's time. This is a common sense.
>
>
>
> 2016-01-24 19:38 GMT+08:00 Jian Zhong :
>
> > can't see your pic, did you create your project first?
> >
> > On Sat, Jan 23, 2016 at 4:17 PM, Dong Li  wrote:
> >
> > > Hello,
> > >
> > > The picture didn't show up.
> > > Also, could you please attach some log from
> > > $KYLIN_HOME/tomcat/logs/kylin.log.
> > >
> > > Thanks,
> > > Dong Li
> > >
> > > 2016-01-23 0:42 GMT+08:00 zhong zhang :
> > >
> > > > [image: Inline image 1]
> > > > Hi All,
> > > >
> > > > Just as the pic shows, there is no projects shown in the UI.
> > > > Can anyone give some ideas to fix this problem?
> > > >
> > > >
> > > > Best regards,
> > > > Zhong
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Dong
> > >
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>

Re: internal hive table and build the cube backward

2016-01-19 Thread ShaoFeng Shi

Only the first step actually, Kylin runs "hive -e" command to create an
intermediate table; The following steps are running MR over the files under
that table.

2016-01-20 4:18 GMT+08:00 zhong zhang <zzaco...@gmail.com>:

> Hi Yu and Everyone,
>
> Just a little bit supplement, Hive definitely involves in the step of
> Create
> Intermediate Flat Hive Table and Build Dimension Dictionary. The question
> is that does Hive involve in the following steps of building cuboids?
>
> Best regards,
> Zhong
>
> On Sun, Jan 17, 2016 at 10:35 PM, yu feng <olaptes...@gmail.com> wrote:
>
> > Firstly, kylin do not distinguish which kind table in hive,  if only you
> > can query it in hive, so the table can be normal table, external table,
> > view or table with some serdes.
> > then I think it is hard to build cube backward along the time in kylin.
> > maybe someone has some good ideas at this point.
> >
> > 2016-01-18 11:04 GMT+08:00 zhong zhang <zzaco...@gmail.com>:
> >
> > > Hi All,
> > >
> > > I'm wondering can I build the Kylin cube backward along the time. More
> > > specifically, can I build the cube from the current time to six months
> > ago
> > > and then from six months ago to 12 months ago and go on? In this way, I
> > can
> > > have the latest six months' cube result first.
> > >
> > > It's well known that the input of Kylin cube is hive table. Does it
> make
> > > any difference
> > > between using internal hive table and external hive table when building
> > the
> > > cube?
> > >
> > > Best regards,
> > > Zhong
> > >
> >
>



-- 
Best regards,

Shaofeng Shi

Re: Patch Review Request for EAGLE-1363 and EAGLE-1355

2016-01-27 Thread ShaoFeng Shi

Thanks Hao!

2016-01-27 21:54 GMT+08:00 hongbin ma <mahong...@apache.org>:

> merged
>
> thank you for your contribution Hao
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>



-- 
Best regards,

Shaofeng Shi

Re: Hive error when running kylin

2016-04-06 Thread ShaoFeng Shi

Hive may tell a wrong class path, which causing the NoClassDefFoundError.

Try run $KYLIN_HOME/bin/find-hive-dependency.sh, get the output, and then
check whether the paths/files exists and accessible.

2016-04-06 15:24 GMT+08:00 Yagyank Chadha <yagy...@gmail.com>:

> HI kylin developers,
>
> I installed hadoop hive and hbase to work with kylin. everything was
> working fine until I changed something( which I donbt exactly know) and I
> started getting the following error whenever I run kylin
>
> KYLIN_HOME is set to /usr/lib/kylin/bin/../
> kylin.security.profile is set to testing
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/hive/ql/CommandNeedRetryException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hive.ql.CommandNeedRetryException
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 4 more
> HIVE_CONF is set to: /usr/lib/hive/apache-hive-1.2.1-bin/conf/, use it
> to locate hive configurations.
> HCAT_HOME is set to: /usr/lib/hive/apache-hive-1.2.1-bin/hcatalog, use
> it to find hcatalog path:
> dirname: missing operand
> Try 'dirname --help' for more information.
> find: cannot search `': No such file or directory
> hive dependency:
>
> /usr/lib/hive/apache-hive-1.2.1-bin/conf/::/usr/lib/hive/apache-hive-1.2.1-bin/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar
> hbase dependency: /usr/lib/hbase/lib/hbase-common-0.98.18-hadoop2.jar
> KYLIN_JVM_SETTINGS is -Xms1024M -Xmx4096M -XX:MaxPermSize=128M
> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> -Xloggc:/usr/lib/kylin/bin/..//logs/kylin.gc -XX:+UseGCLogFileRotation
> -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M
> KYLIN_DEBUG_SETTINGS is not set, will not enable remote debuging
> KYLIN_LD_LIBRARY_SETTINGS is not set, Usually it's okay unless you want
> to specify your own native path
> A new Kylin instance is started by root, stop it using "kylin.sh stop"
> Please visit http://:7070/kylin
> You can check the log at /usr/lib/kylin/bin/..//logs/kylin.log
>
> --
> Regards
> *Yagyank chadha*
>
> *Undergraduate student*
> *Computer Science Engineering*
> *Thapar University, Patiala*
>



-- 
Best regards,

Shaofeng Shi

Re: Re: Cube bulid success,insight no display table

2016-04-08 Thread ShaoFeng Shi

Just ensure the "kylin.rest.servers" property has included all servers,
even only has one machine; Sometimes the user forgot to update this
property after expand kylin to a cluster, that would cause the issue you
mentioned, that's why I was asking for this.

If your configurations are all good, please take more tries and check
whether there is any http error in the logs/kylin.log. So far we haven't
received reporting about this issue in v1.5.0

2016-04-08 17:55 GMT+08:00 Roy <aqinnxuk...@163.com>:

> we kylin is cluster mode. but now just started one machine to listen on
> port 7070. so we need modify some config?
>
>
>
>
>
>
>
>
> At 2016-04-08 17:02:32, "ShaoFeng Shi" <shaofeng...@apache.org> wrote:
> >is your kylin in a singleton mode or cluster mode? does it listen on port
> >7070?
> >
> >2016-04-08 16:05 GMT+08:00 Roy <aqinnxuk...@163.com>:
> >
> >> Hi guy,
> >>
> >> we use kylin v1.5 , sometimes when cube bulid successful and status is
> >> ready. but in insight tab not tables available.
> >>
> >> Best Regards
> >>
> >> Roy He
> >>
> >
> >
> >
> >--
> >Best regards,
> >
> >Shaofeng Shi
>

-- 
Best regards,

Shaofeng Shi

Re: [VOTE] Release apache-kylin-1.5.1 (release candidate 1)

2016-04-09 Thread ShaoFeng Shi

+1 (binding)

verified signature, md5 and sha hash; mvn test also passed;

2016-04-09 16:01 GMT+08:00 王晓雨 <wangxiao...@jd.com>:

> +1 (binding)
>
> mvn test passed
> signature verified
>
>
> > 在 2016年4月9日，15:03，Li Yang <liy...@apache.org> 写道：
> >
> > +1 binding
> >
> > mvn test pass
> >
> > java version "1.7.0_71"
> > OpenJDK Runtime Environment (rhel-2.5.3.1.el6-x86_64 u71-b14)
> > OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
> >
> >
> > On Fri, Apr 8, 2016 at 2:43 PM, Dong Li <lid...@apache.org> wrote:
> >
> >> Hi all,
> >>
> >>
> >> I have created a build for Apache Kylin 1.5.1, release candidate 1.
> >>
> >>
> >> Changes highlights:
> >> [KYLIN-1122] - Kylin support detail data query from fact
> table[KYLIN-1492]
> >> - Custom dimension encoding[KYLIN-1495] - Metadata upgrade from 1.0~1.3
> to
> >> 1.5, including metadata correction, relevant tools, etc.[KYLIN-1534] -
> Cube
> >> specific config, override global kylin.properties[KYLIN-1546] - Tool to
> >> dump information for diagnosis
> >>
> >>
> >> Thanks to everyone who has contributed to this release.
> >> Here’s release notes:
> >>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121version=12335346
> >>
> >>
> >> The commit to be voted upon:
> >>
> >>
> https://github.com/apache/kylin/commit/aa98875c1b603e79b866b5e91bc3288e61a0b679
> >>
> >>
> >> Its hash is aa98875c1b603e79b866b5e91bc3288e61a0b679.
> >>
> >>
> >> The artifacts to be voted on are located here:
> >> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-1.5.1-rc1/
> >>
> >>
> >> The hashes of the artifacts are as follows:
> >> apache-kylin-1.5.1-src.tar.gz.md575df97f689d81f58eff47d1f51cdd45d
> >> apache-kylin-1.5.1-src.tar.gz.sha1
> 8c8266f8fe96665f8520108b75a4491246615ce8
> >>
> >>
> >> A staged Maven repository is available for review at:
> >> https://repository.apache.org/content/repositories/orgapachekylin-1024
> >>
> >>
> >> Release artifacts are signed with the following key:
> >> https://people.apache.org/keys/committer/lidong.asc
> >>
> >>
> >> Please vote on releasing this package as Apache Kylin 1.5.1.
> >>
> >>
> >> The vote is open for the next 72 hours and passes if a majority of
> >> at least three +1 PPMC votes are cast.
> >>
> >>
> >> [ ] +1 Release this package as Apache Kylin 1.5.1
> >> [ ] 0 I don't feel strongly about it, but I'm okay with the release
> >> [ ] -1 Do not release this package because...
> >>
> >>
> >> Here is my vote:
> >>
> >>
> >> +1 (binding)
> >>
> >>
> >> Thanks,
> >> Dong Li
>
>


-- 
Best regards,

Shaofeng Shi

Re: Welcome new committer and PMC member

2016-04-11 Thread ShaoFeng Shi

Welcome Yanghong, Xiaoyu and Dong!

2016-04-11 20:07 GMT+08:00 Adunuthula, Seshu <sadunuth...@ebay.com>:

> Yanghong, Congratulations on becoming a committer.
>
> Xiaoyu, Dong, Congratulations on becoming the PMC members. These
> activities show the maturing of Kylin PMC.
>
> Regards
> Seshu Adunuthula
>
>
> On 4/9/16, 12:26 AM, "Luke Han" <luke...@apache.org> wrote:
>
> >I am very pleased to announce that the Project Management Committee
> >(PMC) of Apache Kylin has asked Yanghong Zhong to become committer,
> >Xiaoyu Wang and Dong Li to become PMC member, and they both accepted.
> >
> >They have made significant contributions, patches, also activity
> >answer others questions and issues in Kylin's community.
> >
> >Welcome:)
> >
> >Luke
> >
> >On behalf of the Apache Kylin PPMC
>
>


-- 
Best regards,

Shaofeng Shi

Re: 遇到该问题怎么解决

2016-04-12 Thread ShaoFeng Shi

Did you try to google this error? Please take a try if you haven't.

2016-04-12 16:01 GMT+08:00 耳东 <775620...@qq.com>:

> Error: java.lang.RuntimeException: java.lang.ClassNotFoundException:
> Class  org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2112) at
> org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:184)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:415) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1776)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by:
> java.lang.ClassNotFoundException: Class
> org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2018)
> at  org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2110)
> ... 8  more




-- 
Best regards,

Shaofeng Shi

Re: an error occurred when build a sample cube at step 5:create HTable

2016-04-12 Thread ShaoFeng Shi

Good catch and share; Keep going

2016-04-12 15:43 GMT+08:00 qing·ye <1813724...@qq.com>:

>  i resolve the problem .
>
> i check my hdfs log and find some information below:
>
> 2016-04-12 12:05:05,726 ERROR [RS_OPEN_REGION-slave2:16020-0]
> handler.OpenRegionHandler: Failed open of
> region=KYLIN_VKRC32OKFP,,1460433926913.73fb906719a75b2733f046e87fbe8105.,
> starting to roll back the global memstore size.
> org.apache.hadoop.hbase.DoNotRetryIOException: Compression algorithm
> 'snappy' previously failed test. at
>
> org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:91)
> at
>
> org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:6300)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6251)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6218)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6189)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6145)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6096)
> at
>
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:362)
> at
>
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:129)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745) 2016-04-12 12:05:05,727 INFO
> [RS_OPEN_REGION-slave2:16020-0] coordination.ZkOpenRegionCoordination:
> Opening of region {ENCODED => 73fb906719a75b2733f046e87fbe8105, NAME =>
> 'KYLIN_VKRC32OKFP,,1460433926913.73fb906719a75b2733f046e87fbe8105.',
> STARTKEY => '', ENDKEY => '\x00\x01'} failed, transitioning from OPENING to
> FAILED_OPEN in ZK, expecting version 1 2016-04-12 12:05:05,775 INFO
> [PriorityRpcServer.handler=18,queue=0,port=16020]
> regionserver.RSRpcServices: Open
> KYLIN_VKRC32OKFP,\x00\x01,1460433926913.06978b9fb1e423563a5aae7e1df044d8.
>
> then i realized probably my config about Compression algorithm is wrong
> so i disabled Compression algorithm of snappy by delete relative config.
>
> To disable compressing MR jobs you need to modify
> $KYLIN_HOME/conf/kylin_job_conf.xml by removing all configuration entries
> related to compression(Just grep the keyword “compress”). To disable
> compressing hbase tables you need to open $KYLIN_HOME/conf/kylin.properties
> and remove the line starting with kylin.hbase.default.compression.codec.
>
> finally, i restart my service and it works well.
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/an-error-occurred-when-build-a-sample-cube-at-step-5-create-HTable-tp4102p4140.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: Does INMEM Cubing algorithm release in v1.3?

2016-04-06 Thread ShaoFeng Shi

in-mem cubing (also called "fast cubing") is only available from v1.5.0;

you can check two blogs about "Fast cubing" in
https://kylin.apache.org/blog/

2016-04-06 15:20 GMT+08:00 Mars J <xujiao.myc...@gmail.com>:

> Hi,
> Does the inmem cubing algorithm released in v1.3-hbase1.1.3 ?
>
> I guess the anwser is not, because I look  up all the v 1.3 logs
> produced by what  I execute, there is no something like 'The cube algorithm
> for...' which I found in the org.apache.kylin.engine.mr.steps (logger
> .info("The cube algorithm for " + seg + " is " + alg);)
>
> Is there some docs describ the inmem cubing algorithm ? Is this one (
> http://www.infoq.com/cn/articles/apache-kylin-algorithm)?
>



-- 
Best regards,

Shaofeng Shi

Re: Can we choose layered cubing or im-memory cubing manually

2016-03-26 Thread ShaoFeng Shi

Hi Yanghong,

What's the detail error in such a failed MR job?

2016-03-26 7:43 GMT+08:00 Zhong, Yanghong <nju_zyh081251...@126.com>:

> For some cube building jobs, kylin chooses im-memory cubing. However, this
> choice is not good for some users due to  large memory cost. Customers may
> be able to tolerate with long cube building time, but is not able to
> tolerate with large memory cost, which may lead to cube building MR job
> failure. Therefore, it may be better to provide a parameter for setting
> whether automatically decide the strategy or manually.
>
> Best regards,
> Yanghong Zhong
> yangzh...@ebay.com
>



-- 
Best regards,

Shaofeng Shi

Re: About the function of the "Refresh" button under tab "System"

2016-03-26 Thread ShaoFeng Shi

My suggestion is "don't make it too smart"; the configurations in
kylin.properties should be static, which not intend to be updated often.
Some configurations are very basic which impact on the whole system; it is
not easy to dynamically change without a restart.

You also need consider the cluster mode, as the "Refresh"/"Set" request is
only sent to one REST server, to take effective all nodes need be
notified/syncronized.

2016-03-26 20:15 GMT+08:00 hongbin ma <mahong...@apache.org>:

> On Sat, Mar 26, 2016 at 7:36 AM, Zhong, Yanghong <nju_zyh081251...@126.com
> >
> wrote:
>
> > After a property be updated by “Set Config”, the left TextArea’s contents
> > be updated automatically.
> >
>
> this  looks great to me
> 
>
>
> > After the “refresh” button clicked for “Server Config”, kylin loads
> > contents from the configuration file “kylin.properties” and updates the
> UI.
> >
>
> I'm not sure if this looks intuitive to users. It makes it look like
> "reset" config rather than "refresh" config. After all the changes made in
> "Set config" is not persisted to the configuration file. A brutal reload
> will clear all the changes being made.
>
>
>
> --
> Regards,
>
> *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io
> Github: https://github.com/binmahone
>

-- 
Best regards,

Shaofeng Shi

Re: [VOTE] Release apache-kylin-1.3 (release candidate 1)

2016-03-07 Thread ShaoFeng Shi

er.java:50)

at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)

at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)

at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)

at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)

at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)

at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)

at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)

at org.junit.runners.ParentRunner.run(ParentRunner.java:309)

at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)

at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)

at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)

at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)

at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)

at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

Tests run: 2, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 0.131 sec
<<< FAILURE! - in org.apache.kylin.storage.test.StorageTest

test01(org.apache.kylin.storage.test.StorageTest)  Time elapsed: 0.13 sec
<<< FAILURE!

java.lang.AssertionError: null

at org.junit.Assert.fail(Assert.java:86)

at org.junit.Assert.assertTrue(Assert.java:41)

at org.junit.Assert.assertNotNull(Assert.java:621)

at org.junit.Assert.assertNotNull(Assert.java:631)

at org.apache.kylin.storage.test.StorageTest.setUp(StorageTest.java:77)



2016-03-08 11:04 GMT+08:00 Dong Li <lid...@apache.org>:

> +1 (no binding)
>
>
> mvn test passed
>
>
> Thanks,
> Dong Li
>
>
> Original Message
> Sender:hongbin mamahong...@apache.org
> Recipient:dev...@kylin.apache.org
> Date:Tuesday, Mar 8, 2016 10:56
> Subject:[VOTE] Release apache-kylin-1.3 (release candidate 1)
>
>
> Hi all, I have created a build for Apache Kylin 1.3, release candidate 1.
> Changes highlights: [KYLIN-1323] - Improve performance of converting data
> to hfile [KYLIN-1186] - Support precise Count Distinct using bitmap
> [KYLIN-976] - Support Custom Aggregation Types [KYLIN-1054] - Support Hive
> client Beeline [KYLIN-1128] - Clone Cube Metadata Thanks to everyone who
> has contributed to this release. Here’s the full release notes:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121version=1265
> The commit to be voted upon:
> https://github.com/apache/kylin/commit/99b50e1ce517a1be8fe3e87108999d347aa4929b
> Its hash is 99b50e1ce517a1be8fe3e87108999d347aa4929b. The artifacts to be
> voted on are located here:
> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-1.3-rc1/ The
> hashes of the artifacts are as follows: apache-kylin-1.3-src.tar.gz.md5
> 15aad4c3f7e435d30477ce5fae8bea30 apache-kylin-1.3-src.tar.gz.sha1
> 09995c94cfd8d8b2bb1798126157093da9142762 A staged Maven repository is
> available for review at:
> https://repository.apache.org/content/repositories/orgapachekylin-1019/
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/mahongbin.asc Please vote on
> releasing this package as Apache Kylin 1.3. The vote is open for the next
> 72 hours and passes if a majority of at least three +1 PPMC votes are cast.
> [ ] +1 Release this package as Apache Kylin 1.3 [ ] 0 I don't feel strongly
> about it, but I'm okay with the release [ ] -1 Do not release this package
> because... Here is my vote: +1 (binding) -- Regards, *Bin Mahone | 马洪宾*
> Apache Kylin: http://kylin.io Github: https://github.com/binmahone




-- 
Best regards,

Shaofeng Shi

Re: multiple user in kylin

2016-03-08 Thread ShaoFeng Shi

The "testing" profile has three built-in accounts: ADMIN(KYLIN),
MODELER(MODELER), ANALYST(ANALYST), each represents a role. To have more
accounts, you can manually edit the kylinSecurity.xml as
https://github.com/apache/kylin/blob/master/server/src/main/resources/kylinSecurity.xml#L154

While the "testing" profile is only for test or POC purpose. In a
production deployment, suggest you use LDAP to manage the user and roles,
Kylin has the support for LDAP integration.

2016-03-08 14:24 GMT+08:00 Ding Dinghua <dingdinghu...@gmail.com>:

> Hi:
> Does kylin support multiple user besides ADMIN?
> This may be useful when integrating kylin in our workflow, we may need
> to
> do isolation work between different users in hdfs/mr/hive/hbase.
>
>  Thanks.
>
> --
> Ding Dinghua
>



-- 
Best regards,

Shaofeng Shi

Re: [jira] [Created] (KYLIN-1089) Kylin failed to run on CDH with HBase 1.0

2016-03-03 Thread ShaoFeng Shi

Hi sdangi, you can find how to make a patch in:
https://kylin.apache.org/development/howto_contribute.html

Thanks!

2016-03-03 21:02 GMT+08:00 sdangi <sda...@datalenz.com>:

> Team -- More than happy to.  Can some one guide me on how to get the patch
> out?
>
> Thanks,
> Regards,
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/jira-Created-KYLIN-1089-Kylin-failed-to-run-on-CDH-with-HBase-1-0-tp2027p3784.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: VER1.5 -- Cannot find rowkey column DT_KEY in cube CubeDesc [name=TEST_CUBE]

2016-03-28 Thread ShaoFeng Shi

This is a bug, could you please report it as a JIRA?

To bypass this error for now, please use the FK on fact table as the
dimension ("TXN_BOOK_DT_KEY" in this case).

2016-03-28 22:35 GMT+08:00 sdangi <sda...@datalenz.com>:

> I have designed model/cubes in the past on version 1.2 and 1.3 no issue.
> I'm
> hitting this issue with 1.5.  Please check the model and cube JSON and let
> me know if there is anything that stands out to cause this.
>
>
> *Error Message
> Cannot find rowkey column DT_KEY in cube CubeDesc [name=TEST_CUBE]*
>
>
> MODEL:
> 
> {
>   "uuid": "dd8395e2-0da3-48b1-8a0c-4165d477e7c5",
>   "version": "1.5.0",
>   "name": "TEST_MODEL",
>   "description": "",
>   "lookups": [
> {
>   "table": "SCHM.DT_DIM_ORC",
>   "join": {
> "type": "inner",
> "primary_key": [
>   "DT_KEY"
> ],
> "foreign_key": [
>   "TXN_BOOK_DT_KEY"
> ]
>   }
> },
> {
>   "table": "SCHM.CST_DIM_ORC",
>   "join": {
> "type": "inner",
> "primary_key": [
>   "CST_KEY"
> ],
> "foreign_key": [
>   "FIRM_CST_KEY"
> ]
>   }
> }
>   ],
>   "dimensions": [
> {
>   "table": "SCHM.TXN_FCT_ORC_SM",
>   "columns": []
> },
> {
>   "table": "SCHM.DT_DIM_ORC",
>   "columns": [
> "DT_KEY"
>   ]
> },
> {
>   "table": "SCHM.CST_DIM_ORC",
>   "columns": [
> "CST_NM"
>   ]
> }
>   ],
>   "metrics": [
> "USD_TXN_AMT"
>   ],
>   "capacity": "MEDIUM",
>   "last_modified": 1459175903495,
>   "fact_table": "SCHM.TXN_FCT_ORC_SM",
>   "filter_condition": "",
>   "partition_desc": {
> "partition_date_column": "SCHM.TXN_FCT_ORC_SM.TXN_BOOK_DT_KEY",
> "partition_time_column": null,
> "partition_date_start": 0,
> "partition_date_format": "-MM-dd",
> "partition_time_format": "HH:mm:ss",
> "partition_type": "APPEND",
> "partition_condition_builder":
>
> "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
>   }
> }
>
>
> CUBE:
> ===
>
> {
>   "name": "TEST_CUBE",
>   "model_name": "TEST_MODEL",
>   "description": "",
>   "dimensions": [
> {
>   "name": "CST_DIM_CST_NM",
>   "table": "SCHM.CST_DIM_ORC",
>   "derived": null,
>   "column": "CST_NM"
> },
> {
>   "name": "DT_DIM_DT_KEY",
>   "table": "SCHM.DT_DIM_ORC",
>   "derived": null,
>   "column": "DT_KEY"
> }
>   ],
>   "measures": [
> {
>   "name": "_COUNT_",
>   "function": {
> "expression": "COUNT",
> "returntype": "bigint",
> "parameter": {
>   "type": "constant",
>   "value": "1",
>   "next_parameter": null
> }
>   }
> },
> {
>   "name": "USD_TXN_AMT",
>   "function": {
> "expression": "SUM",
> "returntype": "decimal(32,8)",
> "parameter": {
>   "type": "column",
>   "value": "USD_TXN_AMT",
>   "next_parameter": null
> }
>   }
> }
>   ],
>   "rowkey": {
> "rowkey_columns": [
>   {
> "column": "CST_NM",
> "encoding": "dict"
>   },
>   {
> "column": "DT_KEY",
> "encoding": "dict"
>   }
> ]
>   },
>   "aggregation_groups": [
> {
>   "includes": [
> "CST_NM",
> "DT_KEY"
>   ],
>   "select_rule": {
> "hierarchy_dims": [],
> "mandatory_dims": [],
> "joint_dims": []
>   }
> }
>   ],
>   "partition_date_start": 138853440,
>   "notify_list": [],
>   "hbase_mapping": {
> "column_family": [
>   {
> "name": "f1",
> "columns": [
>   {
> "qualifier": "m",
> "measure_refs": [
>   "_COUNT_",
>   "USD_TXN_AMT"
> ]
>   }
> ]
>   }
> ]
>   },
>   "retention_range": "0",
>   "auto_merge_time_ranges": [
> 60480,
> 241920
>   ],
>   "engine_type": 2,
>   "storage_type": 2
> }
>
> Thanks,
> Regards,
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/VER1-5-Cannot-find-rowkey-column-DT-KEY-in-cube-CubeDesc-name-TEST-CUBE-tp3982.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: Kylin UI: Enable partition date column to support date and hour as separate columns

2016-03-30 Thread ShaoFeng Shi

I see, the issue should better be fixed in next release.

I have a question, maybe it has been discussed before: will using one
timestamp column be better to using multiple columns for partitioning?
Timestamp is a standard data type like Date, and there are kinds of
functions to extract the week, day, hour, min information from it. If user
separates the hour/minute in different columns, it is also easy to convert
them into a timestamp.

2016-03-30 20:00 GMT+08:00 Jian Zhong <hellowode...@gmail.com>:

> UI is not ready, I'm working with Dipesh on
> https://issues.apache.org/jira/browse/KYLIN-1441
>
> and for backend, need to check the multiple days issue, we have a jira here
>
> https://issues.apache.org/jira/browse/KYLIN-1513?filter=-3
>
>
> On Wed, Mar 30, 2016 at 11:07 AM, ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
>
> > @Dipesh @Jason, could you please answer Delu's question? I'm also curious
> > about this; thank you.
> >
> > 2016-03-30 10:56 GMT+08:00 Delu Zhu <delu...@yahoo.com.invalid>:
> >
> > > Hi,
> > >
> > >
> > >
> > > Is there any beta version for this feature, so I could try it out at my
> > > side.
> > > ThanksDelu
> > >
> > >   From: Delu Zhu <delu...@yahoo.com.INVALID>
> > >  To: "dev@kylin.apache.org" <dev@kylin.apache.org>
> > >  Sent: Tuesday, March 29, 2016 4:35 PM
> > >  Subject: Kylin UI: Enable partition date column to support date and
> hour
> > > as separate columns
> > >
> > > Hi Kylin Developers,
> > >
> > > I'm working on migrating our hourly cube building pipeline to Kylin,
> and
> > > from your official docs Task KYLIN-1427 has been included in the
> release
> > > note of apache/kylin to enable partition date column to support date
> and
> > > hour as separate columns.
> > >
> > > This is a good new, but after upgrading to kylin 1.5, I found this
> > feature
> > > has not been enabled on Kylin UI yet.
> > >
> > > So when will KYLIN-1441 ( the corresponding changes at ui side) be
> > > released?
> > >
> > > ThanksDelu
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi
> >
>



-- 
Best regards,

Shaofeng Shi

Re: VER1.5 -- Cannot find rowkey column DT_KEY in cube CubeDesc [name=TEST_CUBE]

2016-03-30 Thread ShaoFeng Shi

I reproduced this problem from GUI; so this is a regression, or the fix was
only made in 1.x branch; we need check this.

2016-03-30 18:21 GMT+08:00 Li Yang <liy...@apache.org>:

> The column under "rowkey" section should be "TXN_BOOK_DT_KEY" instead of
> "DT_KEY". If the cube is created through GUI then pls report a JIRA.
>
> Btw this bug has been fixed in earlier version. It come back again?!
>
> On Tue, Mar 29, 2016 at 11:21 AM, ShaoFeng Shi <shaofeng...@apache.org>
> wrote:
>
> > This is a bug, could you please report it as a JIRA?
> >
> > To bypass this error for now, please use the FK on fact table as the
> > dimension ("TXN_BOOK_DT_KEY" in this case).
> >
> > 2016-03-28 22:35 GMT+08:00 sdangi <sda...@datalenz.com>:
> >
> > > I have designed model/cubes in the past on version 1.2 and 1.3 no
> issue.
> > > I'm
> > > hitting this issue with 1.5.  Please check the model and cube JSON and
> > let
> > > me know if there is anything that stands out to cause this.
> > >
> > >
> > > *Error Message
> > > Cannot find rowkey column DT_KEY in cube CubeDesc [name=TEST_CUBE]*
> > >
> > >
> > > MODEL:
> > > 
> > > {
> > >   "uuid": "dd8395e2-0da3-48b1-8a0c-4165d477e7c5",
> > >   "version": "1.5.0",
> > >   "name": "TEST_MODEL",
> > >   "description": "",
> > >   "lookups": [
> > > {
> > >   "table": "SCHM.DT_DIM_ORC",
> > >   "join": {
> > > "type": "inner",
> > > "primary_key": [
> > >   "DT_KEY"
> > > ],
> > > "foreign_key": [
> > >   "TXN_BOOK_DT_KEY"
> > > ]
> > >   }
> > > },
> > > {
> > >   "table": "SCHM.CST_DIM_ORC",
> > >   "join": {
> > > "type": "inner",
> > > "primary_key": [
> > >   "CST_KEY"
> > > ],
> > > "foreign_key": [
> > >   "FIRM_CST_KEY"
> > > ]
> > >   }
> > > }
> > >   ],
> > >   "dimensions": [
> > > {
> > >   "table": "SCHM.TXN_FCT_ORC_SM",
> > >   "columns": []
> > > },
> > > {
> > >   "table": "SCHM.DT_DIM_ORC",
> > >   "columns": [
> > > "DT_KEY"
> > >   ]
> > > },
> > > {
> > >   "table": "SCHM.CST_DIM_ORC",
> > >   "columns": [
> > > "CST_NM"
> > >   ]
> > > }
> > >   ],
> > >   "metrics": [
> > > "USD_TXN_AMT"
> > >   ],
> > >   "capacity": "MEDIUM",
> > >   "last_modified": 1459175903495,
> > >   "fact_table": "SCHM.TXN_FCT_ORC_SM",
> > >   "filter_condition": "",
> > >   "partition_desc": {
> > > "partition_date_column": "SCHM.TXN_FCT_ORC_SM.TXN_BOOK_DT_KEY",
> > > "partition_time_column": null,
> > > "partition_date_start": 0,
> > > "partition_date_format": "-MM-dd",
> > > "partition_time_format": "HH:mm:ss",
> > > "partition_type": "APPEND",
> > > "partition_condition_builder":
> > >
> > >
> >
> "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
> > >   }
> > > }
> > >
> > >
> > > CUBE:
> > > ===
> > >
> > > {
> > >   "name": "TEST_CUBE",
> > >   "model_name": "TEST_MODEL",
> > >   "description": "",
> > >   "dimensions": [
> > > {
> > >   "name": "CST_DIM_CST_NM",
> > >   "table": "SCHM.CST_DIM_ORC&quo

Re: V1.5 query no results

2016-03-30 Thread ShaoFeng Shi

Hi Jie,

I compared the hadoop version in 1.5 and 1.3, they are almost the same,
except hbase-hadoop2.version, which is 0.98.8 on v1.3 but 0.98.4 on
v1.5.0). Maybe it has some impactions; if you couldn't wait for next
release, you can clone Kylin's v1.5.0 tag, change the pom.xml, make a new
binary package, and then deploy it in your cluster to see whether it solve
this issue.

Thank you for reporting issue to Kylin!



2016-03-29 22:41 GMT+08:00 Jie Tao <jie@gameforge.com>:

>
> I see errors in the log:
> java.lang.IllegalArgumentException: No enum constant
> org.apache.hadoop.mapreduce
> .JobCounter.VCORES_MILLIS_REDUCES
>
> The problem may be my Hadoop version 2.7, according to this JIRA:
> https://issues.apache.org/jira/browse/KYLIN-1183.
>
> But: v1.2 and v1.3 run on Hadoop 2.7 without any problem. Maybe you can
> check v1.5 to find what was changed. Your suggested Hadoop version is
> 2.4-2.7. So I work with Hadoop 2.7.
>
> Best regards,
>
> Jie
>
>
> Am 29.03.2016 um 12:35 schrieb ShaoFeng Shi:
>
>> I checked this query in my sandbox with 1.5.0 binary package, it can
>> return
>> records;
>>
>> Just check a couple of things:
>> 1. whether the hive table has data;
>> 2. whether the cube has built the date range which covers 2012 to 2013;
>>
>>
>> 2016-03-29 17:06 GMT+08:00 Jie Tao <jie@gameforge.com>:
>>
>> Hi,
>>>
>>> I installed v1.5 and built the sample cube. All was fine. But when I
>>> query
>>> with
>>>
>>> select part_dt, sum(price) as total_selled, count(distinct seller_id) as
>>> sellers from kylin_sales group by part_dt order by part_dt
>>>
>>> I got No Result / Results (0). Kylin.log shows:
>>>
>>> User: ADMIN
>>> Success: true
>>> Duration: 0.172
>>> Project: learn_kylin
>>> Realization Names: [kylin_sales_cube]
>>> Cuboid Ids: [64]
>>> Total scan count: 0
>>> Result row count: 0
>>> Accept Partial: true
>>> Is Partial Result: false
>>> Hit Exception Cache: false
>>> Storage cache used: false
>>> Message: null
>>>
>>>
>>> I created a cube with my own table and also no results. I installed v1.2
>>> earlier but I deleted all tables (including kylin_metadata) in Hbase. I
>>> think there should be no problem now with version conflict. But I did not
>>> delte Hive table.
>>>
>>> What is wrong?
>>>
>>> Best regards,
>>>
>>> Jie
>>>
>>>
>>
>>
>


-- 
Best regards,

Shaofeng Shi

Re: V1.5 query no results

2016-03-29 Thread ShaoFeng Shi

I checked this query in my sandbox with 1.5.0 binary package, it can return
records;

Just check a couple of things:
1. whether the hive table has data;
2. whether the cube has built the date range which covers 2012 to 2013;


2016-03-29 17:06 GMT+08:00 Jie Tao <jie@gameforge.com>:

> Hi,
>
> I installed v1.5 and built the sample cube. All was fine. But when I query
> with
>
> select part_dt, sum(price) as total_selled, count(distinct seller_id) as
> sellers from kylin_sales group by part_dt order by part_dt
>
> I got No Result / Results (0). Kylin.log shows:
>
> User: ADMIN
> Success: true
> Duration: 0.172
> Project: learn_kylin
> Realization Names: [kylin_sales_cube]
> Cuboid Ids: [64]
> Total scan count: 0
> Result row count: 0
> Accept Partial: true
> Is Partial Result: false
> Hit Exception Cache: false
> Storage cache used: false
> Message: null
>
>
> I created a cube with my own table and also no results. I installed v1.2
> earlier but I deleted all tables (including kylin_metadata) in Hbase. I
> think there should be no problem now with version conflict. But I did not
> delte Hive table.
>
> What is wrong?
>
> Best regards,
>
> Jie
>



-- 
Best regards,

Shaofeng Shi

Re: V1.5 query no results

2016-03-31 Thread ShaoFeng Shi

v1.5.1 is on the way; it will include a couple of bug fixes; The next
release date hasn't been decided. You will receive the vote email when it
is kick off.

2016-03-31 14:54 GMT+08:00 Jie Tao <jie@gameforge.com>:

> Thanks for the message. You are right, it may be Hbase problem. I have
> hbase-0.98.17-hadoop2. It may not be the Hadoop problem because I tested
> with Hadoop 2.6 and got the same error:
>
> Java.lang.IllegalArgumentException: No enum constant
> org.apache.hadoop.mapreduce
> .JobCounter.VCORES_MILLIS_REDUCES
>
> When is the  next release? I will wait. Now I work with v1.3. It is fine
> with my Hbase and Hadoop.
>
> Best regards,
>
> Jie
>
>
>
> Am 30.03.2016 um 16:36 schrieb ShaoFeng Shi:
>
>> Hi Jie,
>>
>> I compared the hadoop version in 1.5 and 1.3, they are almost the same,
>> except hbase-hadoop2.version, which is 0.98.8 on v1.3 but 0.98.4 on
>> v1.5.0). Maybe it has some impactions; if you couldn't wait for next
>> release, you can clone Kylin's v1.5.0 tag, change the pom.xml, make a new
>> binary package, and then deploy it in your cluster to see whether it solve
>> this issue.
>>
>> Thank you for reporting issue to Kylin!
>>
>>
>>
>> 2016-03-29 22:41 GMT+08:00 Jie Tao <jie@gameforge.com>:
>>
>> I see errors in the log:
>>> java.lang.IllegalArgumentException: No enum constant
>>> org.apache.hadoop.mapreduce
>>> .JobCounter.VCORES_MILLIS_REDUCES
>>>
>>> The problem may be my Hadoop version 2.7, according to this JIRA:
>>> https://issues.apache.org/jira/browse/KYLIN-1183.
>>>
>>> But: v1.2 and v1.3 run on Hadoop 2.7 without any problem. Maybe you can
>>> check v1.5 to find what was changed. Your suggested Hadoop version is
>>> 2.4-2.7. So I work with Hadoop 2.7.
>>>
>>> Best regards,
>>>
>>> Jie
>>>
>>>
>>> Am 29.03.2016 um 12:35 schrieb ShaoFeng Shi:
>>>
>>> I checked this query in my sandbox with 1.5.0 binary package, it can
>>>> return
>>>> records;
>>>>
>>>> Just check a couple of things:
>>>> 1. whether the hive table has data;
>>>> 2. whether the cube has built the date range which covers 2012 to 2013;
>>>>
>>>>
>>>> 2016-03-29 17:06 GMT+08:00 Jie Tao <jie@gameforge.com>:
>>>>
>>>> Hi,
>>>>
>>>>> I installed v1.5 and built the sample cube. All was fine. But when I
>>>>> query
>>>>> with
>>>>>
>>>>> select part_dt, sum(price) as total_selled, count(distinct seller_id)
>>>>> as
>>>>> sellers from kylin_sales group by part_dt order by part_dt
>>>>>
>>>>> I got No Result / Results (0). Kylin.log shows:
>>>>>
>>>>> User: ADMIN
>>>>> Success: true
>>>>> Duration: 0.172
>>>>> Project: learn_kylin
>>>>> Realization Names: [kylin_sales_cube]
>>>>> Cuboid Ids: [64]
>>>>> Total scan count: 0
>>>>> Result row count: 0
>>>>> Accept Partial: true
>>>>> Is Partial Result: false
>>>>> Hit Exception Cache: false
>>>>> Storage cache used: false
>>>>> Message: null
>>>>>
>>>>>
>>>>> I created a cube with my own table and also no results. I installed
>>>>> v1.2
>>>>> earlier but I deleted all tables (including kylin_metadata) in Hbase. I
>>>>> think there should be no problem now with version conflict. But I did
>>>>> not
>>>>> delte Hive table.
>>>>>
>>>>> What is wrong?
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jie
>>>>>
>>>>>
>>>>>
>>>>
>>
>


-- 
Best regards,

Shaofeng Shi

Re: How to create a streaming model.

2016-04-13 Thread ShaoFeng Shi

It's hard to describe all steps in few words.. we owe a tutorial for this,
let me open a task for this, please watch it for any update:

https://issues.apache.org/jira/browse/KYLIN-1582

2016-04-13 19:39 GMT+08:00 陈佛林 <chenfo...@gmail.com>:

> There is no example or guide to tell me how to create a streaming model.
>
> I don't know how to configure kafka topics,is it just a simple string(such
> as "topic1")?
> and where to configure  kafka address
>



-- 
Best regards,

Shaofeng Shi

Re: version question

2016-04-14 Thread ShaoFeng Shi

https://kylin.apache.org/docs15/install/hadoop_env.html

2016-04-14 18:18 GMT+08:00 耳东 <775620...@qq.com>:

> hi all:
>
>
>The latest apache kylin should be installed on what version of hive
> hbase and hadoop?




-- 
Best regards,

Shaofeng Shi

Re: Question about cube size estimation in Kylin 1.5

2016-04-26 Thread ShaoFeng Shi

Hi Dayue,

could you please open a JIRA for this, and make it configurable? As I know
now Kylin allow cube level's configurations to overwirte kylin.properties,
with this you can customize the magic number at cube level.

Thanks;

2016-04-25 15:01 GMT+08:00 Li Yang <liy...@apache.org>:

> The magic coefficient is due to hbase compression on keys and values, the
> final cube size is much smaller than the sum of all keys and all values.
> That's why multiplying the coefficient. It's totally by experience at the
> moment. It should vary depends on the key encoding and compression applied
> to HTable.
>
> At the minimal, we should make it configurable I think.
>
> On Mon, Apr 18, 2016 at 4:38 PM, Dayue Gao <dayue_...@163.com> wrote:
>
> > Hi everyone,
> >
> >
> > I made several cubing tests on 1.5 and found most of the time was spent
> on
> > the "Convert Cuboid Data to HFile" step due to lack of reducer
> parallelism.
> > It seems that the estimated cube size is too small compared to the actual
> > size, which leads to small number of regions (hence reducers) to be
> > created. The setup and result of the tests are like:
> >
> >
> > Cube#1: source_record=11998051, estimated_size=8805MB, coefficient=0.25,
> > region_cut=5GB, #regions=2, actual_size=49GB
> > Cube#2: source_record=123908390, estimated_size=4653MB, coefficient=0.05,
> > region_cut=10GB, #regions=2, actual_size=144GB
> >
> >
> > The "coefficient" is from CubeStatsReader#estimateCuboidStorageSize,
> which
> > looks mysterious to me. Currently the formula for cuboid size estimation
> is
> >
> >
> >   size(cuboid) = rows(cuboid) x row_size(cuboid) x coefficient
> >   where coefficient = has_memory_hungry_measures(cube) ? 0.05 : 0.25
> >
> >
> > Why do we multiply the coefficient? And why it's five times smaller in
> > memory hungry case? Cloud someone explain the rationale behind it?
> >
> >
> > Thanks, Dayue
> >
> >
> >
> >
> >
> >
> >
> >
>



-- 
Best regards,

Shaofeng Shi

Re: KYLIN-955 not work

2016-04-28 Thread ShaoFeng Shi

hi wang,

Could you please reopen KYLIN-955 and add your findings there? Thanks for
your feedback.

2016-04-28 12:16 GMT+08:00 alaleiwang <alaleiw...@sohu-inc.com>:

> within ldap non-admin user,HiveColumnCardinalityJob fail with
> mapreduce.job.queue defined in file kylin_job.xml
> and i notice same issue fixed by KYLIN-955 after kylin 1.3,but why it still
> not work on my kylin 1.5.1(hadoop 2.4.1)?
>
> i review the related patch,and suspect for the following code should be
> changed from:
>  conf.addResource(jobEngineConfig.getHadoopJobConfFilePath(null));
> to:
>  conf.addResource(new
> Path(jobEngineConfig.getHadoopJobConfFilePath(null)));
>
> difference between public void addResource(String name) and public void
> addResource(Path file) can be checked in:
>
> https://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/conf/Configuration.html#addResource(org.apache.hadoop.conf.Configuration)
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/KYLIN-955-not-work-tp4313.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: cube data and insight are not synchronized

2016-04-28 Thread ShaoFeng Shi

Hi Tao, this is a cache bug in kylin 1.5.1, see
https://issues.apache.org/jira/browse/KYLIN-1612

2016-04-27 20:44 GMT+08:00 Tao Li(Internship) <tao...@envisioncn.com>:

> Hi,
>
>The cube has been successfully build, however tables are empty in the
> Insight page. And after restarting the kylin server, tables show normal. Is
> there any way to synchronize data manually ?
>
>
> Best regards,
>
>
>
> Tao Li
>
>
>
>
>
> ???(,?
> This email message (including any attachments) is confidential and may be
> legally privileged. If you have received it by mistake, please notify the
> sender by return email and delete this message from your system. Any
> unauthorized use or dissemination of this message in whole or in part is
> strictly prohibited. Envision Energy Limited and all its subsidiaries shall
> not be liable for the improper or incomplete transmission of the
> information contained in this email nor for any delay in its receipt or
> damage to your system. Envision Energy Limited does not guarantee the
> integrity of this email message, nor that this email message is free of
> viruses, interceptions, or interference.
>



-- 
Best regards,

Shaofeng Shi

Re: how to get the rate value

2016-04-26 Thread ShaoFeng Shi

hi dong, could you please open a JIRA to Kylin for tracking this issue?
https://issues.apache.org/jira/secure/Dashboard.jspa

Thanks!

2016-04-26 20:56 GMT+08:00 耳东 <775620...@qq.com>:

> Hi all:
>
>
>   I want to get a value which is defined as sum(a)/sum(b), how can I
> do this kind of anlysis.
>
>   Now I build a cube which have sum(a) and sum(b), when I execute
> “select sum(a)/sum(b) from table1 group by c” ,the result is wrong.
> sum(a)/sum(b) the result is all 0 and sum(b)/sum(a) result is all 1.
>
>
>  MMENE_NAMESUCC   ATTSUCC/ATT
>  CSMME15BZX   336981   368366   1
>  CSMME32BZX   338754   366842   1
>  CSMME07BZX   687965   747694   1
>  CSMME03BHW   703269   747623   1
>  CSMME12BZX   705856   764656   1
>  CSMME16BHW   1962293142173   1
>
>
>MMENE_NAME   SUCC   ATT   ATT/SUCC
>  CSMME15BZX   336981   368366   0
>  CSMME32BZX   338754   366842   0
>  CSMME07BZX   687965   747694   0
>  CSMME03BHW   703269   747623   0
>  CSMME12BZX   705856   764656   0
>  CSMME16BHW   1962293142173   0




-- 
Best regards,

Shaofeng Shi

Re: Question about cube size estimation in Kylin 1.5

2016-04-26 Thread ShaoFeng Shi

The issue is very likely related with
https://issues.apache.org/jira/browse/KYLIN-1624; You can wait for v1.5.2,
or pick the commits related with HLL (on master branch) made by Yang
yesterday.


2016-04-26 17:49 GMT+08:00 ShaoFeng Shi <shaofeng...@apache.org>:

> Hi Dayue,
>
> could you please open a JIRA for this, and make it configurable? As I know
> now Kylin allow cube level's configurations to overwirte kylin.properties,
> with this you can customize the magic number at cube level.
>
> Thanks;
>
> 2016-04-25 15:01 GMT+08:00 Li Yang <liy...@apache.org>:
>
>> The magic coefficient is due to hbase compression on keys and values, the
>> final cube size is much smaller than the sum of all keys and all values.
>> That's why multiplying the coefficient. It's totally by experience at the
>> moment. It should vary depends on the key encoding and compression applied
>> to HTable.
>>
>> At the minimal, we should make it configurable I think.
>>
>> On Mon, Apr 18, 2016 at 4:38 PM, Dayue Gao <dayue_...@163.com> wrote:
>>
>> > Hi everyone,
>> >
>> >
>> > I made several cubing tests on 1.5 and found most of the time was spent
>> on
>> > the "Convert Cuboid Data to HFile" step due to lack of reducer
>> parallelism.
>> > It seems that the estimated cube size is too small compared to the
>> actual
>> > size, which leads to small number of regions (hence reducers) to be
>> > created. The setup and result of the tests are like:
>> >
>> >
>> > Cube#1: source_record=11998051, estimated_size=8805MB, coefficient=0.25,
>> > region_cut=5GB, #regions=2, actual_size=49GB
>> > Cube#2: source_record=123908390, estimated_size=4653MB,
>> coefficient=0.05,
>> > region_cut=10GB, #regions=2, actual_size=144GB
>> >
>> >
>> > The "coefficient" is from CubeStatsReader#estimateCuboidStorageSize,
>> which
>> > looks mysterious to me. Currently the formula for cuboid size
>> estimation is
>> >
>> >
>> >   size(cuboid) = rows(cuboid) x row_size(cuboid) x coefficient
>> >   where coefficient = has_memory_hungry_measures(cube) ? 0.05 : 0.25
>> >
>> >
>> > Why do we multiply the coefficient? And why it's five times smaller in
>> > memory hungry case? Cloud someone explain the rationale behind it?
>> >
>> >
>> > Thanks, Dayue
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>
>


-- 
Best regards,

Shaofeng Shi

Re: multi hadoop cluster

2016-04-26 Thread ShaoFeng Shi

if the cube has no data, you'd better firstly have a check on the hive
source table.

Kylin runs "hive -e" shell command to generate the intermediate flat table
at the 1st step of the cube job. If your "hive" command can read the right
table, Kylin should be able also.

The intermediate hive table was dropped at the end of the cube build. But
you can rebuild the cube, and check the intermediate table content before
the build be finished. Or you can directly run the hive create intermediate
table SQL (can be found in the "parameter" of the first step).

With the intermediate table, you can check whether it has data. If no data,
you can further check the filter conditions; sometimes it is the date
format doesn't match causing this.

If the intermediate table has data, while cube has no data, then that is a
problem, but I almost haven't seen this before.

2016-04-27 10:08 GMT+08:00 bitbean <bitb...@qq.com>:

> not so match
>
>
> my case: local hive and remote hive share the metadata database, so my
> local hive can see  fact table on remote hdfs.
>
>
> I use the fact table on remote hdfs, now my trouble is that  kylin will
> build cube in 5 minutes, but the result htable  has no data.
>
>
> I guess kylin doesn't fetch data from remote hdfs, and kylin doesn't tell
> me it can't fetch data,
>
>
>
>
> so can you help me to resolve it ?
>
>
>
>
> -- 原始邮件 --
> 发件人: "ShaoFeng Shi";<shaofeng...@apache.org>;
> 发送时间: 2016年4月26日(星期二) 下午5:42
> 收件人: "dev"<dev@kylin.apache.org>;
>
> 主题: Re: multi hadoop cluster
>
>
>
> will this match your case?
> https://issues.apache.org/jira/browse/KYLIN-1172
>
> 2016-04-26 16:55 GMT+08:00 bitbean <bitb...@qq.com>:
>
> > Hi all,
> >
> >  i am encountering a problem with multiple hadoop cluster.
> >
> >
> >  kylin submit job to yarn on one hdfs, but my fact table is on other
> > hdfs. Two hadoop clusters  use the same mysql to store metadata.
> >
> >
> > so when i build cube, the first step to create intermediate table , and
> > insert data from fact table.
> >
> >
> > but i can't access the fact table in kylin's hive.
> >
> >
> >  for example , the first step as below
> >
> >
> > "kylin_intermediate_cube8_2016030100_2016041300 SELECT
> > PARTNER_USR_DOC_BASIC_INFO_FT0_S.PHONE_PROVINCE_IND
> > FROM WLT_PARTNER.PARTNER_USR_DOC_BASIC_INFO_FT0_S as
> > PARTNER_USR_DOC_BASIC_INFO_FT0_S
> > WHERE (PARTNER_USR_DOC_BASIC_INFO_FT0_S.PT_LOG_D >= '2016-03-01' AND
> > PARTNER_USR_DOC_BASIC_INFO_FT0_S.PT_LOG_D < '2016-04-13')"
> >
> >
> >
> >  Table "PARTNER_USR_DOC_BASIC_INFO_FT0_S" locate
> > "hdfs://hadoop2NameNode/wlt_partner/PARTNER_USR_DOC_BASIC_INFO_FT0_S"
> >
> >
> > but "kylin_intermediate_cube8_2016030100_2016041300"  locate
> > "hdfs://bihbasemaster/"
> >
> >
> > they are different clusters.
> >
> >
> > The current situation is there will not be any error in WEBUI at step
> > 1,
> >
> >
> >  When cube done, there is nothing in  Htable, so What can i do?
>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>



-- 
Best regards,

Shaofeng Shi

Re: Re: Re:Re: No detail message when Error.

2016-04-26 Thread ShaoFeng Shi

33:54,774 DEBUG [http-bio-7070-exec-9]
> service.QueryService:321 : done column metas
> 2016-04-26 13:33:56,732 DEBUG [http-bio-7070-exec-6]
> service.AdminService:90 : Get Kylin Runtime Config
> 2016-04-26 13:33:56,732 DEBUG [http-bio-7070-exec-3]
> controller.UserController:64 : authentication.getPrincipal() is
> org.springframework.security.core.userdetails.User@3b40b2f: Username:
> ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true;
> credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities:
> ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER
> 2016-04-26 13:33:56,826 DEBUG [http-bio-7070-exec-8]
> controller.ProjectController:97 : authentication.getPrincipal() is
> org.springframework.security.core.userdetails.User@3b40b2f: Username:
> ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true;
> credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities:
> ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER
> 2016-04-26 13:33:56,891 DEBUG [http-bio-7070-exec-1]
> service.QueryService:289 : getting table metas
> 2016-04-26 13:33:56,893 DEBUG [http-bio-7070-exec-1]
> service.QueryService:307 : getting column metas
> 2016-04-26 13:33:56,905 DEBUG [http-bio-7070-exec-1]
> service.QueryService:321 : done column metas
> 2016-04-26 13:34:12,127 INFO  [pool-4-thread-1]
> threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual running,
> 0 ready, 43 others
>
>
>
>
>
>
> At 2016-04-25 17:00:28, "Roger Shi" <rogershijich...@hotmail.com> wrote:
> >Hi Roy,
> >
> >Would you please provide all related logs fetched from
> >$KYLIN_HOME/logs/kylin.log? It will help in reproducing the issue and
> >finding the root cause.
> >
> >--
> >View this message in context:
> http://apache-kylin.74782.x6.nabble.com/No-detail-message-when-Error-tp4152p4260.html
> >Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: cardinality number limit about raw expression

2016-04-26 Thread ShaoFeng Shi

The raw measure need encode the column values with dictionary; while
dictionary is not good for ultra high cardinality. That's why it
complaints; You can try something as workaround:

1) cut a big segment into several segments, if you were trying to build a
large data set at once;
2) set "kylin.dictionary.max.cardinality" in conf/kylin.properties to a
bigger value (default is 500).

2016-04-27 10:55 GMT+08:00 yubo-...@yolo24.com <yubo-...@yolo24.com>:

> hi all,
>
> I am using 1.5.1 for testing.
> when I add the raw expression on one column of module, get the following
> error message in log file.
>
> Too high cardinality is not suitable for dictionary -- cardinality:
> 10886118
>
> my question is
>
> 1. does this means that the raw expression only allows limited number of
> cardinality ?
> 2. how to modify configuration for this limited number for raw
> expression(measure).
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/cardinality-number-limit-about-raw-expression-tp4286.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>



-- 
Best regards,

Shaofeng Shi

Re: multi hadoop cluster

2016-04-26 Thread ShaoFeng Shi

will this match your case? https://issues.apache.org/jira/browse/KYLIN-1172

2016-04-26 16:55 GMT+08:00 bitbean <bitb...@qq.com>:

> Hi all,
>
>  i am encountering a problem with multiple hadoop cluster.
>
>
>  kylin submit job to yarn on one hdfs, but my fact table is on other
> hdfs. Two hadoop clusters  use the same mysql to store metadata.
>
>
> so when i build cube, the first step to create intermediate table , and
> insert data from fact table.
>
>
> but i can't access the fact table in kylin's hive.
>
>
>  for example , the first step as below
>
>
> "kylin_intermediate_cube8_2016030100_2016041300 SELECT
> PARTNER_USR_DOC_BASIC_INFO_FT0_S.PHONE_PROVINCE_IND
> FROM WLT_PARTNER.PARTNER_USR_DOC_BASIC_INFO_FT0_S as
> PARTNER_USR_DOC_BASIC_INFO_FT0_S
> WHERE (PARTNER_USR_DOC_BASIC_INFO_FT0_S.PT_LOG_D >= '2016-03-01' AND
> PARTNER_USR_DOC_BASIC_INFO_FT0_S.PT_LOG_D < '2016-04-13')"
>
>
>
>  Table "PARTNER_USR_DOC_BASIC_INFO_FT0_S" locate
> "hdfs://hadoop2NameNode/wlt_partner/PARTNER_USR_DOC_BASIC_INFO_FT0_S"
>
>
> but "kylin_intermediate_cube8_2016030100_2016041300"  locate
> "hdfs://bihbasemaster/"
>
>
> they are different clusters.
>
>
>     The current situation is there will not be any error in WEBUI at step
> 1,
>
>
>  When cube done, there is nothing in  Htable, so What can i do?




-- 
Best regards,

Shaofeng Shi

Re: kylin安装后看不到默认的三张表

2016-04-28 Thread ShaoFeng Shi

> 2016-04-28 19:27:26,435 INFO  [pool-2-thread-1]
> threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual running,
> 0 ready, 17 others
> 2016-04-28 19:27:44,195 DEBUG [http-bio-7070-exec-2]
> controller.UserController:64 : authentication.getPrincipal() is
> org.springframework.security.core.userdetails.User@3b40b2f: Username:
> ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true;
> credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities:
> ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER
> 2016-04-28 19:27:45,455 DEBUG [http-bio-7070-exec-9]
> controller.UserController:64 : authentication.getPrincipal() is
> org.springframework.security.core.userdetails.User@3b40b2f: Username:
> ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true;
> credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities:
> ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER
> 2016-04-28 19:28:18,014 INFO  [pool-2-thread-1]
> threadpool.DefaultScheduler:106 : Job Fetcher: 0 running, 0 actual running,
> 0 ready, 17 others
>
>   --
>   --
>  接口平台项目组-田伟
>  电话：150 1410 4760
>  外网邮箱：372861...@qq.com
>  内网邮箱：tian.wei...@zte.com.cn
>  QQ:   372861232
>  --
>
>
>
>
>
>
>
>  -- 原始邮件 --
>   发件人: "lidong";<lid...@apache.org>;
>  发送时间: 2016年4月28日(星期四) 晚上7:32
>  收件人: "dev"<dev@kylin.apache.org>;
>
>  主题: 回复： kylin安装后看不到默认的三张表
>
>
>
> Please provide the log, otherwise we are not able to help.
>
>
> Thanks,
> Dong
>
>
> Original Message
> Sender:︶ㄣ溡簡~~372861...@qq.com
> Recipient:dev...@kylin.apache.org
> Date:Thursday, Apr 28, 2016 19:11
> Subject:回复： kylin安装后看不到默认的三张表
>
>
> yes,Please help to see.
>
>
>
>
> --原始邮件--
> 发件人:"lidong";lid...@apache.org;
> 发送时间:2016年4月28日(星期四) 晚上7:08
> 收件人:"dev"dev@kylin.apache.org;
> 主题:Re: kylin安装后看不到默认的三张表
>
>
> Have you checked logs/kylin.log?
>
>
> Thanks,
> Dong
>
>
> Original Message
> Sender:︶ㄣ溡簡~~372861...@qq.com
> Recipient:dev...@kylin.apache.org
> Date:Thursday, Apr 28, 2016 18:52
> Subject:kylin安装后看不到默认的三张表
>
>
> 采用集群分布方式，各版本如下 hadoop-2.4.0.tar.gz hbase-0.98.3-hadoop2-bin.tar.gz
> apache-hive-0.13.1-bin.tar.gz apache-kylin-1.5.1-bin.tar.gz
> --
>



-- 
Best regards,

Shaofeng Shi

Re: [VOTE] Release apache-kylin-1.5.2 (release candidate 3)

2016-05-24 Thread ShaoFeng Shi

+1 (binding)

verified md5 & sha1 hash; verified the signature; and passed mvn test on my
laptop with Java HotSpot(TM) 1.7.0_71.



2016-05-24 13:40 GMT+08:00 Li Yang <liy...@apache.org>:

> +1 (binding)
>
> mvn test pass
>
> java version "1.7.0_79"
> OpenJDK Runtime Environment (rhel-2.5.5.1.el6_6-x86_64 u79-b14)
> OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)
>
>
> On Mon, May 23, 2016 at 4:29 PM, Dong Li <lid...@apache.org> wrote:
>
> > Hi all,
> >
> > I have created a build for Apache Kylin 1.5.2, release candidate 3.
> >
> > Changes highlights:
> >
> >- [KYLIN-1077] - Support Hive View as Lookup Table
> >- [KYLIN-1515] - Make Kylin run on MapR
> >- [KYLIN-1600] - Download diagnosis zip from GUI
> >- [KYLIN-1672] - support kylin on cdh 5.7
> >
> >
> > Thanks to everyone who has contributed to this release.
> > Here’s release notes:
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12335528
> >
> > The commit to be voted upon:
> > *
> >
> https://github.com/apache/kylin/commit/af2646b72fbb6dc81699ad6661303fd612a2eebf
> > <
> >
> https://github.com/apache/kylin/commit/af2646b72fbb6dc81699ad6661303fd612a2eebf
> > >*
> >
> > Its hash is af2646b72fbb6dc81699ad6661303fd612a2eebf.
> >
> > The artifacts to be voted on are located here:
> > *https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-1.5.2-rc3/
> > <https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-1.5.2-rc3/>*
> >
> > The hashes of the artifacts are as follows:
> > apache-kylin-1.5.2-src.tar.gz.md511542b6fd4cfc9ed844e8f626d5960b7
> > apache-kylin-1.5.2-src.tar.gz.sha1
> >  57c6e33dfcad34ddcfc805965b00ea6f1d04422f
> >
> > A staged Maven repository is available for review at:
> > *https://repository.apache.org/content/repositories/orgapachekylin-1028/
> > <https://repository.apache.org/content/repositories/orgapachekylin-1028/
> >*
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/lidong.asc
> >
> > Please vote on releasing this package as Apache Kylin 1.5.2.
> >
> > The vote is open for the next 72 hours and passes if a majority of
> > at least three +1 PPMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Kylin 1.5.2
> > [ ] 0 I don't feel strongly about it, but I'm okay with the release
> > [ ] -1 Do not release this package because...
> >
> > Here is my vote:
> > +1 (binding)
> >
> > Thanks,
> > Dong Li
> >
>



-- 
Best regards,

Shaofeng Shi

Re: Help to go through your lira tickets

2016-05-11 Thread ShaoFeng Shi

Thanks Dong for the reminding; I have updated most of the JIRAs owned by
me. Look forward to see v1.5.2.

2016-05-11 15:38 GMT+08:00 lidong <lid...@apache.org>:

> Hello contributors,
>
>
> As we’re preparing to release v1.5.2.
>
>
> Please search the jira issues with:
> “project = KYLIN AND (fixVersion = v1.5.2) AND resolution = Unresolved AND
> assignee= currentUser() ORDER BY due ASC, priority DESC, created ASC”
> To clean the Jira ticket related to v1.5.2 release by:If the issue is
> already fixed, resolve it. Otherwise please mark the uncheck “fix version:
> v1.5.2”, or label it as“fix version: v1.5.3”
>
>
> Also it is preferable if you can go through your recent JIRA ticket
> by“project = KYLIN AND assignee= currentUser() ORDER BY updated DESC”to see
> if any resolved issues’s fix version being v1.5.2
>
>
> Thanks,
> Dong Li




-- 
Best regards,

Shaofeng Shi

Re: About Cube List

2016-05-16 Thread ShaoFeng Shi

Mars, you're correct; Would you like to contribute a patch to Kylin on
this? A matured project depends on everyone to contribute.

2016-05-17 11:36 GMT+08:00 Mars J <xujiao.myc...@gmail.com>:

> yes, i have do it. one project has about 10 models, one model has about 30
> cubes.
>
> 2016-05-17 11:33 GMT+08:00 hongbin ma <mahong...@apache.org>:
>
> > will you consider using projects to manage your models?
> >
> > On Tue, May 17, 2016 at 11:27 AM, Mars J <xujiao.myc...@gmail.com>
> wrote:
> >
> > > Hi,
> > >It's inconvenience to show more than 15 or 30 cubes in the list in
> > > 'Model' tab.Even though there is a 'More' button, but it's still not
> > quick
> > > to find a special cube.
> > >I suggest it can add a cube filter search bar under 'Model' like the
> > way
> > > Monitor do.
> > >
> > > Best Wishes!
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > *Bin Mahone | 马洪宾*
> > Apache Kylin: http://kylin.io
> > Github: https://github.com/binmahone
> >
>



-- 
Best regards,

Shaofeng Shi

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1287 matches

Mail list logo