[ANNOUNCE] New Committer: Jiatao Tao

2019-06-12 Thread ShaoFeng Shi
The Project Management Committee (PMC) for Apache Kylin
has invited Jiatao Tao to become a committer and we are pleased
to announce that he has accepted.

Thanks for all your hard work Jiatao; we look forward to more
contributions!

Please join me in extending congratulations to Jiatao!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: AppendTrieDictionary can't retrieve value from id

2019-06-12 Thread ShaoFeng Shi
Hello,

In your cube, "CARD_ID" is used as both dimension and a bitmap (count
distinct) measure; This is not allowed currently, because to encode it to
an integer, Kylin
uses the "global dictionary", while the "global dictionary" couldn't be
used for dimension encoding.

Please remove it from the dimension in this cube, and build again. And if
you need it as a dimension, create another cube.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




lk_hadoop  于2019年6月11日周二 下午4:12写道:

> hi,all:
>My Model Json string is :
>
>
> {
>   "uuid": "c28014c1-7dae-6900-6264-8794b683ffa7",
>   "last_modified": 1560226964070,
>   "version": "2.6.1.0",
>   "name": "scrm_model",
>   "owner": "ADMIN",
>   "is_draft": false,
>   "description": "###",
>   "fact_table": "GJST.SH_FETCH_SALE_BASE_FACT_ALL_NEW",
>   "lookups": [
> {
>   "table": "TEST.MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>   "kind": "FACT",
>   "alias": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>   "join": {
> "type": "inner",
> "primary_key": [
>   "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4.CARD_ID"
> ],
> "foreign_key": [
>   "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CARD_ID"
> ]
>   }
> }
>   ],
>   "dimensions": [
> {
>   "table": "SH_FETCH_SALE_BASE_FACT_ALL_NEW",
>   "columns": [
> "DATES",
> "CARD_ID",
> "TGOODS_ID",
> "ENT_NAME",
> "ORG_NAME",
> "DATA_FROM",
> "GOODS_NAME",
> "ORG_NO",
> "ATC1_NEW",
> "ATC2_NEW",
> "ATC3_NEW",
> "ATC4_NEW",
> "GOODS_ID"
>   ]
> },
> {
>   "table": "MEMBERSHIP_PRECISE_SELLING_EXTEND_V4",
>   "columns": [
> "CARD_ID",
> "USER_ID",
> "SEX",
> "AGE",
> "BIRTHDAYS",
> "NAME",
> "NICK_NAME",
> "IS_SUBSCRIBE_WX",
> "IS_RECEIVE_CARD",
> "SUBSCRIBE_TIME",
> "SUBSCRIBE_STORE",
> "ACTIVATE_TIME",
> "ACTIVATE_STORE",
> "FIRST_BUY_DATE",
> "RECENT_CONSUME_DATE",
> "RECENT_CONSUMPTION_INTERVAL_DAY",
> "GAOXUEYA_BUYS",
> "GAOXUEYA_FLAG",
> "GAOXUEZHI_BUYS",
> "GAOXUEZHI_FLAG",
> "TANGNIAOBING_BUYS",
> "TANGNIAOBING_FLAG",
> "TOTAL_POINTS",
> "REMAINDER_POINTS",
> "TOTAL_COUPONS_NUMBER",
> "AVAILABLE_COUPONS_NUMBER",
> "TOTAL_USE_COUPONS_NUMBER",
> "MAINTAIN_NUMBERS",
> "MAINTAIN_TYPE",
> "MARKET_PROGRAM",
> "RECENT_MAINTAIN_INTERVAL_DAY",
> "BELONG_STORE",
> "BUSINESS_ID"
>   ]
> }
>   ],
>   "metrics": [
> "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PAID_IN_AMT",
> "SH_FETCH_SALE_BASE_FACT_ALL_NEW.TBILL_CODE",
> "SH_FETCH_SALE_BASE_FACT_ALL_NEW.PROFIT"
>   ],
>   "filter_condition": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.data_from_new <>'' and 
> SH_FETCH_SALE_BASE_FACT_ALL_NEW.card_id is not null",
>   "partition_desc": {
> "partition_date_column": "SH_FETCH_SALE_BASE_FACT_ALL_NEW.CDT",
> "partition_time_column": null,
> "partition_date_start": 0,
> "partition_date_format": "-MM-dd",
> "partition_time_format": "HH:mm:ss",
> "partition_type": "APPEND",
> "partition_condition_builder": 
> "org.apache.kylin.metadata.model.PartitionDesc$DefaultPartitionConditionBuilder"
>   },
>   "capacity&

Re: Row Level Security and Column Level security on Kylin

2019-06-11 Thread ShaoFeng Shi
Today Kylin's ACL is at table level, we think this is good enough (for 99%
scenarios).

Row and column level control is one of the enterprise features. Kyligence's
Kylin distribution has that feature out of box, but it is not open.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




bajpais  于2019年6月11日周二 上午6:31写道:

> Hi all ,
> I have a requirement to implement row level security and column level
> security on Kylin cube.
> Is this possible?
> I would appreciate if someone can share their approach as to how they
> implemented this for their use case?
>
> I am not sure if we can create a view on top of cube to implement the
> security or do we need to join with security table and embed this as part
> of
> cube creation. The disadvantage of embedding data as part join i see that
> there will be redundant data stored for each user having access to same set
> of rows as part of security table
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: 请问怎么退订邮件

2019-06-10 Thread ShaoFeng Shi
Hello,

Please send an empty email to dev-unsubscr...@kylin.apache.org, then it
will reply to confirm with you. Just reply that email to confirm the
unsubscription.

The process for u...@kylin.apache.org mailing list is the same, just
replace "dev" with "user".

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




天涯 <401223...@qq.com> 于2019年6月11日周二 上午10:50写道:

> 请问怎么退订邮件,谢谢


Re: ask about kylin 3.0 Go-Live Date

2019-06-10 Thread ShaoFeng Shi
Hello Bryan,

Kylin 3.0 alpha has been released; now the beta version is coming soon.
Have you tried the alpha version, or have you encounter any problem with
it? You can download it from Kylin website, and the document is also there:

https://kylin.apache.org/download/
https://kylin.apache.org/docs30/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Bryan Liu (CN)  于2019年6月11日周二 上午10:36写道:

> Dears,
>
>   I am from company Homecredit China .
>   I would like to ask when new version 3.0 of kylin will launch?  Due to
> we are finding some realtime olap solution recently.
>
>   Waiting for your feedback.
> Thank you
>
> Best Regards
> Bryan.liu
> Homecredit CN, Tianjing, China.
>


[New blog] Kylin upgrade from 2.4.1 to 2.6.1

2019-06-10 Thread ShaoFeng Shi
Hello Kylin users,

A new article from community user Iñigo Martinez has been published on
Kylin website.

In this article, Iñigo introduces his experience on how to prepare and
carry out an upgrade from Kylin v2.4 to v2.6 with minimal downtime and low
risk. He also shared his feelings with the newer version. Hope this article
can help you to upgrade smoothly. Here is the blog link:
https://kylin.apache.org/blog/2019/05/29/kylin-2.4.1-to-2.6.1/

Thanks again for Iñigo's contribution!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[ANNOUNCE] Gang Ma joins the Apache Kylin PMC

2019-06-02 Thread ShaoFeng Shi
On behalf of the Apache Kylin PMC, I am pleased to announce that Gang Ma
(马刚) has accepted our invitation to become a PMC member on the Apache Kylin
project. We appreciate Gang stepping up to take more responsibility in the
Kylin project.

Please join me in welcoming Gang to the Kylin PMC!

Best Regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: Need More info to use KYLIN

2019-06-01 Thread ShaoFeng Shi
Hello Imran,

What's the "different approach", could you please elaborate?

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Imran Samed  于2019年6月1日周六 下午10:48写道:

> Hi Kylin Team
>
> It's my first time to use *KYLIN *for my project, but I wanna use it with a
> different approach.
> So I need your help.
>
> I love to hear back from you.
>
>
> Thanks and Warm Regards
> Imran Samed
>


Re: Fixes for SonarQube issues

2019-05-30 Thread ShaoFeng Shi
Hi Diego,

Welcome to contribute! The previous JIRA was closed, but you can report a
new one.

In Jan 2019, I remember the SonarCloud or TravisCI had a change which made
the Sonar result was failed to update to SonarCloud, so we disabled the
sonar check temporarily, which means the result on SonarCloud might be out
of dated. We can try again to see whether the environment problem is fixed
or not.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Billy Liu  于2019年5月30日周四 上午11:36写道:

> Welcome the contribution. Sonor issue fix is quite a good idea for
> beginners.
>
> With Warm regards
>
> Billy Liu
>
> Diego Marcílio  于2019年5月30日周四 上午9:05写道:
> >
> > Hello,
> >
> > I'm willing to contribute with fixes for some SonarQube issues. I'm
> > checking the project on SonarCloud (
> > https://sonarcloud.io/dashboard?id=org.apache.kylin%3Akylin).
> >
> > There was an issue in JIRA with this goal but it's closed now.
> > https://issues.apache.org/jira/browse/KYLIN-3597
> >
> > Is that something of interest for the project? If yes, should a new issue
> > be opened?
> >
> > Thank you,
> > Diego.
>


Plan to host the first "Kylin Data Summit" event

2019-05-30 Thread ShaoFeng Shi
Hello Kylin developers and users,



We (Kyligence Inc) planned to host the first "Kylin Data Summit" event at
Shanghai, China. This event is going to provide a place to share, discuss
the technology and trends in Big Data domain. The presentors and target
audiences are big data engineers, data analysts and others who are
interested in the data analysis domain. I’m writing to the community for
the approval of using Apache Kylin trademark on such an event. After
getting Kylin PMC approval, we will submit the request to the VP of ASF
band management. The process is from
https://www.apache.org/foundation/marks/events.html#approval



The information about this event is as follows:

   - *What is the topic focus of the event*

Big Data, OLAP, Apache Kylin and other big data technologies like Apache
Hadoop, Apache Spark, etc.

   - *Who is organising the event*

Kyligence Inc. and InfoQ China

   - *When is the event*

12th, July.

   - *How many attendees are expected*

500 expected attendees.

   - *How much PMC involvement is there already*

Some Apache Kylin’s PMC members are involved in the organization of Kylin
Data Summit, such as Billy Liu, Yang Li, Shaofeng Shi. Several PMC will
give speech on this event: Luke, Yanghong Zhong.



   - *Which marks are requested*

The name and logo of Apache Kylin, the Apache feather logo.

   - *How would you propose that the ASF will be listed as a community
   partner?*

Community Partner

   - *How will the event selection work?*

We have a selection group for this event, formed by the PMCs that will
attend the event. We will invite the community users to prepare proposal,
and then make a vote in the selection group

*Is this for profit or non-profit?*

Profit, with some free tickets to community contributors.

   - *The event’s related site and marketing materials*

They are still under construction.



Please share your comment or suggestions, thank you!


Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: Maven repository URL could not be accessed (http://repository.kyligence.io:8081/repository/maven-public/)

2019-05-28 Thread ShaoFeng Shi
Hello Sam,

Our engineer will change the service port to a default port. But this may
not happen quickly because they are busy before the end of June. Thank you
for the feedback!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Sam Lau - ITD  于2019年5月22日周三 上午9:19写道:

> Dear,
>
>
>
> Maven raise the below exception after he dependency 
> *org.apache.kylin:kylin-core-metadata
> *imported.
>
> My company network only allow PC to allow 80 and 443 port of URL
> outsite.
>
> Port 8081 is not allowed in my company network
>
>
>
> Could you help update the repository URL (using http:// Or
> https://... with default port) ?
>
> Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Failure
> to transfer org.apache.calcite:calcite-core:pom:1.16.0-kylin-r2 from 
> *http://repository.kyligence.io
> <http://repository.kyligence.io>:8081/repository/maven-public/* was
> cached in the local repository, resolution will not be reattempted until
> the update interval of kyligence has elapsed or updates are forced.
> Original error: Could not transfer artifact
> org.apache.calcite:calcite-core:pom:1.16.0-kylin-r2 from/to kyligence (
> http://repository.kyligence.io:8081/repository/maven-public/): connect
> timed out
>
> at
> org.eclipse.aether.internal.impl.DefaultUpdateCheckManager.newException(DefaultUpdateCheckManager.java:240)
>
> at
> org.eclipse.aether.internal.impl.DefaultUpdateCheckManager.checkArtifact(DefaultUpdateCheckManager.java:208)
>
> at
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.gatherDownloads(DefaultArtifactResolver.java:563)
>
> at
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.performDownloads(DefaultArtifactResolver.java:481)
>
> at
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve(DefaultArtifactResolver.java:399)
>
>
>
> https://github.com/apache/kylin/blob/master/pom.xml#L1098
>
>
>
> http://repository.kyligence.io:8081/repository/maven-public/
>
>
>
>
>
> Best regards,
>
> Sam
>
> *EGL Tours Company Limited **東瀛遊旅行社有限公司 **(**香港聯合交易所主板上市 股份代號 6882)*
> *電話查詢及服務中心**: 3692-0888 **Website: www.egltours.com
> <http://www.egltours.com> Facebook: www.facebook.com/egltours
> <http://www.facebook.com/egltours>*
>
>
> 本郵件(及任何附件)可能 載有機密、專有、具有特權或受法律保護的資料,並僅供收件人(或負責將資料遞交給收件人的人士)使用。如閣下不是本郵件
> 的預定收件人,便無權閱讀、列印、保留、複製或傳佈本郵件或其任何部分。如閣下錯誤地收到本郵件,請立即將之銷毀或從閣 下的系統中刪除,並通知寄件人。
>
> __
> http://www.egltours.com/promotion
>
> This message (and any attachments) may contain information that is
> confidential,proprietary,privileged or otherwise protected by law.The
> message is intended solely for the named addressee (or a person responsible
> for delivering it to the addressee).If you are not the intended recipient
> of this message, you are not authorized to read, print, retain , copy or
> disseminate this message or any part of it.If you have received this
> message in error, please destroy the message or delete it from your system
> immediately and notify the sender.
>


Re: spark.yarn.executor.memoryOverhead' has been deprecated problem

2019-05-28 Thread ShaoFeng Shi
Hello Nithya,

You can remove  "
kylin.engine.spark-conf.spark.yarn.executor.memoryOverhead" from
conf/kylin.properties and kylin-defaults.properties. The
"kylin-defaults.properties" is packaged in the
kylin-core-common-.jar, which is
in tomcat/webapps/kylin/WEB-INF/lib/ folder, you need to unzip and then
re-package it.

Could you please report this as a JIRA so that the developers will handle
it? A patch is also welcomed.

BTW, please subscribe dev@kylin.apache.org (drop an email to
dev-subscr...@kylin.apache.org) before sending an email, otherwise your
email will be pending until someone manually accept it.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




nithya.mb4...@gmail.com  于2019年5月29日周三 上午9:24写道:

> Can someone please respond what is the solution for this?
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: 使用left join查询时报错

2019-05-28 Thread ShaoFeng Shi
Define another cube with inner join. Kylin will automatically match the
join type in query with the cube/model definition.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Gods_Dusk <197795...@qq.com> 于2019年5月28日周二 下午3:16写道:

> inner join, and I modify to left join, it works well, thx a lot. besides,
> if
> I want to use both inner and left, how can I do?
>
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: Maven repository URL could not be accessed (http://repository.kyligence.io:8081/repository/maven-public/)

2019-05-21 Thread ShaoFeng Shi
Hello Sam,

I'm checking it; While, there might be some restrictions for us to provide
service on the 80/443 port.

Maybe I can send you the mirror files, and then you can expand it to your
local maven repository, as a temporary solution.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Sam Lau - ITD  于2019年5月22日周三 上午9:19写道:

> Dear,
>
>
>
> Maven raise the below exception after he dependency 
> *org.apache.kylin:kylin-core-metadata
> *imported.
>
> My company network only allow PC to allow 80 and 443 port of URL
> outsite.
>
> Port 8081 is not allowed in my company network
>
>
>
> Could you help update the repository URL (using http:// Or
> https://... with default port) ?
>
> Caused by: org.eclipse.aether.transfer.ArtifactTransferException: Failure
> to transfer org.apache.calcite:calcite-core:pom:1.16.0-kylin-r2 from 
> *http://repository.kyligence.io
> <http://repository.kyligence.io>:8081/repository/maven-public/* was
> cached in the local repository, resolution will not be reattempted until
> the update interval of kyligence has elapsed or updates are forced.
> Original error: Could not transfer artifact
> org.apache.calcite:calcite-core:pom:1.16.0-kylin-r2 from/to kyligence (
> http://repository.kyligence.io:8081/repository/maven-public/): connect
> timed out
>
> at
> org.eclipse.aether.internal.impl.DefaultUpdateCheckManager.newException(DefaultUpdateCheckManager.java:240)
>
> at
> org.eclipse.aether.internal.impl.DefaultUpdateCheckManager.checkArtifact(DefaultUpdateCheckManager.java:208)
>
> at
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.gatherDownloads(DefaultArtifactResolver.java:563)
>
> at
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.performDownloads(DefaultArtifactResolver.java:481)
>
> at
> org.eclipse.aether.internal.impl.DefaultArtifactResolver.resolve(DefaultArtifactResolver.java:399)
>
>
>
> https://github.com/apache/kylin/blob/master/pom.xml#L1098
>
>
>
> http://repository.kyligence.io:8081/repository/maven-public/
>
>
>
>
>
> Best regards,
>
> Sam
>
> *EGL Tours Company Limited **東瀛遊旅行社有限公司 **(**香港聯合交易所主板上市 股份代號 6882)*
> *電話查詢及服務中心**: 3692-0888 **Website: www.egltours.com
> <http://www.egltours.com> Facebook: www.facebook.com/egltours
> <http://www.facebook.com/egltours>*
>
>
> 本郵件(及任何附件)可能 載有機密、專有、具有特權或受法律保護的資料,並僅供收件人(或負責將資料遞交給收件人的人士)使用。如閣下不是本郵件
> 的預定收件人,便無權閱讀、列印、保留、複製或傳佈本郵件或其任何部分。如閣下錯誤地收到本郵件,請立即將之銷毀或從閣 下的系統中刪除,並通知寄件人。
>
> __
> http://www.egltours.com/promotion
>
> This message (and any attachments) may contain information that is
> confidential,proprietary,privileged or otherwise protected by law.The
> message is intended solely for the named addressee (or a person responsible
> for delivering it to the addressee).If you are not the intended recipient
> of this message, you are not authorized to read, print, retain , copy or
> disseminate this message or any part of it.If you have received this
> message in error, please destroy the message or delete it from your system
> immediately and notify the sender.
>


[Announce] Apache Kylin 2.6.2 released

2019-05-19 Thread ShaoFeng Shi
The Apache Kylin team is pleased to announce the immediate availability of
the 2.6.2 release.

This is a bugfix release after 2.6.1, with 9 enhancements and 27 bug fixes.
All of the changes in this release can be found in:
https://kylin.apache.org/docs/release_notes.html

You can download the source release and binary packages from Apache Kylin's
download page: https://kylin.apache.org/download/

Apache Kylin is an open source Distributed Analytics Engine designed to
provide SQL interface and multi-dimensional analysis (OLAP) on Apache
Hadoop, supporting extremely large datasets.

Apache Kylin lets you query massive dataset at sub-second latency in 3
steps:
1. Identify a star schema or snowflake schema data set on Hadoop.
2. Build Cube on Hadoop.
3. Query data with ANSI-SQL and get results in sub-second, via ODBC, JDBC
or RESTful API.

Thanks to everyone who has contributed to the 2.6.2 release.

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kylin.apache.org/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[RESULT][VOTE] Release apache-kylin-2.6.2 (RC1)

2019-05-17 Thread ShaoFeng Shi
Thanks to everyone who has tested the release candidate and given
their comments and votes.

The tally is as follows.

4 binding +1s:
Shaofeng Shi
Dong Li
Kaisen Kang
Billy Liu


7 non-binding +1s:
Chunen Ni
Jiatao Tao
Temple Zhou
Na Zhai
Xiaoxiang Yu
Chao Long
Jianhua Peng


No 0s or -1s.

Therefore I am delighted to announce that the proposal to release
Apache-Kylin-2.6.2 has passed


Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[VOTE] Release apache-kylin-2.6.2 (RC1)

2019-05-13 Thread ShaoFeng Shi
Hi all,

I have created a build for Apache Kylin 2.6.2, release candidate 1.

Changes highlights:
[KYLIN-3892] - Set cubing job priority
[KYLIN-3839] - Storage clean up after refreshing or deleting a segment
[KYLIN-3873] - Fix inappropriate use of memory in SparkFactDistinct.java
[KYLIN-3905] - Enable shrunken dictionary default
[KYLIN-3922] - Fail to update coprocessor when run DeployCoprocessorCLI
[KYLIN-3936] - MR/Spark task will still run after the job is stopped.


Thanks to everyone who has contributed to this release.
Here’s release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345051

The commit to being voted upon:

https://github.com/apache/kylin/commit/c507ae29fa64bc7234efd6a002dcfe990969ad35

Its hash is c507ae29fa64bc7234efd6a002dcfe990969ad35.

The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.2-rc1/

The hash of the artifact is as follows:
apache-kylin-2.6.2-source-release.zip.sha256
db2ab59d3e66d635462e9c9ef49fd7ca29342f07ff4eea0730e52777287e2ebf

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1062/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 2.6.2.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 2.6.2
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[jira] [Created] (KYLIN-3999) Enable dynamic column by default

2019-05-10 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3999:
---

 Summary: Enable dynamic column by default
 Key: KYLIN-3999
 URL: https://issues.apache.org/jira/browse/KYLIN-3999
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Reporter: Shaofeng SHI


More and more user expects to use "SUM(Case when)" feature, and got error. The 
reason is the dynamic column is disabled by default. We should consider to 
enable it by default:

 

kylin.query.enable-dynamic-column=true



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Cartesian Join not supported

2019-05-09 Thread ShaoFeng Shi
Hi Nithya,

The fact table and lookup table need be joined explicitly. That means, the
sql need be:

select bla, bla FROM FACT inner join PRODUCT on FACT.fk = PRODUCT.pk group
by x, y


Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




nithya.mb4...@gmail.com  于2019年5月9日周四 下午3:08写道:

> Hello,
> I have created a cube in KYLIN that has TOP_N measure. When I run the query
> in Insight I am getting Cartesian Join Not supported error. I have given
> the
> details in the below email thread. Can you please let me know If I am doing
> anything wrong? Or is my usage wrong? If yes, how to use this measure?
>
>
>
> Below are complete details:
>
> Fact: Product_ID, Amount.
> Product Dimension: Product_ID, PROD_DESC
> TIME Dimension: Time_ID, TIME_DESC
>
> In Kylin Cube:
> Dimensions: TIME.TIME_DESC
>
> Measure: TOP_N.
> Column: FACT.Amount
> GROUP BY: PRODUCT.PROD_DESC
>
> Query:
> SELECT SUM(FACT.AMOUNT), PRODUCT.PROD_DESC
> FROM FACT,PRODUCT
> GROUP BY PRODUCT.PROD_DESC
> ORDER BY SUM(FACT.AMOUNT) DESC;
>
> Error:
> Cartesian join not supported. While executing SQL:
> SELECT SUM(FACT.AMOUNT), PRODUCT.PROD_DESC
> FROM FACT,PRODUCT
> GROUP BY PRODUCT.PROD_DESC
> ORDER BY SUM(FACT.AMOUNT) DESC;
>
> Regards,
> Nithya
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin

2019-04-26 Thread ShaoFeng Shi
Hi Yuzhang,

Please open a JIRA for this enhancement; If it can be implemented in an
elegant way, that will be great!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




yuzhang  于2019年4月23日周二 上午8:56写道:

> Hi Shaofeng:
> We also take some experiment for add measure after cube be built and
> encountered byte error at the very start. The default mapping strategy
> between HBase store and measure definition is "multiple measures are stored
> in one column of column family", which may cause byte error after add a
> measure and insert it in original measure sequence. Add an column for new
> measure may be better, I think.
>
> I just have a preliminary idea, may be impractical for now, about the
> measure management design.
> Dimensions and metrics are defined once model be designed. The measure
> aggregate the metrics in different dimensions to observe the data entities
> represented by the model. All of these are design of 'logical view', I
> think. The Cube is materialized view of these logical model, which is the
> bridge between the logical view and the physical storage (and the highway
> is set up). The life cycle of the measure may depend on the model rather
> than the cube.
>
> Based on the design, an measure management can be set up after model
> design be completed. We can define the measure based on model. Cubes under
> the model can reuse those measure and build their segment data. When a SQL
> arrive, Kylin query server need to find the suitable model with suitable
> measure, then find the available cube.
>
> Of course, such an design change will have a very large impact on the
> existing kylin architecture, and the query and metadata will have very
> large changes. So it seems that it is still on paper.
> More realistic or transitional design is increasing the metadata of
> the measure. Just as CubeDesc defines the schema, and a relative
> CubeInstance manages the built Segments. MeasureDesc can also has a
> MeasureInstance to manage the segment containing it.
> I observed that kylin's query service generates a GridTable for mapping
> between logical views and HBase physical storage: Cuboid + Measure -> Grid
> Table <- HBase store. This Grid Table is generated based on CubeDesc and
> has such a mapping process for each Segment. Therefore, in the mapping
> stage, it is possible to know which columns of the Grid Table can't be
> obtained in current segment by the metadata. So the measure data can be
> selectively read at the RS backend.
> But its life cycle is the same as MeasureDesc, managed by CubeDesc.
>
> Regarding adding dimensions to the same cube, we also need to consider
> aggregation groups and Rowkey order. I am curious and interesting how you
> implemented it.
>
>
>
>   Best regards
>
>
>   yuzhang
>
> yuzhang
> shifengdefan...@163.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1=yuzhang=shifengdefannao%40163.com=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg=%5B%22shifengdefannao%40163.com%22%5D>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
> On 4/22/2019 09:05,ShaoFeng Shi
>  wrote:
>
> Hi Yuzhang,
>
> Glad to see such a discussion; How to support "schema change" in a friendly
> way is what we should do in the next phase, as we see this requirement is
> stronger than before.
>
> Last week I also did a try on 1) adding a dimension after cube be built,
> and 2) adding a measure after cube be built;
>
> For 1) I have got an idea, the first try was successful, and want to
> discuss it with the community in some day.
>
> The 2) was failed; after a new measure is added, the query got failed and
> in HBase RS side there is byte parsing error. Then I didn't continue that.
>
> Could you elaborate your idea on "the measures of the analysis system can
> be decoupled from the materialized view(cube) and have their own management
> system"? Have you got a rough design on it? Thank you!
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> yuzhang  于2019年4月21日周日 下午8:08写道:
>
> Hi

Re: cube设计维度16个,采用星型模型,结果一直卡在这个位置,怎么处理?

2019-04-25 Thread ShaoFeng Shi
Hi,

For a 16 dimension cube, you need to optimize the dimension combinations
with "aggregation group", "mandatory", "hierarchy" and others. Otherwise,
it may cause a big workload to your cluster. You can check the MR job on
YARN RM to see which mapper or reducer is the slowest one.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年4月24日周三 下午10:02写道:

> Hi, gaofeng5096.
>
> I can not see your pic, you can add it as the attachment. Can you provide
> more information? Such as Model.json, Cube.json and Kylin version.
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
> 
> 发件人: gaofeng5...@capinfo.com.cn 
> 发送时间: Tuesday, April 23, 2019 10:34:50 PM
> 收件人: dev
> 主题: cube设计维度16个,采用星型模型,结果一直卡在这个位置,怎么处理?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 您好,关于kylin在预处理的时候一直卡住,发现也没报错,这个是怎么回事?
>
>
> [cid:_Foxmail.1@e2bfefc0-7c0a-e7f9-ab7d-581d86e4f56a]
> 
> gaofeng5...@capinfo.com.cn
>


Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin

2019-04-21 Thread ShaoFeng Shi
Hi Yuzhang,

Glad to see such a discussion; How to support "schema change" in a friendly
way is what we should do in the next phase, as we see this requirement is
stronger than before.

Last week I also did a try on 1) adding a dimension after cube be built,
and 2) adding a measure after cube be built;

For 1) I have got an idea, the first try was successful, and want to
discuss it with the community in some day.

The 2) was failed; after a new measure is added, the query got failed and
in HBase RS side there is byte parsing error. Then I didn't continue that.

Could you elaborate your idea on "the measures of the analysis system can
be decoupled from the materialized view(cube) and have their own management
system"? Have you got a rough design on it? Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




yuzhang  于2019年4月21日周日 下午8:08写道:

> Hi JiaTao:
> Maybe it's necessary that there is an optional auto-complete machanism
> among different measure's view, isn't it?
>
>
> yuzhang
>
>
> | |
> yuzhang
> |
> |
> shifengdefan...@163.com
> |
> 签名由网易邮箱大师定制
> On 4/20/2019 11:38,JiaTao Tao wrote:
> Hi
>
> The idea that supports Kylin adding measures dynamically is impressive.
>
> But in my opinion, once you add a measure, the existing segments should
> also calculate the new measure(just add a new measure column). Users can
> have many cubes, a cube can have many segments, if measure's view is
> different in each segment, it will increase the burden of the user.
>
> --
>
>
> Regards!
>
> Aron Tao
>
> yuzhang  于2019年4月20日周六 上午1:43写道:
>
> Hi dear kylin users and develop team:
> Here have some things I want to discuss with community.
> As a representative of MOLAP engine, kylin uses pre-aggregation strategies
> to provide high-concurrency and second-level response analysis
> capabilities, but also loses some flexibility.
> The limitation that purge existing segment firstly to add an additional
> measure will cause many double calculation and unnecessary disk IO. Such
> waste should be avoid especially in MOLAP engine.
> For example, there is an cubeA with one measure m1 and segments over time
> range1(tr1). Now, user add one measure m2, but don't want to clear segments
> over tr1. The value of m2 will exist in tr2, the segments build
> subsequently. Sure, tr1 doesn't contain value of m2, which will be
> understanded by user who know litte about MOLAP. Querying over tr1 and tr2
> is valid for both m1 and m2, but the result of m2 over tr1 will be null.
> It's will be better to reminder user the measure missing.Moreover,
> refreshing will supply the m2 to segments over tr1.
> Currently, kylin's storage engine uses HBase. The measure are aggregated
> values based on combination of various dimension members and stored in a
> column of a Column Family in HBase. For the same cube, adding a new measure
> will add a column to the HBase table(mapping) and will take effect in the
> next build. For the existing HTables(segments), the new column is allowed
> to be missing. Refreshing old existing segments will add a new column in
> their HTable to store new measure. Value of new measure is aggregated
> according to the combination of dimension members in rowkey, without
> recalculating existing measure.
> Now, For additional measure and even additional dimensions, Kylin's
> current solution is Hybrid, but we found the following shortcomings during
> use:
> 1. Management costs: Repeated maintenance of similar Cubes, most of which
> have many intersections of dimensions and indicators. If you want to
> perform optimization operations such as pruning, you need to configure all
> of these cubes.
> 2. A large number of cubes: The initial analysis of the business is not
> stable, and analysts often have the need to increase some measures. The
> cube is added continuously to the Hybrid group, which will produce a lot of
> cubes.
> 3. Repeat calculation: If you want to drop the old cube in the Hybrid
> group, you need to build the latest cube by compute historical data to
> cover the old cube.
> Those will result in a lot of waste.
> In addition, I felt that the metadata about the measure was not perfect
> during the applying of Kylin.
> 1. As one of the most important concerns of analysts, if the measures of
> the analysis system can be decoupled from the materialized view(cube) and
> have their own management system, it may be more flexibility.
> 2. Once the dimensions have been choose in cube designing, it's cuboids
> are confirmed no matter the

Re: Re: Deploy Apache Kylin with Standalone HBase Cluster

2019-04-18 Thread ShaoFeng Shi
thanks for the sharing; Is the beeline + "zookeeper discoverer model"  also
okay finally?

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




lk_hadoop  于2019年4月18日周四 上午10:34写道:

> I made some mistake , both cluster with NN-HA configure is OK .
>
> 2019-04-18
>
> lk_hadoop
>
>
>
> 发件人:"lk_hadoop"
> 发送时间:2019-04-17 15:20
> 主题:Re: Deploy Apache Kylin with Standalone HBase Cluster
> 收件人:"dev"
> 抄送:
>
> finally I succeed with read-write separation deployment . There are two
> point cause my failure :
> 1、when use beeline to connect hive ,should not use zookeeper discoverer
> model ,should connect to one of the hiveservers directly.
> 2、should not configure NN HA to connect to Hbase cluster , although I
> configured kylin.storage.hbase.cluster-hdfs-config-file=hbase.hdfs.xml ,
> JOB failed when step to : Convert Cuboid Data to HFile.
> Error Message :
> java.lang.RuntimeException: Could not find any configured addresses for
> URI
> hdfs://nameservice1/user/mykylin/kylin_metadata/kylin-4fdee76b-6b73-087a-b9ad-6cf17dd84aad/kylin_sales_cube/hfile
>
>
> 2019-04-17
>
> lk_hadoop
>
>
>
> 发件人:"lk_hadoop"
> 发送时间:2019-04-16 15:30
> 主题:Deploy Apache Kylin with Standalone HBase Cluster
> 收件人:"dev"
> 抄送:
>
> hi,all:
> I want to try read-write separation deployment . Is the Standalone
> HBase Cluster should use the same HDFS withe the Main Cluster ? My Hbase
> cluster is completly separate with main cluster , both cluster's HDFS is NN
> HA , I can't sucess with read-write separation deployment .
>
> 2019-04-16
>
>
> lk_hadoop


[New blog] "Real-time Streaming Design in Apache Kylin"

2019-04-17 Thread ShaoFeng Shi
Hello,

Gang Ma, the core developer of Kylin Real-time OLAP, just composed a tech
blog on this feature. It will help to understand the purpose, the
architecture and the design. Welcome to read and share with others:

https://kylin.apache.org/blog/2019/04/12/rt-streaming-design/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[Announce] Apache Kylin 3.0.0-alpha released

2019-04-16 Thread ShaoFeng Shi
The Apache Kylin team is pleased to announce the immediate availability of
the 3.0.0-alpha release.

This is the alpha release of v3.0, which introduces the new Real-time OLAP
feature; All of the changes in this release can be found in:
https://kylin.apache.org/docs/release_notes.html

You can download the source release and binary packages from Apache Kylin's
download page: https://kylin.apache.org/download/

Apache Kylin is an open source Distributed Analytics Engine designed to
provide SQL interface and multi-dimensional analysis (OLAP) on Apache
Hadoop, supporting extremely large datasets.

Apache Kylin lets you query massive dataset at sub-second latency in 3
steps:
1. Identify a star schema or snowflake schema data set on Hadoop.
2. Build Cube on Hadoop.
3. Query data with ANSI-SQL and get results in sub-second, via ODBC, JDBC
or RESTful API.

Thanks to everyone who has contributed to the 3.0.0-alpha release.

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kylin.apache.org/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: [ANNOUNCE] Kaisen Kang joins the Apache Kylin PMC

2019-04-15 Thread ShaoFeng Shi
Congratulations, Kaisen!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Luke Han  于2019年4月16日周二 下午1:09写道:

> On behalf of the Apache Kylin PMC I am pleased to announce that Kaisen Kang
> has accepted our invitation to become a PMC member on the Apache Kylin
> project. We appreciate Kaisen stepping up to take more responsibility in
> the Kylin project.
>
> Please join me in welcoming Kaisen to the Kylin PMC!
>
> Best Regards,
>
> Luke
>


Re: Re:RE: [VOTE] Release apache-kylin-3.0.0-alpha (RC1)

2019-04-13 Thread ShaoFeng Shi
Luke,

We couldn't reproduce your problem. What's your java version?

java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

[INFO]
[INFO] Apache Kylin 3.0.0-alpha ... SUCCESS [
3.455 s]
[INFO] Apache Kylin - Core Common . SUCCESS [
17.187 s]
[INFO] Apache Kylin - Core Metadata ... SUCCESS [
47.146 s]
[INFO] Apache Kylin - Core Dictionary . SUCCESS [03:30
min]
[INFO] Apache Kylin - Core Cube ... SUCCESS [01:13
min]
[INFO] Apache Kylin - Core Metrics  SUCCESS [
1.917 s]
[INFO] Apache Kylin - Core Job  SUCCESS [01:00
min]
[INFO] Apache Kylin - Core Storage  SUCCESS [
8.465 s]
[INFO] Apache Kylin - Stream Core . SUCCESS [
48.921 s]
[INFO] Apache Kylin - MapReduce Engine  SUCCESS [
22.799 s]
[INFO] Apache Kylin - Spark Engine  SUCCESS [
34.518 s]
[INFO] Apache Kylin - Hive Source . SUCCESS [
7.270 s]
[INFO] Apache Kylin - DataSource SDK .. SUCCESS [
9.339 s]
[INFO] Apache Kylin - Jdbc Source . SUCCESS [
19.476 s]
[INFO] Apache Kylin - Kafka Source  SUCCESS [
7.034 s]
[INFO] Apache Kylin - Cache ... SUCCESS [
8.015 s]
[INFO] Apache Kylin - HBase Storage ... SUCCESS [
33.344 s]
[INFO] Apache Kylin - Query ... SUCCESS [
10.554 s]
[INFO] Apache Kylin - Metrics Reporter Hive ... SUCCESS [
2.452 s]
[INFO] Apache Kylin - Metrics Reporter Kafka .. SUCCESS [
1.040 s]
[INFO] Apache Kylin - Stream Source Kafka . SUCCESS [
4.048 s]
[INFO] Apache Kylin - Stream Coordinator .. SUCCESS [
12.630 s]
[INFO] Apache Kylin - Stream Receiver . SUCCESS [
8.060 s]
[INFO] Apache Kylin - Stream Storage .. SUCCESS [
1.847 s]
[INFO] Apache Kylin - REST Server Base  SUCCESS [
17.870 s]
[INFO] Apache Kylin - REST Server . SUCCESS [02:02
min]
[INFO] Apache Kylin - JDBC Driver . SUCCESS [
5.836 s]
[INFO] Apache Kylin - Assembly  SUCCESS [
7.037 s]
[INFO] Apache Kylin - Tool  SUCCESS [
14.425 s]
[INFO] Apache Kylin - Tool Assembly ... SUCCESS [
0.789 s]
[INFO] Apache Kylin - Integration Test  SUCCESS [
17.012 s]
[INFO] Apache Kylin - Tomcat Extension 3.0.0-alpha  SUCCESS [
0.905 s]
[INFO]

[INFO] BUILD SUCCESS
[INFO]

[INFO] Total time: 14:01 min
[INFO] Finished at: 2019-04-14T10:41:04+08:00

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Luke Han  于2019年4月13日周六 上午6:14写道:

> I hit this issue:
>
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test)
> on project kylin-storage-hbase: There are test failures.
>
> any lucky your side?
>
> [*ERROR*] *Tests **run: 1*, Failures: 0, *Errors: 1*, Skipped: 0, Time
> elapsed: 206.786 s* <<< FAILURE!* - in
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.
> *CubeVisitServiceTest*
>
> [*ERROR*]
>
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitServiceTest
> Time elapsed: 206.786 s  <<< ERROR!
>
> java.io.IOException: Shutting down
>
> at
>
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitServiceTest.setupBeforeClass(CubeVisitServiceTest.java:155)
>
> Caused by: java.lang.RuntimeException: Master not initialized after
> 20ms seconds
>
> at
>
> org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitServiceTest.setupBeforeClass(CubeVisitServiceTest.java:155)
>
> Best Regards!
> -
>
> Luke Han
>
>
> On Fri, Apr 12, 2019 at 10:00 AM Na Zhai  wrote:
>
> > +1
> >
> >
> >
> > mvn test passed
> >
> >
> >
> > 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
> >
> >
> >
> > 
> > 发件人: Ma Gang 
> > 发送时间: Thursday, April 11, 2019 3:01:59 PM
> > 收件人: dev@kylin.apache.org
> > 主题: Re:RE: [VOTE] Release apache-kylin-3.0.0-alp

[RESULT][VOTE] Release apache-kylin-3.0.0-alpha (RC1)

2019-04-11 Thread ShaoFeng Shi
Thanks to everyone who has tested the release candidate and given
their comments and votes.

The tally is as follows:

3 binding +1s:
Shaofeng Shi
Billy Liu
Dong Li

1 non-binding +1s:
Gang Ma

No 0s or -1s.

Therefore I am delighted to announce that the proposal to release
Apache-Kylin-3.0.0-alpha has passed.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Apache Kylin Meetup @Beijing, April 13, 2019

2019-04-10 Thread ShaoFeng Shi
Hello Kylin users,



There will be a Kylin Meetup this Saturday (4/13) in Beijing, China.
Engineers from Xiaomi (小米), 58.com Inc. (58集团), and Kyligence will share
their use cases and experiences with Kylin.



Date: Saturday, April 13, 2019

Time: 1:00 PM - 17:30 PM

Location: 3WCoffee, No.70 West Street, Haidian District, Beijing

Language: Chinese

Fee: Free!



There are some seats left, so if you can come, please register here:
https://www.huodongxing.com/event/7484371439700.


Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[VOTE] Release apache-kylin-3.0.0-alpha (RC1)

2019-04-07 Thread ShaoFeng Shi
Hi all,

I have created a build for Apache Kylin 3.0.0-alpha, release candidate 1.

Changes highlights:
[KYLIN-3654] - Kylin Real-time Streaming
[KYLIN-3795] - Submit Spark jobs via Apache Livy
[KYLIN-3716] - FastThreadLocal replaces ThreadLocal
[KYLIN-3867] - Enable JDBC to use key store & trust store for https
connection
[KYLIN-3905] - Enable shrunken dictionary default
[KYLIN-3820] - Add a curator-based job scheduler
[KYLIN-3839] - Storage clean up after the refreshing and deleting a segment

Thanks to everyone who has contributed to this release.
Here’s the release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345219

The commit to being voted upon:

https://github.com/apache/kylin/commit/8872e28de06b05b11a423f32ff62a5d00ed84813

Its hash is 8872e28de06b05b11a423f32ff62a5d00ed84813.

The artifacts to be voted on, including the source package and two
pre-compiled binary packages, are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.0-alpha-rc1/

The hash of the artifacts are as follows:
apache-kylin-3.0.0-alpha-source-release.zip.sha256
0cdaa465dd2f80335807a89d39f4599b2ce638a267a03742827c5336e81e86fa
apache-kylin-3.0.0-alpha-bin-hbase1x.tar.gz.sha256
7edd41c522b641aad02f386ea5ea639fb574ae9e0934ea01be0ef6cb2c090ea9
apache-kylin-3.0.0-alpha-bin-cdh57.tar.gz.sha256
7f351edfaad6a5541390581d4ead12fefc4bb9ca837e97f54121a94a2bfeecac

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1061/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 3.0.0-alpha.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 3.0.0-alpha
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: [DISCUSS] Kylin 3.0 alpha and beta release before GA

2019-03-28 Thread ShaoFeng Shi
Hi Vino, thank you for raising this.

The Flink cubing engine is also a good candidate feature; Except the
performance vs spark, we're also expecting to see if it can help the
streaming processing. Welcome the discussion on this.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




vino yang  于2019年3月27日周三 下午7:35写道:

> Hi all,
>
> Thanks for your effort, ShaoFeng! Look forward to Kylin 3.0.
>
> Flink cube engine still has some minor issues and documentation to be done.
>
> I hope it could be joined into Kylin 3.0.
>
> Best,
> Vino
>
>
> ShaoFeng Shi  于2019年3月26日周二 下午10:33写道:
>
>> Just now, we merged the real-time implementation into the master branch.
>> Some PRs were held in the past days for this merge; We will resume the PR
>> merge soon. Thanks for your support!
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>>
>>
>>
>> Chao Long  于2019年3月26日周二 下午6:58写道:
>>
>> > Good feature, looking forward to it!
>> > --
>> > Best Regards,
>> > Chao Long
>> >
>> >
>> > -- 原始邮件 --
>> > 发件人: "ShaoFeng Shi";
>> > 发送时间: 2019年3月25日(星期一) 上午9:24
>> > 收件人: "Apache Kylin PMC";"dev"<
>> > dev@kylin.apache.org>;"user";
>> >
>> > 主题: [DISCUSS] Kylin 3.0 alpha and beta release before GA
>> >
>> >
>> >
>> > Hello,
>> >
>> > About two months ago, we raised the "[Discuss] Moving toward Apache
>> Kylin
>> > 3.0" in the developer group, all agree to use 3.0 as the next major
>> release
>> > version when the Real-Time feature released. Now we're merging the code
>> > from the RT feature branch into the master branch.
>> >
>> > Although this feature has been in production in certain early users, it
>> has
>> > not been widely evaluated by the community. I would like to propose
>> > releasing the alpha and beta before the GA release, just like what we
>> did
>> > in Kylin v2.0. This is to give our users enough time to evaluate; On the
>> > other side, it gives the developers the time to hear feedback, to
>> improve
>> > the stability/performance, catch up the documentation and others.
>> >
>> > A rough plan is:
>> > - April, 3.0 alpha release
>> > - June, 3.0 beta release
>> > - July to Aug, 3.0 GA release
>> >
>> > Before 3.0 GA, the v2.6 branch will roll out bug fix releases at a
>> steady
>> > pace; Usually, 1 version every 1-2 months, depends on the severity of
>> the
>> > reported issues.
>> >
>> > We warmly welcome the community users to join the 3.0 alpha and beta.
>> > Please share your comments here. Thank you for the support to Apache
>> Kylin!
>> >
>> > Best regards,
>> >
>> > Shaofeng Shi 史少锋
>> > Apache Kylin PMC
>> > Email: shaofeng...@apache.org
>> >
>> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> > Join Kylin user mail group: user-subscr...@kylin.apache.org
>> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>


Re: 通过ODBC获取所有可用的表和表元数据

2019-03-27 Thread ShaoFeng Shi
Less people know how to debug ODBC; I'm afraid no help can give to you.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




killyu  于2019年3月22日周五 下午4:53写道:

> 您好,
>   我编写C#程序时使用ODBC能成功连接Kylin,但尝试使用以下代码获取表列表时发生错误:
> var odbcConnection = new OdbcConnection(ConnectionString);
> using (odbcConnection)
> {
> odbcConnection.Open();
> var table = odbcConnection.GetSchema("Tables");
>
>
> foreach (DataRow tableRow in table.Rows)
> {
> foreach (var column in tableRow.ItemArray)
> {
> Console.Write(column + ", ");
> }
>
>
> Console.WriteLine();
> }
>
> }
> 发生错误的行为var table = odbcConnection.GetSchema("Tables");错误信息为
> Arithmetic operation resulted in an overflow.
>at System.Data.Odbc.OdbcDataReader.BuildMetaDataInfo()
>at System.Data.Odbc.OdbcDataReader.GetSchemaTable()
>at
> System.Data.Odbc.OdbcMetaDataFactory.NewDataTableFromReader(IDataReader
> reader, Object[]& values, String tableName)
>at
> System.Data.Odbc.OdbcMetaDataFactory.DataTableFromDataReader(IDataReader
> reader, String tableName)
>at System.Data.Odbc.OdbcMetaDataFactory.GetTablesCollection(String[]
> restrictions, OdbcConnection connection, Boolean isTables)
>at System.Data.Odbc.OdbcMetaDataFactory.PrepareCollection(String
> collectionName, String[] restrictions, DbConnection connection)
>at System.Data.ProviderBase.DbMetaDataFactory.GetSchema(DbConnection
> connection, String collectionName, String[] restrictions)
>at System.Data.Odbc.OdbcConnection.GetSchema(String collectionName,
> String[] restrictionValues)
>at KylinConnectTest.Program.Main(String[] args) in
> C:\vsprojects\temp\Test\MysqlConnectorTest\KylinConnectTest\Program.cs:line
> 27
>
> 而当使用odbcConnection.GetSchema();时能正确运行。
> 请问这种方式是否可行?或者能否使用其它方式获取所有表信息?


Re: [DISCUSS] Kylin 3.0 alpha and beta release before GA

2019-03-26 Thread ShaoFeng Shi
Just now, we merged the real-time implementation into the master branch.
Some PRs were held in the past days for this merge; We will resume the PR
merge soon. Thanks for your support!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Chao Long  于2019年3月26日周二 下午6:58写道:

> Good feature, looking forward to it!
> --
> Best Regards,
> Chao Long
>
>
> -- 原始邮件 ------
> 发件人: "ShaoFeng Shi";
> 发送时间: 2019年3月25日(星期一) 上午9:24
> 收件人: "Apache Kylin PMC";"dev"<
> dev@kylin.apache.org>;"user";
>
> 主题: [DISCUSS] Kylin 3.0 alpha and beta release before GA
>
>
>
> Hello,
>
> About two months ago, we raised the "[Discuss] Moving toward Apache Kylin
> 3.0" in the developer group, all agree to use 3.0 as the next major release
> version when the Real-Time feature released. Now we're merging the code
> from the RT feature branch into the master branch.
>
> Although this feature has been in production in certain early users, it has
> not been widely evaluated by the community. I would like to propose
> releasing the alpha and beta before the GA release, just like what we did
> in Kylin v2.0. This is to give our users enough time to evaluate; On the
> other side, it gives the developers the time to hear feedback, to improve
> the stability/performance, catch up the documentation and others.
>
> A rough plan is:
> - April, 3.0 alpha release
> - June, 3.0 beta release
> - July to Aug, 3.0 GA release
>
> Before 3.0 GA, the v2.6 branch will roll out bug fix releases at a steady
> pace; Usually, 1 version every 1-2 months, depends on the severity of the
> reported issues.
>
> We warmly welcome the community users to join the 3.0 alpha and beta.
> Please share your comments here. Thank you for the support to Apache Kylin!
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[DISCUSS] Kylin 3.0 alpha and beta release before GA

2019-03-24 Thread ShaoFeng Shi
Hello,

About two months ago, we raised the "[Discuss] Moving toward Apache Kylin
3.0" in the developer group, all agree to use 3.0 as the next major release
version when the Real-Time feature released. Now we're merging the code
from the RT feature branch into the master branch.

Although this feature has been in production in certain early users, it has
not been widely evaluated by the community. I would like to propose
releasing the alpha and beta before the GA release, just like what we did
in Kylin v2.0. This is to give our users enough time to evaluate; On the
other side, it gives the developers the time to hear feedback, to improve
the stability/performance, catch up the documentation and others.

A rough plan is:
- April, 3.0 alpha release
- June, 3.0 beta release
- July to Aug, 3.0 GA release

Before 3.0 GA, the v2.6 branch will roll out bug fix releases at a steady
pace; Usually, 1 version every 1-2 months, depends on the severity of the
reported issues.

We warmly welcome the community users to join the 3.0 alpha and beta.
Please share your comments here. Thank you for the support to Apache Kylin!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: Kylin ODBC driver is removed from download page

2019-03-23 Thread ShaoFeng Shi
Hello Kylin users,

The Kylin PMC has discussed this issue, and here is the conclusion:


   - Apache Kylin will releases the ODBC driver in source code format,
   under Apache License v2;
   - If you need the binary, you can compile and build it from source code
   by your own, or seek third-party.


Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




ShaoFeng Shi  于2019年3月12日周二 上午9:39写道:

> Hello Kylin users,
>
> The Kylin ODBC driver was removed from Apache Kylin download web page at
> yesterday. The reason is that the ODBC driver binary package (.exe files)
> wasn't formally voted by the PMC (although it has been there for some
> time). To be compliant with asf policy, we removed that. The PMC will
> discuss the follow-up actions, but there is no ETA for when it can be
> recovered.
>
> If you still need the ODBC driver, you can try to compile it from the
> source code ("odbc" folder in Kylin source release).
>
> On behalf of Apache Kylin PMC
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>


Re: Kylin Support Spark 1.6 version when start supporting spark 2.6

2019-03-21 Thread ShaoFeng Shi
The latest spark release is v2.4; Kylin is using spark v2.3.2 now; so far
there is no plan to upgrade. If you find newer version Spark has
advantages, please share with the community, thanks!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年3月21日周四 下午9:18写道:

> Hi, rsanadhya.
>
>
>
> What’s your Kylin version? In Kylin v2.6, it has supported Spark v2.3.2.
> Please refer to this issue:
> https://issues.apache.org/jira/browse/KYLIN-3272. And Why you want to use
> Spark v2.6?
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> 
> 发件人: rsanad...@gmail.com 
> 发送时间: Tuesday, March 19, 2019 2:27:19 PM
> 收件人: dev@kylin.apache.org
> 主题: Kylin Support Spark 1.6 version when start supporting spark 2.6
>
> HI ALl,
>
> Wanted to check Currently Kylin supports Spark 1.6 version which is older
> and latest Spark 2.6 is available . When will Kylin start supporting Spark
> 2.6 ?
>
> Thanks,
> Rahul
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: Hbase table is always empty when build with spark

2019-03-19 Thread ShaoFeng Shi
Hi Alex,

Could you please report a JIRA to Kylin? or send a Pull request if you
already have a hot-fix. Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




mailpig  于2019年2月25日周一 下午5:18写道:

> Sure, hive table is not empty and the output directory of hfile also has
> data.
>
> <http://apache-kylin.74782.x6.nabble.com/file/t635/IMG20190225_171051.png>
>
>
> After set the mapreduce.job.outputformat.class in the job config, load
> hfile
> to hbase is success.
> Besides that I found the source code has the above config in the first
> commit,
> ..
> HTable table = new HTable(hbaseConf,
> cubeSegment.getStorageLocationIdentifier());
> try {
> HFileOutputFormat2.configureIncrementalLoadMap(job, table);
> } catch (IOException ioe) {
> // this can be ignored.
> logger.debug(ioe.getMessage(), ioe);
> }
> ...
> But after the commit 76c9c960be542c919301c72b34c7ae5ce6f1ec1c, the above
> config is deleted, I don't know why. Please check.
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


[RESULT][VOTE] Release apache-kylin-2.6.1 binary packages

2019-03-18 Thread ShaoFeng Shi
Thanks to everyone who has verified the binary packages.

The tally is as follows.

3 binding +1s:
Shaofeng Shi
Billy Liu
Yang Li

1 non-binding +1s:
Jianhua Peng

No 0s or -1s.

Therefore I am delighted to announce that the proposal to release
Apache-Kylin-2.6.1 binary package has passed.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: [VOTE] Release apache-kylin-2.6.1 binary packages

2019-03-18 Thread ShaoFeng Shi
Thanks to Billy, Yang, Jianhua for the verification! I will send out the
result soon.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Jianhua Peng  于2019年3月18日周一 下午8:55写道:

>
> +1
> On 2019/03/15 08:57:57, ShaoFeng Shi  wrote:
> > Hi all,
> >
> > The source code of apache-kylin-2.6.1 has been released on 3/8 on last
> > week. Now we prepared the binary packages of v2.6.1 for users'
> convenience.
> > Please review the binary packages, and give your vote.
> >
> > The packages are in :
> > https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.1-rc1/
> >
> > The hash of the artifact is as follows:
> > apache-kylin-2.6.1-bin-hbase1x.tar.gz -
> > f91f3ff0d6426f84e752cc1178fd704895842e9464ce5cd31c099b1f31eb6b68
> > apache-kylin-2.6.1-bin-hadoop3.tar.gz -
> > 6f06e94055d7639729f7879508669375a80eddd76c2a4880da38a0f7f223de44
> > apache-kylin-2.6.1-bin-cdh57.tar.gz  -
> > b5038da13bfbf7fbba9a46b4675b587c882a8e152d244b063c4a610d6000bd55
> > apache-kylin-2.6.1-bin-cdh60.tar.gz  -
> > d1ba39a6e288131a89e3c8e4d0959fd3c05c4ed42df1164df4d2ec9ddf55f92f
> >
> > The checking content should include:
> >
> >- sigs and hashes must be OK
> >- the package must contain the correct NOTICE and LICENSE files for
> the
> >included content
> >- the package must not contain any content not derived from the
> source.
> >- in the case of bundled binaries, reviewers must check that all
> >contents are represented in the LICENSE (and NOTICE file if required).
> >The bundle must not contain any files that are prohibited from
> >distribution (category X).
> >
> >
> > Here is my vote:
> > +1 (binding)
> >
> > Thank you!
> >
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> > Apache Kylin PMC
> > Email: shaofeng...@apache.org
> >
> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> >
>


Re: 答复: How kylin store data in Hbase ?

2019-03-18 Thread ShaoFeng Shi
Hi Rahul,

Please check this slide, I made for last year's HBaseCon; Page 16/17
introduces how Kylin store cube in HBase;

https://www.slideshare.net/ShiShaoFeng1/apache-kylin-on-hbase-extreme-olap-engine-for-big-data

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




rsanad...@gmail.com  于2019年3月18日周一 下午4:13写道:

> Hello Na Zhai,
>
> 2 Mandatory Dimension and No Hierarchic Dims .
>
> Thanks,
> Rahul S
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: [Discussion] Enable shrunken dictionary by default

2019-03-17 Thread ShaoFeng Shi
+1.

Thanks to Xiaoxiang for raising this; Kylin has some advanced but hidden
feature. As the function becomes stable, we should enable them by default
to benefit all users.

Please also raise similar discussion if you wish to enable some good
features.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Zhong, Yanghong  于2019年3月18日周一 上午10:39写道:

> +1.
>
> Best regards,
> Yanghong Zhong
>
> On 2019/3/18, 10:27 AM, "Xiaoxiang Yu"  wrote:
>
> Dear all,
> I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by
> default(it is disabled by default), because I found enable it will speed up
> cube build process when cube have count distinct(bitmap) on a large
> cardinality column. This feature is contributed in KYLIN-3491.
>
> When using count distinct(bitmap) measure on a large cardinality
> column(this require global dictionary), build base cuboid step need
> frequent cache swap so it cannot finished within a reasonable period.
> KYLIN-3491 add a new step to build separated dictionary for each InputSplit
> before BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to
> fetch a smaller dictionary for itself(without unused value), instead of a
> larger global dictionary. It will reduce cache swap and make
> BuildBaseCuboid step run as quick as possible.
>
> In my test env, my hadoop cluster is a CDH cluster with 56 vcore and
> 110GB Memory. I create a model with a fact table (153326740 rows) and three
> dimension tables, there are three count distinct(bitmap) measure which the
> largest cardinality of single column is 55200325. With ShrunkenDict
> disabled, the BuildBaseCuboid cannot completed in 22 hours. Comparatively,
> with ShrunkenDict enabled, build process completed in a reasonable
> duration(Extra Dictionary cost 5 minutes, Build Base Cuboid costs 5
> minutes).
>
>
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F14030549%2F54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.pngdata=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583sdata=KuUcbcerY42oG4J11G1jlEcIs4v%2BPPVt40B9G9fqa80%3Dreserved=0
>
> If you want know more, please check
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-3491data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583sdata=T1P1rCA1munwUedC0PC4qttqbFqiDkda%2FZ%2BgqgkQn%2BE%3Dreserved=0.
> If you have any suggestion, please let me know.
>
> 
> Best wishes,
> Xiaoxiang Yu
>
>
>
>


[jira] [Created] (KYLIN-3878) NPE to run sonar analysis

2019-03-15 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3878:
---

 Summary: NPE to run sonar analysis
 Key: KYLIN-3878
 URL: https://issues.apache.org/jira/browse/KYLIN-3878
 Project: Kylin
  Issue Type: Test
  Components: Tools, Build and Test
Reporter: Shaofeng SHI


mvn sonar:sonar -Dsonar.host.url=https://sonarcloud.io 
-Dsonar.organization=kylin -e

[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 03:13 min
[INFO] Finished at: 2019-03-15T14:42:16Z
[INFO] 
[ERROR] Failed to execute goal 
org.sonarsource.scanner.maven:sonar-maven-plugin:3.6.0.1398:sonar (default-cli) 
on project kylin: null: MojoExecutionException: NullPointerException -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal 
org.sonarsource.scanner.maven:sonar-maven-plugin:3.6.0.1398:sonar (default-cli) 
on project kylin: null
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:213)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:154)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:146)
 at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:117)
 at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:81)
 at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
 (SingleThreadedBuilder.java:56)
 at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
(LifecycleStarter.java:128)
 at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
 at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
 at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
 at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
 at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
 at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
 at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke 
(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke 
(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke (Method.java:498)
 at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced 
(Launcher.java:289)
 at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:229)
 at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode 
(Launcher.java:415)
 at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:356)
Caused by: org.apache.maven.plugin.MojoExecutionException
 at org.sonarsource.scanner.maven.bootstrap.ScannerBootstrapper.execute 
(ScannerBootstrapper.java:67)
 at org.sonarsource.scanner.maven.SonarQubeMojo.execute (SonarQubeMojo.java:104)
 at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
(DefaultBuildPluginManager.java:137)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:208)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:154)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
(MojoExecutor.java:146)
 at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:117)
 at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
(LifecycleModuleBuilder.java:81)
 at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
 (SingleThreadedBuilder.java:56)
 at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
(LifecycleStarter.java:128)
 at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
 at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
 at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
 at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
 at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
 at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
 at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke 
(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke 
(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke (Method.java:498)
 at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced 
(Launcher.java:289)
 at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:229)
 at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode 
(Launcher.java:415)
 at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:356)
Caused by: java.lang.NullPointerException
 at org.A.E.

[VOTE] Release apache-kylin-2.6.1 binary packages

2019-03-15 Thread ShaoFeng Shi
Hi all,

The source code of apache-kylin-2.6.1 has been released on 3/8 on last
week. Now we prepared the binary packages of v2.6.1 for users' convenience.
Please review the binary packages, and give your vote.

The packages are in :
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.1-rc1/

The hash of the artifact is as follows:
apache-kylin-2.6.1-bin-hbase1x.tar.gz -
f91f3ff0d6426f84e752cc1178fd704895842e9464ce5cd31c099b1f31eb6b68
apache-kylin-2.6.1-bin-hadoop3.tar.gz -
6f06e94055d7639729f7879508669375a80eddd76c2a4880da38a0f7f223de44
apache-kylin-2.6.1-bin-cdh57.tar.gz  -
b5038da13bfbf7fbba9a46b4675b587c882a8e152d244b063c4a610d6000bd55
apache-kylin-2.6.1-bin-cdh60.tar.gz  -
d1ba39a6e288131a89e3c8e4d0959fd3c05c4ed42df1164df4d2ec9ddf55f92f

The checking content should include:

   - sigs and hashes must be OK
   - the package must contain the correct NOTICE and LICENSE files for the
   included content
   - the package must not contain any content not derived from the source.
   - in the case of bundled binaries, reviewers must check that all
   contents are represented in the LICENSE (and NOTICE file if required).
   The bundle must not contain any files that are prohibited from
   distribution (category X).


Here is my vote:
+1 (binding)

Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Apache Kylin binary package temporarily removed from download page

2019-03-15 Thread ShaoFeng Shi
Hello Kylin users,

We found the binary packages have some issues (missing NOTICE and LICENSE
file), so temporarily removed them from the download page and apache
mirrors.

We will re-package them, and then call a vote in the PMC. After being
approved, we will onboard them again. If you need it during this period,
you can download the source release and then compile from source by
following the related document.

Thanks for your patience!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: kylin内存溢出问题

2019-03-13 Thread ShaoFeng Shi
The slow query is a kind of warning. As the administrator, you need to
analyze why the query was so slow; In many cases, it was due to the bad
design of the cube, for example, the dimension sequence on Rowkey, the
aggregation group, etc. You can search for the best practices of Kylin, or
check it on the FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html

Besides, analyze the memory dump is also recommended.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




wanmin_ws  于2019年3月13日周三 下午5:51写道:

>
>
> --
> 发件人:wanmin_ws 
> 发送时间:2019年3月12日(星期二) 16:58
> 收件人:yuzhang 
> 主 题:回复:kylin内存溢出问题
>
> 没有做什么多余的配置,就是在查询的高峰期的时候kylin崩溃了,崩溃了肯定作业什么的都不能运行了啊。kylin.log日志是正常的,只有这一个错误信息。不然也不会排查了那么久。想请教一下,1、slow
> query多了会不会导致kylin崩溃。2、kylin的查询并发是多少? 3、是不是每一个查询都要创建一个session。
> 谢谢。
> --
> 发件人:yuzhang 
> 发送时间:2019年3月9日(星期六) 11:33
> 收件人:dev@kylin.apache.org ; wanmin...@aliyun.com <
> wanmin...@aliyun.com>
> 抄 送:wanmin...@aliyun.com ; dev@kylin.apache.org <
> dev@kylin.apache.org>
> 主 题:回复:kylin内存溢出问题
>
>
>  Hi, wanmin
> I am interested in this problem and making some research on it.
> When the query kylin instance down, are the job and all instances
> running normally? The most number of concurrents are query request? Have
> you ever redeploy or shutdown kylin web app by hand before the exception
> occured? Any extra configurations have been set in tomcat?
> As shishaofeng said, the log information is limited. More log or the
> way to reproduce the error will be helpful.
>
>
>   Best regards
>
>   yuzhang
>
>
>
>
> yuzhang
>
> shifengdefan...@163.com
>  签名由 网易邮箱大师 定制
> 在2019年3月7日 11:08,wanmin_ws 写道:  你好,请问能解答一下吗。
> --
> 发件人:wanmin_ws 
> 发送时间:2019年3月6日(星期三) 10:55
> 收件人:wanmin_ws ; dev 
> 抄 送:dev 
> 主 题:回复:kylin内存溢出问题
>
> kylin 2.5.0 大数据平台HDP,一共5台kylin节点,一台all,一台job,三台query节点。
> 挂掉的都是query节点,但是这个错误在5个节点上都有报
> --
> 发件人:yuzhang 
> 发送时间:2019年3月5日(星期二) 17:37
> 收件人:wanmin_ws 
> 抄 送:dev 
> 主 题:回复:kylin内存溢出问题
>
> Hi, Could you describe your deploy environment and Kylie version and
> Number of concurrent
>
>
> | |
> yuzhang
> |
> |
> Email:shifengdefan...@163.com
> |
>
> Signature is customized by Netease Mail Master
>
> 在2019年03月05日 17:27,wanmin_ws 写道:
> 只有查询高峰期的时候会出现这个问题,这个问题和slow query
> 有没有关系?这个错是kylin.out报出来的,显示是tomcat不能开启更多的session
> --
> 发件人:ShaoFeng Shi 
> 发送时间:2019年3月5日(星期二) 17:23
> 收件人:dev ; wanmin_ws 
> 抄 送:user 
> 主 题:Re: kylin内存溢出问题
>
> Hi Min,
>
> The log information is so limited that we don't know what may caused that.
> I highly recommend you to do some analysis from the following perspective:
> 1) check the log files in "logs/" and "tomcat/logs" folder;
> 2) using jmap and jhat to analysis the heap usage;
> 3) using jstack to analysis the thread information;
> 4) check your cube definition to see whether there is some UHC dimension
> and the dictionary encoding was used for that.
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Work email: shaofeng@kyligence.io
> Kyligence Inc: https://kyligence.io/
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> wanmin_ws  于2019年3月5日周二 下午5:03写道:
>
>  错误描述:在访问高峰期的时候,kylin会挂掉,查看日志如下,但不知道如何操作,请问能帮我看一下吗。这个问题已经困扰很久了。
>  日志:
>  严重: The web application [/kylin] created a ThreadLocal with key of type
>  [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@6cc63032]) and a
>  value of type [org.apache.kylin.rest.msg.Message] (value
>  [org.apache.kylin.rest.msg.Message@3ad18e2f]) but failed to remove it
>  when the web application was stopped. Threads are going to be renewed over
>  time to try and avoid a probable memory leak.
>  三月 01, 2019 9:50:52 上午 org.apache.catalina.loader.WebappClassLoaderBase
>  checkThreadLocalMapForLeaks
>  严重: The web application [/kylin] created a ThreadLocal with key of type
>  [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@76397470]) and a
>  value of type [org.apache.kylin.com

Kylin ODBC driver is removed from download page

2019-03-11 Thread ShaoFeng Shi
Hello Kylin users,

The Kylin ODBC driver was removed from Apache Kylin download web page at
yesterday. The reason is that the ODBC driver binary package (.exe files)
wasn't formally voted by the PMC (although it has been there for some
time). To be compliant with asf policy, we removed that. The PMC will
discuss the follow-up actions, but there is no ETA for when it can be
recovered.

If you still need the ODBC driver, you can try to compile it from the
source code ("odbc" folder in Kylin source release).

On behalf of Apache Kylin PMC

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[jira] [Created] (KYLIN-3862) Check the binary packages

2019-03-09 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3862:
---

 Summary: Check the binary packages
 Key: KYLIN-3862
 URL: https://issues.apache.org/jira/browse/KYLIN-3862
 Project: Kylin
  Issue Type: Task
Reporter: Shaofeng SHI


As to the approval of binary packages:
 
It's not possible in general to check the exact contents of a binary, however 
there are some checks that should be made:
- sigs and hashes must be OK
- the package must contain the correct NOTICE and LICENSE files for the 
included content
- the package must not contain any content not derived from the source.
- in the case of bundled binaries, reviewers must check that all contents are 
represented in the LICENSE (and NOTICE file if required).
The bundle must not contain any files that are prohibited from distribution 
(category X).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3863) Check the binary packages

2019-03-09 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3863:
---

 Summary: Check the binary packages
 Key: KYLIN-3863
 URL: https://issues.apache.org/jira/browse/KYLIN-3863
 Project: Kylin
  Issue Type: Task
Reporter: Shaofeng SHI


As to the approval of binary packages:
 
It's not possible in general to check the exact contents of a binary, however 
there are some checks that should be made:
- sigs and hashes must be OK
- the package must contain the correct NOTICE and LICENSE files for the 
included content
- the package must not contain any content not derived from the source.
- in the case of bundled binaries, reviewers must check that all contents are 
represented in the LICENSE (and NOTICE file if required).
The bundle must not contain any files that are prohibited from distribution 
(category X).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [Discuss] Won't ship Spark binary in Kylin binary anymore

2019-03-09 Thread ShaoFeng Shi
Thanks, guys; I have updated the installation guide for this change:
https://kylin.apache.org/docs/install/index.html

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




JiaTao Tao  于2019年3月9日周六 下午9:28写道:

> +1
>
>
> --
>
>
> Regards!
>
> Aron Tao
>
> ShaoFeng Shi  于2019年3月8日周五 上午2:43写道:
>
>> Hello,
>>
>> As we know Kylin ships a Spark in its binary package; The total package
>> becomes bigger and bigger as the version grows; the latest version (v2.6.1)
>> is bigger than 350MB, which was rejected by Apache SVN server when trying
>> to upload the new package. Among the 350MB, more than 200MB is Spark, while
>> Spark is not mandatory for Kylin.
>>
>> So I would propose to exclude Spark from Kylin's binary package, from the
>> current v2.6.1; the user just needs to point SPARK_HOME to any a folder of
>> the expected spark version, or manually download and then put it to
>> KYLIN_HOME/spark.  All other behaviors are not impacted.
>>
>> Just share your comments if any.
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>>
>>


[Announce] Apache Kylin 2.6.1 released

2019-03-08 Thread ShaoFeng Shi
The Apache Kylin team is pleased to announce the immediate availability of
the 2.6.1 release.

This is a bugfix release after 2.6.0, with 7 enhancements and 19 bug fixes;
All of the changes in this release can be found in:
https://kylin.apache.org/docs/release_notes.html

You can download the source release and binary packages from Apache Kylin's
download page: https://kylin.apache.org/download/

Apache Kylin is an open source Distributed Analytics Engine designed to
provide SQL interface and multi-dimensional analysis (OLAP) on Apache
Hadoop, supporting extremely large datasets.

Apache Kylin lets you query massive dataset at sub-second latency in 3
steps:
1. Identify a star schema or snowflake schema data set on Hadoop.
2. Build Cube on Hadoop.
3. Query data with ANSI-SQL and get results in sub-second, via ODBC, JDBC
or RESTful API.

Thanks to everyone who has contributed to the 2.6.1 release.

We welcome your help and feedback. For more information on how to
report problems, and to get involved, visit the project website at
https://kylin.apache.org/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[Discuss] Won't ship Spark binary in Kylin binary anymore

2019-03-07 Thread ShaoFeng Shi
Hello,

As we know Kylin ships a Spark in its binary package; The total package
becomes bigger and bigger as the version grows; the latest version (v2.6.1)
is bigger than 350MB, which was rejected by Apache SVN server when trying
to upload the new package. Among the 350MB, more than 200MB is Spark, while
Spark is not mandatory for Kylin.

So I would propose to exclude Spark from Kylin's binary package, from the
current v2.6.1; the user just needs to point SPARK_HOME to any a folder of
the expected spark version, or manually download and then put it to
KYLIN_HOME/spark.  All other behaviors are not impacted.

Just share your comments if any.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[RESULT][VOTE] Release apache-kylin-2.6.1 (RC1)

2019-03-07 Thread ShaoFeng Shi
Thanks to everyone who has tested the release candidate and given
their comments and votes.

The tally is as follows.

4 binding +1s:
Shaofeng Shi
Luke Han
Billy Liu
Yanghong Zhong

8 non-binding +1s:
Temple Zhou
Chao Long
Jiatao Tao
Xiaoxiang Yu
Zhengshuai Peng
Rongchuan Jin
Chunen Ni
Yichen Zhou

No 0s or -1s.

Therefore I am delighted to announce that the proposal to release
Apache-Kylin-2.6.1 has passed.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: [VOTE] Release apache-kylin-2.6.1 (RC1)

2019-03-07 Thread ShaoFeng Shi
Thanks, guys; the vote has lasted more than 72 hours, I will send out the
summary soon.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Yichen Zhou  于2019年3月7日周四 下午5:35写道:

> +1
> mvn test passed
>
> -Yichen
>
> On Wed, Mar 6, 2019 at 4:44 AM Zhong, Yanghong  >
> wrote:
>
> > +1 binding
> >
> > mvn test passed
> >
> > LM-SHC-16507566:verify yangzhong$ mvn -v
> > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe;
> > 2018-06-18T02:33:14+08:00)
> > Maven home: /usr/local/mvn/apache-maven-3.5.4
> > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime:
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> > Default locale: en_CN, platform encoding: UTF-8
> > OS name: "mac os x", version: "10.14.3", arch: "x86_64", family: "mac"
> >
> > --
> > Best regards,
> > Yanghong Zhong
> >
> > On 2019/3/6, 8:24 PM, "zjsy...@163.com on behalf of nichunen" <
> > zjsy...@163.com on behalf of n...@apache.org> wrote:
> >
> > +1
> > mvn test passed
> > build success
> >
> > md5 verified
> >
> >
> > --
> >
> >
> > Best regards,
> >
> >
> >
> > Ni Chunen / George
> >
> >
> >
> >
> > At 2019-03-06 19:48:59, "Rongchuan Jin" 
> > wrote:
> > >+1 mvn test passed
> > >Best Regards
> > >Rongchuan Jin
> > >
> > >
> > >
> > >在 2019/3/6 下午7:37,“Billy Liu” 写入:
> > >
> > >+1 binding
> > >
> > >mvn test passed
> > >
> > >Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe;
> > >2018-06-18T02:33:14+08:00)
> > >Maven home: /usr/local/Cellar/maven/3.5.4
> > >Java version: 1.8.0_91, vendor: Oracle Corporation, runtime:
> > >
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home/jre
> > >Default locale: en_US, platform encoding: UTF-8
> > >OS name: "mac os x", version: "10.14.3", arch: "x86_64", family:
> > "mac"
> > >
> > >With Warm regards
> > >
> > >Billy Liu
> > >
> > >Luke Han  于2019年3月6日周三 下午6:29写道:
> > >>
> > >> +1 binding
> > >>
> > >    > Best Regards!
> > >> -
> > >>
> > >> Luke Han
> > >>
> > >>
> > >> On Wed, Mar 6, 2019 at 5:55 PM PENG Zhengshuai <
> > cosine...@hotmail.com>
> > >> wrote:
> > >>
> > >> > +1
> > >> >
> > >> > > On Mar 5, 2019, at 1:56 PM, JiaTao Tao <
> taojia...@gmail.com>
> > wrote:
> > >> > >
> > >> > > +1
> > >> > >
> > >> > > --
> > >> > >
> > >> > >
> > >> > > Regards!
> > >> > >
> > >> > > Aron Tao
> > >> > >
> > >> > >
> > >> > >
> > >> > > ShaoFeng Shi  于2019年3月4日周一
> > 上午10:35写道:
> > >> > >
> > >> > >> Hi all,
> > >> > >>
> > >> > >> I have created a build for Apache Kylin 2.6.1, release
> > candidate 1.
> > >> > >>
> > >> > >> Changes highlights:
> > >> > >> [KYLIN-3494] - Build cube with spark reports
> > >> > ArrayIndexOutOfBoundsException
> > >> > >> [KYLIN-3537] - Use Spark to build Cube on Yarn failed at
> > Setp8 on HDP3.
> > >> > >> [KYLIN-3815] - Unexpected behavior when joining the
> > streaming table and
> > >> > >> hive table
> > >> > >> [KYL

Re: kylin内存溢出问题

2019-03-05 Thread ShaoFeng Shi
Hi Min,

The log information is so limited that we don't know what may caused that.
I highly recommend you to do some analysis from the following perspective:
1) check the log files in "logs/" and "tomcat/logs" folder;
2) using jmap and jhat to analysis the heap usage;
3) using jstack to analysis the thread information;
4) check your cube definition to see whether there is some UHC dimension
and the dictionary encoding was used for that.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




wanmin_ws  于2019年3月5日周二 下午5:03写道:

> 错误描述:在访问高峰期的时候,kylin会挂掉,查看日志如下,但不知道如何操作,请问能帮我看一下吗。这个问题已经困扰很久了。
> 日志:
> 严重: The web application [/kylin] created a ThreadLocal with key of type
> [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@6cc63032]) and a
> value of type [org.apache.kylin.rest.msg.Message] (value
> [org.apache.kylin.rest.msg.Message@3ad18e2f]) but failed to remove it
> when the web application was stopped. Threads are going to be renewed over
> time to try and avoid a probable memory leak.
> 三月 01, 2019 9:50:52 上午 org.apache.catalina.loader.WebappClassLoaderBase
> checkThreadLocalMapForLeaks
> 严重: The web application [/kylin] created a ThreadLocal with key of type
> [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@76397470]) and a
> value of type [org.apache.kylin.common.util.ImplementationSwitch] (value
> [org.apache.kylin.common.util.ImplementationSwitch@26f33af5]) but failed
> to remove it when the web application was stopped. Threads are going to be
> renewed over time to try and avoid a probable memory leak.
> 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop
> 信息: Stopping ProtocolHandler ["http-bio-7070"]
> 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop
> 信息: Stopping ProtocolHandler ["ajp-bio-9009"]


[VOTE] Release apache-kylin-2.6.1 (RC1)

2019-03-04 Thread ShaoFeng Shi
Hi all,

I have created a build for Apache Kylin 2.6.1, release candidate 1.

Changes highlights:
[KYLIN-3494] - Build cube with spark reports ArrayIndexOutOfBoundsException
[KYLIN-3537] - Use Spark to build Cube on Yarn failed at Setp8 on HDP3.
[KYLIN-3815] - Unexpected behavior when joining the streaming table and
hive table
[KYLIN-3828] - ArrayIndexOutOfBoundsException thrown when building a
streaming cube with empty data in its first dimension
[KYLIN-3833] - Potential OOM in Spark Extract Fact Table Distinct Columns
step
[KYLIN-3826] - MergeCuboidJob only uploads necessary segment's dictionary

Thanks to everyone who has contributed to this release.
Here’s the release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344845

The commit to being voted upon:

https://github.com/apache/kylin/commit/270cfe68ecc94c66141b29e2ccf20b9ec25e23dd

Its hash is 270cfe68ecc94c66141b29e2ccf20b9ec25e23dd.

The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.1-rc1/

The hash of the artifact is as follows:
apache-kylin-2.6.1-source-release.zip.sha256
961b8c8d0e781fe7936efb7f33cebb9661b4fbf83082669769a41b47cea19001

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1060/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 2.6.1.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 2.6.1
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[New Doc] Enable distributed cache for Kylin v2.6

2019-03-01 Thread ShaoFeng Shi
Hello,

Kylin v2.6 introduced a set of improvements on the query cache, including a
more precise policy to flush the cache, the by-segment storage level cache,
and using Memcached as the distributed cache. These enhancements have
proved can greatly improve the query performance and throughput by the eBay
Kylin team.  Now a short guide is added into the "Kylin configuration" page:

https://kylin.apache.org/docs/install/configuration.html#distributed-cache

You can read more in JIRA KYLIN-2895
<https://issues.apache.org/jira/browse/KYLIN-2895> (and its sub-tasks);
Feedback is welcomed.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: I wonder about the qualifier assignment within an column family for measures

2019-02-24 Thread ShaoFeng Shi
Hi Zhang Yu,

Kylin's metadata structure allows dividing the measures to one or multiple
CF/columns. To be simple, the web GUI only asks the user to select CF, and
then combine all the measures in a CF into one column. You can customize
the policy to further divides the measures into different columns if you
think that is better. Please share us with your findings, if it does
benefit the performance.

Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




yuzhang  于2019年2月21日周四 下午8:19写道:

>   Hi, dear team, when I read the source code about create HTable and
> CubeDescCreator, I found there is only one column("M") per column
> family(F1, F2, F3). And the column "M" contain all measures which are
> assigned to this column family.  Then, HBaseReadonlyStore will load all
> this column(or Cell) data into buffer in region server, and then only
> return the selected measure to query server. I wonder why don't kyiln
> assign an column qualifer(like M1, M2) per measures?
>
> and if my understanding of these codes is incorrect, could you let me
> know, Please.
> Hope for any of yours reply.
>
> | |
> shifengdefannao
> |
> |
> Email:shifengdefan...@163.com
> |
>
> Signature is customized by Netease Mail Master


Re: got exception when start kylin

2019-02-24 Thread ShaoFeng Shi
Please try to check more logs, both in /logs and tomcat/logs. I remember
some users also encountered the "/tomcat/conf/.keystore (No such
file or directory)" error before, but this was not the root cause.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年2月24日周日 下午8:15写道:

> Hi, chenyang
>
>
>
> Have you modified any configuration of Tomcat?
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> 
> 发件人: chenyang 
> 发送时间: Friday, February 22, 2019 4:30:20 PM
> 收件人: dev@kylin.apache.org
> 主题: got exception when start kylin
>
> may someone help me to solve this WARN:
> ```
> org.apache.catalina.LifecycleException: Failed to initialize component
> [Connector[org.apache.coyote.http11.Http11Protocol-7443]]
> at
> org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:112)
> at
>
> org.apache.catalina.core.StandardService.initInternal(StandardService.java:552)
> at
> org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:107)
> at
>
> org.apache.catalina.core.StandardServer.initInternal(StandardServer.java:875)
> at
> org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:107)
> at org.apache.catalina.startup.Catalina.load(Catalina.java:632)
> at org.apache.catalina.startup.Catalina.start(Catalina.java:669)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:353)
> at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:493)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:313)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:227)
> Caused by: org.apache.catalina.LifecycleException: Protocol handler
> initialization failed
> at
> org.apache.catalina.connector.Connector.initInternal(Connector.java:995)
> at
> org.apache.catalina.util.LifecycleBase.init(LifecycleBase.java:107)
> ... 18 more
> Caused by: java.lang.IllegalArgumentException:
> /home/deployer/apache-kylin-2.5.2-bin-cdh60/tomcat/conf/.keystore (No such
> file or directory)
> at
> org.apache.tomcat.util.net
> .AbstractJsseEndpoint.createSSLContext(AbstractJsseEndpoint.java:115)
> at
> org.apache.tomcat.util.net
> .AbstractJsseEndpoint.initialiseSsl(AbstractJsseEndpoint.java:86)
> at org.apache.tomcat.util.net
> .NioEndpoint.bind(NioEndpoint.java:244)
> at
> org.apache.tomcat.util.net
> .AbstractEndpoint.init(AbstractEndpoint.java:1087)
> at
> org.apache.tomcat.util.net
> .AbstractJsseEndpoint.init(AbstractJsseEndpoint.java:265)
> at
> org.apache.coyote.AbstractProtocol.init(AbstractProtocol.java:581)
> at
>
> org.apache.coyote.http11.AbstractHttp11Protocol.init(AbstractHttp11Protocol.java:68)
> at
> org.apache.catalina.connector.Connector.initInternal(Connector.java:993)
> ... 19 more
> Caused by: java.io.FileNotFoundException:
> /home/deployer/apache-kylin-2.5.2-bin-cdh60/tomcat/conf/.keystore (No such
> file or directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.(FileInputStream.java:138)
> at java.io.FileInputStream.(FileInputStream.java:93)
> at
>
> sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
> at
>
> sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
> at
>
> org.apache.tomcat.util.file.ConfigFileLoader.getInputStream(ConfigFileLoader.java:89)
> at
> org.apache.tomcat.util.net.SSLUtilBase.getStore(SSLUtilBase.java:140)
> at
> org.apache.tomcat.util.net
&g

Re: can not open kylin web ui

2019-02-24 Thread ShaoFeng Shi
Remember to add the "/kylin" context on the URL, sometimes people may
forget: http://hostname:7070/kylin

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




JiaTao Tao  于2019年2月23日周六 下午12:09写道:

> Hi
>
> You can check files in "${KYLIN_HOME)/logs" and see if there's something
> unexpected occurs.
>
> --
>
>
> Regards!
>
> Aron Tao
>
> hetadesai56  于2019年2月22日周五 下午2:34写道:
>
> > Hi,
> >
> > I am working on HDP 2.6.5 on virtual box. I have installed
> > apache-kylin-2.6.0-bin. kylin started successfully. But Web UI is not
> > working. I did port forwarding in virtual box for kylin default port
> 7070.
> >
> > How can i resolve this ?
> >
> > Thank you,
> > Heta
> >
> > --
> > Sent from: http://apache-kylin.74782.x6.nabble.com/
> >
>


Re: Hbase table is always empty when build with spark

2019-02-24 Thread ShaoFeng Shi
Hello Alex,

Interesting; We didn't observe such an issue. Can you confirm your hive
table has the data, instead of an input error? Does the problem get solved
after setting "mapreduce.job.outputformat.class"?

Thanks for the feedback!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




mailpig  于2019年2月20日周三 上午11:18写道:

> In kylin-2.5.2, the result hbase table is always table is always empty
> when I
> build cube with spark.
> I found that the step "Load HFile to HBase Table" has some warn log:
> /2019-01-27 00:49:30,067 WARN [Scheduler 448149092 Job
> 89a25959-e12d-7a5e-0ecb-80c978533eab-6419]
> mapreduce.LoadIncrementalHFiles:204 : Skipping non-directory
>
> hdfs://test/kylin/kylin_metadata/kylin-89a25959-e12d-7a5e-0ecb-80c978533eab/test_UUID_spark/hfile/_SUCCESS
> 2019-01-27 00:49:30,068 WARN [Scheduler 448149092 Job
> 89a25959-e12d-7a5e-0ecb-80c978533eab-6419]
> mapreduce.LoadIncrementalHFiles:204 : Skipping non-directory
>
> hdfs://test/kylin/kylin_metadata/kylin-89a25959-e12d-7a5e-0ecb-80c978533eab/test_UUID_spark/hfile/part-r-0
> 2019-01-27 00:49:30,068 WARN [Scheduler 448149092 Job
> 89a25959-e12d-7a5e-0ecb-80c978533eab-6419]
> mapreduce.LoadIncrementalHFiles:204 : Skipping non-directory
>
> hdfs://test/kylin/kylin_metadata/kylin-89a25959-e12d-7a5e-0ecb-80c978533eab/test_UUID_spark/hfile/part-r-1/
>
> After read the source code, I found the step "Convert Cuboid Data to HFile"
> with spark has bug. The above step's outputdir should has subdirectory with
> column family. Indeed, SparkCubeHFile must set
> mapreduce.job.outputformat.class with HFileOutputFormat2.class.
>
> Please check if I am correct!
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


[jira] [Created] (KYLIN-3826) MergeCuboidJob only uploads necessary segment's dictionary

2019-02-23 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3826:
---

 Summary: MergeCuboidJob only uploads necessary segment's dictionary
 Key: KYLIN-3826
 URL: https://issues.apache.org/jira/browse/KYLIN-3826
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Reporter: Shaofeng SHI


On yesterday's Kylin meetup, Zhang Wei mentioned that the "MergeCuboidJob" will 
upload all segment's metadata, which will take extra long time when the segment 
number is large. While this is unnecessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3825) Add ACL Rest APIs to document

2019-02-22 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3825:
---

 Summary: Add ACL Rest APIs to document
 Key: KYLIN-3825
 URL: https://issues.apache.org/jira/browse/KYLIN-3825
 Project: Kylin
  Issue Type: Improvement
Reporter: Shaofeng SHI






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3823) Release v2.6.1

2019-02-22 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3823:
---

 Summary: Release v2.6.1
 Key: KYLIN-3823
 URL: https://issues.apache.org/jira/browse/KYLIN-3823
 Project: Kylin
  Issue Type: Task
Reporter: Shaofeng SHI






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Re: [Discuss] Moving toward Apache Kylin 3.0

2019-02-19 Thread ShaoFeng Shi
Thanks for the feedbacks; Since there was no objection on this, the version
on the current master branch has been updated to 3.0.0-SNAPSHOT.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Yichen Zhou  于2019年1月25日周五 下午4:14写道:

> +1
>
> Regards,
> Yichen
>
> On Thu, Jan 24, 2019 at 11:49 PM nichunen  wrote:
>
> > +1
> >
> >
> > --
> >
> >
> > Best regards,
> >
> >
> >
> > Ni Chunen / George
> >
> >
> >
> >
> > At 2019-01-25 15:48:21, "Billy Liu"  wrote:
> > >+1
> > >
> > >That's cool. Let's move to the real-time scenario.
> > >
> > >With Warm regards
> > >
> > >Billy Liu
> > >
> > >
> > >Temple Zhou  于2019年1月25日周五 下午3:32写道:
> > >
> > >> +1
> > >> Real-time streaming feature may be expected by many people.
> > >> ———
> > >> Best wishes~
> > >> Temple Zhou
> > >>
> > >>  Original Message
> > >> *Sender:* zhan shaoxiong
> > >> *Recipient:* dev@kylin.apache.org
> > >> *Date:* Friday, Jan 25, 2019 15:27
> > >> *Subject:* Re: [Discuss] Moving toward Apache Kylin 3.0
> > >>
> > >> +1
> > >> thanks
> > >>
> > >> 在 2019/1/23 下午3:57,“ShaoFeng Shi” 写入:
> > >>
> > >> Hi Kylin developers,
> > >>
> > >> In last week, Kylin released v2.6.0, with the enhanced &
> > distributed query
> > >> cache and JDBC data source SDK. After this release, the next batch
> > >> candidate features include real-time streaming, parquet storage,
> > and druid
> > >> storage. These features were developed in the past 1-2 years by
> > different
> > >> Kylin players and were open sourced in the past 6 months. They
> have
> > already
> > >> been staged in separate branches and are under evaluation by the
> > community.
> > >> We have received much feedback from the community.
> > >>
> > >> These candidate features are big supplements to as-is Kylin
> > functions; For
> > >> example, the real-time streaming feature will bring Kylin from
> > batch &
> > >> historical analytics into real-time analytics. The parquet storage
> > will
> > >> make the deployment more flexible and more cloud-friendly. Of
> > course,
> > >> stabilizing and improving these features need additional time and
> > effort.
> > >>
> > >> So, when we merging and releasing them, we'd better give it a new
> > version
> > >> number so that user can clearly know the difference with current
> 2.x
> > >> versions. I discussed this with several developers offline, we
> > think it is
> > >> time to move toward Kylin 3.0. So, if one of the above features is
> > merged,
> > >> the version will be 3.0. The current 2.6 will be maintained until
> > 3.x is
> > >> ready for production use.
> > >>
> > >> Your comments, ideas, and suggestions are welcomed!
> > >>
> > >> Best regards,
> > >>
> > >> Shaofeng Shi 史少锋
> > >> Apache Kylin PMC
> > >> Work email: shaofeng@kyligence.io
> > >> Kyligence Inc: https://kyligence.io/
> > >>
> > >> Apache Kylin FAQ:
> > https://kylin.apache.org/docs/gettingstarted/faq.html
> > >> Join Kylin user mail group: user-subscr...@kylin.apache.org
> > >> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> > >>
> > >>
> >
>


Re: Kylin go to hdfs to find jar file

2019-02-19 Thread ShaoFeng Shi
Did you configure to use HDFS as the defaultFS in core-site.xml?

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




jiangxiaoma111 <369806...@qq.com> 于2019年2月19日周二 下午6:02写道:

> Did you find the solution of this exception
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Apache Kylin Meetup @Shanghai, 23 Feb 2019

2019-02-18 Thread ShaoFeng Shi
Hello,

There will be a Kylin meetup event on this Saturday afternoon at Shanghai,
China. Engineers from China UnionPay (银联), Ctrip (携程), eBay and Kyligence
will share their use case and experiences with Kylin.

There are some seats left, so if you can come, please register here by free:

http://www.huodongxing.com/event/4476570217900

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: [kylin] left join怎样join is null记录

2019-02-18 Thread ShaoFeng Shi
Can you use some value (like "unknown") to represent NULL ? NULL is not
equal to anything, including NULL:

https://stackoverflow.com/questions/1843451/why-does-null-null-evaluate-to-false-in-sql-server

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年2月17日周日 下午8:58写道:

> Hi, chen
>
> I can not see your picture, you can try to add these in the attachment.
> Does it have any error message? Maybe you can try to change null to another
> word, like “unknown”.
>
> Best wishes!
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
> 
> 发件人: chen snowlake 
> 发送时间: Thursday, February 14, 2019 6:07:01 PM
> 收件人: dev@kylin.apache.org
> 主题: [kylin] left join怎样join is null记录
>
> HI:
> 大家好,在join的时候,我的联合的条件数据中存在is null记录,他们不能join起来
> 请问kylin是怎么写这样的查询? 不胜感激
> 比如
> [cid:image003.png@01D4C490.0483D3B0]
>
> … left Join … on ((a.region = a0.region) or (a.region is null and
> a0.region is null )) 这样的条件并不能将 nul行join起来
>
> [cid:image006.png@01D4C490.0483D3B0]
> SnowLake
> Email:che...@outlook.com
>
>


Re: Unexpected behavior when joinning streaming table and hive table

2019-02-13 Thread ShaoFeng Shi
Using hour as the partition column should be fine. From the data, it seems
the declared column sequence is not matched with the persisted data.

Lifan, I see you posted the cube JSON, could you please also provide the
model's JSON? That would help to analysis the problem. Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Xiaoxiang Yu  于2019年2月14日周四 上午10:34写道:

> Hi, lifei
>
> After check your model.json, I found you use "HOUR_START" as your
> partition_date_column, which is not correct.
> I think you should change to "timestamp" and have another try.
>
> Source code at
> https://github.com/apache/kylin/blob/master/source-kafka/src/main/java/org/apache/kylin/source/kafka/TimedJsonStreamParser.java#L111
>
> If you find any mistake, please let me know.
>
> 
> Best wishes,
> Xiaoxiang Yu
>
>
> On [DATE], "[NAME]" <[ADDRESS]> wrote:
>
> Hello, I am evaluating Kylin and tried to join streaming table and hive
> table, but now got unexpected behavior.
>
> All the scripts can be found in
> https://gist.github.com/OstCollector/a4ac396e3169aa42a416d96db3021195
> (may need to modify some script to match the environments)
>
> Environment:
> Centos 7
> Hadoop on CDH-5.8
> dedicated Kafka-2.1 (not included in CDH)
>
> How to reproduce this problem:
>
> 1. run gen_station.pl to generate dim table data
> 2. run import-data.sh to build dim table in Hive
> 3. run factdata.pl and pipe its output into kafka
> 4. create tables TEST_WEATHER.STATION_INFO(hive)
> TEST_WEATHER.WEATHER(streaming) in Kylin
> 5. create model and cube in Kylin, join WEATHER.SATION_ID = STATION.ID
> 6. build the cube
>
> Expected behavior:
> The cube is built correctly and I can get data when search.
>
> Actual behavior:
> On apache-kylin-2.6.0-bin-cdh57: build failed at step #2 (Create
> Intermediate Flat Hive Table)
> On apache-kylin-2.5.2-bin-cdh57: got empty cube
>
> I also tried with this case without streaming, with the format of
> timestamp
> column changed to "%Y-%m-%d %H:%M:%S", and an additional table to
> store the
> mapping of timestamp and {hour,day,month,year}_start.
> In this case, the cube is built as expected.
>
>
> In both failed cases, the intermediate fact table on Hive built in
> step #2
> seems to have wrong column order.
> e.g. on version 2.5.2-cdh57, the schema and content of temp table are
> shown
> below:
>
> CREATE EXTERNAL TABLE IF NOT EXISTS
> kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact
> (
> DAY_START date
> ,YEAR_START date
> ,STATION_ID string
> ,QUARTER_START date
> ,MONTH_START date
> ,TEMPERATURE bigint
> ,HOUR_START timestamp
> )
> STORED AS SEQUENCEFILE
> LOCATION
>
> 'hdfs://hz-dev-hdfs-service/user/admin/kylin-2/kylin_metadata/kylin-5dbe40eb-55ba-2245-c0b5-1e9efcb67937/kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact';
> ALTER TABLE
> kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact
> SET
> TBLPROPERTIES('auto.purge'='true');
>
> hive> select * from
> kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact
> limit
> 10;
> OK
> NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01
>   NULL
> NULL
> NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31
>   NULL
> NULL
> NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31
>   NULL
> NULL
> NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31
>   NULL
> NULL
> NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31
>   NULL
> NULL
> NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01
>   NULL
> NULL
> NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01
>   NULL
> NULL
> NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31
>   NULL
> NULL
> NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31
>   NULL
> NULL
> NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01
>   NULL
> NULL
> Time taken: 0.421 seconds, Fetched: 10 row(s)
>
> While the the content of temp file is:
>

Re: Kylin exist same segement

2019-02-10 Thread ShaoFeng Shi
Hi Bingmei,

Kylin doesn't allow two segments have the same or overlapped time range. So
you got the above error. You said you already have two segments for the
same range, could you please provide the Cube instance JSON? Is your Kylin
a multiple nodes cluster? If yes, how many "job" or "all" nodes in this
cluster?

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年2月7日周四 上午1:35写道:

> Hi, XiebingMei.
>
>I can not reproduce this suitcase. Can you provide more
> information? When you built the segment for the third time, was it in the
> build phase? If there is a job for this segment, the system will alert this
> error.
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> 
> 发件人: XiebingMei 
> 发送时间: Sunday, February 3, 2019 11:48:02 AM
> 收件人: dev@kylin.apache.org
> 主题: Kylin exist same segement
>
> Hi Team,
>
> kylin  2.5.1 ,after i build cube same
> segement(2019020303_2019020304) two times,and two times building
> all
> success, When i build cube same segement third times.system alert
>
> "Segments overlap: cube_name[2019020303_2019020304] and
> cube_name[2019020303_2019020304]"
>
> {"code":"999","data":null,"msg":"Segments overlap:
> cube_name[2019020303_2019020304] and
>
> cube_name[2019020303_2019020304]","stacktrace":"org.apache.kylin.rest.exception.InternalErrorException:
> Segments overlap: cube_name[2019020303_2019020304] and
> cube_name[2019020303_2019020304]\n
> at
>
> org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:402)\n
> at
>
> org.apache.kylin.rest.controller.CubeController.rebuild(CubeController.java:354)\n
> at
>
> org.apache.kylin.rest.controller.CubeController.build(CubeController.java:343)\n
> at sun.reflect.GeneratedMethodAccessor257.invoke(Unknown Source)\n
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n
> at java.lang.reflect.Method.invoke(Method.java:498)\n
> at
>
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)\n
> at
>
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)\n
> at
>
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)\n
> at
>
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)\n
> at
>
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)\n
> at
>
> org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)\n
> at
>
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)\n
> at
>
> org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)\n
> at
>
> org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)\n
> at
>
> org.springframework.web.servlet.FrameworkServlet.doPut(FrameworkServlet.java:883)\n
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:653)\n
> at
>
> org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)\n
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n
> at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n
> at
>
> org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)\n
> at
>
> org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)\n
> at
>
> org.springframework.security.web.access.intercept.FilterSecurity

Re: 答复: kylin 手动合并(merge)报错问题

2019-02-10 Thread ShaoFeng Shi
From the source code where the NPE be thrown, we can see it seems the cube
statistics file wasn't found in Kylin meta store. It seems your metadata is
incomplete:

https://github.com/apache/kylin/blob/2.2.x/engine-mr/src/main/java/org/apache/kylin/engine/mr/steps/MergeStatisticsStep.java#L80

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




王林 <1059790...@qq.com> 于2019年1月31日周四 下午5:11写道:

> 2.2.0
>
>
>
>
> -- 原始邮件 --
> 发件人: "Na Zhai";
> 发送时间: 2019年1月31日(星期四) 下午2:54
> 收件人: "dev@kylin.apache.org";
>
> 主题: 答复: kylin 手动合并(merge)报错问题
>
>
>
> Hi, wanglin.
>
>What’s your Kylin version? There is an issue about auto merge:
> https://issues.apache.org/jira/browse/KYLIN-3718.
>
> But I think your error is not related to that issue. It is may be caused
> by the deletion of cube_statistics directory.
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> 
> 发件人: 王林 <1059790...@qq.com>
> 发送时间: Monday, January 28, 2019 10:23:37 AM
> 收件人: dev
> 主题: kylin 手动合并(merge)报错问题
>
> 你好:
>   我使用kylin 创建了一个cube,开启了自动合并功能,合并周期为7天,28天。
> 但是发现kylin 自动合并功能没有生效,然后手动合并cube,合并最近几天的没有问题,但是合并以前的就报错信息如下:
>
>
> 报错步骤:#2 Step Name: Merge Cuboid Statistics Duration:
> 0.01 mins Waiting:  0 seconds
> 报错日志:java.lang.NullPointerException
> at 
> org.apache.kylin.engine.mr.steps.MergeStatisticsStep.doWork(MergeStatisticsStep.java:80)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
>   at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
>   at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:144)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:745)
> 请教是什么原因。
> 谢谢


Re: kylin 自动merge 问题

2019-02-10 Thread ShaoFeng Shi
Hi Lin,

Could you please describe the problem you found in detail? Thanks!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




王林 <1059790...@qq.com> 于2019年1月31日周四 上午11:57写道:

> 你好:
>  通过查看kylin自动merge功能未起作用通过跟踪源码:
> 当自动merge时触发:
> public void updateOnNewSegmentReady(String cubeName) {
> final KylinConfig kylinConfig = KylinConfig.getInstanceFromEnv();
> String serverMode = kylinConfig.getServerMode();
> if (Constant.SERVER_MODE_JOB.equals(serverMode.toLowerCase())
> ||
> Constant.SERVER_MODE_ALL.equals(serverMode.toLowerCase())) {
> CubeInstance cube = getCubeManager().getCube(cubeName);
> if (cube != null) {
> CubeSegment seg = cube.getLatestBuiltSegment();
> if (seg != null && seg.getStatus() ==
> SegmentStatusEnum.READY) {
> keepCubeRetention(cubeName);
> mergeCubeSegment(cubeName);
> }
> }
> }
> }
>
>
>
> private void mergeCubeSegment(String cubeName) {
> CubeInstance cube = getCubeManager().getCube(cubeName);
> if (!cube.needAutoMerge())
> return;
>
>
> synchronized (CubeService.class) {
> try {
> cube = getCubeManager().getCube(cubeName);
> SegmentRange offsets = cube.autoMergeCubeSegments();
> if (offsets != null) {
> CubeSegment newSeg =
> getCubeManager().mergeSegments(cube, null, offsets,
> true);//这个触发是流式cube的merge。但是我的cube不是流式cube。报异常。
> logger.debug("Will submit merge job on " + newSeg);
> DefaultChainedExecutable job =
> EngineFactory.createBatchMergeJob(newSeg, "SYSTEM");
> getExecutableManager().addJob(job);
> } else {
> logger.debug("Not ready for merge on cube " +
> cubeName);
> }
> } catch (IOException e) {
> logger.error("Failed to auto merge cube " + cubeName, e);
> }
> }
> }
>
>
>
> public CubeSegment mergeSegments(CubeInstance cube, TSRange tsRange,
> SegmentRange segRange, boolean force)
> throws IOException {
> if (cube.getSegments().isEmpty())
> throw new IllegalArgumentException("Cube " + cube + " has no
> segments");
>
>
> checkInputRanges(tsRange, segRange);
> checkBuildingSegment(cube);
> checkCubeIsPartitioned(cube);
>
>
> if (cube.getSegments().getFirstSegment().isOffsetCube()) {
> // offset cube, merge by date range?
> if (segRange == null && tsRange != null) {
> Pair pair =
> cube.getSegments(SegmentStatusEnum.READY)
> .findMergeOffsetsByDateRange(tsRange,
> Long.MAX_VALUE);
> if (pair == null)
> throw new IllegalArgumentException("Find no segments
> to merge by " + tsRange + " for cube " + cube);
> segRange = new
> SegmentRange(pair.getFirst().getSegRange().start,
> pair.getSecond().getSegRange().end);
> }
> tsRange = null;
> Preconditions.checkArgument(segRange != null);
> } else {
> segRange = null;
> Preconditions.checkArgument(tsRange != null);//抛出异常
> }
>
>
>
> 通过跟踪代码,自动合并有问题,我使用的是kylin2.2.x源码,是否存在问题?


[New Document] Kylin SQL reference

2019-01-30 Thread ShaoFeng Shi
Hello Kylin users,

A new document is added to Apache Kylin website for introducing the SQL
grammar, functions and data types that Kylin supports; We believe it will
help new users. Many thanks to Na Zhai, who drafted this doc and verified
the sample queries.

Here is the link:

English:
https://kylin.apache.org/docs/tutorial/sql_reference.html

Chinese:
https://kylin.apache.org/cn/docs/tutorial/sql_reference.html

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[jira] [Created] (KYLIN-3795) Submit Spark jobs via Apache Livy

2019-01-29 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3795:
---

 Summary: Submit Spark jobs via Apache Livy
 Key: KYLIN-3795
 URL: https://issues.apache.org/jira/browse/KYLIN-3795
 Project: Kylin
  Issue Type: New Feature
  Components: Spark Engine
Reporter: Shaofeng SHI


Livy is a rest service for Spark. Some users are using Livy as the interface 
for Spark. Kylin can have the capability to submit spark job via Livy.

https://livy.incubator.apache.org/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3796) MongoDB as data source

2019-01-29 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3796:
---

 Summary: MongoDB as data source
 Key: KYLIN-3796
 URL: https://issues.apache.org/jira/browse/KYLIN-3796
 Project: Kylin
  Issue Type: New Feature
  Components: Others
Reporter: Shaofeng SHI






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3793) org.apache.kylin.source.kafka.util.KafkaSampleProducer exit after generating 1 message

2019-01-28 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3793:
---

 Summary: org.apache.kylin.source.kafka.util.KafkaSampleProducer 
exit after generating 1 message
 Key: KYLIN-3793
 URL: https://issues.apache.org/jira/browse/KYLIN-3793
 Project: Kylin
  Issue Type: Bug
  Components: NRT Streaming
Affects Versions: v2.6.0
Reporter: Shaofeng SHI






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[Discuss] Moving toward Apache Kylin 3.0

2019-01-22 Thread ShaoFeng Shi
Hi Kylin developers,

In last week, Kylin released v2.6.0, with the enhanced & distributed query
cache and JDBC data source SDK. After this release, the next batch
candidate features include real-time streaming, parquet storage, and druid
storage. These features were developed in the past 1-2 years by different
Kylin players and were open sourced in the past 6 months. They have already
been staged in separate branches and are under evaluation by the community.
We have received much feedback from the community.

These candidate features are big supplements to as-is Kylin functions; For
example, the real-time streaming feature will bring Kylin from batch &
historical analytics into real-time analytics. The parquet storage will
make the deployment more flexible and more cloud-friendly. Of course,
stabilizing and improving these features need additional time and effort.

So, when we merging and releasing them, we'd better give it a new version
number so that user can clearly know the difference with current 2.x
versions. I discussed this with several developers offline, we think it is
time to move toward Kylin 3.0. So, if one of the above features is merged,
the version will be 3.0. The current 2.6 will be maintained until 3.x is
ready for production use.

Your comments, ideas, and suggestions are welcomed!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: ERROR context.ContextLoader:350 : Context initialization failed

2019-01-21 Thread ShaoFeng Shi
It seems you're running Kylin from IDE, please make sure the hbase-site.xml
under "/home/hadoop/Desktop/kylin-2.3.x
(2)/server/../examples/test_case_data/sandbox" is valid for your
environment. If you look at that file, it uses "sandbox.hortonworks.com" as
the zk host name, making sure this host is valid in your machine.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年1月17日周四 下午5:33写道:

> Hi, Jock.
>
>Do you follow the instructions on this page?
> http://kylin.apache.org/development/dev_env.html. And I find “ClusterId
> read in ZooKeeper is null” in the log that you provide, so I think the
> environment for you developing has the problem.
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> 
> 发件人: Jock 
> 发送时间: Wednesday, January 16, 2019 5:17:39 PM
> 收件人: dev@kylin.apache.org
> 主题: ERROR context.ContextLoader:350 : Context initialization failed
>
> Hello,
>
>
> I am getting following error when I try to run the source code of  kylin
> 2.3 in IntelliJ IDEA. I have no idea  about it. Could you give some
> suggests about it?
>
>
> Really appreciated.
>
>
> 2019-01-16 16:34:01,778 INFO  [main] common.KylinConfig:378 : Setting
> sandbox env, KYLIN_CONF=/home/hadoop/Desktop/kylin-2.3.x
> (2)/server/../examples/test_case_data/sandbox
> 2019-01-16 16:34:01,801 INFO  [main] util.ClassUtil:40 : Adding path
> /home/hadoop/Desktop/kylin-2.3.x
> (2)/server/../examples/test_case_data/sandbox to class path
> 2019-01-16 16:34:01,807 INFO  [main] common.KylinConfig:319 : Loading
> kylin-defaults.properties from
> /home/hadoop/Desktop/kylin-2.3.x%20(2)/core-common/target/classes/kylin-defaults.properties
> 2019-01-16 16:34:01,891 INFO  [main] common.KylinConfig:274 : Use
> KYLIN_CONF=/home/hadoop/Desktop/kylin-2.3.x
> (2)/server/../examples/test_case_data/sandbox
> 2019-01-16 16:34:01,896 INFO  [main] common.KylinConfig:99 : Initialized a
> new KylinConfig from getInstanceFromEnv : 1308927845
> 2019-01-16 16:34:01,897 INFO  [main] common.KylinConfigBase:1074 :
> override kylin.engine.mr.job-jar to /home/hadoop/Desktop/kylin-2.3.x
> (2)/server/../assembly/target/kylin-assembly-2.3.2-SNAPSHOT-job.jar
> 2019-01-16 16:34:01,901 INFO  [main] common.KylinConfigBase:919 : override
> kylin.storage.hbase.coprocessor-local-jar to
> /home/hadoop/Desktop/kylin-2.3.x
> (2)/server/../storage-hbase/target/kylin-storage-hbase-2.3.2-SNAPSHOT-coprocessor.jar
> 一月 16, 2019 4:34:03 下午 org.apache.catalina.core.AprLifecycleListener
> lifecycleEvent
> 信息: The APR based Apache Tomcat Native library which allows optimal
> performance in production environments was not found on the
> java.library.path:
> /usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 一月 16, 2019 4:34:05 下午 org.apache.coyote.AbstractProtocol init
> 信息: Initializing ProtocolHandler ["http-bio-7070"]
> 一月 16, 2019 4:34:05 下午 org.apache.catalina.core.StandardService
> startInternal
> 信息: Starting service Tomcat
> 一月 16, 2019 4:34:05 下午 org.apache.catalina.core.StandardEngine
> startInternal
> 信息: Starting Servlet Engine: Apache Tomcat/7.0.85
> 一月 16, 2019 4:34:06 下午 org.apache.catalina.startup.ContextConfig
> getDefaultWebXmlFragment
> 信息: No global web.xml found
> 一月 16, 2019 4:34:33 下午 org.apache.catalina.startup.TldConfig execute
> 信息: At least one JAR was scanned for TLDs yet contained no TLDs. Enable
> debug logging for this logger for a complete list of JARs that were scanned
> but no TLDs were found in them. Skipping unneeded JARs during scanning can
> improve startup time and JSP compilation time.
> 一月 16, 2019 4:34:33 下午 org.apache.catalina.core.ApplicationContext log
> 信息: No Spring WebApplicationInitializer types detected on classpath
> 一月 16, 2019 4:34:33 下午 org.apache.catalina.core.ApplicationContext log
> 信息: Initializing Spring root WebApplicationContext
> 2019-01-16 16:34:36,912 DEBUG [localhost-startStop-1]
> security.PasswordPlaceholderConfigurer:174 : Loading properties file from
> InputStream resource [resource loaded through InputStream]
> 2019-01-16 16:34:38,213 INFO  [localhost-startStop-1]
> metrics.MetricsManager:135 : Kylin metrics monitor is not enabled!!!
> 2019-01-16 16:34:39,843 INFO  [localhost-startStop-1]
> init.InitialTaskManager:38 : Kylin service is starting.
> 2019-01-16 16:34:40,114 INFO  [localhost-startStop-1]
> persistence.ResourceStore:86 : Using meta

Re: v2.6.0 download for Hadoop 3.1.1 / Hbase 2

2019-01-18 Thread ShaoFeng Shi
The download page has been updated with Kylin 2.6 for Hadoop 3 packages,
please check:

https://kylin.apache.org/download/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




ShaoFeng Shi  于2019年1月16日周三 下午10:59写道:

> Hi Jon,
>
> Yes, will be; It will be uploaded a little bit later as there are some
> code conflicts to resolve. Keep tunned;
>
> Btw, please subscribe the mailing list before sending to it; otherwise,
> your email is blocked unless someone manually approves it. (To subscribe,
> send an email to dev-subscr...@kylin.apache.org, and then confirm on the
> reply email).
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Work email: shaofeng@kyligence.io
> Kyligence Inc: https://kyligence.io/
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Jon Shoberg  于2019年1月14日周一 上午9:46写道:
>
>> Will be there be a binary download for v2.6.0 and Hadoop 3.1.1 / HBase2?
>>
>> Hbase comparability matrix lists Hadoop-3.1.1 as supported with HBase 2.
>>
>> Following Hadoop releases I believe 3.1.1. is being promoted as
>> production ready.
>>
>> Thanks! J
>>
>


Re: A Survey on The Usage of Sonar Cloud

2019-01-17 Thread ShaoFeng Shi
Hello Edna,

I have finished the survey, with my email shaofeng...@apache.org; I
appreciate you and SonarCloud 's support to open source community!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Edna Dias Canedo  于2019年1月16日周三 下午10:13写道:

> >
> > Dear all,
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> I am investigating how Apache development teams use static analysis
> >>>>>> tools (in particular SonarQube). To this end, I kindly ask you to
> answer a
> >>>>>> small survey on this topic. The survey is available at:
> >>>>>>
> >>>>>> [image:
> https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif]
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> *https://canedo.typeform.com/to/JxPfG6
> >>>>>> <https://canedo.typeform.com/to/JxPfG6>*
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> All the best.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> > --
> >
> Professora Dra. Edna Dias Canedo
> Department of Computer Science
> University of Brasília (UnB), Campus Darcy Ribeiro
>


New tech blog: Introducing Kylin JDBC data source SDK

2019-01-17 Thread ShaoFeng Shi
Hello,

There is a new tech blog from Youcheng Zhang and Dong Li on the new feature
"Data source SDK" in Kylin 2.6:

https://kylin.apache.org/blog/2019/01/16/introduce-data-source-sdk-v2.6.0/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


New tech blog: How cisco team improved Kylin's throughput by five times

2019-01-16 Thread ShaoFeng Shi
Hello,

The engineer Li Zongwei from Cisco China has composed a tech blog on how he
identified a performance bottleneck in Kylin and the QPS before and after
this hotfix. The fix is released in Kylin 2.5.2 and 2.6.0.

English version:
http://kyligence.io/2019/01/11/how-ciscos-big-data-team-improved-apache-kylins-high-concurrent-throughput-by-5x/

Chinese version:
http://kyligence.io/zh/2019/01/07/how-cisco-big-data-team-increased-kylin-throughput-five-times/

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: v2.6.0 download for Hadoop 3.1.1 / Hbase 2

2019-01-16 Thread ShaoFeng Shi
Hi Jon,

Yes, will be; It will be uploaded a little bit later as there are some code
conflicts to resolve. Keep tunned;

Btw, please subscribe the mailing list before sending to it; otherwise,
your email is blocked unless someone manually approves it. (To subscribe,
send an email to dev-subscr...@kylin.apache.org, and then confirm on the
reply email).

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Jon Shoberg  于2019年1月14日周一 上午9:46写道:

> Will be there be a binary download for v2.6.0 and Hadoop 3.1.1 / HBase2?
>
> Hbase comparability matrix lists Hadoop-3.1.1 as supported with HBase 2.
>
> Following Hadoop releases I believe 3.1.1. is being promoted as production
> ready.
>
> Thanks! J
>


Re: Quicksight integration with apache kylin

2019-01-14 Thread ShaoFeng Shi
I don't have AWS Quicksight experience; Does it support generic JDBC
connector?

If still unclear, maybe you can try AWS's channel.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年1月14日周一 下午1:09写道:

> Hi, venkateshd.
>
>If you do successfully. Welcome to write an article about this to
> the community.
>
>
>
>Best wishes!
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> 
> 发件人: venkateshd 
> 发送时间: Saturday, January 12, 2019 8:36:58 AM
> 收件人: dev@kylin.apache.org
> 主题: Quicksight integration with apache kylin
>
> Did anyone tried integrating Apache Kylin with AWS QuickSight successfully.
> If so can you post us some information regarding the same. As such there is
> no third party library available for making this integration possible.
>
> If someone developed a package for integrating these two components please
> let me know
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: Kylin 2.6.0 fails to build cube with spark

2019-01-11 Thread ShaoFeng Shi
Hi Hubert,

In the original log file, I see hbase-server and hbase-common are on the
spark command:

--jars 
*/usr/lib/hbase/lib/hbase-common-1.4.2.jar*,*/usr/lib/hbase/lib/hbase-server-1.4.2.jar*,/usr/lib/hbase/lib/hbase-client-1.4.2.jar,/usr/lib/hbase/lib/hbase-protocol-1.4.2.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar,/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,/usr/lib/hbase/lib/metrics-core-2.2.0.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.2.jar,/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.2.jar,


Did you add other jars?


Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




hubert stefani  于2019年1月11日周五 下午6:20写道:

>  indeed. By adding the hbase-server and hbase common jars the pb seems to
> be fixed.
>
>
> Le vendredi 11 janvier 2019 à 11:11:40 UTC+1, ShaoFeng Shi <
> shaofeng...@apache.org> a écrit :
>
>  It seems missing some HBase class; The HFile.class should be in
> hbase-server-.jar, not sure whether it is EMR's package issue. You
> can unzip the HBase jar files to see which jar has the class, and then add
> it to spark/lib folder.
>
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.hadoop.hbase.io.hfile.HFile
> at
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
> at
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
> at
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
> at org.apache.spark.internal.io
> .HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
> at org.apache.spark.internal.io
> .SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
> at org.apache.spark.internal.io
> .SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
> at
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
>
>
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Work email: shaofeng@kyligence.io
> Kyligence Inc: https://kyligence.io/
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> hubert stefani  于2019年1月11日周五 下午4:52写道:
>
> > hello,
> >
> > we are testing the 2.6 RC and we are facing a systematic issue when
> > building cubes with spark engine (even with sample cube), whereas the
> > MapReduce engin succeeds.
> >
> > The job process fails at step #8 Step Name: Convert Cuboid Data to HFile
> > with the following error  (full log is available as attachment):
> >
> > ClassNotFoundException: org.apache.hadoop.hbase.metrics.MetricRegistry
> >
> > We run kylin on AWS EMR 5.13 (it failed also with 5.17).
> >
> > Do you have any idea of the reasons why it happens ?
> > Hubert
> >


Re: Kylin 2.6.0 fails to build cube with spark

2019-01-11 Thread ShaoFeng Shi
It seems missing some HBase class; The HFile.class should be in
hbase-server-.jar, not sure whether it is EMR's package issue. You
can unzip the HBase jar files to see which jar has the class, and then add
it to spark/lib folder.

java.lang.NoClassDefFoundError: Could not initialize class
org.apache.hadoop.hbase.io.hfile.HFile
at 
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.getNewWriter(HFileOutputFormat2.java:305)
at 
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:229)
at 
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:167)
at 
org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
at 
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at 
org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)



Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




hubert stefani  于2019年1月11日周五 下午4:52写道:

> hello,
>
> we are testing the 2.6 RC and we are facing a systematic issue when
> building cubes with spark engine (even with sample cube), whereas the
> MapReduce engin succeeds.
>
> The job process fails at step #8 Step Name: Convert Cuboid Data to HFile
> with the following error  (full log is available as attachment):
>
> ClassNotFoundException: org.apache.hadoop.hbase.metrics.MetricRegistry
>
> We run kylin on AWS EMR 5.13 (it failed also with 5.17).
>
> Do you have any idea of the reasons why it happens ?
> Hubert
>


Re: [VOTE] Release apache-kylin-2.6.0 (RC1)

2019-01-09 Thread ShaoFeng Shi
Checked the source package, the signature, and the sha256 hash;

Mvn package and test are all successful with jdk 1.8.0_111 on Mac;

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




JiaTao Tao  于2019年1月10日周四 上午9:42写道:

> +1
> mvn test passed
>
> Yanghong Zhong  于2019年1月9日周三 上午2:46写道:
>
> > Hi all,
> >
> > I have created a build for Apache Kylin 2.6.0, release candidate 1.
> >
> > Changes highlights:
> > [KYLIN-2895] - Refine query cache by changing the query cache expiration
> > strategy by signature checking and introducing memcached as distributed
> > cache
> > [KYLIN-2932] - Simplify the thread model for in-memory cubing
> > [KYLIN-3021] - Check MapReduce job failed reason and include the
> > diagnostics into email notification
> > [KYLIN-3272] - Upgrade Spark dependency to 2.3.2
> > [KYLIN-3540] - Improve Mandatory Cuboid Recommendation Algorithm
> > [KYLIN-3552] - Data Source SDK to ingest data from different JDBC sources
> > [KYLIN-3611] - Upgrade Tomcat to 7.0.91, 8.5.34 or later
> > [KYLIN-3656] - Improve HLLCounter performance
> > [KYLIN-3700] - Quote sql identities when creating flat table
> > [KYLIN-3729] - CLUSTER BY CAST(field AS STRING) will accelerate base
> cuboid
> > build with UHC global dict
> >
> > Thanks to everyone who has contributed to this release.
> > Here’s release notes:
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344003
> >
> > The commit to be voted upon:
> >
> >
> >
> https://github.com/apache/kylin/commit/8737bc1f555a2789a67462c8f8420b6ab3be97ce
> >
> > Its hash is 8737bc1f555a2789a67462c8f8420b6ab3be97ce.
> >
> > The artifacts to be voted on are located here:
> > https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.0-rc1/
> >
> > The hash of the artifact is as follows:
> > apache-kylin-2.6.0-source-release.zip.sha256
> >
> > 3621750945823ff4f0c4124b6d5b5c7164d9b08686729352ea22b2f486958d2a
> >
> > A staged Maven repository is available for review at:
> > https://repository.apache.org/content/repositories/orgapachekylin-1059/
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/nju_yaho.asc
> >
> > Please vote on releasing this package as Apache Kylin 2.6.0.
> >
> > The vote is open for the next 72 hours and passes if a majority of
> > at least three +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Kylin 2.6.0
> > [ ]  0 I don't feel strongly about it, but I'm okay with the release
> > [ ] -1 Do not release this package because...
> >
> >
> > Here is my vote:
> >
> > +1 (binding)
> >
> > Best regards,
> > Yanghong Zhong
> > eBay Inc.
> >
>
>
> --
>
>
> Regards!
>
> Aron Tao
>


Redash v6 adds Kylin as data source

2019-01-07 Thread ShaoFeng Shi
Hi Kylin user,

Redash v6 just announced the new data source support of Apache Kylin;
Please check this post:
https://blog.redash.io/just-in-time-for-christmas-redash-v6-70cb23dfbbf3

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


Re: 请问可以设置多台机器同时构建cube吗?

2019-01-07 Thread ShaoFeng Shi
Please check "Enable multiple job engines" in
https://kylin.apache.org/docs/install/advance_settings.html

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年1月7日周一 下午8:36写道:

> Hi, 奥威软件.
>
>Is that you mean build different cube? Yes, it is.
>
>
>
> 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>
>
>
> 
> 发件人: 奥威软件 <3513797...@qq.com>
> 发送时间: Monday, January 7, 2019 6:55:30 PM
> 收件人: dev
> 主题: 请问可以设置多台机器同时构建cube吗?
>
> 如题:
>  请问可以设置多台机器同时构建cube吗?
>


Re: Increment Upload in Kylin

2019-01-06 Thread ShaoFeng Shi
Sounds like a multiple-level partitioning scenario? Time/date is the one
partition, another dimension is the second level partition.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




somu0...@gmail.com  于2019年1月7日周一 上午10:16写道:

> Is there any feature in kylin which will do increment update without
> refreshing the complete cube.  for example if one dimension get new data
> every day it should calculate the new one without refreshing the complete
> cube which will save time for building the cube. Could you please help me
> if
> such feature available in kylin?
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: 你好,我在构建cube的时候清理h ive垃圾出现问题,在追踪源码后发现问题。

2019-01-03 Thread ShaoFeng Shi
hi, thanks for the reporting; It is very interesting (never heard of it
before); Did you observe this problem occasionally, or it can be reproduced
consistently? (the file exists, but Kylin reports not existing).

Can you try to add some log there and then check whether it is really
misreporting? Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




冯广彬  于2019年1月3日周四 下午3:58写道:

> 具体的描述是这样的,
> 我的报错堆栈信息:
> org.apache.kylin.job.exception  .ExecuteException:
> org.apache.kylin.job.exception  .ExecuteException:
> java.lang.RuntimeException: Failed to read kylin_hive_conf.xml
> at org.apache.kylin.job.execution
> .AbstractExecutable.execute(  AbstractExecutable.java:179)
>
> at org.apache.kylin.job.impl.thre
> adpool.DefaultScheduler$JobRun  ner.run(DefaultScheduler.java:
>  113)
> at java. util.concurrent.ThreadPoo
> lExecutor.runWorker(ThreadPool  Executor.java:1149)
>
> at java.util.concurrent.ThreadPoo
> lExecutor$Worker.run(ThreadPoo  lExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.kylin.job.exception  .ExecuteException:
> java.lang.RuntimeException: Failed to read kylin_hive_conf.xml
> at org.apache.kylin.job.execution
> .AbstractExecutable.execute(  AbstractExecutable.java:
> 179)
> at org.apache.kylin.job.execution
> .DefaultChainedExecutable.doWo  rk(DefaultChainedExecutable.
>  java:70)
> at org.apache.kylin.job.execution
> .AbstractExecutable.execute(  AbstractExecutable.java:164)
>
> ... 4 more
> Caused by: java.lang.RuntimeException: Failed to read kylin_hive_conf.xml
> at org.apache.kylin.common.util.S  ourceConfigurationUtil.loadXml
> Configuration(SourceConfigurat ionUtil.java:83)
> at org.apache.kylin.common.util.S
> ourceConfigurationUtil.loadHiv  eConfiguration(SourceConfigura
>  tionUtil.java:57)
> at org.apache.kylin.common.util.H  iveCmdBuilder.
> (HiveCmdBu  ilder
> .java:46)
> at org.apache.kylin.source.hive.G
> arbageCollectionStep.cleanUpIn  termediateFlatTable(GarbageCol
>  lectionStep.java:61)
> at org.apache.kylin.source.hive.G
> arbageCollectionStep.doWork(Ga  rbageCollectionStep
> .java: 48)
> at org.apache.kylin.job.execution
> .AbstractExecutable.execute(  AbstractExecutable.java:164)
>
>
>
> *我下载了源码,查看了这个类*org.apache.kylin.common.util.S ourceConfigurationUtil;
> 我发现他在加载配置文件的时候会做这一步判断
>  if (!confFile.exists()) {
> if (checkExist)
> throw new RuntimeException("Failed to read " +
> xmlFileName);
> else
> return confProps;
> }
>
> 后来经过查阅   https://bugs.java.com/bu  gdatabase/view_bug.do?bug_id=  5003595
> <https://bugs.java.com/bugdatabase/view_bug.do?bug_id=5003595>
>
>  发现在linux系统上使用此方法可能会出现及时文件存在,依然返回值为false的结果,我想请你审阅下是否存在问题,如果存在问题的话,我可以怎样解决呢?
> 谢谢!
>


Re: Fw: Please share a public maven setting.xml configuration

2019-01-02 Thread ShaoFeng Shi
yes, please go ahead.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




冯广彬  于2019年1月3日周四 下午2:37写道:

> 你好,史少峰,我遇到了些问题,我们可以用中文交流吗?
>
> ShaoFeng Shi  于2018年12月29日周六 下午5:45写道:
>
> > I have no customization in maven setting, but I have a local proxy
> server:
> >
> > export http_proxy=http://127.0.0.1:1087;
> > export https_proxy=http://127.0.0.1:1087;
> >
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> > Apache Kylin PMC
> > Work email: shaofeng@kyligence.io
> > Kyligence Inc: https://kyligence.io/
> >
> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> >
> >
> >
> >
> > _  于2018年12月29日周六 下午5:38写道:
> >
> > >
> > >
> > >
> > >
> > > - Forwarded mail -
> > > From: _ 
> > > Date: 12/29/2018 16:44
> > > To: dev 
> > > Cc:
> > > Subject: Please share a public maven setting.xml configuration
> > > Hi all:
> > >I want to compile the kylin-on-parquet source code, but I find that
> > > some jars can not found depend on my maven configuration, so can
> anybody
> > > share an effective setting.xml? many thanks!
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>


Re: Re: Kylin will not delete old hbase table when refresh the segment

2019-01-02 Thread ShaoFeng Shi
JIRA is created as below; anyone who wants to contribute can leave a
comment under it, then I will change the assignee to you:

https://issues.apache.org/jira/browse/KYLIN-3753

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Ma Gang  于2018年12月19日周三 下午2:39写道:

> +1, the behavior can be the same as merge segments, old htables can be
> deleted directly, not sure it make sense or not to keep old htables to have
> the capability to roll back the old data.
>
> At 2018-12-17 17:49:50, "ShaoFeng Shi"  wrote:
> >It can be improved to do the cleanup automatically. I heard some community
> >user has implemented that, a patch is welcomed!
> >
> >Best regards,
> >
> >Shaofeng Shi 史少锋
> >Apache Kylin PMC
> >Work email: shaofeng@kyligence.io
> >Kyligence Inc: https://kyligence.io/
> >
> >Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> >Join Kylin user mail group: user-subscr...@kylin.apache.org
> >Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> >
> >
> >
> >
> >Chao Long  于2018年12月17日周一 下午5:40写道:
> >
> >> Hi,
> >> You can use storage cleanup tool.
> >> http://kylin.apache.org/docs/howto/howto_cleanup_storage.html
> >>
> >>
> >> --
> >> Best Regards,
> >> Chao Long
> >>
> >>
> >> -- Original --
> >> From:  "mailpig";
> >> Date:  Mon, Dec 17, 2018 05:16 PM
> >> To:  "dev";
> >>
> >> Subject:  Kylin will not delete old hbase table when refresh the segment
> >>
> >>
> >>
> >> Hi, my kylin version is 2.1.0. It has run one year. Howerver, there has
> too
> >> many tables in hbase.
> >> I found that when I refresh the segment, kylin will not delete the old
> >> hbase
> >> table. Is there has a tool to delete the old hbase table?
> >>
> >> --
> >> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


[jira] [Created] (KYLIN-3753) Delete old hbase table when refresh the segment

2019-01-02 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3753:
---

 Summary: Delete old hbase table when refresh the segment
 Key: KYLIN-3753
 URL: https://issues.apache.org/jira/browse/KYLIN-3753
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Reporter: Shaofeng SHI






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Re: Evaluate Kylin on Parquet

2019-01-01 Thread ShaoFeng Shi
Hi Yang,

The real-time streaming feature is also under review and testing now. I
think when they (new storage and real-time) are ready to be merged, we can
propose to jump the version to 3.0.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Li Yang  于2019年1月1日周二 下午12:40写道:

> From the discussion, apparently a new storage will be added sooner or late.
>
> Will it be a new big version of Kylin? Like Apache Kylin 3.0? Also how
> about the migration from old storage? I assume old cube data has to be
> transformed and loaded into the new storage.
>
> Yang
>
> On Sat, Dec 29, 2018 at 5:52 PM ShaoFeng Shi 
> wrote:
>
>> Thanks very much for Yiming and Jiatao's comments, they're very valueable.
>> There are many improvements can do for this new storage. We welcome all
>> kinds of contribution and would like to improve it together with the
>> community in the year of 2019!
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC
>> Work email: shaofeng@kyligence.io
>> Kyligence Inc: https://kyligence.io/
>>
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>>
>>
>>
>> JiaTao Tao  于2018年12月19日周三 下午8:44写道:
>>
>> > Hi all,
>> >
>> > Truly agreed with Yiming, and here I expand a little more about
>> > "Distributed computing".
>> >
>> > As Yiming mentioned, Kylin will parse the query into an execution plan
>> > using Calcite(Kylin will change the execution plan cuz the data in
>> cubes is
>> > already aggregated, we cannot use the origin plan directly). It's a tree
>> > structure, a node represents a specific calculation and data goes from
>> > bottom to top applying all these calculations.
>> > [image: image.png]
>> > (Pic from https://blog.csdn.net/yu616568/article/details/50838504, a
>> > really good blog.)
>> >
>> > At present, Kylin will do almost all these calculations only in its own
>> > node, in other words, we cannot fully use the power of the cluster, and
>> > it's a SPOF. And here comes a design that we can visit this tree, *and
>> > transform each node into operations to Spark's Dataframes(i.e. "DF").*
>> >
>> > More specifically, we will visit the nodes recursively until we met the
>> > "TableScan" node(like a stack pushing operation). e.g. In the above
>> > diagram, the first node we met is a "Sort" node, we just visit its
>> > child(ren), and we'll not stop visiting each node's child(ren) until we
>> met
>> > a "TableScan" node.
>> >
>> > In the "TableScan" node, we will generate the initial DF, then the DF
>> will
>> > be poped to the "Filter" node, and the "Filter" node will apply its own
>> > operation like "df.filter(xxx)". Finally, we will apply each node's
>> > operation to this DF, and the final call chain will like:
>> > "df.filter(xxx).select(xxx).agg(xxx).sort(xxx)".
>> >
>> > After we got the final Dataframe and triggered the calculation, all the
>> > rest were handled by Spark. And we can gain tremendous benefits in
>> > computation level, more details can be seen in my previous post:
>> >
>> http://apache-kylin.74782.x6.nabble.com/Re-DISCUSS-Columnar-storage-engine-for-Apache-Kylin-tc12113.html
>> > .
>> >
>> >
>> > --
>> >
>> >
>> > Regards!
>> >
>> > Aron Tao
>> >
>> >
>> > 许益铭  于2018年12月19日周三 上午11:40写道:
>> >
>> >> hi All!
>> >> 关于CHAO LONG提到的几个问题,我有以下几个看法:
>> >>
>> >>
>> 1.当前我们的架构是分为两层的,一层是storage层,一层是计算层.在storage层,我们已经做了一些优化,在storage层做了预聚合来减少返回的数据量,但是runtime的聚合和连接发生在kylin
>> >> server端,序列化无可避免,且这个架构容易导致单点瓶颈,如果runtime
>> >> 的agg或join数据量比较大的话,会导致查询性能直线下降,kylin
>> >> server GC严重
>> >>
>> >>
>> >>
>> 2.关于字典问题,字典是当初为了在hbase中对齐rowkey,同时也为了减少一部分的存储而引入的设计.但这也引入另外一个问题,hbase很难处理非定长的string类型的dimension,如果遇到高基的非定长dimension,往往只能去建立一个很大的字典或者给一个比较大的fixlength,导致存储翻倍,同时因为字典比较大,查询性能会受到很大影响(gc).如果我们使用列式存储,是可以不需要考虑这个问题的.
>> >>
&g

Re: Re: Evaluate Kylin on Parquet

2018-12-29 Thread ShaoFeng Shi
Thanks very much for Yiming and Jiatao's comments, they're very valueable.
There are many improvements can do for this new storage. We welcome all
kinds of contribution and would like to improve it together with the
community in the year of 2019!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




JiaTao Tao  于2018年12月19日周三 下午8:44写道:

> Hi all,
>
> Truly agreed with Yiming, and here I expand a little more about
> "Distributed computing".
>
> As Yiming mentioned, Kylin will parse the query into an execution plan
> using Calcite(Kylin will change the execution plan cuz the data in cubes is
> already aggregated, we cannot use the origin plan directly). It's a tree
> structure, a node represents a specific calculation and data goes from
> bottom to top applying all these calculations.
> [image: image.png]
> (Pic from https://blog.csdn.net/yu616568/article/details/50838504, a
> really good blog.)
>
> At present, Kylin will do almost all these calculations only in its own
> node, in other words, we cannot fully use the power of the cluster, and
> it's a SPOF. And here comes a design that we can visit this tree, *and
> transform each node into operations to Spark's Dataframes(i.e. "DF").*
>
> More specifically, we will visit the nodes recursively until we met the
> "TableScan" node(like a stack pushing operation). e.g. In the above
> diagram, the first node we met is a "Sort" node, we just visit its
> child(ren), and we'll not stop visiting each node's child(ren) until we met
> a "TableScan" node.
>
> In the "TableScan" node, we will generate the initial DF, then the DF will
> be poped to the "Filter" node, and the "Filter" node will apply its own
> operation like "df.filter(xxx)". Finally, we will apply each node's
> operation to this DF, and the final call chain will like:
> "df.filter(xxx).select(xxx).agg(xxx).sort(xxx)".
>
> After we got the final Dataframe and triggered the calculation, all the
> rest were handled by Spark. And we can gain tremendous benefits in
> computation level, more details can be seen in my previous post:
> http://apache-kylin.74782.x6.nabble.com/Re-DISCUSS-Columnar-storage-engine-for-Apache-Kylin-tc12113.html
> .
>
>
> --
>
>
> Regards!
>
> Aron Tao
>
>
> 许益铭  于2018年12月19日周三 上午11:40写道:
>
>> hi All!
>> 关于CHAO LONG提到的几个问题,我有以下几个看法:
>>
>> 1.当前我们的架构是分为两层的,一层是storage层,一层是计算层.在storage层,我们已经做了一些优化,在storage层做了预聚合来减少返回的数据量,但是runtime的聚合和连接发生在kylin
>> server端,序列化无可避免,且这个架构容易导致单点瓶颈,如果runtime
>> 的agg或join数据量比较大的话,会导致查询性能直线下降,kylin
>> server GC严重
>>
>>
>> 2.关于字典问题,字典是当初为了在hbase中对齐rowkey,同时也为了减少一部分的存储而引入的设计.但这也引入另外一个问题,hbase很难处理非定长的string类型的dimension,如果遇到高基的非定长dimension,往往只能去建立一个很大的字典或者给一个比较大的fixlength,导致存储翻倍,同时因为字典比较大,查询性能会受到很大影响(gc).如果我们使用列式存储,是可以不需要考虑这个问题的.
>>
>> 3.我们要使用parquet的page
>> index,必须把tuplefilter转换成parquet的filter,这个工作量不小.而且我们的数据都是被编码过的,parquet的page
>> index只会根据page上的min max来进行过滤,因此对于binary的数据,是无法做filter的.
>>
>> 我觉得使用spark来做我们的计算引擎能解决上述所有问题:
>>
>> 1.分布式计算
>> sql通过calcite解析优化之后会生成olap
>>
>> rel的一颗树,而spark的catalyst也是通过解析sql生成一棵树后,自动优化成为dataframe来计算,如果calcite的plan能够转换成spark的plan,那么我们将实现分布式计算,calcite只负责解析sql和返回结果集,减少kylin
>> server端的压力.
>>
>> 2.去掉字典
>>
>> 字典有个很好的作用就是在中低基数下减少储存压力,但是也有一个坏处就是其数据文件无法脱离字典单独使用,我建议刚开始可以不考虑字典类型的encoding,让系统尽可能的简单,默认使用parquet的page级别的dictionary即可.
>>
>> 3.parquet存储使用列的真实类型,而不是使用binary
>>
>> 如上,parquet对于binary的filter能力极弱,而使用基本类型能够直接使用spark的Vectorizedread,加速数据读取速度和计算.
>>
>> 4.使用spark适配parquet
>> 当前的spark已经适配了parquet,spark的pushed
>> filter已经被转换成为了parquet能用的filter,这里只需要升级parquet版本后稍加修改就能提供parquet的page
>> index能力.
>>
>> 5.index server
>> 就如JiaTao Tao所述,index server分为file index 和 page index ,字典的过滤无非就是file
>> index的一种,因为我们可以在这里插入一个index server.
>>
>>
>> hi,all!
>> I have the following views:
>> 1. At present, our architecture is divided into two layers, one is the
>> storage layer, and the other is the computing layer. In the storage layer,
>> we have made some optimizations and do pre-aggregation in the storage
>> layer
>> to reduce the amount of data returned. However, the aggregation and
>> connection of the runtime occurs on the kylin server side. Serialization
>> is
>> inevitable, and this architecture is easy to cause a single point
>> bottleneck. If the a

Re: Fw: Please share a public maven setting.xml configuration

2018-12-29 Thread ShaoFeng Shi
I have no customization in maven setting, but I have a local proxy server:

export http_proxy=http://127.0.0.1:1087;
export https_proxy=http://127.0.0.1:1087;

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




_  于2018年12月29日周六 下午5:38写道:

>
>
>
>
> - Forwarded mail -
> From: _ 
> Date: 12/29/2018 16:44
> To: dev 
> Cc:
> Subject: Please share a public maven setting.xml configuration
> Hi all:
>I want to compile the kylin-on-parquet source code, but I find that
> some jars can not found depend on my maven configuration, so can anybody
> share an effective setting.xml? many thanks!
>
>
>
>
>
>
>


Re: Kylin real-time streaming is ready on realtime-streaming branch

2018-12-28 Thread ShaoFeng Shi
Create new component "Real-time Streaming" in JIRA, please use this for the
real-time related issues.

The old "Streaming" component is renamed to "NRT Streaming".

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




ShaoFeng Shi  于2018年12月26日周三 上午9:26写道:

> Thanks for the information. I'm trying to catch up.
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Work email: shaofeng@kyligence.io
> Kyligence Inc: https://kyligence.io/
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Xiaoxiang Yu  于2018年12月23日周日 下午9:06写道:
>
>> Hi,everyone. I am reading source code of real-time streaming and find
>> some way which may helpful to other who is interested in this feature. If
>> you are interested in eBay's new real time streaming solution but don't
>> know in which way it may help you, then the following link will help you
>> running or debugging it on your laptop.
>>
>>
>>
>>
>> https://github.com/hit-lacus/hit-lacus.github.io/issues/13#issuecomment-448449318
>>
>>
>>
>>
>>
>> 
>>
>> Best wishes,
>>
>> Xiaoxiang Yu
>>
>>
>>
>>
>>
>> *发件人**: *Ma Gang 
>> *答复**: *"u...@kylin.apache.org" 
>> *日期**: *2018年12月23日 星期日 13:33
>> *收件人**: *kylin_dev , kylin_user <
>> u...@kylin.apache.org>
>> *主题**: *Kylin real-time streaming is ready on realtime-streaming branch
>>
>>
>>
>> Hi all,
>>
>>
>>
>> Kylin real-time streaming feature has been staged in Kylin code
>> repository for public review and evaluation. You can check out the
>> "realtime-streaming" branch to read the code, and make a binary build to
>> run an example. The detail design doc and usage doc can be found in the
>> attachment of jira: https://issues.apache.org/jira/browse/KYLIN-3654.
>>
>>
>>
>> This is just the first version, any comments and pull request are welcome!
>>
>>
>>
>> Thanks,
>>
>> Ma,Gang
>>
>>
>>
>>
>>
>>
>


Re: Kylin2.5.2服务搭建趟坑集锦,有遇得相同问题的朋友请放心食用

2018-12-28 Thread ShaoFeng Shi
No problem, welcome to try Kylin!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




冯广彬  于2018年12月28日周五 下午4:21写道:

> Thank you, Shaofeng Shi, If it still doesn't work, I will try your
> suggestion, thank you again!
>
>
> ShaoFeng Shi  于2018年12月28日周五 下午3:06写道:
>
> > Hi Cookie,
> >
> > Thanks for the sharing; I'm sorry that you spent several days to run a
> > sample cube. Kylin has the dependency on Hadoop components; If you
> install
> > those components separately, it has a high possibility that there is JAR
> > conflict. We highly recommend you to use a commercial Hadoop release like
> > HDP, CDH, because the publisher has already solved those inconsistencies;
> > With HDP/CDH, most likely you don't need to make a change to run Kylin
> > successfully.  Besides, MacOS is not tested, Linux(CentOS/RHEL/Ubuntu) is
> > the only supported OS.
> >
> > Here is the tip from Kylin FAQ:
> >
> > How to quick start with Kylin?
> >
> >- To get a quick start, you can run Kylin in a Hadoop sandbox VM or in
> >the cloud, for example, start a small AWS EMR or Azure HDInsight
> cluster
> >and then install Kylin in one of the nodes.
> >
> >
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> > Apache Kylin PMC
> > Work email: shaofeng@kyligence.io
> > Kyligence Inc: https://kyligence.io/
> >
> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> >
> >
> >
> >
> > CookieNats  于2018年12月28日周五 下午2:34写道:
> >
> > > https://www.jianshu.com/p/e7dc59af58b0
> > >
> > > --
> > > Sent from: http://apache-kylin.74782.x6.nabble.com/
> > >
> >
>


Re: Kylin2.5.2服务搭建趟坑集锦,有遇得相同问题的朋友请放心食用

2018-12-27 Thread ShaoFeng Shi
Hi Cookie,

Thanks for the sharing; I'm sorry that you spent several days to run a
sample cube. Kylin has the dependency on Hadoop components; If you install
those components separately, it has a high possibility that there is JAR
conflict. We highly recommend you to use a commercial Hadoop release like
HDP, CDH, because the publisher has already solved those inconsistencies;
With HDP/CDH, most likely you don't need to make a change to run Kylin
successfully.  Besides, MacOS is not tested, Linux(CentOS/RHEL/Ubuntu) is
the only supported OS.

Here is the tip from Kylin FAQ:

How to quick start with Kylin?

   - To get a quick start, you can run Kylin in a Hadoop sandbox VM or in
   the cloud, for example, start a small AWS EMR or Azure HDInsight cluster
   and then install Kylin in one of the nodes.


Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




CookieNats  于2018年12月28日周五 下午2:34写道:

> https://www.jianshu.com/p/e7dc59af58b0
>
> --
> Sent from: http://apache-kylin.74782.x6.nabble.com/
>


Re: Kylin real-time streaming is ready on realtime-streaming branch

2018-12-25 Thread ShaoFeng Shi
Thanks for the information. I'm trying to catch up.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Xiaoxiang Yu  于2018年12月23日周日 下午9:06写道:

> Hi,everyone. I am reading source code of real-time streaming and find
> some way which may helpful to other who is interested in this feature. If
> you are interested in eBay's new real time streaming solution but don't
> know in which way it may help you, then the following link will help you
> running or debugging it on your laptop.
>
>
>
>
> https://github.com/hit-lacus/hit-lacus.github.io/issues/13#issuecomment-448449318
>
>
>
>
>
> 
>
> Best wishes,
>
> Xiaoxiang Yu
>
>
>
>
>
> *发件人**: *Ma Gang 
> *答复**: *"u...@kylin.apache.org" 
> *日期**: *2018年12月23日 星期日 13:33
> *收件人**: *kylin_dev , kylin_user <
> u...@kylin.apache.org>
> *主题**: *Kylin real-time streaming is ready on realtime-streaming branch
>
>
>
> Hi all,
>
>
>
> Kylin real-time streaming feature has been staged in Kylin code repository
> for public review and evaluation. You can check out the
> "realtime-streaming" branch to read the code, and make a binary build to
> run an example. The detail design doc and usage doc can be found in the
> attachment of jira: https://issues.apache.org/jira/browse/KYLIN-3654.
>
>
>
> This is just the first version, any comments and pull request are welcome!
>
>
>
> Thanks,
>
> Ma,Gang
>
>
>
>
>
>


<    1   2   3   4   5   6   7   8   9   10   >