from:"Xiaoxiang Yu"

Re: NPE In QueryMetricsFacade

2018-10-22 Thread Xiaoxiang Yu


I cannot reproduce it one my kylin cluster. Could anyone show us how to 
reproduce that error?
 
I found other users have met same error, counld anyone show us the detail myabe 
useful?

Please attach your detail to https://issues.apache.org/jira/browse/KYLIN-3609. 

Thanks.



俞霄翔/Xiaoxiang Yu 
软件工程师/Software Engineer
电话/Mobile:18516298930
http://kyligence.io <http://kyligence.io/>

 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi Team,

 Query metrics are not getting updated on "host:7070/kylin/dashboard".
I found below stack trace regarding query metrics facade in Kylin logs.

  My Kylin version v2.5 with Hbase v1.x. Am I missing something in
configuration which is causing below error?

2018-10-21 18:34:41,625 WARN  [Query
7b16f84c-3053-f64c-2cd2-4ae5348c295c-121] service.QueryService:421 : Write
metric error.
java.lang.NullPointerException
at

org.apache.kylin.rest.metrics.QueryMetricsFacade.updateMetricsToReservoir(QueryMetricsFacade.java:148)
at

org.apache.kylin.rest.metrics.QueryMetricsFacade.updateMetrics(QueryMetricsFacade.java:73)
at

org.apache.kylin.rest.service.QueryService.recordMetric(QueryService.java:503)
at

org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:419)
at

org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:351)
at

org.apache.kylin.rest.controller.QueryController.query(QueryController.java:86)
at sun.reflect.GeneratedMethodAccessor253.invoke(Unknown Source)
at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at

org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at

org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
at

org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)
at

org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
at

org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
at

org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at

org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)
at

org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)
at

org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
at

org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:650)
at

org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)


Thank You,
Shrikant Bang.

Re: [DISCUSS] New Kylin Streaming Solution From eBay

2018-10-31 Thread Xiaoxiang Yu

Hi gang, I am so glad to know that eBay has a solution for realtime olap on 
kylin. I have some small question:


1.  Is it possible to use Yarn as cluster manager for index task. 
Coordinator process will set up them at specificed period. Yarn will manage :

a)   retry these task if some failed

b)   resource allocation

c)   log collection

2.  As I know, ebay’s New Kylin Streaming Solution use replica Set to 
ensure that income messages wouldn’t lost if some processes  lost. I think 
replica set is a set of kafka cosumer processes which is responsible for ingest 
message and build base cuboid in memory. Could you please show me some detail 
about how replica Set provide HA guarantee? How to configure it? A link / paper 
is OK.  I found one but I don’t know if it same meaning for your replica Set.

a)   [Mongodb replication](https://docs.mongodb.com/manual/replication/).

3.  How to add or remove node of replica Set in production env? How to 
monitor the health/pressure of replica Set cluster ?

4.  Does all measure are supported in ebay’s New Kylin Streaming Solution? 
What about count distinct(bitmap)?

5.  It seems ebay’s New Kylin Streaming Solution use a custom columnar 
storage, why not use a open source mature columnar storage  solution ? Have 
your ever compare the performance of your custom columnar storage to open 
source columnar storage  solution ?




Best wishes,
Xiaoxiang Yu


发件人: Ma Gang 
答复: "dev@kylin.apache.org" 
日期: 2018年10月30日 星期二 15:24
收件人: "dev@kylin.apache.org" 
主题: [DISCUSS] New Kylin Streaming Solution From eBay

Hi all,

eBay Kylin team has developed a new Kylin streaming solution, the basic idea is 
to build a streaming cluster to ingest data from streaming source(Kafka), and 
provide query for real-time data, the data preparation latency is milliseconds, 
which means the data is queryable almost when it is ingested, attach is the 
architecture design doc.
We would like to contribute the feature to community, please let us know if you 
have any concern.

Thanks,
Gang(Allen) Ma

Re: [DISCUSS] New Kylin Streaming Solution From eBay

2018-11-01 Thread Xiaoxiang Yu

Thank you for your reply. Maybe I can help to improve your Kylin Streaming 
Solution in the future.

Best wishes,
Xiaoxiang Yu

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Thanks Xiaoxiang,

Very good questions! Please see my comments started with [Gang]:

1.  Is it possible to use Yarn as cluster manager for index task. 
Coordinator process will set up them at specificed period.

[Gang] I think it is possible, but in current design,  the indexing task is 
designed as long running task, it also can provide query service, this makes 
the whole system very simple and efficiency, I don't think we need to 
stop/start indexing task time by time. But use yarn to manage the resource is 
possible, we need to redesign the existing coordinator, to make it easy to 
deploy to Yarn, Kubernetes, etc. Hope this can be done after contribution to 
community.

2.  As I know, ebay’s New Kylin Streaming Solution use replica Set to 
ensure that income messages wouldn’t lost if some processes  lost. I think 
replica set is a set of kafka cosumer processes which is responsible for ingest 
message and build base cuboid in memory. Could you please show me some detail 
about how replica Set provide HA guarantee? How to configure it? A link / paper 
is OK.  I found one but I don’t know if it same meaning for your replica Set.

[Gang] Yes, it is similar as the MongoDB replication, but currently we 
don't replicate data from Primary node, just assign the same Kafka 
topic/partitions to the receivers in a ReplicaSet, all receivers in a 
ReplicaSet will consume data from Kafka, so if one receiver is down, other 
receivers in the ReplicaSet are still consuming the same Kafka data, so the 
consume/query will not be impact. And We don't guarantee that the receivers in 
a ReplicaSet have the same consuming rate, but we can guarantee that the user 
can view data consistently by stick to the query to one receiver for one cube.

The HA implementation is a little bit naive, but simple and worked. Maybe 
in the future, we can do HA by replication to support other streaming sources 
that don't support multiple consumers and don't have persistent store.

3.  How to add or remove node of replica Set in production env? How to 
monitor the health/pressure of replica Set cluster ?

[Gang] Currently we have UI/restful api to let admin to add/remove node 
to/from a ReplicaSet, and have a simple ui to let admin monitor the health, 
consuming rate for each receiver/cube. Also all metrics are collected using 
yammer metrics framework, it is easy to exposed to other monitor system.

4.  Does all measure are supported in ebay’s New Kylin Streaming 
Solution? What about count distinct(bitmap)?

[Gang] Most measures are supported, but precise count distinct(bitmap) is 
not support in case that the distinct dimension is not int type. As you know, 
to support precise count distinct for not-int type dimension, it needs to build 
global dictionary, it is not possible in the streaming env.

5.  It seems ebay’s New Kylin Streaming Solution use a custom columnar 
storage, why not use a open source mature columnar storage  solution ? Have 
your ever compare the performance of your custom columnar storage to open 
source columnar storage  solution ?

[Gang] Most open source columnar format like Parquet, ORC are designed to 
use in Hadoop env, the streaming data are in local disk, so I didn't consider 
them at the beginning. It is not very hard to define columnar format to store 
Kylin specific data, use a customize columnar storage, you can use mmap file to 
scan data, add row-level invert index for all dimensions, so I think the 
performance will be better compared to using common columnar format. I didn't 
compare the performance, but the storage engine is pluggable, you may 
contribute a parquet storage if you are interesting.

At 2018-11-01 12:42:25, "Xiaoxiang Yu"  wrote:

>Hi gang, I am so glad to know that eBay has a solution for realtime olap 
on kylin. I have some small question:

>

>

>1.  Is it possible to use Yarn as cluster manager for index task. 
Coordinator process will set up them at specificed period. Yarn will manage :

>

>a)   retry these task if some failed

>

>b)   resource allocation

>

>c)   log collection

>

>2.  As I know, ebay’s New Kylin Streaming Solution use replica Set to 
ensure that income messages wouldn’t lost if some processes  lost. I think 
replica set is a set of kafka cosumer processes which is responsible for ingest 
message and build base cuboid in memory. Could you please show me some detail 
about how replica Set provide HA guarantee? How to configure it? A link / paper 
is OK.  I found one but I don’t know if it same meaning for your replica Set.

>

Re: [VOTE] Release apache-kylin-2.5.1 (RC1)

2018-11-04 Thread Xiaoxiang Yu

+1
Mvn test passed


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi all,

I have created a build for Apache Kylin 2.5.1, release candidate 1.

Changes highlights:

[KYLIN-3531] - Login failed with case-insensitive username
[KYLIN-3604] - Can't build cube with spark in HBase standalone mode
[KYLIN-3613] - Kylin with Standalone HBase Cluster could not find the main
cluster namespace at "Create HTable" step
[KYLIN-3634] - When the filter column has null value may cause incorrect
query result
[KYLIN-3635] - Percentile calculation on Spark engine is wrong
[KYLIN-3644] - NumberFormatExcetion on null values when building cube with
Spark
[KYLIN-3599] - Bulk Add Measures

Thanks to everyone who has contributed to this release.
Here’s release notes:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344108

The commit to be voted upon:


https://github.com/apache/kylin/commit/24e2452309a450ec4ef62339b003343eabe23016

Its hash is 24e2452309a450ec4ef62339b003343eabe23016.

The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.5.1-rc1/

The hashe of the artifact is as follows:
apache-kylin-2.5.1-source-release.zip.sha256
21db5dab4d3900a49237b9083b5d270c8471d1882a5427cddf1cc74873df42f2

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1056/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 2.5.1.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PPMC votes are cast.

[ ] +1 Release this package as Apache Kylin 2.5.1
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

-- 
Best regards,

Shaofeng Shi 史少锋

Re: [DISCUSS] Not sending Github PR notifications to dev@kylin

2018-10-04 Thread Xiaoxiang Yu

+1

Xiaoxiang Yu 
xiaoxiang...@kyligence.io
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hello, Kylin dev subscribers,

Recently I received several complaints saying that there are many emails
sent to the "dev@kylin.apache.org" from the github.com pull request since
we enabled the Gitbox service for Kylin.

Today most patches and code reviews are performed on GitHub. Each pull
request action (even add a comment) will emit an email to dev@kylin,
instead of the individual contributor or reviewer; This generates many
spams and causes the emails from people are left in the basket.

Now I plan to change the Gitbox email notifications rule: removing 
dev@kylin,
use author and reviewer instead, as follows:


*For Github issues, please notify iss...@kylin.apache.org
 ;For Github PR, please notify the author,
reviewer and iss...@kylin.apache.org *

The related JIRA to Apache Infra is
https://issues.apache.org/jira/browse/INFRA-17073

Please +1 if you agree with the new rule, or -1 if you want to keep as
today. If no objection, we will move on with the new rule.

-- 
Best regards,

Shaofeng Shi 史少锋

Re: 我使用的kylin版本是2.5.2,我发现经常会出现找不到kylin.properties以及kylin_hive_conf.xml等配置文件的情况,下面是我的一些截图。为什么这个变量的值即便我设置了，他也是随机的呢？这导致我经常会报这个错误。我现在怀疑这个变量是不是跟文件的加载顺序有关。但是加载配置文件的堆栈信息太长了。我没法追踪下去。

2019-01-19 Thread Xiaoxiang Yu

Hi,
I cannot see your screenshot files because it wasn’t uploaded.
Please add these screenshot files as email attach and try again, or you use a 
link in your email which other person could click in and read your screenshot.



Best wishes,
Xiaoxiang Yu


发件人: 冯广彬 
答复: "dev@kylin.apache.org" 
日期: 2019年1月19日 星期六 18:09
收件人: "dev@kylin.apache.org" 
主题: 
我使用的kylin版本是2.5.2,我发现经常会出现找不到kylin.properties以及kylin_hive_conf.xml等配置文件的情况,下面是我的一些截图。为什么这个变量的值即便我设置了，他也是随机的呢？这导致我经常会报这个错误。我现在怀疑这个变量是不是跟文件的加载顺序有关。但是加载配置文件的堆栈信息太长了。我没法追踪下去。

[fail_to_locate_kylin.properties.jpg]

[很多文件都找不到.jpg]
[没设置.jpg]
[设置了KYLIN_CONF.jpg]
[又换了位置.jpg]
[又换了位置.png]

Re: Kylin real-time streaming is ready on realtime-streaming branch

2018-12-23 Thread Xiaoxiang Yu

Hi，everyone. I am reading source code of real-time streaming and find some way 
which may helpful to other who is interested in this feature. If you are 
interested in eBay's new real time streaming solution but don't know in which 
way it may help you, then the following link will help you running or debugging 
it on your laptop.


https://github.com/hit-lacus/hit-lacus.github.io/issues/13#issuecomment-448449318



Best wishes,
Xiaoxiang Yu


发件人: Ma Gang 
答复: "u...@kylin.apache.org" 
日期: 2018年12月23日 星期日 13:33
收件人: kylin_dev , kylin_user 
主题: Kylin real-time streaming is ready on realtime-streaming branch

Hi all,

Kylin real-time streaming feature has been staged in Kylin code repository for 
public review and evaluation. You can check out the "realtime-streaming" branch 
to read the code, and make a binary build to run an example. The detail design 
doc and usage doc can be found in the attachment of jira: 
https://issues.apache.org/jira/browse/KYLIN-3654.

This is just the first version, any comments and pull request are welcome!

Thanks,
Ma,Gang

Re: [Question:]The version of HDP sandbox for set up kylin dev environment

2018-12-24 Thread Xiaoxiang Yu

Hi, I am using hdp 2.4(2.4.0.0-169) to develop kylin(start DebugTomcat in IDE) 
, and I think it works well. It's also necessary to use Java 8, because source 
using some API/feature introduced in Java8.


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi, any engineer, could you tell me which version of HDP sandbox that you 
use for developing Kylin 2.6.0?Please. As we can see, the doc show that the 
version of HDP sandbox used for set up kylin dev env is 2.4, it's jdk need to 
update to 1.8. But I find that the latest version of kylin 2.6.0 can support 
Hadoop 3.0. So, those baffle me. Help me, please

Re: 转发: issue: the same cube in diffent project response different result.

2018-12-15 Thread Xiaoxiang Yu

Hi liang gang,
Using you sample data, sql query and hive ddl, Kylin gave the same result as 
Hive in my sandbox env. So I think I think maybe you should double check your 
codebase or env. If you confirm the result is different from the result return 
by hive in the future, please report a JIRA and attach required files.


Best wishes,
Xiaoxiang Yu


发件人: "liangg...@qutoutiao.net" 
日期: 2018年12月15日 星期六 18:43
收件人: Xiaoxiang Yu 
抄送: "dev@kylin.apache.org" 
主题: 转发: issue: the same cube in diffent project response different result.

Hi  xiaoxiang,

Thanks for your response.  I provide attaching data file in case you can check 
my issue. Thanks.
“GRAPH_DETAIL_CUBE_ VIEW” is the fact table, and “DIM_METRIC_INFO” is the 
dimension table, it uses “inner join” in kylin Model.
When I use the 1.sql  to query ,it responses one data. Like below snapshot:
[cid:image005.jpg@01D4949A.40AAB230]

When I add ‘7464’ in “exp_id” filter like 2.sql to query,  it responses no 
data. Like below snapshot:
[cid:image008.jpg@01D4949A.40AAB230]

Below columns are measure columns. All of others are dimension columns.
SAMPLE_SIZE
SUM_X
SUM_SQUARE_X
ACT_DEV
[cid:image001.jpg@01D494A1.F2122D30]

I have checked it in kylin version 2.3.2 and 2.5.0, it has the same issue.

So I think it’s a bug of kylin. Please help me check, thanks!

发件人: Xiaoxiang Yu [mailto:xiaoxiang...@kyligence.io]
发送时间: 2018年12月13日 11:13
收件人: u...@kylin.apache.org<mailto:u...@kylin.apache.org>
抄送: liangg...@qutoutiao.net<mailto:liangg...@qutoutiao.net>
主题: Re: issue: the same cube in diffent project response different result.

Hi, lianggang

I haven’t see such problem. Could you please provide following detail for 
deeper research:

1.   Sample data which could can reproduce inconsistent result.(Maybe a 
csv/json file contains data.)

2.   Your model and cube metadata (some json file ).

3.   Your sql query which result is not correct.(in a sql file)

4.   Your Kylin version,  Spark/Hbase/MapReduce version

If you confirm inconsistent result can be reproduced on sample data, please 
open a JIRA ticket on https://issues.apache.org/jira/projects/KYLIN and attach 
your files.

--------
Best wishes,
Xiaoxiang Yu


发件人: "liangg...@qutoutiao.net<mailto:liangg...@qutoutiao.net>" 
mailto:liangg...@qutoutiao.net>>
答复: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" 
mailto:u...@kylin.apache.org>>
日期: 2018年12月12日 星期三 17:43
收件人: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" 
mailto:u...@kylin.apache.org>>
抄送: "liangg...@qutoutiao.net<mailto:liangg...@qutoutiao.net>" 
mailto:liangg...@qutoutiao.net>>
主题: issue: the same cube in diffent project response different result.

Hi All,

I encountered one strange problem for my cube data. Please help me check what’s 
wrong with Kylin.

I created two same cube in “ABtest” and “ABtest_prod” project, the cube 
structure is the same. built the same time range’s data. The data size is the 
same. But Using the same SQL to query, the response data is different. The 
snapshot like below:

ABtest project’s snapshot:
[cid:image001.png@01D4923F.A7D44B90]

ABtest_prod project’s snapshot:
[cid:image002.png@01D4923F.A7D44B90]

One other strange thing, when I use “trim(a.os)” in the query filter of 
ABtest_prod project, the response rows is 20. And data is correct. The snapshot 
like below:
[cid:image003.png@01D49241.0DC91B00]



I have checked it for a long time. I confirm the cube structure is the same, 
the time range and data size is also the same. I use the version 2.3.2 of Kylin.
Currently I don’t know what’s the reason. Please help me. Thank you very much!

Re: [VOTE] Release apache-kylin-2.5.2 (RC2)

2018-11-30 Thread Xiaoxiang Yu

+1

Local CI pass


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi all,

I have created a build for Apache Kylin 2.5.2, release candidate 2.

Changes:
[KYLIN-3187] - JDK APIs using the default locale, time zone or character
set should be avoided
[KYLIN-3636] - Wrong "storage_type" in CubeDesc causing cube building error
[KYLIN-3666] - Mege cube step 2: Update dictionary throws
IllegalStateException
[KYLIN-3672] - Performance is poor when multiple queries occur in a short
period
[KYLIN-3676] - Update to custom calcite and remove the "atopcalcite"
[KYLIN-3678] - CacheStateChecker may remove a cache file that under a
building
[KYLIN-3683] - Package org.apache.commons.lang3 not exists
[KYLIN-3689] - When the startTime is equal to the endTime in build request,
the segment will build all data.
[KYLIN-3693] - TopN, Count distinct incorrect in Spark engine
[KYLIN-3705] - Segment Pruner mis-functions when the source data has
Chinese characters
Thanks to everyone who has contributed to this release.

Here are release notes:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344466

The commit to being voted upon:


https://github.com/apache/kylin/commit/0e519d859e217fbfadd534313376e532d2c647fa

Its hash is 0e519d859e217fbfadd534313376e532d2c647fa.

The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.5.2-rc2/

The hashe of the artifact is as follows:
apache-kylin-2.5.2-source-release.zip.sha256
fca5688cf64442ea595e07c2a4a4b2b549836d268ce8f10f3d559f05c22b61d0

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1058/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 2.5.2.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PPMC votes are cast.

[ ] +1 Release this package as Apache Kylin 2.5.2
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re: [DISCUSS] Stop inserting git diffs to JIRA ticket

2018-12-02 Thread Xiaoxiang Yu

+1
Good idea.


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hello Kylin developers,

After we enable the git box for Kylin code repository, when there is a PR
merged, the "ASF Github Bot" will insert the git diff to the associated
JIRA. We noticed this function will make the JIRA very long when the code
change is big. Besides, when cherry-picking the change to another branch,
it will append again. This makes it is too hard for a human to read the
JIRA, the important message may be overlooked.

A typical sample is this:
https://issues.apache.org/jira/browse/KYLIN-3187

My proposal is, stopping sync the code change from GitHub to JIRA; Only
keep necessary notifications like "A PR is created/closed" etc. For the
code change, people should go to GitHub code history, not JIRA.

Please express your ideas; If no objection in the next couple of days, we
will raise a change request to the infrastructure team.

Thanks for your input!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re: dont complete a job in apache kylin.

2018-11-18 Thread Xiaoxiang Yu

In your picture , I can see 'Memory Total' is 0B.
So, it seems that you Yarn is not configured right. The jobs submited by Kylin 
is accepted by Yarn, but Yarn has no any resource to allocated.




Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

hi. I built a job and see it in the monitor of Kylin.
the status is running.
I see a new table in Hive and all tables in
yarn(hadoop:http://localhost:8088/cluster/apps/ACCEPTED) but dont full
progress (final status=UNDEFINED).
I check the kylin.log but i dont find Error.

---

<http://apache-kylin.74782.x6.nabble.com/file/t799/Screenshot_from_2018-11-18_07-48-55.png>
 

<http://apache-kylin.74782.x6.nabble.com/file/t799/Screenshot_from_2018-11-18_07-49-20.png>
 

<http://apache-kylin.74782.x6.nabble.com/file/t799/Screenshot_from_2018-11-18_07-51-32.png>
 
..
 thank you.

--
Sent from: http://apache-kylin.74782.x6.nabble.com/

Re: [VOTE] Release apache-kylin-2.6.0 (RC1)

2019-01-10 Thread Xiaoxiang Yu

+1 
mvn test pass


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

+1 binding

mvn test passed on my env:

Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe;
2018-06-18T02:33:14+08:00)
Maven home: /usr/local/Cellar/maven/3.5.4
Java version: 1.8.0_91, vendor: Oracle Corporation, runtime:
/Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.14.2", arch: "x86_64", family: "mac"

With Warm regards

Billy Liu

李 栋  于2019年1月10日周四 下午6:30写道：
>
> +1（binding)  mvn test passed
>
> Dong Li
>
> -Original Message-
> From: Ma Gang 
> Sent: Thursday, January 10, 2019 1:16 PM
> To: dev@kylin.apache.org
> Subject: Re:[VOTE] Release apache-kylin-2.6.0 (RC1)
>
> +1,  mvn test passed
>
> At 2019-01-10 09:59:55, "Yichen Zhou"  wrote:
> >+1
> >mvn test passed
> >
> >Regards,
> >Yichen
> >
> >On Wed, Jan 9, 2019 at 5:58 PM Rongchuan Jin
> >
> >wrote:
> >
> >> +1
> >>
> >> 
> >> 金荣钏/Rongchuan.Jin
> >>
> >>
> >> 在 2019/1/10 上午9:49，“ShaoFeng Shi” 写入:
> >>
> >> Checked the source package, the signature, and the sha256 hash;
> >>
> >> Mvn package and test are all successful with jdk 1.8.0_111 on
> >> Mac;
> >>
> >> +1 (binding)
> >>
> >> Best regards,
> >>
> >> Shaofeng Shi 史少锋
> >> Apache Kylin PMC
> >> Work email: shaofeng@kyligence.io
> >> Kyligence Inc: https://kyligence.io/
> >>
> >> Apache Kylin FAQ:
> >> https://kylin.apache.org/docs/gettingstarted/faq.html
> >> Join Kylin user mail group: user-subscr...@kylin.apache.org
> >> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
> >>
> >>
> >>
> >>
> >> JiaTao Tao  于2019年1月10日周四 上午9:42写道：
> >>
> >> > +1
> >> > mvn test passed
> >> >
> >> > Yanghong Zhong  于2019年1月9日周三 上午2:46写道：
> >> >
> >> > > Hi all,
> >> > >
> >> > > I have created a build for Apache Kylin 2.6.0, release 
candidate 1.
> >> > >
> >> > > Changes highlights:
> >> > > [KYLIN-2895] - Refine query cache by changing the query cache
> >> expiration
> >> > > strategy by signature checking and introducing memcached as
> >> distributed
> >> > > cache
> >> > > [KYLIN-2932] - Simplify the thread model for in-memory cubing
> >> > > [KYLIN-3021] - Check MapReduce job failed reason and include 
the
> >> > > diagnostics into email notification
> >> > > [KYLIN-3272] - Upgrade Spark dependency to 2.3.2
> >> > > [KYLIN-3540] - Improve Mandatory Cuboid Recommendation 
Algorithm
> >> > > [KYLIN-3552] - Data Source SDK to ingest data from different
> >> JDBC sources
> >> > > [KYLIN-3611] - Upgrade Tomcat to 7.0.91, 8.5.34 or later
> >> > > [KYLIN-3656] - Improve HLLCounter performance
> >> > > [KYLIN-3700] - Quote sql identities when creating flat table
> >> > > [KYLIN-3729] - CLUSTER BY CAST(field AS STRING) will
> >> accelerate base
> >> > cuboid
> >> > > build with UHC global dict
> >> > >
> >> > > Thanks to everyone who has contributed to this release.
> >> > > Here’s release notes:
> >> > >
> >> > >
> >> >
> >> 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344003
> >> > >
> >> > > The commit to be voted upon:
> >> > >
> >> > >
> >> > >
> >> >
> >> 
https://github.com/apache/kylin/commit/8737bc1f555a2789a67462c8f8420b6ab3be9

Re: 退订

2019-01-04 Thread Xiaoxiang Yu

Please send a piece of words to dev-unsubscr...@kylin.apache.org to unsubscribe 
dev mail list
More information please visit http://kylin.apache.org/community/


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

麻烦退订谢谢

Re: [Discuss] Moving toward Apache Kylin 3.0

2019-01-24 Thread Xiaoxiang Yu

+1, 
I am looking forward to real-time streaming feature. Wish more dev/user’s 
participation.


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi Kylin developers,

In last week, Kylin released v2.6.0, with the enhanced & distributed query
cache and JDBC data source SDK. After this release, the next batch
candidate features include real-time streaming, parquet storage, and druid
storage. These features were developed in the past 1-2 years by different
Kylin players and were open sourced in the past 6 months. They have already
been staged in separate branches and are under evaluation by the community.
We have received much feedback from the community.

These candidate features are big supplements to as-is Kylin functions; For
example, the real-time streaming feature will bring Kylin from batch &
historical analytics into real-time analytics. The parquet storage will
make the deployment more flexible and more cloud-friendly. Of course,
stabilizing and improving these features need additional time and effort.

So, when we merging and releasing them, we'd better give it a new version
number so that user can clearly know the difference with current 2.x
versions. I discussed this with several developers offline, we think it is
time to move toward Kylin 3.0. So, if one of the above features is merged,
the version will be 3.0. The current 2.6 will be maintained until 3.x is
ready for production use.

Your comments, ideas, and suggestions are welcomed!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re: 星座模型问题-- 有用到多张事实表都需要聚合情况，请问有解决的方法吗？

2019-04-04 Thread Xiaoxiang Yu

Hi, 
I have some ideas which may be helpful. 
1. Try to join these fact tables into another hive table(new fact table) 
which contains all column, but be careful for some measure(such as SUM) because 
rows will be duplicate after join.
2. Create single cube for each fact table, and merge result at your side.  

If you find any mistake, please let me know.


Best wishes,
Xiaoxiang Yu 
 

On 2019/3/27, 20:09, "奥威软件" <3513797...@qq.com> wrote:

Hi,
星座模型问题：
有用到多张事实表关联情况，需要聚合多张事实表的度量值，请问有解决的方法吗？


如果把事实表当作维度表，这张就不能聚合事实表的度量值了把？




请帮忙看下怎么解决，谢谢！


Best regards

Re: sqlserver jdbc数据源问题：当table前缀有dbo时，kylin无法读取表

2019-03-03 Thread Xiaoxiang Yu

Hi,
We cannot see your  screenshots. But this is my solution, hope this may 
help you.(Following steps have be test at my dev env.)



  1.  copy mssql-jdbc-6.4.0.jre8.jar to your $KYLIN_HOME/lib
  2.  add following config in kylin.properties or Project Level
  3.  start Kylin process and import table



kylin.source.default


16


kylin.source.jdbc.adaptor


org.apache.kylin.sdk.datasource.adaptor.DefaultAdaptor


kylin.source.jdbc.connection-url


jdbc:sqlserver://XXX.com:1433;database=sample


kylin.source.jdbc.dialect


mssql


kylin.source.jdbc.driver


com.microsoft.sqlserver.jdbc.SQLServerDriver


kylin.source.jdbc.pass


XXX


kylin.source.jdbc.sqoop-home


/opt/cloudera/parcels/CDH-5.15.1-1.cdh5.15.1.p0.4


kylin.source.jdbc.user


root



For screenshots, refer to: 
https://github.com/hit-lacus/hit-lacus.github.io/issues/32#issuecomment-469134060
If you find any mistakes, please let me know. Thank you.


Best wishes,
Xiaoxiang Yu


From: 奥威软件 <3513797...@qq.com>
Reply-To: "dev@kylin.apache.org" 
Date: Wednesday, February 27, 2019 at 21:54
To: dev 
Subject: sqlserver jdbc数据源问题： 当table前缀有dbo时，kylin无法读取表

 Hi,
sqlserver jdbc数据源问题： 当table前缀有dbo时，kylin无法读取表信息
[cid:2AFCFD0C@78E74533.9E96765C]
数据源查询的时候都是前缀 dbo

[cid:31FEFC0B@7518E300.9E96765C]



当表的前缀不是dbo的时候
kylin可以读取表数据
[cid:2D00F70A@DB1CFE19.9E96765C]

 请帮忙看下怎么解决，谢谢！

 Best regards
--
Regards!
Aron Tao

Re: 搭建开发环境

2019-03-04 Thread Xiaoxiang Yu

Hi, can you provide any error log and stack trace, in kylin.log or kylin.out?  
Please check your hbase cluster in health state(This can be checked by Ambari 
web UI)? 
And I think kylin.properties should be provided as well.


Best wishes,
Xiaoxiang Yu 
 

On 2019/3/4, 19:38, "294936039"  wrote:

各位好！我现在按照看kylin官网提供的搭建教程来搭建2.6版本的开发环境，启动DebugTomcat的时候报找不到hbase的region 
server，请求解答呀，或者有谁整理一份比较清楚的搭建手册吗，谢谢！小白一个，各位大佬多担待。

Re: [VOTE] Release apache-kylin-2.6.1 (RC1)

2019-03-04 Thread Xiaoxiang Yu

+1

mvn test passed


Best wishes,
Xiaoxiang Yu 
 

On 2019/3/4, 18:35, "ShaoFeng Shi"  wrote:

Hi all,

I have created a build for Apache Kylin 2.6.1, release candidate 1.

Changes highlights:
[KYLIN-3494] - Build cube with spark reports ArrayIndexOutOfBoundsException
[KYLIN-3537] - Use Spark to build Cube on Yarn failed at Setp8 on HDP3.
[KYLIN-3815] - Unexpected behavior when joining the streaming table and
hive table
[KYLIN-3828] - ArrayIndexOutOfBoundsException thrown when building a
streaming cube with empty data in its first dimension
[KYLIN-3833] - Potential OOM in Spark Extract Fact Table Distinct Columns
step
[KYLIN-3826] - MergeCuboidJob only uploads necessary segment's dictionary

Thanks to everyone who has contributed to this release.
Here’s the release notes:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12344845

The commit to being voted upon:


https://github.com/apache/kylin/commit/270cfe68ecc94c66141b29e2ccf20b9ec25e23dd

Its hash is 270cfe68ecc94c66141b29e2ccf20b9ec25e23dd.

The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.1-rc1/

The hash of the artifact is as follows:
apache-kylin-2.6.1-source-release.zip.sha256
961b8c8d0e781fe7936efb7f33cebb9661b4fbf83082669769a41b47cea19001

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1060/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 2.6.1.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 2.6.1
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

[Discussion] Enable shrunken dictionary by default

2019-03-17 Thread Xiaoxiang Yu

Dear all,
I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by default(it
is disabled by default), because I found enable it will speed up cube build
process when cube have count distinct(bitmap) on a large cardinality column.
This feature is contributed in KYLIN-3491.

When using count distinct(bitmap) measure on a large cardinality column(this
require global dictionary), build base cuboid step need frequent cache swap so
it cannot finished within a reasonable period. KYLIN-3491 add a new step to
build separated dictionary for each InputSplit before BuildBaseCuboid step. So
mapper of BuildBaseCuboid step only has to fetch a smaller dictionary for
itself(without unused value), instead of a larger global dictionary. It will
reduce cache swap and make BuildBaseCuboid step run as quick as possible.

In my test env, my hadoop cluster is a CDH cluster with 56 vcore and 110GB
Memory. I create a model with a fact table (153326740 rows) and three dimension
tables, there are three count distinct(bitmap) measure which the largest
cardinality of single column is 55200325. With ShrunkenDict disabled, the
BuildBaseCuboid cannot completed in 22 hours. Comparatively, with ShrunkenDict
enabled, build process completed in a reasonable duration(Extra Dictionary cost
5 minutes, Build Base Cuboid costs 5 minutes).

https://user-images.githubusercontent.com/14030549/54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png

If you want know more, please check
https://issues.apache.org/jira/browse/KYLIN-3491. If you have any suggestion,
please let me know.

Best wishes,
Xiaoxiang Yu

Re: 回复： hive表数据源：当hive的table中某列名含有中文，build cube 会报错

2019-03-07 Thread Xiaoxiang Yu

Hi, 
Thanks for your reporting, it does has such issue. If you have fix this, please 
submit your PR.


Best wishes,
Xiaoxiang Yu 
 

On 2019/3/6, 14:18, "奥威软件" <3513797...@qq.com> wrote:

The datasource is hive as described in the title




-- 原始邮件 --
发件人: "PENG Zhengshuai";
发送时间: 2019年3月6日(星期三) 下午2:07
收件人: "dev@kylin.apache.org";

主题: Re: hive表数据源： 当hive的table中某列名含有中文，build cube 会报错



Hi,

Let’s make sure the below things:
1. The datasource is hive or RDBMS?
2. If use RDBMS as datasource, which RDBMS? Mysql or Mssql
3. Do you have change some configurations like disable quote in sql?

BR
PENG Zhengshuai

> On Mar 6, 2019, at 1:28 PM, 奥威软件 <3513797...@qq.com> wrote:
> 
> kylin生成的sql语句(在hive)：
> CREATE EXTERNAL TABLE IF NOT EXISTS 
kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b
> (
> ICSTOCKBILL_1W_C_门店ID int
> ,ICSTOCKBILL_1W_C_客户ID int
> ,ICSTOCKBILL_1W_C_时间 timestamp
> ,ICSTOCKBILL_1W_C_商品ID int
> ,GOODS_C_商品ID int
> ,GOODS_C_品类ID int
> ,DEPARTMENT_C_门店ID int
> ,DEPARTMENT_C_区域ID int
> ,GOODSCLASS_C_品类ID int
> ,DEPARTMENTCLASS_C_区域ID int
> ,ICSTOCKBILL_1W_C_数量 int
> ,ICSTOCKBILL_1W_C_进货价 decimal(20,3)
> ,ICSTOCKBILL_1W_C_总售价 decimal(20,3)
> ,ICSTOCKBILL_1W_C_售价 decimal(20,3)
> ,ICSTOCKBILL_1W_C_总成本 decimal(20,3)
> )
> STORED AS SEQUENCEFILE
> 
> LOCATION 
'hdfs://kylincluster/kylin/kylin_metadata/kylin-6613a735-0452-1bd5-aa22-e63013366c2a/kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b';
> 
> 
> 
> 错误信息为：FAILED: ParseException line 3:19 cannot recognize input near 'ID' 
'int' ',' in column type
> 
> 
> 
> 
> 能正常使用的hive sql语句(区别是表名都添加了但反引号 ` ):
> CREATE EXTERNAL TABLE IF NOT EXISTS 
kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b
> (
> `ICSTOCKBILL_1W_C_门店ID` int
> ,`ICSTOCKBILL_1W_C_客户ID` int
> ,`ICSTOCKBILL_1W_C_时间` timestamp
> ,`ICSTOCKBILL_1W_C_商品ID` int
> ,`GOODS_C_商品ID` int
> ,`GOODS_C_品类ID` int
> ,`DEPARTMENT_C_门店ID` int
> ,`DEPARTMENT_C_区域ID` int
> ,`GOODSCLASS_C_品类ID` int
> ,`DEPARTMENTCLASS_C_区域ID` int
> ,`ICSTOCKBILL_1W_C_数量` int
> ,`ICSTOCKBILL_1W_C_进货价` decimal(20,3)
> ,`ICSTOCKBILL_1W_C_总售价` decimal(20,3)
> ,`ICSTOCKBILL_1W_C_售价` decimal(20,3)
> ,`ICSTOCKBILL_1W_C_总成本` decimal(20,3)
> )
> STORED AS SEQUENCEFILE
> 
> LOCATION 
'hdfs://kylincluster/kylin/kylin_metadata/kylin-6613a735-0452-1bd5-aa22-e63013366c2a/kylin_intermediate_ch_cube_5cc1_a6f3_f9f8_b90a_3e52abe0760b';
> 
> 
> 
> 
> 
> 
> 
> -- 原始邮件 --
> 发件人: "PENG Zhengshuai";
> 发送时间: 2019年3月6日(星期三) 中午1:09
> 收件人: "dev@kylin.apache.org";
> 
> 主题: Re: hive表数据源： 当hive的table中某列名含有中文，build cube 会报错
> 
> 
> 
> Hi,
> 
> Can you show the Hive Sql in Kylin.log when cube building?
> 
> BR
> PENG Zhengshuai
> 
>> On Mar 5, 2019, at 3:25 PM, 奥威软件 <3513797...@qq.com> wrote:
>> 
>> Hi,
>> hive表数据源： 当hive的table中某列名含有中文，build cube 会报错
>> kylin2.6.0 hadoop3
>> 错误如下：
>> 
>> 
>> NoViableAltException(24@[])  at 
org.apache.hadoop.hive.ql.parse.HiveParser.type(HiveParser.java:36813)   at 
org.apache.hadoop.hive.ql.parse.HiveParser.colType(HiveParser.java:36595)at 
org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeConstraint(HiveParser.java:34322)
   at 
org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrConstraint(HiveParser.java:34075)
 at 
org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeOrConstraintList(HiveParser.java:29819)
 at 
org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:6662)
at 
org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:4295)   
 at 
org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2494)  
 at 
org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)   at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)  at 
org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74) at 
org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67) at 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616)at 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)   at 
org.apache.hadoop.hive.ql.Driver.compile

Re: Unexpected behavior when joinning streaming table and hive table

2019-02-13 Thread Xiaoxiang Yu

Hi, lifei

After check your model.json, I found you use "HOUR_START" as your 
partition_date_column, which is not correct. 
I think you should change to "timestamp" and have another try. 

Source code at 
https://github.com/apache/kylin/blob/master/source-kafka/src/main/java/org/apache/kylin/source/kafka/TimedJsonStreamParser.java#L111

If you find any mistake, please let me know.

----
Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hello, I am evaluating Kylin and tried to join streaming table and hive
table, but now got unexpected behavior.

All the scripts can be found in
https://gist.github.com/OstCollector/a4ac396e3169aa42a416d96db3021195
(may need to modify some script to match the environments)

Environment: 
Centos 7
Hadoop on CDH-5.8
dedicated Kafka-2.1 (not included in CDH)

How to reproduce this problem:

1. run gen_station.pl to generate dim table data
2. run import-data.sh to build dim table in Hive
3. run factdata.pl and pipe its output into kafka
4. create tables TEST_WEATHER.STATION_INFO(hive)
TEST_WEATHER.WEATHER(streaming) in Kylin
5. create model and cube in Kylin, join WEATHER.SATION_ID = STATION.ID
6. build the cube

Expected behavior:
The cube is built correctly and I can get data when search.

Actual behavior:
On apache-kylin-2.6.0-bin-cdh57: build failed at step #2 (Create
Intermediate Flat Hive Table)
On apache-kylin-2.5.2-bin-cdh57: got empty cube

I also tried with this case without streaming, with the format of timestamp
column changed to "%Y-%m-%d %H:%M:%S", and an additional table to store the
mapping of timestamp and {hour,day,month,year}_start.
In this case, the cube is built as expected. 


In both failed cases, the intermediate fact table on Hive built in step #2
seems to have wrong column order.
e.g. on version 2.5.2-cdh57, the schema and content of temp table are shown
below:

CREATE EXTERNAL TABLE IF NOT EXISTS
kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact
(
DAY_START date
,YEAR_START date
,STATION_ID string
,QUARTER_START date
,MONTH_START date
,TEMPERATURE bigint
,HOUR_START timestamp
)
STORED AS SEQUENCEFILE
LOCATION

'hdfs://hz-dev-hdfs-service/user/admin/kylin-2/kylin_metadata/kylin-5dbe40eb-55ba-2245-c0b5-1e9efcb67937/kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact';
ALTER TABLE
kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact SET
TBLPROPERTIES('auto.purge'='true');

hive> select * from
kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact limit
10;
OK
NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01  
NULL   
NULL
NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31  
NULL   
NULL
NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31  
NULL   
NULL
NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31  
NULL   
NULL
NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31  
NULL   
NULL
NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01  
NULL   
NULL
NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01  
NULL   
NULL
NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31  
NULL   
NULL
NULL2009-01-01  2009-10-01  2009-12-01  2009-12-31  
NULL   
NULL
NULL2010-01-01  2010-01-01  2010-01-01  2010-01-01  
NULL   
NULL
Time taken: 0.421 seconds, Fetched: 10 row(s)

While the the content of temp file is:
# hdfs dfs -text

hdfs://hz-dev-hdfs-service/user/admin/kylin-2/kylin_metadata/kylin-5dbe40eb-55ba-2245-c0b5-1e9efcb67937/kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact/part-m-1
| head -n 10
19/02/13 11:44:12 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
[.deflate]
19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
[.deflate]
19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
[.deflate]
19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
[.deflate]
0030322010-01-012010-01-012010-01-012010-01-012010-01-01
07:00:001706
0075762010-01-012010-01-012010-01-012010-01-012010-01-01
07:00:002605
0113882010-01-012010-01-012010-01-012010-01-012010-01-01
07:00:002963
0214922010-01-012010-01-012010-01-012010-01-012010-01-01
07:00:001769
0303062010-01-012010-01-012010-01-012010-01-012010-01-01 0

Re: Hdfs Working directory usage

2019-02-10 Thread Xiaoxiang Yu

Hi, Ketan.

This is what I find:

- cuboid 
- This dir contains the cuboid data with each row contains dimensions array 
and MeasureAggregator array. 
- The size is depend on the cardinality of each columns and it is often 
very large. 
- When merge job completed, cuboid file of all segments which be merged 
successfully will be deleted automatically.
- fact_distinct_columns 
- This dir contains the distinct value of each column.
- It should be deleted after current segment build job succeed.
- hfile 
- This dir contains data file which be bulk loaded into hbase.
- It should be deleted after current segment build job succeed.
- rowkey_stats 
- Files under this dir are often very small, you may not need deleted them 
yourself.
- These files are used to partition hfile.

I think you should update your auto-merge settings to let auto-merge more 
often, if you find any mistakes, please let me know, thank you!



Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi team,
Any updates on the same ?  

Thanks,
Ketan

> On 01-Feb-2019, at 11:39 AM, ketan dikshit  wrote:
> 
> Hi Team,
> 
> We have a lot of data accumulated in our hdfs-working-directory, so we 
want to understand the usage of the following job data, once the job has been 
completed and segment is successfully created. 
> 
> cuboid
> 
fact_distinct_columns
> hfile
> rowkey_stats
> 
> Basically I need to understand the purpose of: 
cuboid,fact_distinct_columns,hfile,rowkey_stats after the job has built the 
cube segment (assuming we don’t use and merging/automerging of segments on the 
cube later).
> 
> The space taken up by these data in hdfs-working-dir is quite 
huge(affecting our costing), and is not getting cleaned by by cleanup 
job(org.apache.kylin.tool.StorageCleanupJob). So we need to be understand, that 
if we manually clean this up we will not get any issues later.
> 
> Thanks,
> Ketan@Exponential

Re: SQL Server JDBC Datasource doesnt list tables.

2019-01-29 Thread Xiaoxiang Yu

Hi Gladson,
I reproduce the same situation in my dev env,. The question is caused by 
org.apache.kylin.source.jdbc.metadata.SQLServerJdbcMetadata.java(Line:53) 
didn't report null in "TABLE_CATALOG" column.


This can be verified by following code:
String 
sqlServerUrl="jdbc:sqlserver://cdh1.cloudera.com:1433;user=SA;password=Kyligence2019;DatabaseName=xiaoxiang_test";

Connection con = DriverManager.getConnection(sqlServerUrl);
DatabaseMetaData meta = con.getMetaData();

System.out.println("");
ResultSet rs1 = meta.getCatalogs();
while(rs1.next()){
System.out.println(String.format("Catalog \t%s", rs1.getString(1)));

}

rs1 = meta.getSchemas();
System.out.println("");
while(rs1.next()){
System.out.println(String.format("Schemas \t%s\t%s", 
rs1.getString(1), rs1.getString(2)));
}

Output as follow:


Catalog master
Catalog model
Catalog msdb
Catalog tempdb
Catalog xiaoxiang_test

Schemas db_accessadmin  null
Schemas db_backupoperator   null
Schemas db_datareader   null
Schemas db_datawriter   null
Schemas db_ddladmin null
Schemas db_denydatareader   null
Schemas db_denydatawriter   null
Schemas db_ownernull
Schemas db_securityadminnull
Schemas dbo null
Schemas guest   null
Schemas INFORMATION_SCHEMA  null
Schemas sys null


By the way, which version of sqlserver as you use, and which version of jdbc 
driver are you use? Maybe we can try to fix this by change the way of fetching 
metadata(catalog) in SQLServerJdbcMetadata.

If you find any mistake, please let me know.


Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi,
I followed all the steps in this url 
http://kylin.apache.org/docs/tutorial/setup_jdbc_datasource.html , but when i 
click on Load table button or Load table from tree i don't seem to have any 
tables loaded from the SQL Server data source.There are no errors/exceptions in 
the logs too.
kylin.properties:

kylin.source.default=8kylin.source.jdbc.connection-url=jdbc:sqlserver://hostname:1433;database=samplekylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriverkylin.source.jdbc.dialect=mssqlkylin.source.jdbc.user=userkylin.source.jdbc.pass=passkylin.source.jdbc.sqoop-home=sqoophomekylin.source.jdbc.filed-delimiter=|kylin.source.jdbc.sqoop-mapper-num=4
kylin.log
2019-01-28 22:52:27,948 DEBUG [http-bio-7070-exec-1] common.KylinConfig:328 
: KYLIN_CONF property was not set, will seek KYLIN_HOME env variable2019-01-28 
22:52:28,017 INFO  [FetcherRunner 992042775-44] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 0 error, 0 discarded, 0 
others2019-01-28 22:52:28,123 INFO  [http-bio-7070-exec-4] 
common.KylinConfig:455 : Creating new manager instance of class 
org.apache.kylin.metadata.project.ProjectManager2019-01-28 22:52:28,125 INFO  
[http-bio-7070-exec-4] project.ProjectManager:81 : Initializing ProjectManager 
with metadata url kylin_metadata@hbase2019-01-28 22:52:28,129 DEBUG 
[http-bio-7070-exec-4] cachesync.CachedCrudAssist:118 : Reloading 
ProjectInstance from 
kylin_metadata(key='/project')@kylin_metadata@hbase2019-01-28 22:52:28,188 
DEBUG [http-bio-7070-exec-4] cachesync.CachedCrudAssist:127 : Loaded 1 
ProjectInstance(s) out of 1 resource2019-01-28 22:52:28,304 DEBUG 
[http-bio-7070-exec-5] project.ProjectL2Cache:198 : Loading L2 project cache 
for Testing2019-01-28 22:52:28,325 INFO  [http-bio-7070-exec-5] 
common.KylinConfig:455 : Creating new manager instance of class 
org.apache.kylin.metadata.TableMetadataManager2019-01-28 22:52:28,326 DEBUG 
[http-bio-7070-exec-5] cachesync.CachedCrudAssist:118 : Reloading TableDesc 
from kylin_metadata(key='/table')@kylin_metadata@hbase2019-01-28 22:52:28,333 
INFO  [http-bio-7070-exec-1] common.KylinConfig:455 : Creating new manager 
instance of class org.apache.kylin.metadata.model.DataModelManager2019-01-28 
22:52:28,360 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:127 : 
Loaded 0 TableDesc(s) out of 0 resource2019-01-28 22:52:28,361 DEBUG 
[http-bio-7070-exec-5] cachesync.CachedCrudAssist:118 : Reloading TableExtDesc 
from kylin_metadata(key='/table_exd')@kylin_metadata@hbase2019-01-28 
22:52:28,391 DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:127 : 
Loaded 0 TableExtDesc(s) out of 0 resource2019-01-28 22:52:28,392 DEBUG 
[http-bio-7070-exec-5] cachesync.CachedCrudAssist:118 : Reloading 
ExternalFilterDesc from 
kylin_metadata(key='/ext_filter')@kylin_metadata@hbase2019-01-28 22:52:28,421 
DEBUG [http-bio-7070-exec-5] cachesync.CachedCrudAssist:127 : Loaded 0 
ExternalFilte

Re: Hive ORC tables and empty lookup tables in the cube

2019-06-02 Thread Xiaoxiang Yu

Hi, Vadym Antsut
1. "The cube builded successfully, but all lookup tables is empty." How 
do you know your lookup tables is empty? Is it empty in Hive or in Kylin?
2. " I build the cube on the hive parquet tables in model " Does that 
means you drop you previous table, and create a new one with " STORED AS 
PARQUET", have you check your lookup table in hive cli?

----
Best wishes,
Xiaoxiang Yu 
 

在 2019/6/2 16:03，“Вадим Анцут” 写入:

Hello!

Can anyone help me with a problem:
I have in Hive one fact table and few lookup tables in the model for kylin
cube. The cube builded successfully, but all lookup tables is empty.
If I build the cube on the hive parquet tables in model with the same
values, then the lookup tables are not empty.


-- 
WBR,
Vadym Antsut

Re: [ANNOUNCE] Gang Ma joins the Apache Kylin PMC

2019-06-04 Thread Xiaoxiang Yu

Thanks Gang Ma for his selfless help and encouragement in reviewing my code. I 
hope I can get more guidance from you in the future and make Kylin better. 
Congratulations!


Best wishes,
Xiaoxiang Yu 
 

在 2019/6/3 13:32，“ShaoFeng Shi” 写入:

On behalf of the Apache Kylin PMC, I am pleased to announce that Gang Ma
(马刚) has accepted our invitation to become a PMC member on the Apache Kylin
project. We appreciate Gang stepping up to take more responsibility in the
Kylin project.

Please join me in welcoming Gang to the Kylin PMC!

Best Regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re: Hive ORC tables and empty lookup tables in the cube

2019-06-04 Thread Xiaoxiang Yu

Hi Antsut,
  Sorry for my late reply. First let's make it clear, you have two cubes (let 
us call them cube1 and cube2), cube1 's lookup table is in ORC format and 
cube2's lookup table is in parquet format. Now you find that with the same sql 
query, result by cube1 is correct and result by cube2 is worry. 
  So, to eliminate cause from data source(external from Kylin), please enter 
the same sql query in Hive CLI, if you find the result from Kylin is different 
from what you see in Hive CLI, I guess it is a bug of Kylin. In that case, 
please report a JIRA in https://issues.apache.org/jira/projects/KYLIN, with 
some detail which help us to reproduce bug(version of Kylin you are using, 
version of Hive/Hadoop, DDL of lookup table, screenshot of query result from 
hive CLI and from Kylin).


Best wishes,
Xiaoxiang Yu 
 

在 2019/6/3 15:40，“Vadajer” 写入:

Hello Xiaoxiang Yu!

Thanks for reply!

1. I'm selecting in kylin insight gui from lookup table or from fact table
with join to lookup table. All values from lookup table are empty. 

<http://apache-kylin.74782.x6.nabble.com/file/t967/empty.png> 

2. I'm create in hive the same tables (name with prefix) with "STORED AS
PARQUET" and create another cube. 
(see attached pics)

<http://apache-kylin.74782.x6.nabble.com/file/t967/parquet.png> 

WBR,
Vadym Antsut


--
Sent from: http://apache-kylin.74782.x6.nabble.com/

Re: build cube时发生错误

2019-06-13 Thread Xiaoxiang Yu

Hi,
  I have a simple workaround which maybe help you. Could you please use the 
HDP3.0 (which Hive version is 3.1 and Hbase version is 2.0 and use MySQL as 
metadata of Hive)? I am sure kylin can work well in that env because I have 
verified it.
 If you have to use oracle as metadata store of hive, and make sure that error 
could be reproduce , please report a JIRA
to https://issues.apache.org/jira/browse/KYLIN (with your kylin version and 
environment detail). Thank you very much.

Best wishes,
Xiaoxiang Yu


发件人: 平 
答复: "u...@kylin.apache.org" 
日期: 2019年6月13日 星期四 17:35
收件人: "u...@kylin.apache.org" 
抄送: "dev@kylin.apache.org" 
主题: build cube时发生错误


你好，新手，碰到这种问题实在不知道怎么解决。

在build cube时，提示ORA-00904: "B0"."CATALOG_NAME": 标识符⽆无效，我的kylin版本是hadoop3 
的2.6.2，hadoop版本时3.1，hive版本时3.1.0，hbase版本时2.0，操作系统是centos7

我也查看了dbs表，的确没有CATALOG_NAME这个字段，而且我找了好几个hive版本的初始化脚本，就是没看见CATALOG_NAME这个字段，也不知道哪里错了。。希望哪位好心提点一下，万分感谢！



java.lang.RuntimeException: java.io.IOException: 
MetaException(message:Exception thrown when executing query : SELECT DISTINCT 
'org.apache.hadoop.hive.metastore.model.MTable' AS 
NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.OWNER_TYPE,A0.RETENTION,A
 0.REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER 
JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0."NAME" = ? AND 
B0."CATALOG_NAME" = ?) at 
org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83)
 at 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126)
 at 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104)
 at 
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
 at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
 at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
 at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
 at 
org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: MetaException(message:Exception thrown when 
executing query : SELECT DISTINCT 
'org.apache.hadoop.hive.metastore.model.MTable' AS 
NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.OWNER_TYPE,A0.RETENTION,A
 0.REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER 
JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0."NAME" = ? AND 
B0."CATALOG_NAME" = ?) at 
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97)
 at 
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
 at 
org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:80)
 ... 10 more
Caused by: MetaException(message:Exception thrown when executing query : SELECT 
DISTINCT 'org.apache.hadoop.hive.metastore.model.MTable' AS
NUCLEUS_TYPE,A0.CREATE_TIME,A0.LAST_ACCESS_TIME,A0.OWNER,A0.OWNER_TYPE,A0.RETENTION,A
 0.REWRITE_ENABLED,A0.TBL_NAME,A0.TBL_TYPE,A0.TBL_ID FROM TBLS A0 LEFT OUTER 
JOIN DBS B0 ON A0.DB_ID = B0.DB_ID WHERE A0.TBL_NAME = ? AND B0."NAME" = ? AND 
B0."CATALOG_NAME" = ?) at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:208)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 at com.sun.proxy.$Proxy70.get_table_req(Unknown Source) at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1578)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1570)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:208)
 at com.sun.proxy.$Proxy71.getTable(Unknown Source) at 
org.apache.hive.hcatalog.common.HCatUtil.getTable(HCatUtil.java:191) at 
org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:105)
 at 
org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:88)
 at 
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
 ... 12 more
Caused by: javax.jdo.JDOException: Exception thrown when executing query : 
SELECT DISTINCT 'org.apache.ha

Re: cube-rowkey排序咨询

2019-06-10 Thread Xiaoxiang Yu

Hi, wangfx

Kylin converts sql query to two parameters(Start_key and end_key) in the range 
Scan operation in HBase.
The well-designed Rowkey will more effectively complete the query filtering and 
positioning of the data, reduce the number of IO, improve the query speed, the 
order of the dimension in the Rowkey, and have a significant impact on the 
query performance.

The following 2 principles need to be combined when adjusting the order of 
Rowkey: ·
1. Dimensions that are used as filter criteria in a query are placed in front 
of the non-filtered conditional dimension ·
2. Dimensions with a higher cardinality, before the lower cardinality dimension.

So, in your situation, I suggest the order should be :a,b,c,d.(If you have only 
four dimensions).   

And this link may help, 
https://kyligence.io/zh/blog/apache-kylin-optimizer-kybot-rowkey/.



Best wishes,
Xiaoxiang Yu 
 

在 2019/6/11 09:56，“wangfx”<945517...@qq.com> 写入:


cube若干个维度，其中a，b为强制维度，一定出现在where里，b的基数很低(只有3种数据)；c，d不会出现在where里，只出现在select和group 
by里，基数c>d>a>b，剩下的维度是where里的常规维度，请问rowkey里abcd和其他的维度顺序怎么排？

Re: ORC lookup tables missed in cube calculation.

2019-06-10 Thread Xiaoxiang Yu

Thanks for reporting, we will examined it closely.

Best wishes,
Xiaoxiang Yu 
 

在 2019/6/11 01:22，“Александр Сидорчук” 写入:

Hello,

Please help wuth solution finding. ORC is the best way to store data in
hive. But we have a blocking issue on that.
Please look into it:
https://issues.apache.org/jira/browse/KYLIN-4038

i've checked already source code, and looks like the problem in HCatalog,
that used by Kylin. But i haven't idea what can i do next... I will be glad
to any suggestions.

Thanks!

Reply:kylin求和时空值如何处理

2019-06-25 Thread Xiaoxiang Yu

Hi,
 I have a verify in my test env. Surely what  you report is right. 


SELECT SUM(INTEREST_SCORE) AS A, AAA as B, AAA + SUM(INTEREST_SCORE) as C
FROM USERACTION
LEFT JOIN (
SELECT SUM(INTEREST_SCORE) AS AAA
FROM USERACTION
WHERE DT = '2012-01-02'
) A on 1 = 1 
WHERE DT = '2012-01-01'
GROUP BY AAA


What kylin returns looks like the following:
  |   A | B | C |
  | 5.85  || |


But using the same sql in Hive CLI(1.2.X), result is the same:


OK
5.851NULLNULL


So kylin provided the same result as Hive return. Kylin looks make no 
mistakes. And I have check other query releated with NULL in kylin, I find no 
mistake. 






-
-
Best wishes to you ! 
From ：Xiaoxiang Yu



在 2019-06-25 10:58:22，"肖孟华"  写道：
>执行：select sum( TRAAMT ) AAA from TM_SR_SKY_T31293716 where  AREA_CODE ='370799'
>查询结果集为空，TRAAMT 为number类型；
>
>
>执行：select sum( TRAAMT ) AAA from TM_SR_SKY_T31293716 where  AREA_CODE ='370783'
>查询结果集不为空，有数值，TRAAMT 为number类型；
>
>
>执行：select sum( TRAAMT ) A , AAA AS B,  sum( TRAAMT ) + AAA AS C from 
>TM_SR_SKY_T31293716
>left join 
>(select sum( TRAAMT ) AAA from TM_SR_SKY_T31293716 where  AREA_CODE ='370799') 
>A
>on 1=1
>where AREA_CODE ='370783' group by aaa
>执行成功后，A结果不为空，B结果集为空，A与B相加后结果集C为空。
>
>
>问题：当空值与非空值进行加减乘除四则运算时，结果集均为空值，sum()函数在使用过程中也存在这样的问题，能否实现在运算前对空值在sql中进行特殊处理，将空值转换为0后在进行运算？若能实现，如何实现？需要如何处理？
>
>
>
>
>发件人：肖孟华
>联系电话：(+86)17616716362
>地址：（中国）山东省潍坊市高新区健康街潍坊软件园
>
>
>
>
>

Re: [VOTE] Release apache-kylin-2.6.2 (RC1)

2019-05-13 Thread Xiaoxiang Yu

+1 
mvn test passed


Best wishes,
Xiaoxiang Yu 
 

在 2019/5/14 09:10，“ShaoFeng Shi” 写入:

Hi all,

I have created a build for Apache Kylin 2.6.2, release candidate 1.

Changes highlights:
[KYLIN-3892] - Set cubing job priority
[KYLIN-3839] - Storage clean up after refreshing or deleting a segment
[KYLIN-3873] - Fix inappropriate use of memory in SparkFactDistinct.java
[KYLIN-3905] - Enable shrunken dictionary default
[KYLIN-3922] - Fail to update coprocessor when run DeployCoprocessorCLI
[KYLIN-3936] - MR/Spark task will still run after the job is stopped.


Thanks to everyone who has contributed to this release.
Here’s release notes:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345051

The commit to being voted upon:


https://github.com/apache/kylin/commit/c507ae29fa64bc7234efd6a002dcfe990969ad35

Its hash is c507ae29fa64bc7234efd6a002dcfe990969ad35.

The artifacts to be voted on are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.2-rc1/

The hash of the artifact is as follows:
apache-kylin-2.6.2-source-release.zip.sha256
db2ab59d3e66d635462e9c9ef49fd7ca29342f07ff4eea0730e52777287e2ebf

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1062/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 2.6.2.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 2.6.2
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re : Could not open client transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 configs from ZooKeeper

2019-05-21 Thread Xiaoxiang Yu

Hi, 
  From the error stack trace, I guess it is maybe caused by zookeeper server is 
not in health state, or maybe your hive-client is not configured correctly. I 
think you should contact your Hadoop admin for help. 


Best wishes,
Xiaoxiang Yu 
 

在 2019/5/20 11:19，“王廉鑫” 写入:

Hi,






 Kylin 2.5.2  On FusionInsight HD,when  click load table metadata from tree 
in web UI datasource,get error:




2019-05-20 11:15:58,350 INFO  [http-bio-7070-exec-5-EventThread] 
zookeeper.ClientCnxn:614 : EventThread shut down for session: 0xfb0ab634ba4f
2019-05-20 11:15:58,350 INFO  [http-bio-7070-exec-5] 
zookeeper.ZooKeeper:1325 : Session: 0xfb0ab634ba4f closed
2019-05-20 11:15:58,351 ERROR [http-bio-7070-exec-5] 
controller.TableController:190 : java.sql.SQLException: Could not open client 
transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 
configs from ZooKeeper
java.lang.RuntimeException: java.sql.SQLException: Could not open client 
transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 
configs from ZooKeeper
at 
org.apache.kylin.source.hive.BeelineHiveClient.init(BeelineHiveClient.java:76)
at 
org.apache.kylin.source.hive.BeelineHiveClient.(BeelineHiveClient.java:66)
at 
org.apache.kylin.source.hive.HiveClientFactory.getHiveClient(HiveClientFactory.java:29)
at 
org.apache.kylin.source.hive.HiveMetadataExplorer.(HiveMetadataExplorer.java:43)
at 
org.apache.kylin.source.hive.HiveSource.getSourceMetadataExplorer(HiveSource.java:41)
at 
org.apache.kylin.rest.service.TableService.getSourceDbNames(TableService.java:276)
at 
org.apache.kylin.rest.controller.TableController.showHiveDatabases(TableController.java:188)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)
at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)
at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
at 
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:624)
at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at 
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
at 
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
at 
org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at 
org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)
at 
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331

Re:the mysql datasource and the hive datasource dose not valueable at the same time

2019-07-13 Thread Xiaoxiang Yu

Hi, 
   Currently Kylin didn't support multi source type in single project, such as 
RDBMS & Hive. I think manage specific source type in their project won't be a 
big trouble. Maybe you can share with us
your case.

-
-
Best wishes to you ! 
From ：Xiaoxiang Yu

At 2019-07-11 18:06:16, "ALERT MAIL"  wrote:
>
>
> hi,
>
>  Sorry to trouble you, I need to load tables from mysql datasource and hive 
> datasource on the same kylin project at the same time.  but  if I  configure 
> the “kylin.source.default=8”, it can’t load hive table ; if  I  configure the 
> “kylin.source.default=0”, it can’t load the mysql table, are there  other 
> solutions to slove this problem.
>
>   thanks!
>

Re:Kylin interacting with AWS EMR

2019-07-09 Thread Xiaoxiang Yu

Dear friend ,
   I am feeling sad that you have met such trouble. I have depolyed Kylin into 
CDH's Hadoop Cluster, but I have less knowledge about AWS's EMR, but I think I 
may share what I know to you.
   First question, how to depoly Kylin outside the Hadoop cluster? As far as I 
see, I think you should deploy Kylin into a router/client node of Hadoop 
Cluster. A router node should be a node which has deploy Hadoop binary(such as 
Hive/HDFS) and conf file, but without DataNode/NodeManager(So it has no heavy 
workload). The router/client node let you have fully access to Hive CLI/HBase 
CLI/HDFS CLI, that is suitable for Kylin's depolyment. 
   On another hand, I think depoly Kylin outside the Hadoop cluster is not 
suitable, because Kylin need to upload/download large amounts of data to/from 
Hadoop cluster. So, depolying Kylin outside the Hadoop cluster, make network 
being a bottleneck, which has bad influence on Kylin's performance.
   Another question, the entry "kylin.job.use-remote-cli=true", which is used 
for Kylin's developer, but not for Kylin's user. If you are interested in that, 
please check http://kylin.apache.org/development/dev_env.html for detail.
   Besides, I have invited you into a slack 
channel(https://apache-kylin.slack.com). Some kylin user has deploy Kylin 
successfully on EMR, you may ask them more question.

-
-
Best wishes to you ! 
From ：Xiaoxiang Yu

At 2019-07-09 00:34:01, "Fábio Teixeira"  wrote:
>Dear all,
>
>First of all, thank you very much for building and maintaining Apache Kylin, 
>it is a really awesome, the work you are doing. 
>
>I had to try it out, so I first configured Apache Kylin into an AWS EMR 
>cluster which worked pretty well and then I wanted to really go crazy and have 
>it outside the AWS EMR cluster.
>
>I’ve already setup a Kylin cluster using MySQL as metastore but I am 
>struggling on making it interacting with the EMR cluster.
>
>My issue:
>On the first build step of a cube, It is fetching data using sqoop and should 
>add it to the Hive table, but there it is timing out because it tries to 
>connect to 127.0.0.1:50010 which obviously is not the AWS EMR cluster. I was 
>trying to find where I could change the ip for the datanode without success.
>
>Considering my issue, I was checking the code and I saw that there is the 
>possibility of running the jobs using remote cli and I was wondering if this 
>should be the way to go on a Production environment.
>
>Would you be so kind and provide me some guidance on the following topics?:
>Setting up kylin.job.use-remote-cli=true is the configuration that one should 
>use when Apache Kylin is not inside the Hadoop cluster.
>If not then could you provide me any kind of guidance where I can find 
>documentation for doing that kind of configuration (Kylin and Hadoop 
>separated)?
>I was already investigating the 
>https://github.com/apache/kylin/tree/master/examples/test_case_data/sandbox 
><https://github.com/apache/kylin/tree/master/examples/test_case_data/sandbox> 
>Do you have more updated documentation for having Kylin outside the Hadoop 
>cluster?
>Is it recommended to use Kylin outside the Hadoop cluster on a production 
>environment?
>
>Thank you in advance.
>
>I look forward to hearing from you.
>
>Kind regards,
>Fábio Teixeira
>

Re: [DISCUSS] Support multiple pushdown query engines

2019-07-07 Thread Xiaoxiang Yu

Hi,
   Thanks for your enthusiasm for Kylin community, I guess it will be a great 
feature in next release.




-
-
Best wishes to you ! 
From ：Xiaoxiang Yu



At 2019-07-07 15:05:49, "codingfor...@126.com"  wrote:
>Thanks nichunen and Xiaoxiang Yu for replying , I will create a jira with 
>lable `new-feature` and implement it.
>
>> On Jul 7, 2019, at 14:49, Xiaoxiang Yu  wrote:
>> 
>> Hi,
>>  +1. 
>>  I am agree with such proposal. Kylin should support multi-level pushdown, 
>> query which can not match by cube should be pushdown to several engines in 
>> order, such as presto -> SparkSQL -> Hive, which is more reasonable and let 
>> query can be answer as far as possible. Maybe it is worthy to open a JIRA.
>> 
>> 
>> 
>> 
>> -
>> -
>> Best wishes to you ! 
>> From ：Xiaoxiang Yu
>> 
>> 
>> 
>> 在 2019-07-05 15:37:41，"nichunen"  写道：
>>> +1
>>> 
>>> 
>>> Sounds useful and not difficult to develop.
>>> 
>>> 
>>> 
>>> Best regards,
>>> 
>>> 
>>> 
>>> Ni Chunen / George
>>> 
>>> 
>>> 
>>> On 07/5/2019 15:20，codingfor...@126.com wrote：
>>> Hi, all:
>>> Current (version 3.0.0-SNAPSHOT), kylin support only one kind of pushdown 
>>> query engine. In some user's scenario, need pushdown query to mysql, spark 
>>> sql，hive etc.
>>> I think kylin need support  multiple pushdowns. I want to discuss with you 
>>> whether it is need?
>>> Any suggestion is welcome. Thanks.
>

Re:regarding glue metastore

2019-07-16 Thread Xiaoxiang Yu

Hi krishna,
   I guess that you set EMR to use AWS Glue catalog as Hive metadata

and Kylin is missing the AWS lib com.amazonaws.glue in Kylin's classpath. Maybe 
/usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar(https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore/blob/master/aws-glue-datacatalog-hive2-client/src/main/java/com/amazonaws/glue/catalog/metastore/AWSGlueDataCatalogHiveClientFactory.java)?
 
You should find the lib in the EMR cluster and add it to yourclass path(maybe 
$KYLIN_HOME/lib). 
   If you cannot find the right jar, you may package it manually, repo 

should be this 
https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore.
 
   Maybe ask EMR customer service for help should be considered.







-
-
Best wishes to you ! 
From ：Xiaoxiang Yu

At 2019-07-16 10:40:10, "Krishna Bandaru"  wrote:

hi I created Kylin cluster with HA(3 masters and 2 cores)
java.lang.RuntimeException: java.io.IOException: MetaException(message:Unable 
to instantiate a metastore client factory 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due 
to: java.lang.ClassNotFoundException: Class 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not 
found)
at 
org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:97)
at 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:122)
at 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:100)
at 
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
at 
org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:111)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: MetaException(message:Unable to instantiate a 
metastore client factory 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due 
to: java.lang.ClassNotFoundException: Class 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not 
found)
at 
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97)
at 
org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
at 
org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:94)
... 10 more
Caused by: MetaException(message:Unable to instantiate a metastore client 
factory 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due 
to: java.lang.ClassNotFoundException: Class 
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not 
found)
at 
org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClientFactory(HiveUtils.java:525)
at 
org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClient(HiveUtils.java:506)
at 
org.apache.hive.hcatalog.common.HiveClientCache.getNonCachedHiveMetastoreClient(HiveClientCache.java:99)
at 
org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:318)
at 
org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:315)
at 
com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4791)
at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372)
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250)
at com.google.common.cache.LocalCache.get(LocalCache.java:3985)
at 
com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4788)
at 
org.apache.hive.hcatalog.common.HiveClientCache.getOrCreate(HiveClientCache.java:315)
at 
org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:277)
at 
org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558)
at 
org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:104)
at 
org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:88)
at 
org.apache.hive.hcatalog.mapreduce.HCatInputForma

Re: regarding glue metastore

2019-07-16 Thread Xiaoxiang Yu

Hi krishna,
I guess that you set EMR to use AWS Glue catalog as Hive metadata and Kylin 
is missing the AWS lib com.amazonaws.glue in Kylin's classpath. You should find 
the lib in the EMR cluster and add it to your class path(maybe $KYLIN_HOME/lib) 
is OK. 

If you cannot find the right jar, you may package it manually, repo should 
be this 
https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore.
 Maybe ask EMR customer service for help should be considered.

==
Xiaoxiang Yu
Best wishes to you!



> 在 2019年7月16日，10:40，Krishna Bandaru  写道：
> 
> hi I created Kylin cluster with HA(3 masters and 2 cores)
> 
> java.lang.RuntimeException: java.io.IOException: MetaException(message:Unable 
> to instantiate a metastore client factory 
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due 
> to: java.lang.ClassNotFoundException: Class 
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not 
> found)
>   at 
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:97)
>   at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:122)
>   at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:100)
>   at 
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
>   at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
>   at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:111)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: MetaException(message:Unable to instantiate a 
> metastore client factory 
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due 
> to: java.lang.ClassNotFoundException: Class 
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not 
> found)
>   at 
> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:97)
>   at 
> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
>   at 
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:94)
>   ... 10 more
> Caused by: MetaException(message:Unable to instantiate a metastore client 
> factory 
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due 
> to: java.lang.ClassNotFoundException: Class 
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not 
> found)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClientFactory(HiveUtils.java:525)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.createMetaStoreClient(HiveUtils.java:506)
>   at 
> org.apache.hive.hcatalog.common.HiveClientCache.getNonCachedHiveMetastoreClient(HiveClientCache.java:99)
>   at 
> org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:318)
>   at 
> org.apache.hive.hcatalog.common.HiveClientCache$5.call(HiveClientCache.java:315)
>   at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4791)
>   at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3584)
>   at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372)
>   at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335)
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3985)
>   at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4788)
>   at 
> org.apache.hive.hcatalog.common.HiveClientCache.getOrCreate(HiveClientCache.java:315)
>   at 
> org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:277)
>   at 
> org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558)
>   at 
> org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:104)
>   at 
> org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:88)
&g

Re: [VOTE] Release apache-kylin-2.6.3 (RC1)

2019-07-02 Thread Xiaoxiang Yu

+1mvn test passed-
-
Best wishes to you ! 
From ：Xiaoxiang Yu



At 2019-07-01 15:51:58, "Wang rupeng"  wrote:
>+1
>mvn test passed
>
>在 2019/7/1 下午2:16，“Chao Long” 写入:
>
>+1
>mvn test passed
>
>On Mon, Jul 1, 2019 at 2:09 PM Cheng wang  wrote:
>
>> ＋1(binding)
>>
>> Best regards,
>> Cheng Wang
>>
>>
>> On 7/1/19, 9:27 AM, "ShaoFeng Shi"  wrote:
>>
>> Hi all,
>>
>> I have created a build for Apache Kylin 2.6.3, release candidate 1.
>>
>> Changes highlights:
>> - [KYLIN-4024] - Support pushdown to Presto
>> - [KYLIN-3977] - Avoid mistaken deleting dicts by storage cleanup 
> while
>> building jobs are running
>> - [KYLIN-4015] – Fix build cube error at the "Build UHC Dictionary"
>> step
>> - [KYLIN-4022] - Error with message "Unrecognized column type:
>> DECIMAL(xx,xx)" happens when doing query pushdown
>>
>> Thanks to everyone who has contributed to this release.
>> Here’s release notes:
>>
>> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345582
>>
>> The commit to being voted upon:
>>
>>
>> 
> https://github.com/apache/kylin/commit/0d5f85b0a40c301134122de927204a0d17ad65fa
>>
>> Its hash is 0d5f85b0a40c301134122de927204a0d17ad65fa.
>>
>> The artifacts to be voted on are located here:
>> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.3-rc1/
>>
>> The hash of the artifact is as follows:
>> apache-kylin-2.6.3-source-release.zip.sha256
>> 50d1cad423f1a15a5e25f1c3c68748c7ce10e0116fd67fa9e38c1470a11d389c
>>
>> A staged Maven repository is available for review at:
>>
>> https://repository.apache.org/content/repositories/orgapachekylin-1063/
>>
>> Release artifacts are signed with the following key:
>> https://people.apache.org/keys/committer/shaofengshi.asc
>>
>> Please vote on releasing this package as Apache Kylin 2.6.3.
>>
>> The vote is open for the next 72 hours and passes if a majority of
>> at least three +1 PMC votes are cast.
>>
>> [ ] +1 Release this package as Apache Kylin 2.6.3
>> [ ]  0 I don't feel strongly about it, but I'm okay with the release
>> [ ] -1 Do not release this package because...
>>
>>
>> Here is my vote:
>>
>> +1 (binding)
>>
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC
>> Email: shaofeng...@apache.org
>>
>> Apache Kylin FAQ:
>> https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>>
>>
>>
>
>

Re: 取消订阅

2019-08-22 Thread Xiaoxiang Yu

Hi. 
  We are sad to here you leave. 
  If you have to unsubscribe, please send something to 
dev-unsubscr...@kylin.apache.org. For more detail, you may check 
http://www.apache.org/foundation/mailinglists.html and 
http://kylin.apache.org/community/ .



Best wishes,
Xiaoxiang Yu 
 

在 2019/8/22 15:28，“徐时永” 写入:

麻烦帮忙取消订阅。谢谢

Re: Failed to read big resource /dict/xxxx

2019-08-30 Thread Xiaoxiang Yu

e that marker. Since this is 
a broken metadata entry, deletion won't make damage. After the deletion, 
following rebuilt job will succeed.

This is some related report mail :
1. 
http://apache-kylin.74782.x6.nabble.com/How-to-repair-the-cube-that-it-lost-someone-dictionary-td12989.html
2. 
http://mail-archives.apache.org/mod_mbox/kylin-user/201908.mbox/%3c4bcca64e.4af8.16cdb473a62.coremail.itzhangqi...@163.com%3e

I think we should fix this in next release by deleting broken metadata entry if 
found. And I want to say thank you to issue's reporter for their patience and 
assistance.

If anyone find any mistake, please let me know. Thank you.

----
Best wishes,
Xiaoxiang Yu


发件人: Johnson 
答复: "u...@kylin.apache.org" 
日期: 2019年8月29日 星期四 21:44
收件人: Wang rupeng 
抄送: "u...@kylin.apache.org" 
主题: Re: Failed to read big resource /dict/

1.kylin版本2.6.2，查看邮件列表，之前有个同学也遇到了这个问题，也是2.6.2版本
2.报错的那个dict不存在，这个cube已经稳定运行数月，各组件权限没有问题。
3.查看kylin log没有发现问题点。
4.之后我反复测试构建该cube，监控hdfs上kylin元数据目录，发现该报错维度的dict文件没有生成，报错信息一样，怀疑是元数据问题。
5.使用kylin cli工具清理存储（包括部分元数据），再构建，报错一致。
6.对报错cube使用到的hive表，建hive视图（改名），导入kylin，新建了一样的cube，之后构建这个新cube没有报错。怀疑kylin构建之前那个hive
 表元数据问题。
[图像已被发件人删除。]
itzhangqiang
邮箱itzhangqi...@163.com

签名由 网易邮箱大师<https://mail.163.com/dashi/dlpro.html?from=mail88> 定制
On 08/29/2019 19:38, Wang rupeng<mailto:wangrup...@live.cn> wrote:
Hi,
When this error occurs, you can using “$KYLIN_HOME/bin/metastore.sh fetch 
/dict“ to download the dictionaries to local and check if the dictionary file 
exist, besides. You can check your hdfs permission. Otherwise, you may show us 
more information about your situation like your working scene.

---
Best wishes,
Rupeng Wang


发件人: Johnson mailto:itzhangqi...@163.com>>
答复: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" 
mailto:u...@kylin.apache.org>>
日期: 2019年8月29日 星期四 10:49
收件人: "u...@kylin.apache.org<mailto:u...@kylin.apache.org>" 
mailto:u...@kylin.apache.org>>
主题: Failed to read big resource /dict/

hi，大家好，
今天发现一个失败的任务，失败在#4 Step Name: Build Dimension Dictionary
报错信息如下：之后我把这个任务drop掉，重新构建还是一直报这个错，大家知道怎么解决吗

org.apache.kylin.engine.mr.exception.HadoopShellException: 
java.lang.RuntimeException: java.io.IOException: Failed to read big resource 
/dict/KYLIN_VIEW./COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict

  at 
org.apache.kylin.dict.DictionaryManager.getDictionaryInfo(DictionaryManager.java:108)

  at 
org.apache.kylin.dict.DictionaryManager.checkDupByContent(DictionaryManager.java:173)

  at 
org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:151)

  at 
org.apache.kylin.dict.DictionaryManager.saveDictionary(DictionaryManager.java:320)

  at 
org.apache.kylin.cube.CubeManager$DictionaryAssist.saveDictionary(CubeManager.java:1127)

  at org.apache.kylin.cube.CubeManager.saveDictionary(CubeManager.java:1089)

  at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:74)

  at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:55)

  at 
org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)

  at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:93)

Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

2019-08-30 Thread Xiaoxiang Yu

Hi ,
As far as I can see, Kylin should be installed on a separated Gateway node in 
CDH cluster. In gateway node, you should not install any long running 
process/component such as data node/region server/node manager. Instead with 
some gateway role to let you have access to HDFS/HBase/Yarn/Hive etc.
Following screenshot is a gateway node in my test env.
[cid:image001.png@01D55F66.D07C07F0]


Best wishes,
Xiaoxiang Yu


发件人: Wang rupeng 
答复: "u...@kylin.apache.org" 
日期: 2019年8月30日 星期五 15:31
收件人: Gourav Gupta , "u...@kylin.apache.org" 

主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Hi Gupta,
You can change kylin port by using following command and new port is 7070 
plus the number you set:
./$KYLIN_HOME/bin/kylin-port-replace-util.sh set 
If kylin web UI cannot be opened, you can check kylin log which is 
$KYLIN_HOME/logs/kylin.log to see more details.
There are some suggestions for your doubts:
1. You need to add environment variable SPARK_HOME=/local/path/to/spark so 
that you can start kylin successfully even though you don't use spark to build 
cube. And you'd better using suggested version of spark(spark-2.3.2), you can 
download it by ./$KYLIN_HOME/bin/down-spark.sh .
2. Kylin supported cdh vertion is cdh5.7+, cdh6.0, cdh6.1 and you don't 
have to care about HBase version if you are using cdh. In case you are using 
cdh5.16, you can download  apache-kylin--bin-cdh57.tar.gz from 
http://kylin.apache.org/download/
3. You don't have to install kylin on master node, any other node in 
cluster would be OK.

---
Best wishes,
Rupeng Wang


发件人: Gourav Gupta 
日期: 2019年8月30日 星期五 02:03
收件人: Wang rupeng 
抄送: "dev@kylin.apache.org" 
主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Thanks a lot Wang for the prompt helpful reply. Actually today I have removed 
the old version of Kylin and installed successfully apache Kylin 2.6 for CDH 
mode but now at this time, we are unable to open Kylin WEB UI. Even though I 
have had changed port number 7070 to some other number in server.xml(Tomcat 
directory), but still facing the same issue.

I have some doubts while configuring the Kylin which are mentioned below:

1. Would I have to write the path of spark master node or path of spark which 
has come with Kylin?
2.Which tar file will be suitable for cloudera 5.16 ?? What is the need of 
Kylin-HBase version?
3.should  I need to install and configured Kylin on master node? will 
installation over the edge node work?

Actually, we are trying to switch the visualization layer from SQL(OLAP) - 
PowerBI pipeline to KYLIN-Mean Stack (Open Source/Enterprise version ). So your 
help is much appreciated on the same.

I am waiting for your positive response.


Regards,
Gourav Gupta

On Thu, Aug 29, 2019 at 5:43 PM Wang rupeng 
mailto:wangrup...@live.cn>> wrote:
Hi,
It seems the problem is following
"60505 [dispatcher-event-loop-6] ERROR  
org.apache.spark.scheduler.cluster.YarnScheduler  - Lost executor 1 on 
*: Container marked as failed:"
It usually comes out with not enough memory for your yarn so that yarn 
container is closed because of lack of memory , you can go to yarn resource 
manager web page to see more details with yarn log.
If it's the memory issue, you can try to allocate more memory for spark 
yarn executor by change the following configuration item in 
"$KYLIN_HOME/conf/kylin.properties"
kylin.engine.spark-conf.spark.yarn.executor.memoryOverhead=384


---
Best wishes,
Rupeng Wang


在 2019/8/29 14:57，“Gourav 
Gupta”mailto:techgouravgu...@gmail.com>> 写入:

Hi Sir,

I have installed and configured Apache Kylin 2.4 on Cloudera Platform for
creating the Cube.

I have been able to create a cube in MapReduce mode but getting the
below-mentioned caveat while executes on spark mode. I have had followed
all the steps and tried many remedies for debugging the problem.



Please let me know how to resolve this bug. Thanks in Advance.





1091 [main] ERROR org.apache.spark.SparkContext  - Error adding jar
(java.lang.IllegalArgumentException: requirement failed: JAR
kylin-job-2.4.0.jar already registered.), was the --addJars option used?

[Stage 0:>  (0 + 0)
/ 2]
[Stage 0:>  (0 + 2)
/ 2]


60505 [dispatcher-event-loop-6] ERROR
org.apache.spark.scheduler.cluster.YarnScheduler  - Lost executor 1 on **
***: Container marked as failed:
container_e62_1566915974858_6628_01_03 on host: ***. Exit status:
50. Diagnostics: Exception from container-launch.
Container id: container_e62_1566915974858_6628_01_03
Exit code: 50
Stack trace: ExitCodeException exitCode=50:
at org.apache.hadoop.util.Shell.runCommand(Sh

Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

2019-09-03 Thread Xiaoxiang Yu

Dear Gourav,
  Thank you for your update.


Best wishes,
Xiaoxiang Yu


发件人: Gourav Gupta 
日期: 2019年9月4日 星期三 00:09
收件人: Xiaoxiang Yu , Wang rupeng 
抄送: "dev@kylin.apache.org" 
主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Dear Xiaoxiang,

Thanks for the helpful reply. Please be apprised, have resolved all the issues 
and now I am able to create a cube with MapReduce mode. Last caveat i.e. 
"FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" is resolved while I 
configured the "hive.auto.convert.join = false" in kylin-hive-site.xml.

Thanks for the support and appreciates the quick response from you and Kylin 
Team. I will take your help in future as well if I face any other issue when 
building a cube with spark mode.

Best Regards,
Gourav Gupta

On Sun, Sep 1, 2019 at 10:54 AM Xiaoxiang Yu 
mailto:xiaoxiang...@kyligence.io>> wrote:
Hi friend,
  I feel so glad to hear you have resolved some problem after a lot effort, and 
it is very kind of you to share something you found about  
kylin-port-replace-util.sh with us.
  It seems that you meet another trouble of the first step of your cube 
building, using Hive to create a flat table. As far as I can see, the message 
provided by you “FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask”indicated that your Hive is 
NOT configured in right way. Your Hive command run in local mode other than 
Yarn mode. It is strange, did your node which you choose to deploy Kylin is 
configured in correct way? Maybe you should ask your Hadoop administrator for 
help. Or could you please provided more detail about how your deploy Kylin?
   If you use Kylin for the first time and you are familiar with Docker, maybe 
you can run a docker container to have a technical preview. Please refer to 
http://kylin.apache.org/docs/install/kylin_docker.html.

--------
Best wishes,
Xiaoxiang Yu


发件人: Gourav Gupta mailto:techgouravgu...@gmail.com>>
日期: 2019年9月1日 星期日 01:24
收件人: Wang rupeng mailto:wangrup...@live.cn>>, Xiaoxiang Yu 
mailto:xiaoxiang...@kyligence.io>>, 
"dev@kylin.apache.org<mailto:dev@kylin.apache.org>" 
mailto:dev@kylin.apache.org>>
主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Dear Wang and Xiaoxiang,
Thanks for providing the suggestions and solutions for all those queries which 
I had mentioned in the previous trailing mail. Truly appreciated!!!

As the answers have been received from you, I did the port number amendment in  
"./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was 
facing with the same issue. After doing hours of brainstorming, I was able to 
resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the 
java application was running on 9009 port no. and we also know that Kylin takes 
3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped 
the already running script on 9009.

At this time I am facing with one caveat i.e "FAILED: Execution Error, return 
code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going 
to create a cube in Map-Reduce mode. I googled the same and did the amendment( 
Kylin and Hive property) as per the solution I got over the shared 
link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
 but still, I am not able to resolve.

Please let me know is there any way of resolving this issue. Attaching the 
screenshot of the error.

Thanks in advance.

Best Regards,
Gourav Gupta

On Sat, Aug 31, 2019 at 10:49 PM Gourav Gupta 
mailto:techgouravgu...@gmail.com>> wrote:
Dear Wang and Xiaoxiang,
Thanks for providing the suggestions and solutions for all those queries which 
I had mentioned in the previous trailing mail. Truly appreciated!!!

As the answers have been received from you, I did the port number amendment in  
"./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was 
facing with the same issue. After doing hours of brainstorming, I was able to 
resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the 
java application was running on 9009 port no. and we also know that Kylin takes 
3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped 
the already running script on 9009.

At this time I am facing with one caveat i.e "FAILED: Execution Error, return 
code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going 
to create a cube in Map-Reduce mode. I googled the same and did the amendment( 
Kylin and Hive property) as per the solution I got over the shared 
link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
 but still, I am not able to resolve.

Please let me know

Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

2019-08-31 Thread Xiaoxiang Yu

Hi friend,
  I feel so glad to hear you have resolved some problem after a lot effort, and 
it is very kind of you to share something you found about  
kylin-port-replace-util.sh with us.
  It seems that you meet another trouble of the first step of your cube 
building, using Hive to create a flat table. As far as I can see, the message 
provided by you “FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask”indicated that your Hive is 
NOT configured in right way. Your Hive command run in local mode other than 
Yarn mode. It is strange, did your node which you choose to deploy Kylin is 
configured in correct way? Maybe you should ask your Hadoop administrator for 
help. Or could you please provided more detail about how your deploy Kylin?
   If you use Kylin for the first time and you are familiar with Docker, maybe 
you can run a docker container to have a technical preview. Please refer to 
http://kylin.apache.org/docs/install/kylin_docker.html.


Best wishes,
Xiaoxiang Yu


发件人: Gourav Gupta 
日期: 2019年9月1日 星期日 01:24
收件人: Wang rupeng , Xiaoxiang Yu 
, "dev@kylin.apache.org" 
主题: Re: Unable to create cube in Spark Mode -Apache Kylin on Cloudera

Dear Wang and Xiaoxiang,
Thanks for providing the suggestions and solutions for all those queries which 
I had mentioned in the previous trailing mail. Truly appreciated!!!

As the answers have been received from you, I did the port number amendment in  
"./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was 
facing with the same issue. After doing hours of brainstorming, I was able to 
resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the 
java application was running on 9009 port no. and we also know that Kylin takes 
3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped 
the already running script on 9009.

At this time I am facing with one caveat i.e "FAILED: Execution Error, return 
code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going 
to create a cube in Map-Reduce mode. I googled the same and did the amendment( 
Kylin and Hive property) as per the solution I got over the shared 
link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
 but still, I am not able to resolve.

Please let me know is there any way of resolving this issue. Attaching the 
screenshot of the error.

Thanks in advance.

Best Regards,
Gourav Gupta

On Sat, Aug 31, 2019 at 10:49 PM Gourav Gupta 
mailto:techgouravgu...@gmail.com>> wrote:
Dear Wang and Xiaoxiang,
Thanks for providing the suggestions and solutions for all those queries which 
I had mentioned in the previous trailing mail. Truly appreciated!!!

As the answers have been received from you, I did the port number amendment in  
"./$KYLIN_HOME/bin/Kylin-port-replace-util.sh set", but still thereafter I was 
facing with the same issue. After doing hours of brainstorming, I was able to 
resolve the aforesaid issue(Not able to access Kylin UI), Actually, one of the 
java application was running on 9009 port no. and we also know that Kylin takes 
3 ports 7070,9009 & 7443. Was able to access the Kylin Web UI while I stopped 
the already running script on 9009.

At this time I am facing with one caveat i.e "FAILED: Execution Error, return 
code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask" when I am going 
to create a cube in Map-Reduce mode. I googled the same and did the amendment( 
Kylin and Hive property) as per the solution I got over the shared 
link(https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask)
 but still, I am not able to resolve.

Please let me know is there any way of resolving this issue. Attaching the 
screenshot of the error.

Thanks in advance.

Best Regards,
Gourav Gupta




On Fri, Aug 30, 2019 at 1:00 PM Wang rupeng 
mailto:wangrup...@live.cn>> wrote:
Hi Gupta,
You can change kylin port by using following command and new port is 7070 
plus the number you set:
./$KYLIN_HOME/bin/kylin-port-replace-util.sh set 
If kylin web UI cannot be opened, you can check kylin log which is 
$KYLIN_HOME/logs/kylin.log to see more details.
There are some suggestions for your doubts:
1. You need to add environment variable SPARK_HOME=/local/path/to/spark so 
that you can start kylin successfully even though you don't use spark to build 
cube. And you'd better using suggested version of spark(spark-2.3.2), you can 
download it by ./$KYLIN_HOME/bin/down-spark.sh .
2. Kylin supported cdh vertion is cdh5.7+, cdh6.0, cdh6.1 and you don't 
have to care about HBase version if you are using cdh. In case you are using 
cdh5.16, you can download  apache-kylin--bin-cdh57.tar.gz from 
http://kylin.apache.org/download/
3. You don't have to install kylin on master node, an

Re: How to migrate model/cube metadat across cluster

2019-09-18 Thread Xiaoxiang Yu

Hi Lionel,
  Sorry for my misunderstanding, you are right, I think in your situation, use 
metastore.sh back is a better way. 


Best wishes,
Xiaoxiang Yu 
 

在 2019/9/19 10:42，“lionel@oocl.com” 写入:

Hi

Note that the different Kylin environments should share the same Hadoop 
cluster, including HDFS, HBase and HIVE.
The doc says that should share the same Hadoop cluster, including HDFS, 
HBase and HIVE, right?

Thanks & Regards
Lionel

-Original Message-
From: Xiaoxiang Yu 
Sent: Thursday, September 19, 2019 10:32 AM
To: LIONEL TAO (OPS-IRIS-ISD-OOCLL/ZHA) ; 
dev@kylin.apache.org
Subject: Re: How to migrate model/cube metadat across cluster

Hi,
  As the doc says, it could migrate cube between different clusters.
  If you have a kylin instance in QA cluster (qa_node:7070), and you want 
to migrate to a PROD cluster (prod_node:7070), you can use qa_node:7070 as 
srcKylinConfigUri and prod_node:7070 as dstKylinConfigUri, thus to migrate 
across cluster.


Best wishes,
Xiaoxiang Yu


在 2019/9/19 10:21，“lionel@oocl.com” 写入:

Hi Xiaoxiang,

CubeMigrationCLI.java can migrate a cube from a Kylin environment to 
another, for example, promote a well tested cube from the testing env to 
production env. Note that the different Kylin environments should share the 
same Hadoop cluster, including HDFS, HBase and HIVE.

Per the document, cannot migrate across cluster?
We are finding a solution to do this and QA & PROD are in different 
cluster.
Can you help give more advice?


Thanks & Regards
Lionel



-Original Message-
    From: Xiaoxiang Yu 
Sent: Thursday, September 19, 2019 10:07 AM
To: dev@kylin.apache.org; LIONEL TAO (OPS-IRIS-ISD-OOCLL/ZHA) 

Subject: Re: How to migrate model/cube metadat across cluster

Hi Lionel,
  It is a good practice to do a full test at test env and migrate it to 
another env, I think you can use CubeMigrationCLI to meet your request.
  Please check 
https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkylin.apache.org%2Fdocs%2Fhowto%2Fhowto_use_cli.htmldata=02%7C01%7Clionel.tao%40oocl.com%7C2610d4d2698b44a0bf6908d73ca98c1d%7C7851b4cc2c5c459f96d916731d6b4ca4%7C0%7C0%7C637044571243399491sdata=%2FLMuIwEDnbaulQoZONRm5fNMseHb%2BdH3dkAOfc5N%2Ft8%3Dreserved=0
 for more detail.


Best wishes,
    Xiaoxiang Yu


在 2019/9/19 01:12，“lionel@oocl.com” 写入:

Hi Kylin dev,

We are now trying Kylin for our OLAP system, and now have some 
questions need clarify with you :


  1.  See that cannot use CLI to migrate data from QA environment 
to PROD if they are in different cluster, what's the limitation when migrate 
across clusters and  then how can do the migration?
  2.  We have tried with metastore.sh backup and then restore in 
another cluster, Will this be a solution for migration? And any impact if using 
this solution?





Thanks & Regards
Lionel

Disclaimer : This email and all contents are subject to the 
following disclaimer: 
https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Femaildisclaimer.oocl.com%2Fdefault.htmldata=02%7C01%7Clionel.tao%40oocl.com%7C2610d4d2698b44a0bf6908d73ca98c1d%7C7851b4cc2c5c459f96d916731d6b4ca4%7C0%7C0%7C637044571243399491sdata=%2BHDGzQz0VlgJhx3vqt6b5xadXgkwhf4GvepE0djVQC0%3Dreserved=0


Disclaimer : This email and all contents are subject to the following 
disclaimer: 
https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Femaildisclaimer.oocl.com%2Fdefault.htmldata=02%7C01%7Clionel.tao%40oocl.com%7C2610d4d2698b44a0bf6908d73ca98c1d%7C7851b4cc2c5c459f96d916731d6b4ca4%7C0%7C0%7C637044571243399491sdata=%2BHDGzQz0VlgJhx3vqt6b5xadXgkwhf4GvepE0djVQC0%3Dreserved=0


Disclaimer : This email and all contents are subject to the following 
disclaimer: http://emaildisclaimer.oocl.com/default.html

Re: [jira] [Created] (KYLIN-4125) Kylin upgraded from springmvc architecture to spring boot architecture

2019-08-06 Thread Xiaoxiang Yu

Hi zjt,
  Glad to hear your suggestion.


Best wishes,
Xiaoxiang Yu 
 

在 2019/8/6 14:36，“zjt” 写入:

Hi Team:


Next, I will do these things to upgrade kylin from springmvc architecture 
to springbok architecture.
1. Modify the project's pom.xml to import the spring boot dependency.
2. Upgrade kylin-server from spring mvc to springbok 1.5.5, modify 
kylin-server code to support spring boot.
3. Modify the build script, package kylin into to a war package and 
run it as external tomcat mode. It's the same as current deployment.


I have implemented this part or the features and conducted a simple test. I 
expect to submit my code to the "KYLIN-4125" branch this week.












 Forwarding messages 
From: "zhao jintao (JIRA)" 
Date: 2019-08-06 11:53:00
To:  dev@kylin.apache.org
Subject: [jira] [Created] (KYLIN-4125) Kylin upgraded from springmvc 
architecture to spring boot architecture
zhao jintao created KYLIN-4125:
--

 Summary: Kylin upgraded from springmvc architecture to spring 
boot architecture
 Key: KYLIN-4125
 URL: https://issues.apache.org/jira/browse/KYLIN-4125
 Project: Kylin
  Issue Type: Improvement
  Components: REST Service
Reporter: zhao jintao
Assignee: zhao jintao


Hi Team:

Kylin is based on the spring mvc architecture, but the spring mvc 
configuration is more complicated. It is cumbersome when integrateing new 
components.

Now， The mainstream of the industry has been based on the spring boot 
architecture. Spring boot can be automatically configured to reduce the 
complexity of project integration; promote the expansion and implementation of 
microservice architecture. More and more project architectures have been 
upgraded from springmvc to spring boot.

Kylin can also be upgraded from the springmvc architecture to the spring 
boot architecture.


Do you have any suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Re:Using kylin in simplified way

2019-07-25 Thread Xiaoxiang Yu

Hi Asim,
I heard a lot of Kylin users may have such wish(remove Hadoop component), 
currently kylin comunity have already implement RDBMS as Metadata Store(default 
is HBase), next plan will be add parquet/druid as storage layer(to replace 
HBase) and use Spark instead of MapReduce. Maybe newer version of Kylin which 
deployed in cloud will only depend on MYSQL(metadata), Spark/Parquet(as storage 
layer and compute engine), S3 as distributed storage.
Currently there are branchs 
https://github.com/apache/kylin/tree/kylin-on-parquet & 
https://github.com/apache/kylin/tree/kylin-on-druid, but these feature have 
some limitation and maybe enough mature. We may wait for further development 
work.
-
-
Best wishes to you ! 
From ：Xiaoxiang Yu



At 2019-07-25 11:29:22, "Asim Ali"  wrote:
>Hi All,
>I tried using Kylin on Hadoop environment, but overhead of hadoop is too
>much
>for our medium scale need.
>Is there any way we can use kylin Olap engine with minimal requirements of
>underlaying storage layer.
>What are the best practices and architecture to support this, where we
>possibly can use it without hadoop components.
>Thanks
>
>Asim Ali
>*Software Developer*
>
>Email: a...@easyemployer.com 
>
>Phone: 1300 855 642 <1300855642>
>Website: www.easyemployer.com
>
>[image: easyemployer] <http://www.easyemployer.com/>
>
>
>On Fri, 21 Jun 2019 at 15:22, 敏丞  wrote:
>
>> Hi,
>> After check cube.json provided by you, I can reproduce this error in
>> my development env. Looks like this kind exception occurs when you have a
>> cube which *have both Raw measure and CountDistinct(Bitmap) on the same
>> column*. I find the reason should be the Raw Measure choose the wrong
>> dictionary (AppendTrieDictionary cannot used to decode).  Maybe you should
>> try use two cubes in this situation.
>> And if you don't mind, I have a question, have you ever use this type
>> of query "select * from FACT_TABLE" in old version of kylin in such kind of
>> cube(raw measure and count_distinct both on the same column) and get
>> correct result successfully?
>>
>>
>> If you have find anything wrong or other information, please let me
>> know. Thank you.
>>
>>
>>
>>
>> *-*
>> *-*
>> *Best wishes to you ! *
>> *From ：**Xiaoxiang Yu*
>>
>>
>> At 2019-06-20 16:00:10, "greatelvisw...@gmail.com" 
>>  wrote:
>> >hi,all:
>> >
>> > I got an error like "AppendTrieDictionary can't retrieve value from id" 
>> > while query the cube data, the following is the cube info and  exception 
>> > info.
>> >
>> >I found the same error in this 
>> >thread(https://lists.apache.org/thread.html/63981bc08ef7b97c41921ed692de79ef9a744f6329192e5199074ba3@%3Cdev.kylin.apache.org%3E),
>> > but I just use the  bitmaps (count distinct) as  measure, and never use it 
>> >as dimension.
>> >
>> >So please help me to resolve it.
>> >
>> >cube data:
>> >{
>> >  "uuid": "eb0b4a32-fbc0-b197-b3f0-4c9cd5fb3916",
>> >  "last_modified": 1561014544528,
>> >  "version": "2.6.2.0",
>> >  "name": "dev_cube_user_currency",
>> >  "is_draft": false,
>> >  "model_name": "user_currency",
>> >  "description": "",
>> >  "null_string": null,
>> >  "dimensions": [
>> >{
>> >  "name": "TYPE",
>> >  "table": "DEV_DWD_USER_CURRENCY",
>> >  "column": "TYPE",
>> >  "derived": null
>> >},
>> >{
>> >  "name": "SUB_TYPE",
>> >  "table": "DEV_DWD_USER_CURRENCY",
>> >  "column": "SUB_TYPE",
>> >  "derived": null
>> >},
>> >{
>> >  "name": "SOURCE_TYPE",
>> >  "table": "DEV_DWD_USER_CURRENCY",
>> >  "column": "SOURCE_TYPE",
>> >  "derived": null
>> >},
>> >{
>> >  "name": "SOURCE",
>> >  "table": "DEV_DWD_USER_CURRENCY",
>> >  "column": "SOURCE",
>> >  "derived": null
>> >},
>> >{
>&g

Re: How to migrate model/cube metadat across cluster

2019-09-18 Thread Xiaoxiang Yu

Hi Lionel,
  It is a good practice to do a full test at test env and migrate it to another 
env, I think you can use CubeMigrationCLI to meet your request. 
  Please check http://kylin.apache.org/docs/howto/howto_use_cli.html for more 
detail.


Best wishes,
Xiaoxiang Yu 
 

在 2019/9/19 01:12，“lionel@oocl.com” 写入:

Hi Kylin dev,

We are now trying Kylin for our OLAP system, and now have some questions 
need clarify with you :


  1.  See that cannot use CLI to migrate data from QA environment to PROD 
if they are in different cluster, what's the limitation when migrate across 
clusters and  then how can do the migration?
  2.  We have tried with metastore.sh backup and then restore in another 
cluster, Will this be a solution for migration? And any impact if using this 
solution?





Thanks & Regards
Lionel

Disclaimer : This email and all contents are subject to the following 
disclaimer: http://emaildisclaimer.oocl.com/default.html

Re: [VOTE] Release apache-kylin-3.0.0-beta (RC1)

2019-09-26 Thread Xiaoxiang Yu

+1
mvn test passed


Best wishes,
Xiaoxiang Yu 
 

在 2019/9/27 10:03，“Wang rupeng” 写入:

+1
---
Best wishes,
Rupeng Wang
 

在 2019/9/27 09:45，“Chao Long” 写入:

+1
mvn test passed

On Thu, Sep 26, 2019 at 8:44 PM Yaqian Zhang  
wrote:

> +1
> mvn test passed
>
> > 在 2019年9月26日，20:10，nichunen  写道：
> >
> > +1
> >
> >
> >
> > Best regards,
> >
> >
> >
> > Ni Chunen / George
> >
> >
> >
> > On 09/26/2019 16:41，ShaoFeng Shi wrote：
> > Hi all,
> >
> > I have created a build for Apache Kylin 3.0.0-beta, release 
candidate 1.
> >
> > Changes highlights:
> > [KYLIN-4122] - Add Kylin user and group management modules
> > [KYLIN-4167] - Refactor streaming coordinator
> > [KYLIN-4114] - Provided a self-contained docker image for Kylin
> > [KYLIN-4137] - Accelerate metadata reloading
> >
> > Thanks to everyone who has contributed to this release.
> > Here’s the release notes:
> >
> 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12345686
> >
> > The commit to being voted upon:
> >
> 
https://github.com/apache/kylin/commit/721be80866223fecad9a6231fa2427a847bc8f48
> >
> > Its hash is 721be80866223fecad9a6231fa2427a847bc8f48.
> >
> > The artifacts to be voted on, including the source package and two
> > pre-compiled binary packages, are located here:
> >
> 
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.0-beta-rc1/
> >
> > The hash of the artifacts are as follows:
> > apache-kylin-3.0.0-beta-source-release.zip.sha256
> > 53547e8a94eb74cdcd329777ff03f1c79209020016c2f9a62351e8c73ac8e0bd
> > apache-kylin-3.0.0-beta-bin-hbase1x.tar.gz.sha256
> > 1d50348660899baa9005b78cf45243e0eb2495fa0403d6250b3439ff50bf1731
> > apache-kylin-3.0.0-beta-bin-cdh57.tar.gz.sha256
> > bc9e303154901d4061dbac3876157cb4be25f23307f4c709d083da70aa18524b
> > apache-kylin-3.0.0-beta-bin-hadoop3.tar.gz.sha256
> > 681452450248f56ebe107d278e3ccb1478e42137875a2dded953db8c03488f9a
> > apache-kylin-3.0.0-beta-bin-cdh60.tar.gz.sha256
> > 2f66497ed39d7d78ea5a634a8796ab408586dce369edc97ed9374ba90a88b03d
> >
> > A staged Maven repository is available for review at:
> > 
https://repository.apache.org/content/repositories/orgapachekylin-1066/
> >
> > Release artifacts are signed with the following key:
> > https://people.apache.org/keys/committer/shaofengshi.asc
> >
> > Please vote on releasing this package as Apache Kylin 3.0.0-beta.
> >
> > The vote is open for the next 72 hours and passes if a majority of
> > at least three +1 PMC votes are cast.
> >
> > [ ] +1 Release this package as Apache Kylin 3.0.0-beta
> > [ ]  0 I don't feel strongly about it, but I'm okay with the release
> > [ ] -1 Do not release this package because...
> >
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> > Apache Kylin PMC
> > Email: shaofeng...@apache.org
> >
> > Apache Kylin FAQ: 
https://kylin.apache.org/docs/gettingstarted/faq.html
> > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>

Re: Install Apache Kylin in custom environment

2019-09-26 Thread Xiaoxiang Yu

Dear sir,
   Hadoop has a long history and it is complex, so we recommend you to use some 
well-tested Hadoop Distribution such as CDH and HDP, but not a custom Hadoop 
environment.
   If you are do a PoC and want to learn Kylin quickly, please use Docker image 
https://hub.docker.com/r/apachekylin/apache-kylin-standalone. If want to use 
Kylin in a more formal Hadoop environment, could you please use a CDH 5.x or 
HDP 2.x Hadoop Distribution?


Best wishes,
Xiaoxiang Yu 
 

在 2019/9/27 11:24，“Ngọc Thiên Nguyễn” 写入:

Dear Sir,

I'm trying to install Apache Kylin in custom environment.

But I had some problem

https://stackoverflow.com/questions/58126981/install-apache-kylin-in-custom-environment,
can you help me?

Looking forward to hear from you.

Thanks & Regards

Re: Build cube by JDBC

2019-06-14 Thread Xiaoxiang Yu

Hi, 
  Sorry for my late reply. I have reproduced that error, using oracle 11g as 
data source. 
  First, I met the error " java.sql.SQLSyntaxErrorException: ORA-00911: 无效字符". 
This can be fixed by adding a configuration in kylin.properties, that is " 
kylin.source.hive.quote-enabled=false".
  After restart Kylin process, I met another error " 
java.sql.SQLSyntaxErrorException: ORA-00933: SQL 命令未正确结束",  and I found this 
exception is caused by "AS" in from clause. I will create a JIRA and fixed it 
later, you may wait for next release.

 Following is my configuration:
 kylin.source.default=8
 kylin.source.jdbc.connection-url=jdbc:oracle:thin:@hdp30-qa:49161/XE
 kylin.source.jdbc.driver=oracle.jdbc.driver.OracleDriver
 kylin.source.jdbc.dialect=oracle 
 kylin.source.jdbc.user=system
 kylin.source.jdbc.pass=oracle
 kylin.source.jdbc.sqoop-home=/opt/cloudera/parcels/CDH/lib/sqoop
 kylin.source.jdbc.filed-delimiter=|
 kylin.source.hive.quote-enabled=false

 If you have any suggestion or find any mistake, please let me know, thank you 
very much.

----
Best wishes,
Xiaoxiang Yu 
 

在 2019/6/10 17:03，“高铭潮” 写入:

Hi, all

When I build the cube by Oracle JDBC. There is something error. Like 
this error message:

java.io.IOException: OS command error exit with return code: 1, error 
message: Warning: 
/Users/gmc/Technology/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not 
exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: 
/Users/gmc/Technology/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not 
exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: 
/Users/gmc/Technology/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not 
exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/Users/gmc/Technology/hadoop/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/Users/gmc/Technology/hbase/hbase-2.0.5/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-06-10 15:57:06,965 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2019-06-10 15:57:06,996 WARN tool.BaseSqoopTool: Setting your password on 
the command-line is insecure. Consider using -P instead.
2019-06-10 15:57:07,088 WARN sqoop.ConnFactory: Parameter --driver is set 
to an explicit driver however appropriate connection manager is not being set 
(via --connection-manager). Sqoop is going to fall back to 
org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which 
connection manager should be used next time.
2019-06-10 15:57:07,106 INFO manager.SqlManager: Using default fetchSize of 
1000
2019-06-10 15:57:07,106 INFO tool.CodeGenTool: Beginning code generation
2019-06-10 15:57:07,587 INFO manager.SqlManager: Executing SQL statement: 
SELECT `WFPROCESSINST`.`PROCESSINSTID` as `WFPROCESSINST_PROCESSINSTID` 
,`WFPROCESSINST`.`PROCESSINSTNAME` as `WFPROCESSINST_PROCESSINSTNAME` 
,`WFPROCESSINST`.`CREATOR` as `WFPROCESSINST_CREATOR` ,`WFPROCESSINST`.`OWNER` 
as `WFPROCESSINST_OWNER` ,`WFPROCESSINST`.`RELATEDATA` as 
`WFPROCESSINST_RELATEDATA` ,`WFPROCESSINST`.`STARTTIME` as 
`WFPROCESSINST_STARTTIME` ,`WFPROCESSINST`.`ENDTIME` as `WFPROCESSINST_ENDTIME` 
,`WFPROCESSINST`.`FINALTIME` as `WFPROCESSINST_FINALTIME` 
,`WFPROCESSINST`.`REMINDTIME` as `WFPROCESSINST_REMINDTIME` 
,`WFPROCESSINST`.`CURRENTSTATE` as `WFPROCESSINST_CURRENTSTATE` 
,`WFPROCESSINST`.`PARENTACTID` as `WFPROCESSINST_PARENTACTID`  FROM 
`SIE_EMS`.`WFPROCESSINST` as `WFPROCESSINST` INNER JOIN 
`SIE_EMS`.`SMBP_PROCESSINSTBIZRELA` as `SMBP_PROCESSINSTBIZRELA` ON 
`WFPROCESSINST`.`PROCESSINSTID` = `SMBP_PROCESSINSTBIZRELA`.`PROCESSINSTID` 
INNER JOIN `SIE_EMS`.`WFWORKITEM` as `WFWORKITEM` ON 
`WFPROCESSINST`.`PROCESSINSTID` = `WFWORKITEM`.`PROCESSINSTID` WHERE 1=1 AND 
(`WFPROCESSINST`.CREATETIME >= '2016-01-01 00:00:00' AND 
`WFPROCESSINST`.CREATETIME < '2019-06-10 00:00:00')  AND  (1 = 0) 
2019-06-10 15:57:07,623 ERROR manager.SqlManager: Error executing 
statement: java.sql.SQLSyntaxErrorException: ORA-00911: 无效字符

java.sql.SQLSyntaxErrorException: ORA-00911: 无效字符

at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513)
at oracle.

Re:[DISCUSS] Support multiple pushdown query engines

2019-07-07 Thread Xiaoxiang Yu

Hi,
  +1. 
  I am agree with such proposal. Kylin should support multi-level pushdown, 
query which can not match by cube should be pushdown to several engines in 
order, such as presto -> SparkSQL -> Hive, which is more reasonable and let 
query can be answer as far as possible. Maybe it is worthy to open a JIRA.




-
-
Best wishes to you ! 
From ：Xiaoxiang Yu



在 2019-07-05 15:37:41，"nichunen"  写道：
>+1
>
>
>Sounds useful and not difficult to develop.
>
>
>
>Best regards,
>
> 
>
>Ni Chunen / George
>
>
>
>On 07/5/2019 15:20，codingfor...@126.com wrote：
>Hi, all:
>Current (version 3.0.0-SNAPSHOT), kylin support only one kind of pushdown 
>query engine. In some user's scenario, need pushdown query to mysql, spark 
>sql，hive etc.
>I think kylin need support  multiple pushdowns. I want to discuss with you 
>whether it is need?
>Any suggestion is welcome. Thanks.

Re:When building a cube, because the backquote is reported incorrectly

2019-07-07 Thread Xiaoxiang Yu

Hi friend,
   I think it is a good question. When related to RDBMS soure, make a correct 
decision is not so easy, I think it should be depend you requirement and your 
understanding about these feature.
   Firstly, source type = 8, is older, it use sqoop to ingest data from RDBMS 
into hive, thus make "create flat table" realizable. It has been tested under 
MySQL and SqlServer. After cube built successfully, your  cube could answer all 
measure type provided by Kylin, and it is quicker than query directly aganist 
underlying RDBMS soure, .
   Secondly, source type = 16(we called DataSource SDK), is newer. Not only 
sqoop, but also apache calcite, has been introduced into Kylin, to make Kylin 
even stronger. In addition to the ability of ingesting data from RDBMS, but 
also DataSource SDK can rewrite your query to make more query could be answered 
bu Kylin. In some scenario, when your query cannot be match with your cube, 
kylin will try pushdown to let real source answer such query. But pushdown may 
failed bacause reason of difference of RDBMS's dialect. DataSource SDK can make 
a difference by rewriting your query based on a mapping file(a XML file) 
provided by you. And it has provided some SPI(service provided interface) to 
let your implement your rewrite logic, which is more stronger and  also more 
difficult. You should check source code carefully to know how it should be 
used. 
   So, as far as I can see, source type = 8 is OK for most scenario, it is out 
of box and should works well. On the another hand,if you want to make push down 
query more smoothly, you should try source type = 16. 
   If you have more question, or I have make any mistake, please let me know. 
Thank you very much.





-
-
Best wishes to you ! 
From ：Xiaoxiang Yu

At 2019-07-04 20:11:10, "紫电_恶魔"  wrote:


The following content is translated from Chinese using translation software, it 
is inevitable that errors, please treat

Kylin.properties





Kylin.source.default in the official documentation using jdbc source has 8 and 
16 configurations, I do not know the difference

 

http://kylin.apache.org/cn/development/datasource_sdk.html

Based on this document, create a new configuration file, postgresql.xml, as 
follows




The SQL executed when building cube is as follows

 

Please tell me what other configuration needs to be done. Thank you

Re:Detailed Structure Reference

2019-07-07 Thread Xiaoxiang Yu

Hi,
   I don't know the exact meaning of "Kylin detailed structure". Could you 
please provided a more specific desciption? By the way, if you want to know the 
implementation of Kylin, you may visit Kylin's Doc and read some books. And I 
think this book should fit for most chinese reader, 
https://book.douban.com/subject/26975003/.

-
-
Best wishes to you ! 
From ：Xiaoxiang Yu

At 2019-07-05 09:38:01, "shicheng31...@gmail.com"  
wrote:
>Hi all:
>Is there any reference  of Kylin detailed structure?  I have some quetions 
> about it , but I could not find any solution on official introduction. 
> Thanks. 
>
>
>
>shicheng31...@gmail.com

Re: using mySql as metadata storage

2019-11-13 Thread Xiaoxiang Yu

Dear Sir,
I want to share my opinion, and I will be glad if it could help you to make 
better decision.
First, RDBMS metadata store is contributed by an experienced dev team which 
provided professional solution based on Apache Kylin, they have verified its 
stability in their customer's prod env before their contribution. After that, I 
have known some users which deployed Kylin on AWS EMR, and they choose MySQL as 
metadata store, their Kylin is running smoothly for several months.
I think you could have a try by deploying a smaller Kylin cluster which use 
MySQL as metadata store and monitor its stability. And if you face any issue in 
the future, please let us know you problem, thank you.

Best wishes,
Xiaoxiang Yu 
 

在 2019/11/14 13:57，“听风看雨” 写入:

hey,
may i ask, is now the support using mysql as metadata base stable? I've 
been looking through kylin official website for some time, but found no more 
hints in release notes after version 2.5.0, are there any details neglected？
best wishes！

Re: Kylin 2.6.4 error when building cubes with hadoop 3.1.3

2019-11-17 Thread Xiaoxiang Yu

Hi,
   Do you use Kylin in CDH 6.3? I have heard one user deploy Kylin on CDH 6.3 
and met the same problem?


Best wishes,
Xiaoxiang Yu


发件人: "zx张笑(深圳)" 
答复: "dev@kylin.apache.org" 
日期: 2019年11月16日 星期六 01:52
收件人: "dev@kylin.apache.org" 
主题: Kylin 2.6.4 error when building cubes with hadoop 3.1.3

Hi，developers of,

   I’m a user of Kylin.

   I’m facing following errors at the third step when building a cube 
(Kylin_Fact_Distinct_Columns):

   [cid:image001.png@01D59BCF.A2D5BDF0]

   The versions of my environment is as follows:
[cid:image002.png@01D59BCF.A2D5BDF0]

   I think it’s maybe a problem caused by guava versions. Hadoop 3.1.3 used 
guava-27.0，but kylin call function from guava versions less than guava-16.

   So, how can I fixed this problem.

   Thank you!

Best Begards,
Sunshine

Re: New committer: Temple Zhou

2019-11-17 Thread Xiaoxiang Yu

Temple Zhou , congratulations!

Best wishes,
Xiaoxiang Yu 

在 2019/11/17 22:57，“Temple Zhou” 写入:

Sorry for late reply, thank you everyone.

Kylin community is a very open and friendly community. I'm very honored to
become a Kylin committer.

I will make Kylin more reliable and excellent in every way I can. The more
people join us, the better the community will be.

On Sun, Nov 17, 2019, 09:15 Yaqian Zhang  wrote:

> Congratulations!
>
> > 在 2019年11月16日，14:27，codingfor...@126.com 写道：
> >
> > Congratulations!
> >
> >
> >> 在 2019年11月16日，13:57，nichunen  写道：
> >>
> >> Congratulations!
> >
>
>

Re: Kylin to PostgreSQL Error in Cube build Step 1

2019-11-21 Thread Xiaoxiang Yu

Hi Andrey,
Firstly, thank you for your testing on our build, I have some question to ask:

  1.  When you set kylin.source.default=16, you said you found “Oops… Failed to 
take action.”, did you see what the exception kylin throw? Could you please 
show us error message in kylin.log? Our patch work when 
kylin.source.default=16, so the error message throw by kylin when you set it to 
8 is not what we care in this issue/PR. So the important things is what 
occurred when you see “Oops… Failed to take action.”
  2.  If you could provided more detail about you related config, maybe I can 
find something useful.

Best wishes,
Xiaoxiang Yu


发件人: Andrey Molotov 
日期: 2019年11月21日 星期四 16:19
收件人: Xiaoxiang Yu 
抄送: "dev@kylin.apache.org" 
主题: Re: Kylin to PostgreSQL Error in Cube build Step 1

Hello, Sir.
I’ve installed the Kylin binary you’ve provided. Also I’ve prepared data tables 
that you used to test you build https://github.com/apache/kylin/pull/902 .
If I set a property kylin.source.default=16 and click on Load Table Metadata 
From Tree, I got an error: “Oops… Failed to take action.”
So, I was forced to use kylin.source.default=8. I prepared model and cube just 
like you, but still got the error on the first step
My env:

• PostgreSQL 9.5.20

• cdh 5.16.2

• Kylin build from master branch
Here is log:
java.io.IOException: OS command error exit with return code: 1, error message: 
Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will 
fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/hadoop/sqoop/../zookeeper does not exist! Accumulo imports will 
fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of 
HADOOP_PREFIX.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/hadoop/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-11-21 10:51:09,835 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2019-11-21 10:51:09,872 WARN tool.BaseSqoopTool: Setting your password on the 
command-line is insecure. Consider using -P instead.
2019-11-21 10:51:09,982 WARN sqoop.ConnFactory: Parameter --driver is set to an 
explicit driver however appropriate connection manager is not being set (via 
--connection-manager). Sqoop is going to fall back to 
org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which 
connection manager should be used next time.
2019-11-21 10:51:09,997 INFO manager.SqlManager: Using default fetchSize of 1000
2019-11-21 10:51:09,997 INFO tool.CodeGenTool: Beginning code generation
2019-11-21 10:51:10,443 INFO manager.SqlManager: Executing SQL statement: 
SELECT `FILM_PLAY`.`AUDIENCE_ID` as `FILM_PLAY_AUDIENCE_ID` 
,`FILM_PLAY`.`FILM_ID` as `FILM_PLAY_FILM_ID` ,`FILM_PLAY`.`WATCH_TIME` 
,`FILM_PLAY`.`PAYMENT` as `FILM_PLAY_PAYMENT`  FROM `SC1`.`FILM_PLAY` as 
`FILM_PLAY` INNER JOIN `PUBLIC`.`FILM` as `FILM` ON `FILM_PLAY`.`FILM_ID` = 
`FILM`.`FILM_ID` INNER JOIN `SC2`.`AUDIENCE` as `AUDIENCE` ON 
`FILM_PLAY`.`AUDIENCE_ID` = `AUDIENCE`.`AUDIENCE_ID` WHERE 1=1 AND 
(`FILM_PLAY`.`WATCH_TIME` >= '2017-01-01 00:00:00' AND `FILM_PLAY`.`WATCH_TIME` 
< '2017-12-01 00:00:00')  AND  (1 = 0)
2019-11-21 10:51:10,454 ERROR manager.SqlManager: Error executing statement: 
org.postgresql.util.PSQLException: ERROR: syntax error at or near "."
  Position: 19
org.postgresql.util.PSQLException: ERROR: syntax error at or near "."
  Position: 19
 at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284)
 at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003)
 at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200)
 at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:424)
 at 
org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:161)
 at 
org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:114)
 at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:777)
 at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786)
 at 
org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289)
 at 
org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260)
 at 
org.apache.sqoop.manag

Re: Kylin to PostgreSQL Error in Cube build Step 1

2019-11-21 Thread Xiaoxiang Yu

Yes, actually there is NPE throw by JdbcExplorer, but it is not the first 
exception which related to JdbcExplorer, and that previous exception  should be 
the root cause. You may search "jdbc.extensible" or "JdbcSource" in kylin.log 
and you will find more information. I guess maybe your jdbc information like 
user/password/url is not configured in correct way.



Could you please find it and share with us? And it is OK if you send my the 
whole log file and I will check it.

----
Best wishes,
Xiaoxiang Yu


发件人: Andrey Molotov 
日期: 2019年11月21日 星期四 18:47
收件人: Xiaoxiang Yu 
抄送: "dev@kylin.apache.org" 
主题: Re: Kylin to PostgreSQL Error in Cube build Step 1

Dear Sir,
Thank you for your reply.

1.   Here is kylin.log at the moment, when “Oops… Failed to take action.” 
was thrown:

2019-11-21 13:04:32,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:05:02,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:05:08,277 INFO  [BadQueryDetector] service.BadQueryDetector:147 : 
Detect bad query.
2019-11-21 13:05:32,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:06:02,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:06:08,278 INFO  [BadQueryDetector] service.BadQueryDetector:147 : 
Detect bad query.
2019-11-21 13:06:32,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:07:02,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:07:08,278 INFO  [BadQueryDetector] service.BadQueryDetector:147 : 
Detect bad query.
2019-11-21 13:07:32,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:07:54,609 DEBUG [http-nio-7070-exec-3] 
security.KylinAuthenticationProvider:114 : User ADMIN authorities : 
[ROLE_ADMIN, ROLE_ANALYST, ROLE_MODELER]
2019-11-21 13:07:54,609 DEBUG [http-nio-7070-exec-3] 
security.KylinAuthenticationProvider:57 : User cache [-108, 112, -63, -32, 41, 
-87, -81, 81, -32, 61, -35, -111, 7, 56, -29, -59] is removed due to EXPIRED
2019-11-21 13:07:54,609 DEBUG [http-nio-7070-exec-3] 
security.KylinAuthenticationProvider:128 : Authenticated user 
org.springframework.security.authentication.UsernamePasswordAuthenticationToken@3704d9a0:
 Principal: org.springframework.security.core.userdetails.User@3b40b2f: 
Username: ADMIN; Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; 
credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: 
ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER; Credentials: [PROTECTED]; Authenticated: 
true; Details: 
org.springframework.security.web.authentication.WebAuthenticationDetails@e21a:
 RemoteIpAddress: 172.0.0.66; SessionId: null; Granted Authorities: ROLE_ADMIN, 
ROLE_ANALYST, ROLE_MODELER
2019-11-21 13:07:54,610 DEBUG [http-nio-7070-exec-3] 
controller.UserController:52 : User login: 
org.springframework.security.core.userdetails.User@3b40b2f: Username: ADMIN; 
Password: [PROTECTED]; Enabled: true; AccountNonExpired: true; 
credentialsNonExpired: true; AccountNonLocked: true; Granted Authorities: 
ROLE_ADMIN,ROLE_ANALYST,ROLE_MODELER
2019-11-21 13:08:02,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:08:08,279 INFO  [BadQueryDetector] service.BadQueryDetector:147 : 
Detect bad query.
2019-11-21 13:08:32,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:09:02,365 INFO  [FetcherRunner 308979117-47] 
threadpool.DefaultFetcherRunner:94 : Job Fetcher: 0 should running, 0 actual 
running, 0 stopped, 0 ready, 0 already succeed, 6 error, 2 discarded, 0 others
2019-11-21 13:09:08,279 INFO  [BadQueryDetector] service.BadQueryDetector:147 : 
Detect bad query.
2019-11-21 13:09:32,365 INF

Re: Error on EMR

2019-12-03 Thread Xiaoxiang Yu

Hi, 
   I have successfully deployed latest version of Kylin(3.0.beta) on AWS EMR 
5.27 and build a few cubes successfully, maybe you can have a try? 
   The cluster is created by CLI looks like this, and I deployed Kylin on 
MASTER node:

aws emr create-cluster --applications Name=Hadoop Name=Hive Name=Pig Name=Spark 
Name=Sqoop Name=Tez Name=Zeppelin Name=ZooKeeper Name=Ganglia\
--release-label emr-5.27.0 \
--instance-groups 
'[{"InstanceCount":4,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":200,"VolumeType":"gp2"},"VolumesPerInstance":1}]},"InstanceGroupType":"CORE","InstanceType":"m4.2xlarge","Name":"Worker
 
Cluster"},{"InstanceCount":1,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":100,"VolumeType":"gp2"},"VolumesPerInstance":1}]},"InstanceGroupType":"MASTER","InstanceType":"c4.4xlarge","Name":"MasterQuery"}]'
 \
--configurations 
'[{"Classification":"hdfs-site","Properties":{"dfs.replication":"2"}}]' \
--ebs-root-volume-size 100 \--enable-debugging \
--name 'BenchmarkCluster' \
--scale-down-behavior TERMINATE_AT_TASK_COMPLETION \
--region cn-northwest-1


Best wishes,
Xiaoxiang Yu 
 

在 2019/12/2 20:38，“Tanmay Movva” 写入:

Hello,

We have installed kylin on our EMR master along with hbase, hadoop and
hive. Using download-spark.sh from KYLIN_HOME/bin I have installed spark.
As mentioned in "Install KYLIN on AWS EMR" guide we have followed the steps
to configure Kylin working dir and hbase storage as S3 and also made the
necessary zkquorum changes.

When we run the sample.sh or check-env.sh we don't get any errors. But when
we run the cube build job from UI, the job fails at stage-2 "Redistribute
Flat Hive Tables". As the job "Create Intermediate Hive tables" has been
completed successfully I don't think there has been any error with Hive.

Can anyone help us with this? Thank You.


java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
at 
org.apache.kylin.source.hive.CLIHiveClient.(CLIHiveClient.java:47)
at 
org.apache.kylin.source.hive.HiveClientFactory.getHiveClient(HiveClientFactory.java:27)
at 
org.apache.kylin.source.hive.RedistributeFlatHiveTableStep.computeRowCount(RedistributeFlatHiveTableStep.java:40)
at 
org.apache.kylin.source.hive.RedistributeFlatHiveTableStep.doWork(RedistributeFlatHiveTableStep.java:91)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.conf.HiveConf
at 
org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1928)
at 
org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1771)
... 11 more

-- 
Regards,
Tanmay Krishna Movva
Razorpay

Re: Kylin v2.6.4 support Spark 2.4 or above

2019-12-11 Thread Xiaoxiang Yu

Hi friend,
   In my view, the latest version of Kylin should support Spark2.4, but do not 
support Spark SQL as data source. But I have see some patch which is try to 
achieve such targets. This is one of them, 
https://github.com/apache/kylin/pull/927, you may have a look.


Best wishes,
Xiaoxiang Yu 
 

在 2019/12/9 21:38，“Madhusudhan 
Maankar” 写入:

Hi,

I would appreciate, if could please let me know the details on below points.

1. if Apache Kylin V2.6.4 has support to Spark 2.4 and above?

2. And if Kylin can work without Hive but with Spark only.

Thanks,
Madhusudhan Maankar.

--
Sent from: http://apache-kylin.74782.x6.nabble.com/

Re: Releasing Apache Kylin v3.0-GA

2019-12-05 Thread Xiaoxiang Yu

Good news, I cannot wait to the next generation of Kylin.


Best wishes,
Xiaoxiang Yu


发件人: George Ni 
答复: "u...@kylin.apache.org" 
日期: 2019年12月6日 星期五 10:25
收件人: "u...@kylin.apache.org" , "dev@kylin.apache.org" 

主题: Releasing Apache Kylin v3.0-GA

Hi Community,

As we have released v3.0-alpha, v3.0-alpha2, v3.0-beta, we have enough 
cofidence to
release the GA version for v3.0 next week, and I’m planning to create a branch 
for its release.

Detail features, improvements and bug fixes will come later, the main features 
are:
1. Realtime OLAP
2. Job scheduler with Apache Curator
3. User and user group management

Please feel free to leave your comments here.


-
Best regards,

Ni Chunen / George

Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)

2019-10-23 Thread Xiaoxiang Yu

Hi Sir,
   I cannot see your screenshot, could you please paste the stack trace (in 
kylin.log) or provided more description to us?
  And have you ever try use the workaround which was provided in KYLIN-3970 to 
fix your problem?

Best wishes,
Xiaoxiang Yu


发件人: 田家铮 
答复: "dev@kylin.apache.org" 
日期: 2019年10月24日 星期四 11:44
收件人: dev 
主题: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)

环境配置：
 kylin-2.6.4
 HDP-3.1.4。0

issue:
 3970
 
https://issues.apache.org/jira/browse/KYLIN-3970?jql=text%20~%20%22partition.assignment.strategy%22<https://issues.apache.org/jira/browse/KYLIN-3970?jql=text%20%7E%20%22partition.assignment.strategy%22>


问题：
流式构建读取kafka数据时，无法构建，但是hive构建没问题

[cid:E5BB65D6@3C8BF016.500DB15D]
[cid:1F6BECD4@A04E8373.500DB15D]

Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)

2019-10-24 Thread Xiaoxiang Yu

Hi Sir,
   If I make no mistake, from the logs provided by you, it is clear that the 
workaround/steps provided by me in my later mail has FIXED the previous error. 
   The previous error was occured at Step1(Collect data from kafka), and it is 
a class conflict problem, because we can see the text like " Error: 
org.apache.kafka.clients.consumer.ConsumerConfig.configNames()Ljava/util/Set;". 
   And the later error is occured at Step4(Fact Distinct Column), because we 
can see the text like " 
org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper". I think it is a 
totally NEW problem(has no relation with KYLIN-3970 because it didn't use kafka 
client), and it may be caused by Mapreduce Framework. You should use less data 
and try again or modify the map reduce configuration (by check 
http://kylin.apache.org/docs/install/configuration.html#mr-config-override).
If I make any mistake or you have more findings , please let me know. Thank 
you.

----
Best wishes,
Xiaoxiang Yu 
 

在 2019/10/24 19:10，“tianjiazheng” 写入:

yes，I think  this is the same problem，can  you tell me  the ways? 




--
发自邮洽

_
发件人： dev@kylin.apache.org
发送时间： 2019-10-24 18:58:53
收件人： dev@kylin.apache.org
主题： Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)


Does your error log like this KYLIN-3970 ? If yes, there're some ways to 
solve this problem, it may be helpful.


在 2019/10/24 17:30，“田家铮” 写入:

I followed your steps and found the error report again. Can you help me fix 
this problem?



2019-10-24 05:25:11,171 ERROR [pool-11-thread-2] 
threadpool.DefaultScheduler:116 : ExecuteException 
job:9722333e-789b-1f30-50e6-38e9905802c0
org.apache.kylin.job.exception.ExecuteException: 
org.apache.kylin.job.exception.ExecuteException: 
org.apache.kylin.engine.mr.exception.MapReduceException: no counters for job 
job_1571906689528_0004Job Diagnostics:Task failed 
task_1571906689528_0004_m_00
Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 
killedReduces: 0
Failure task Diagnostics:
[2019-10-24 05:25:00.454]Exception from container-launch.
Container id: container_e01_1571906689528_0004_01_05
Exit code: 255


[2019-10-24 05:25:00.456]Container exited with a non-zero exit code 255. 
Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger 
(org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for 
more info.




[2019-10-24 05:25:00.457]Container exited with a non-zero exit code 255. 
Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger 
(org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for 
more info.








at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:182)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kylin.job.exception.ExecuteException: 
org.apache.kylin.engine.mr.exception.MapReduceException: no counters for job 
job_1571906689528_0004Job Diagnostics:Task failed 
task_1571906689528_0004_m_00
Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 
killedReduces: 0
Failure task Diagnostics:
[2019-10-24 05:25:00.454]Exception from container-launch.
Container id: container_e01_1571906689528_0004_01_05
Exit code: 255


[2019-10-24 05:25:00.456]Container exited with a non-zero exit code 255. 
Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger 
(org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for 
more info.




[2019-10-24 05:25:00.457]Container exited with a non-zero exit code 255. 
Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could

Re: kylin2.6.4流式构建不能与hdp3.1.4.0使用(issue3970)

2019-10-24 Thread Xiaoxiang Yu

您好，我的微信ID是： hit-lacus，您可以加我微信看看能不能一起解决


Best wishes,
Xiaoxiang Yu 
 

在 2019/10/25 10:37，“田家铮” 写入:

I have replaced all the guava versions in mapreduce.tar.gz with 
guava-11.0.2.jar and uploaded them to HDFS again, but the error is still the 
same!

--
java.lang.NoSuchMethodError: 
com.google.common.hash.Hasher.putString(Ljava/lang/CharSequence;)Lcom/google/common/hash/Hasher;

--

Re: kylin使用oracle做为数据源遇到的疑问

2019-11-21 Thread Xiaoxiang Yu

Hi friend,
I am happy to hear that you are interested in Kylin, in fact kylin did NOT make 
a promise to support Oracle as JDBC source in the moment, so you cannot find 
any doc about oracle source because it is not exists. But I think you can still 
have a try.
By the way, we cannot see you screenshot picture. And it is great if you could 
provided you config in Kylin .


Best wishes,
Xiaoxiang Yu


发件人: 莪哭ㄋ誰疼 <13205288...@qq.com>
答复: "dev@kylin.apache.org" 
日期: 2019年11月22日 星期五 15:13
收件人: dev 
主题: kylin使用oracle做为数据源遇到的疑问

kylin的开发团队：

见信好，我是一名kylin爱好者，我想深入学习使用kylin，我现在遇到一个问题。


问题描述如下：我是使用oracle作为原始数据的，然后的我按照JDBC数据源的配置方式，全部配置成功，表也都可以load到kylin里的，但在build我的Cube的时候报错了，报错我传到附件里面去了，后台打出来的sql语句我分析判断是方言错了，但是在官方的文档上没有找到相关的配置说明，还请大神指点迷津。然后我把我的配置也贴在邮件里了，望指教。

发送人：某位不知名的小人物

[cid:F38F3FE0@A814B056.8A52D65D.jpg]

Re: metastore clean OutOfMemoryError

2019-11-21 Thread Xiaoxiang Yu

You should a log like hs_err_pid19438.log, could you please show the content to 
us?


Best wishes,
Xiaoxiang Yu 
 

在 2019/11/22 15:00，“MrWell” 写入:

Hi, Kylin Team.

When I execute "bin/metastore.sh clean --delete true" , I get a 
"OutOfMemoryError" like this



java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid4839.hprof ...
Heap dump file created [317991670bytes in 2.120 secs]
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
# Executing /bin/sh -c "kill -9 4839"...
bin/metastore.sh: line 109: 4839 Killed
 ${KYLIN_HOME}/bin/kylin.sh 
org.apache.kylin.tool.MetadataCleanupJob "${@:2}"





I have set 'setenv.sh' file, like this


export KYLIN_JVM_SETTINGS="-Xms16g -Xmx16g -XX:MaxPermSize=512m 
-XX:NewSize=3g -XX:MaxNewSize=3g -XX:SurvivorRatio=4 
-XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled 
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
-XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly 
-XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:$KYLIN_HOME/logs/kylin.gc.$$ 
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M"


Dose it means heap memory is still small?

Re: Cube data tables for cube-wizard tutorial

2019-11-28 Thread Xiaoxiang Yu

Unfortunately, the image you sent to mailing list is NOT shown correctly, could 
you please re-send your image and the content of your kylin.log and version 
information of your Hadoop env to mailing list?  Thank you.

You can choose to store your image in somewhere on the Internet, and paste URL 
of image in mail.


Best wishes,
Xiaoxiang Yu


发件人: Ben Lee 
答复: "dev@kylin.apache.org" 
日期: 2019年11月28日 星期四 08:03
收件人: "dev@kylin.apache.org" 
主题: Cube data tables for cube-wizard tutorial

Hi team,

I'm trying to follow the following tutorial
http://kylin.apache.org/docs/tutorial/create_cube.html
to create cube


But I noticed that I missed the following DB and tables. ( tried to run 
samples.sh)

[cid:ii_k3hm10iq0]

Can anyone share where to get the data or share any other tutorial as a 
bundle together?

Thanks,
Ming

Re: Cube data tables for cube-wizard tutorial

2019-11-28 Thread Xiaoxiang Yu

Hi ming,
I can not understand what happen in env. Could you please re-describe your 
problem in detail?
Looks like recharge_detail is loaded successfully in Kylin. Do you mean when 
you loaded a table into Kylin, but cannot find it when you want to create a 
model? Do you check your kylin.log and maybe you will find some clue.


Best wishes,
Xiaoxiang Yu


发件人: Ben Lee 
日期: 2019年11月29日 星期五 00:27
收件人: Xiaoxiang Yu 
主题: Re: Cube data tables for cube-wizard tutorial

Hi Xiaoxiang,

Thanks for your response.Here is the picture again,
[cid:image001.png@01D5A6A8.757A6610]
Thanks,
Ming.



On Thu, 28 Nov 2019 at 00:51, Xiaoxiang Yu 
mailto:xiaoxiang...@kyligence.io>> wrote:
Unfortunately, the image you sent to mailing list is NOT shown correctly, could 
you please re-send your image and the content of your kylin.log and version 
information of your Hadoop env to mailing list?  Thank you.

You can choose to store your image in somewhere on the Internet, and paste URL 
of image in mail.


Best wishes,
Xiaoxiang Yu


发件人: Ben Lee mailto:benmin...@gmail.com>>
答复: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" 
mailto:dev@kylin.apache.org>>
日期: 2019年11月28日 星期四 08:03
收件人: "dev@kylin.apache.org<mailto:dev@kylin.apache.org>" 
mailto:dev@kylin.apache.org>>
主题: Cube data tables for cube-wizard tutorial

Hi team,

I'm trying to follow the following tutorial
http://kylin.apache.org/docs/tutorial/create_cube.html
to create cube


But I noticed that I missed the following DB and tables. ( tried to run 
samples.sh)

错误!未指定文件名。

Can anyone share where to get the data or share any other tutorial as a 
bundle together?

Thanks,
Ming

Re: Kylin to PostgreSQL Error in Cube build Step 1

2019-10-29 Thread Xiaoxiang Yu

Hi Molotov,
   The PR is under review and test, and In my side it is OK, you can check the 
test with screenshot at page (https://github.com/apache/kylin/pull/902) to see 
if it is tested well. If you want to test it at your env, please let me know, 
and I will send the binary to you.  


Best wishes,
Xiaoxiang Yu 
 

在 2019/10/28 15:08，“Andrey Molotov” 写入:

Hello, thank you for your answer.
I pulled the commit you provided and compiled jar file (two jar files, 
actually: kylin-source-jdbc-3.0.0-SNAPSHOT.jar and 
kylin-jdbc-3.0.0-SNAPSHOT.jar). Then for each of these files I did following: 
renamed it and put it instead of existing kylin-jdbc-2.6.4.jar file in 
kylin/lib directory. 
But unfortunately this did help me resolve my problem with the backtick in 
SQL query.
Is there any other way to get a proper query line for PostgreSQL or maybe I 
did something wrong? Thanks in advance.

> 16 окт. 2019 г., в 02:51, "codingfor...@126.com"  
написал(а):
> 
> Hi, Molotov, because postgresql's syntax and metadata have certain 
specialities, need to do some development work. PR 
https://github.com/apache/kylin/pull/747 
<https://github.com/apache/kylin/pull/747> id doing this kind of thing, it is 
in review now.
> 
>>> 在 2019年10月15日，20:54，Andrey Molotov  写道：
>> Hello, everyone.
>> I’ve set up Kylin to access a PostgreSQL Database using JDBC as 
described in http://kylin.apache.org/docs/tutorial/setup_jdbc_datasource.html .
>> I’ve also set kylin.source.default=16 and 
kylin.source.hive.enable.quote=false in kylin.properties.
>> But when I try to build a cube a get an error on #1 Step Name: Sqoop To 
Flat Hive Table.
>> My Kylin Version is 2.6.4.
>> Here is log:
>>  java.io.IOException: OS command error exit with return 
code: 1, error message: Error: Could not find or load main class 
org.apache.hadoop.hbase.util.GetJavaProperty
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in 
[jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in 
[jar:file:/opt/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in 
[jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 2019-10-15 08:40:23,908 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
>> 2019-10-15 08:40:23,936 WARN tool.BaseSqoopTool: Setting your password 
on the command-line is insecure. Consider using -P instead.
>> 2019-10-15 08:40:24,004 WARN sqoop.ConnFactory: Parameter --driver is 
set to an explicit driver however appropriate connection manager is not being 
set (via --connection-manager). Sqoop is going to fall back to 
org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which 
connection manager should be used next time.
>> 2019-10-15 08:40:24,017 INFO manager.SqlManager: Using default fetchSize 
of 1000
>> 2019-10-15 08:40:24,017 INFO tool.CodeGenTool: Beginning code generation
>> 2019-10-15 08:40:24,164 INFO manager.SqlManager: Executing SQL 
statement: SELECT "installations"."city" AS "INSTALLATIONS_CITY", 
"installations"."device_type" AS "INSTALLATIONS_DEVICE_TYPE", 
"installations"."install_datetime"
>> FROM "data"."installations" AS "installations"
>> WHERE 1 = 1 AND ("installations"."install_datetime" >= '2019-01-01' AND 
"installations"."install_datetime" < '2019-01-03') AND  (1 = 0)
>> 2019-10-15 08:40:24,176 INFO manager.SqlManager: Executing SQL 
statement: SELECT "installations"."city" AS "INSTALLATIONS_CITY", 
"installations"."device_type" AS "INSTALLATIONS_DEVICE_TYPE", 
"installations"."install_datetime"
>> FROM "data"."installations" AS "installations"
>> WHERE 1 = 1 AND ("installations"."install_datetime" >= '2019-01-01' AND 
"installations"."install_datetime" < '2019-01-03') AND  (1 = 0)
>> 2019-10-15 08:40:24,200 INFO orm.CompilationManager: HADOOP_MAPRED_HOME 
is /opt/hadoop
>> Note: 
/tmp/sqoop-hadoop/compile/33bbb7f633bb5f8338ed0a8e1e7ce3cc/QueryResult.java 
uses or overrides a deprecated API.
>> Note: Recompile with -Xlint:deprecation for details.
>>

Re: New committer: Chao Long

2019-10-08 Thread Xiaoxiang Yu

Chao, Congratulations!!


Best wishes,
Xiaoxiang Yu 
 

在 2019/10/7 10:01，“zjsy...@163.com 代表 nichunen” 写入:

Congratulations!




Best regards,

 

Ni Chunen / George



On 10/7/2019 09:22，Yichen Zhou wrote：
Congratulations, Chao!!!

Best,
Yichen

On Sun, Oct 6, 2019 at 6:19 PM ShaoFeng Shi  wrote:

The Project Management Committee (PMC) for Apache Kylin
has invited Chao Long to become a committer and we are pleased to announce
that he has accepted.

Chao Long (龙超，email: wayn...@qq.com) has started to contribute to Kylin
since last year. Till today, he has made 81 commits on the master branch,
resolved 71 JIRAs. His contribution includes: making fact distinct job in
Spark, merging dictionary on Yarn, improving cube planner,  parquet storage
PoC, and many bug fixes. Besides, he also answered many questions on the
mailing lists.

Congratulations, Chao!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re: [Announce] Apache Kylin 3.0.0 released

2019-12-24 Thread Xiaoxiang Yu

May new version of Kylin be even more powerful and stable!


Best wishes,
Xiaoxiang Yu 
 

在 2019/12/24 13:53，“Wang rupeng” 写入:

Congratulations! 

---
Best wishes,
Rupeng Wang
 


在 2019/12/24 12:08，“Xiaoyuan Gu” 写入:

Big congrats! Looking forward to seeing Kylin embraces more 
state-of-the-art features. Kudos to all contributors!


Bests,
Xiaoyuan



At 2019-12-20 20:45:16, "ShaoFeng Shi"  wrote:
>The Apache Kylin team is pleased to announce the immediate 
availability of
>the 3.0.0 release.
>
>This is the GA release of Kylin’s next generation after 2.x, with the 
new
>real-time OLAP feature, Kylin can query streaming data with sub-second
>latency. All of the
> changes in this release can be found in:
>https://kylin.apache.org/docs/release_notes.html
>
>
>You can download the source release and binary packages from Apache 
Kylin's
>download page:https://kylin.apache.org/download/
>
>
>Apache Kylin is an open-source Distributed Analytics Engine designed to
>provide SQL interface and multi-dimensional analysis (OLAP) on Apache
>Hadoop, supporting extremely
> large datasets.
>
>
>Apache Kylin lets you query massive dataset at sub-second latency in 3
>steps:
>1. Identify a star schema or snowflake schema data set on Hadoop.
>2. Build Cube on Hadoop.
>3. Query data with ANSI-SQL and get results in sub-second, via ODBC, 
JDBC
>or RESTful API.
>
>
>Thanks to everyone who has contributed to this release.
>
>
>We welcome your help and feedback. For more information on how to 
report
>problems, and to get involved, visit the project website at
>https://kylin.apache.org/
>
>Best regards,
>
>Shaofeng Shi 史少锋
>Apache Kylin PMC
>Email: shaofeng...@apache.org
>
>Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>Join Kylin user mail group: user-subscr...@kylin.apache.org
>Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re:Kylin start up failing on EMR 5.28

2020-02-09 Thread Xiaoxiang Yu

Dear friend,
If I make no mistake, it looks like you faced two problem. 
The first one you are facing is how to set up kylin in single EMR cluster, 
I have tested it last month, I think 
http://kylin.apache.org/docs31/install/kylin_aws_emr.html and  
https://github.com/hit-lacus/hit-lacus.github.io/issues/81 may help you solve 
your problem.  The second one is about R/W separated deployment, in this case, 
could you please show us how you configure in your kylin.properties? I will 
have a test when I have some free time.

--

Best wishes to you ! 
From ：Xiaoxiang Yu

At 2020-02-09 11:55:44, "Raghu Ram Reddy Medapati"  
wrote:
>I've setup kylin on EMR edge node, when i try to start the service, i get the 
>below error.
>Kylin -->Latest (3.0)
>EMR --> 5.28
>ERROR [localhost-startStop-1] org.springframework.web.context.ContextLoader - 
>Context initialization failed
>org.springframework.beans.factory.BeanCreationException: Error creating bean 
>with name 
>'org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter':
> Instantiation of bean failed; nested exception is 
>org.springframework.beans.BeanInstantiationException: Failed to instantiate 
>[org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter]:
> Constructor threw exception; nested exception is 
>java.lang.ClassCastException: com.fasterxml.jackson.datatype.joda.JodaModule 
>cannot be cast to com.fasterxml.jackson.databind.Module
>   at 
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateBean(AbstractAutowireCapableBeanFactory.java:1155)
>  ~[spring-beans-4

Re:Need support

2020-02-16 Thread Xiaoxiang Yu

Hi friend,
Have you ever checked this doc 
http://kylin.apache.org/docs/install/index.html ? What kind of concrete problem 
did you met? As far as I know, test has passed on Hortonworks HDP 2.4-2.6 and 
3.0 - 3.1 for latest Kylin binary.

--

Best wishes to you ! 
From ：Xiaoxiang Yu

At 2020-02-16 12:11:17, "Dinesh Dhanasekaran"  
wrote:
>Hi Team,
>
>I need to install Kylin , could you please help me for installation on Ambari.
>
>Regards,
>Dinesh Dhanasekaran

Re: [VOTE] Release apache-kylin-2.6.5 (RC2)

2020-02-16 Thread Xiaoxiang Yu

+1mvn test passed in my dev env







--

Best wishes to you ! 
From ：Xiaoxiang Yu



At 2020-02-16 13:45:19, "Yaqian Zhang"  wrote:
>+1
>mvn test passed
>
>> 在 2020年2月16日，10:45，Xiaoyuan Gu  写道：
>> 
>> +1
>> mvn test passed
>> 
>> 
>> 
>> 
>> Bests,
>> Xiaoyuan Gu
>> 
>> 
>> 
>> 
>> 
>> At 2020-02-14 21:21:30, "George Ni"  wrote:
>>> Hi all,
>>> 
>>> 
>>> 
>>> I have created a build for Apache Kylin 2.6.5, release candidate 2.
>>> 
>>> 
>>> 
>>> Changes highlights:
>>> 
>>> [KYLIN-4374] - Fix security issues reported by code analysis platform LGTM
>>> 
>>> [KYLIN-4291] - Parallel segment building may causes WriteConflictException
>>> 
>>> [KYLIN-4263] - Inappropriate exception handling causes job stuck on running
>>> status
>>> 
>>> [KYLIN-4169] - Too many logs during DataModelManager’s initiation, cause
>>> the first RESTful API to hang for a long time
>>> 
>>> 
>>> 
>>> Thanks to everyone who has contributed to this release.
>>> 
>>> Here are the release notes:
>>> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346281==12316121
>>> 
>>> 
>>> 
>>> The commit to being voted upon:
>>> 
>>> https://github.com/apache/kylin/commit/73d42edec5f6492b3d3ffc222c26dce4bdfe7263
>>> 
>>> 
>>> Its hash is 73d42edec5f6492b3d3ffc222c26dce4bdfe7263.
>>> 
>>> 
>>> 
>>> The artifacts to be voted on, including the source package and four
>>> 
>>> pre-compiled binary packages are located here:
>>> 
>>> https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.5-rc2/
>>> 
>>> 
>>> 
>>> The hash of the artifacts are as follows:
>>> 
>>> apache-kylin-2.6.5-source-release.zip.sha256
>>> 
>>> 32fb722a58ed49318dc5f72da429506238ee2089170236c742f22240d93080a3
>>> 
>>> apache-kylin-2.6.5-bin-hbase1x.tar.gz.sha256
>>> 
>>> f8e04fafa7e63cdeb28aa4e8c35bc2f54456888bef37d4cca060c424be722e1e
>>> 
>>> apache-kylin-2.6.5-bin-cdh57.tar.gz.sha256
>>> 
>>> 71deb9bc84e5b75320e2c8ca358449f808ef2decbee11679e7f4339abeeb3d3f
>>> 
>>> apache-kylin-2.6.5-bin-hadoop3.tar.gz.sha256
>>> 
>>> e77218204800a7b6be42e90e0fcb03e6e8458d600125e44bc112a8a359291fc4
>>> 
>>> apache-kylin-2.6.5-bin-cdh60.tar.gz.sha256
>>> 
>>> 2f7003eba8b528198aa7f886be04ecb75e86ece9e2e67dba4c15a89fb8817f7a
>>> 
>>> 
>>> 
>>> A staged Maven repository is available for review at:
>>> 
>>> https://repository.apache.org/content/repositories/orgapachekylin-1075/
>>> 
>>> 
>>> 
>>> Release artifacts are signed with the following key:
>>> 
>>> https://people.apache.org/keys/committer/nic.asc
>>> 
>>> 
>>> 
>>> Please vote on releasing this package as Apache Kylin 2.6.5
>>> 
>>> 
>>> 
>>> The vote is open for the next 72 hours and passes if a majority of
>>> 
>>> at least three +1 PMC votes are cast.
>>> 
>>> 
>>> 
>>> [ ] +1 Release this package as Apache Kylin 2.6.5
>>> 
>>> [ ]  0 I don't feel strongly about it, but I'm okay with the release
>>> 
>>> [ ] -1 Do not release this package because...
>>> 
>>> 
>>> 
>>> Here is my vote:
>>> 
>>> 
>>> 
>>> +1 (binding)
>>> 
>>> -- 
>>> 
>>> -
>>> 
>>> Best regards,
>>> 
>>> 
>>> 
>>> Ni Chunen / George

Re:[DISCUSS] Collect Kylin best practices with Apache Wiki

2020-02-18 Thread Xiaoxiang Yu

+1
Great, wiki is good place to share knowledge and it is easy to use. I will 
share what I found when I have some free time. 








--

Best wishes to you ! 
From ：Xiaoxiang Yu



At 2020-02-18 20:15:52, "ShaoFeng Shi"  wrote:

Hello Kylin users,


I'm proposing to collect the Kylin best practices with Apache Wiki. I have 
created an entry page, and start to compose some there. If you want to share or 
contribute, please email to the group, then we will review and add to it. The 
practice should be brief and easy to understand; If it need to dive into 
detail, a reference link can be provided together. Let's try, thank you!


Here is the wiki link:
https://cwiki.apache.org/confluence/display/KYLIN/Best+practices



Best regards,


Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org


Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re:Kylin Building Engine With SparkSql & Parquet

2020-01-19 Thread Xiaoxiang Yu

Great news! 
I can foresee Kylin could be in a more Cloud-Native way after the mature of 
parquet storage. And I wish the developer team will share more detail for its 
desgin.




--

Best wishes to you ! 
From ：Xiaoxiang Yu



At 2020-01-19 22:22:30, "George Ni"  wrote:
>Hi Kylin users & developers,
>
>By-layer Spark Cubing has been introduced into Apache Kylin since v2.0 to
>achieve better performance and it does run much faster compared to MR
>engine. Also Hbase has been Kylin’s trustful storage engine since Kylin was
>born and it has been proved to be a success for providing the ability to
>handle high concurrency queries in extremely large data scale with low
>latency. But there are also limitations for HBase, such as filtering is not
>flexible as we could only filter by RowKey, measures are usually combined
>together which causes more data to be scanned than requested.
>
>
>
>So in order to optimize Kylin in both building strategy and storage engine,
>development team of Kyligence is introducing a new cube building engine
>which uses Spark Sql to construct cuboids with a new strategy and stores
>cube results in Parquet files. The building strategy allows Kylin to build
>cuboids in a smarter way by choosing and building on the optimal cuboid
>source. And Parquet, a columnar storage format available to any project in
>the Hadoop ecosystem, will power the filtering ability with the page-level
>column index and reduce I/O by saving measures in different columns. Also
>with Storing cuboid in Parquet instead of Hbase, we can utilize Kylin in
>Cloud Native way. More information on design and technique details will
>come soon.
>
>
>
>Below is the comparison in building duration and size of results between
>By-layer Spark Cubing and the new cubing strategy.
>
>
>
>Environment
>
>4-nodes Hadoop cluster
>
>YRAN has 400GB RAM and 128 cores in total;
>
>CDH 5.1, Apache Kylin 3.0.
>
>
>
>Spark
>
>Spark 2.4.1-kylin-r17
>
>
>
>Test Data
>
>SSB data
>
>Cube: 15 dimensions, 3 measures (SUM)
>
>
>
>Test Scenarios
>
>Build the cube at different source size level: 30 million, 60 million
>source rows; Compare the build time with Spark (by layer) + Hbase and
>SparkSql + Parquet.
>
>
>Besides, we attempt to resolve many drawbacks in current query engine,
>which relies heavily on Apache Calcite, such as the performance bottleneck
>in aggregating large query results which currently can only be operated by
>a single worker. By embracing SparkSql, this kind of expensive computing
>can be done distributedly. Also combined with Parquet format, plenty of
>filtering optimizations could be applied,which will boost Kylin’s query
>performance significantly. The features will be open source along with
>technique details in the near future.
>
>
>
>   - https://issues.apache.org/jira/browse/KYLIN-4188
>
>
>-- 
>
>-
>
>Best regards,
>
>
>
>Ni Chunen / George

Re: New committer: Xiaoxiang Yu

2019-12-30 Thread Xiaoxiang Yu

Thank you. I am very grateful to everyone who gave me guide and support during 
this period, especially Shaofeng Shi and Gang Ma. 
Wish kylin will be even more powerful and user-friendly in 2020!


--
Best wishes to you ! 
From ：Xiaoxiang Yu



在 2019-12-30 16:46:29，"JiaTao Tao"  写道：
>Congratulations Xiaoxiang!
>
>-- 
>
>
>Regards!
>
>Aron Tao
>
>> 在 2019年12月29日，17:17，ShaoFeng Shi  写道：
>> 
>> Hi folks,
>> 
>> The Project Management Committee (PMC) for Apache Kylin
>> has invited Xiaoxiang Yu to become a committer and we are pleased to
>> announce that he has accepted.
>> 
>> Xiaoxiang Yu (俞霄翔, email hit_la...@126.com) is one of the big data
>> engineers from Kyligence; He started to work on the Kylin project since the
>> middle of 2018. In the past time, he fixed many issues, investigated and
>> verified many new features (especially the v3.0 real-time streaming),
>> enhancements and bug fixes. Thank you and congratulations, Xiaoxiang!
>> 
>> Let's warmly welcome Xiaoxiang as the Kylin committer!
>> 
>> Best regards,
>> 
>> Shaofeng Shi 史少锋
>> Apache Kylin PMC
>> Email: shaofeng...@apache.org
>> 
>> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
>> Join Kylin user mail group: user-subscr...@kylin.apache.org
>> Join Kylin dev mail group: dev-subscr...@kylin.apache.org

Re: kafka-kylin real time streaming

2019-12-25 Thread Xiaoxiang Yu

Hello Sir:
I am very glad to hear your comment. Here is my replies:
1. I see the subject of this email is "real time streaming", it seems to
indicated that you are using the "Realtime OLAP" feature(see
http://kylin.apache.org/docs/tutorial/realtime_olap.html ) which introduced in
Kylin 3.0, not the NRT streaming feature(see
http://kylin.apache.org/docs/tutorial/cube_streaming.html ) which introduced in
Kylin v1.6, am I right? But I see you are using the Rest API "
http://localhost:7070/kylin/api/cubes/{your_cube_name}/init_start_offsets "
which is only for NRT Streaming, not the Realtime OLAP.
2. For the real-time OLAP cube, please check this if you need to update
segment(http://kylin.apache.org/docs/tutorial/lambda_mode_and_timezone_realtime_olap.html).
3. What is real-time OLAP cube? It is the cube which its fact table is
loaded by icon "Add streaming table V2".
4. I have updated my message via
PR(https://github.com/apache/kylin/pull/1018 ), when it be merged, you can
check the new documentation.

Thank you again for your comments. If you have more suggestions, please let
us know.

----
Best wishes,
Xiaoxiang Yu

在 2019/12/25 16:35，“newUser” 写入:

Hello, I am using Kylin version 3.0 I could manage to create streaming cube
But I can't refresh the cube. I am getting this error.

The new refreshing segment kafka_kylin_cube[20191225093200_20191225093200]
does not match any exisiting segment in cube CUBE[name=kylin_cube]

But when I run this code I get successfull message but cant refresh the
cube.

curl -X PUT --user ADMIN:KYLIN -H "Content-Type:
application/json;charset=utf-8" -d '{ "sourceOffsetStart": 0,
"sourceOffsetEnd": 9223372036854775807, "buildType": "REFRESH"}'
http://localhost:7070/kylin/api/cubes/{your_cube_name}/init_start_offsets

--
Sent from: http://apache-kylin.74782.x6.nabble.com/

Re:Kylin GUI error "Cannot get HiveTableMeta" on AWS EMR with Glue as hive metastore

2020-04-09 Thread Xiaoxiang Yu

For 3.0.2, we are trying to fix a few security issues, and code change is under 
review. 
I think we maybe start to plan a release RC for 3.0.2 in next two week, depend 
on the review result.

--

Best wishes to you ! 
From ：Xiaoxiang Yu

At 2020-04-10 09:21:59, "mvishnubhatta"  
wrote:
>Ah, that makes sense. I didn't look closely at the "31" in the link. I will
>try out your suggestions, but is there any planned timeline for 3.0.2 or
>3.1?
>
>Appreciate your quick response.
>
>--
>Sent from: http://apache-kylin.74782.x6.nabble.com/

Re:Kylin GUI error "Cannot get HiveTableMeta" on AWS EMR with Glue as hive metastore

2020-04-09 Thread Xiaoxiang Yu

Hi, 
Glad to hear you that you have fix previous issue and thank you for your 
update, that is great. For the AWS Glue support with Kylin, that feature will 
be introduced in future release(should be 3.0.2 and 3.1.0), please check 
https://issues.apache.org/jira/browse/KYLIN-4206, the doc 
https://kylin.apache.org/docs31/install/kylin_aws_emr.html is for Kylin 3.1, 
and it is not released yet.


For now, you can build binary package from master branch and do a test, 
this is my https://github.com/hit-lacus/hit-lacus.github.io/issues/81 .
Besides, another committer Kaige has another suggestion(but I didn't test 
it), you can check it:  https://issues.apache.org/jira/browse/KYLIN-3685 .







--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-04-10 05:29:04, "mvishnubhatta"  
wrote:
>Hi,
>
>I am trying to set up Kylin on an EMR cluster (trying both on a master node
>and on an edge node). While the GUI shows up, I get a "Cannot get
>HiveTableMeta" error when trying to load a hive table.
>
>The hive metastore is AWS Glue.
>
>Curiously, the  most recent version of the document
><https://kylin.apache.org/docs/install/kylin_aws_emr.html>   does not say
>anything about Glue)
>
>Looking at an older version of the document
><https://kylin.apache.org/docs31/install/kylin_aws_emr.html>  , I tried the
>following:
>
>cp /usr/lib/hive/auxlib/aws-glue-datacatalog-hive2-client.jar
>$KYLIN_HOME/lib/
>cp
>/usr/share/aws/hmclient/lib/aws-glue-datacatalog-client-common-1.11.0-SNAPSHOT.jar
>$KYLIN_HOME/lib/
>#Modify kylin.properties to uncomment the entry
>kylin.source.hive.client=cli
>#Modify kylin.properties to add an entry that says 
>kylin.source.hive.metadata-type=gluecatalog
>
>But I still get the same error on the GUI.
>
>I feel this Kylin server is not even aware that Glue is the hive catalog
>since I cannot find any reference to Glue in any of the logs or error
>messages.
>
>Any ideas on how to set up the hive metastore correctly?
>
>Hadoop version is Hadoop 2.8.5-amzn-4
>Kylin version is 3.0.1
>Hive 2.3.5-amzn-0
>
>I exported the following variables:
>export KYLIN_HOME=/usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x
>#export HADOOP_HOME=/usr/lib/hadoop/etc/hadoop
>export HADOOP_HOME=/usr/lib/hadoop
>export HBASE_HOME=/usr/lib/hbase/
>export HIVE_HOME=/usr/lib/hive/
>export HADOOP_CONF_DIR=/etc/hadoop/conf
>export HIVE_LIB=/usr/lib/hive/lib
>export HIVE_CONF=/etc/hive/conf
>export HCAT_HOME=/usr/lib/hive-hcatalog 
>
>The kylin.log file throws this error:
>2020-04-09 21:17:21,243 ERROR [http-bio-7070-exec-1]
>controller.TableController:130 : Failed to load Hive Table
>java.lang.RuntimeException: cannot get HiveTableMeta
>at
>org.apache.kylin.source.hive.HiveMetadataExplorer.loadTableMetadata(HiveMetadataExplorer.java:68)
>at
>org.apache.kylin.rest.service.TableService.extractHiveTableMeta(TableService.java:211)
>at
>org.apache.kylin.rest.service.TableService.loadHiveTablesToProject(TableService.java:137)
>at
>org.apache.kylin.rest.controller.TableController.loadHiveTables(TableController.java:114)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>at
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:498)
>at
>org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
>at
>org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
>at
>org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)
>at
>org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
>at
>org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
>at
>org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
>at
>org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)
>at
>org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)
>at
>org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
>at
>org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
>at javax.servlet.http.HttpServlet.service(HttpS

Re:Kylin Start fails with NoClassDefFoundError on AWS EMR Edge Node

2020-04-09 Thread Xiaoxiang Yu

Hi friend, 
I have a suggestion and I am not sure if it is the best practice, and I 
don't know if it works. But maybe you can have a try if you are willing to do 
it.
I guess maybe you have installed Kylin successfully on Master node of EMR 
cluster, right? After kylin instance was started successfully(if you are using 
the kylin 3.0 or above), you will see some files with cached-*-dependency.sh 
under $KYLIN_HOME/bin, in all these files, you can find the location of jars 
that is needed by Kylin. You may try copy the jars from master node to edge 
node in the same folder, then restart Kylin in edge node.
On the another hand, you don't need to modify the property 
"kylin.job.mr.lib.dir" when starting a Kylin instance.

Following :

  cached-hadoop-conf-dir.sh
  cached-hbase-dependency.sh
  cached-hive-dependency.sh
  cached-kafka-dependency.sh
  cached-spark-dependency.sh

   [root@cdh-master all]# cat bin/cached-hive-dependency.sh
export 
hive_dependency=/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/conf:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-jdbc-standalone.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/junit-4.11.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/commons-collections-3.2.2.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-service.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/metrics-jvm-3.0.2.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/ant-launcher-1.9.1.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/log4j-1.2.16.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/datanucleus-api-jdo-3.2.6.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-cli-1.1.0-cdh5.7.6.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib/hive-ant.jar:/opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/../lib/hive/lib
 .

--

Best wishes to you ! 
From ：Xiaoxiang Yu

At 2020-04-09 09:22:59, "mvishnubhatta"  
wrote:
>Hi,
>
>I am trying to set up a Kylin server on an AWS EMR edge node. When doing
>that the web URL throws the error that "The origin server did not find a
>current representation for the target resource or is not willing to disclose
>that one exists".
>
>The tomcat localhost log file shows a NoClassDefFoundError for the class
>org/apache/hadoop/hive/metastore/api/NoSuchObjectException 
>
>But I see this class in the hive library
>/usr/lib/hive/lib/hive-metastore-2.3.5-amzn-0.jar and the path is exported
>as HIVE_LIB and as HIVE_HOME (used HIVE_LIB based on suggestion here
>https://issues.apache.org/jira/browse/KYLIN-2511).
>
>This appears to be a case where some CLASSPATH somewhere is not correctly
>including this path.
>
>So, I tried modifying the kylin.properties file to add a property called
>kylin.job.mr.lib.dir pointing to the /usr/lib/hive/lib/ but that didnt help
>either. (This too was based on another suggested workaround here).
>
>Can someone point out what I am missing?
>
>Here are the full details:
>
>Hadoop version is Hadoop 2.8.5-amzn-4
>Kylin version is 3.0.1
>Hive 2.3.5-amzn-0
>
>I have an AWS EMR Cluster and a separate EC2 machine that is set up as an
>edge node. From this edge node I am able to run hive queries to connect to
>the EMR Master node. Hbase also works on this node.
>
>On this edge node, I are trying to start a Kylin server. I followed the
>instructions given here: 
>https://kylin.apache.org/docs/install/kylin_aws_emr.html
><https://kylin.apache.org/docs/install/kylin_aws_emr.html>  
>
>I exported the following variables:
>export KYLIN_HOME=/usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x
>#export HADOOP_HOME=/usr/lib/hadoop/etc/hadoop
>export HADOOP_HOME=/usr/lib/hadoop
>export HBASE_HOME=/usr/lib/hbase/
>export HIVE_HOME=/usr/lib/hive/lib/
>export HADOOP_CONF_DIR=/etc/hadoop/conf
>export HIVE_LIB=/usr/lib/hive/lib
>export HIVE_CONF=/etc/hive/conf
>export HCAT_HOME=/usr/lib/hive-hcatalog
>
>
>The command I run is: $KYLIN_HOME/bin/kylin.sh start
>
>After starting it, I get this message:
>
>/A new Kylin instance is started by hadoop. To stop it, run 'kylin.sh stop'
>Check the log at
>/usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x/logs/kylin.log
>Web UI is at http://ip-x.compute.internal:7070/kylin/
>
>When I navigate to the URL, I get the following error.
>
>"The origin server did not find a current representation for the target
>resource or is not willing to d

RE: Kylin 3.0 - Need some help - Can't open the Kylin UI

2020-04-16 Thread Xiaoxiang Yu

Hi friend,
I can felt your pain, but please be patient. 
In my opinion, to make good use of Apache Kylin, or any other complex 
system, some prerequisites should be met. Firstly, I think understanding 
Kylin's overall architecture, some knowledge of analysing 
logs(kylin.log/kylin.out) of Kylin are very important, only in this way you can 
narrow down the problem you faced.  After problem was identified, if you have 
some knowledge of how to manipulate Hadoop components: such as Apache HBase, 
Apache HDFS and Apache Hive, I think you can fix problem yourself or ask for 
assistance from community. Hortonworks have provided some documentation which 
looks good to 
me(https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/starting-hive/content/hive_start_a_command_line_query_locally.html),
 maybe you can have a try.


Besides, did you ever try to use docker image to start kylin in one 
command? It will give you a learning env without any Hadoop cluster.  I think 
it works well in my laptop(Macbook Pro,15 inch with the latest Docker Desktop 
installed), here is the link, 
https://hub.docker.com/r/apachekylin/apache-kylin-standalone













--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-04-15 21:46:49, "Phillip Poirier"  wrote:
>Not that misery always loves company, but I had the previous Kylin version
>and was trying to use it with the Hortonworks 2.0 distro, and even though I
>could eventually get the service to start, the UI would never launch.  The
>few people that responded to me told me to try things I already tried (like
>try a different URL or try using different browsers) and it never worked.
>As someone who's spent decades working in BI and OLAP (especially in SSAS,
>Essbase and TM1) I too was very interested in trying Kylin as a big data
>OLAP option and even took the Udemy course on it.  Unfortunately, I just had
>to sit and watch as the course instructor performed cube-building tasks as I
>was never able to get the UI to launch.
>
> 
>
>If you get a real answer to this issue, I'd like to try again.  I have the
>Hortonworks 3.0 distro now, and would like to try the new version of Kylin,
>if possible.  If not, I'll take a look at Druid.
>
> 
>
>Good luck.
>
> 
>
>-Phil
>
> 
>
>From: Rubio Piqueras, David [mailto:david.ru...@gft.com] 
>Sent: Tuesday, April 14, 2020 11:58 AM
>To: dev@kylin.apache.org
>Subject: Kylin 3.0 - Need some help - Can't open the Kylin UI
>
> 
>
>Hi guys,
>
> 
>
>Just starting using Kylin too which looks so interesting.
>
>We downloaded your Kylin 3.0 image version:
>
> 
>
>
>
>However, I can't open the Kylin UI anytime I try to open it. Checking the
>logs, this is what we are seeing:
>
> 
>
>2020-04-14 13:58:23,853 INFO  [main-SendThread(localhost:2181)]
>zookeeper.ClientCnxn:1235 : Session establishment complete on server
>localhost/127.0.0.1:2181, sessionid = 0x17178f6fbd6000a, negotiated timeout
>= 4
>
>Exception in thread "main" java.lang.IllegalArgumentException: Failed to
>find metadata store by url: kylin_metadata@hbase
>
>at
>org.apache.kylin.common.persistence.ResourceStore.createResourceStore(Resour
>ceStore.java:101)
>
>at
>org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.jav
>a:113)
>
>at
>org.apache.kylin.rest.service.AclTableMigrationTool.checkIfNeedMigrate(AclTa
>bleMigrationTool.java:99)
>
>at
>org.apache.kylin.tool.AclTableMigrationCLI.main(AclTableMigrationCLI.java:43
>)
>
>Caused by: java.lang.reflect.InvocationTargetException
>
>at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>Method)
>
>at
>sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAcces
>sorImpl.java:62)
>
>at
>sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstruc
>torAccessorImpl.java:45)
>
>at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>
>at
>org.apache.kylin.common.persistence.ResourceStore.createResourceStore(Resour
>ceStore.java:94)
>
>... 3 more
>
>Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed
>after attempts=1, exceptions:
>
>Tue Apr 14 13:58:24 UTC 2020,
>RpcRetryingCaller{globalStartTime=1586872703980, pause=100, retries=1},
>java.net.ConnectException: Connection refused
>
> 
>
>at
>org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetrying
>Caller.java:147)
>
>at
>org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture
>.run(ResultBoundedCompletionService.java:64)
>
>at
>java.util.concurrent.ThreadPoolExecutor.runWorker(Thre

Re:Kylin GUI error "Cannot get HiveTableMeta" on AWS EMR with Glue as hive metastore

2020-04-11 Thread Xiaoxiang Yu

Thank you for update on this. And appreciate Kaige 's contribution.










--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-04-10 22:38:00, "mvishnubhatta"  
wrote:
>Great. Thank you. I tried out Kaige's suggestion on
>https://issues.apache.org/jira/browse/KYLIN-3685 that you pointed me to and
>that worked for me.
>
>--
>Sent from: http://apache-kylin.74782.x6.nabble.com/

Re:[VOTE] Release apache-kylin-2.6.6 (RC1)

2020-05-11 Thread Xiaoxiang Yu

Hi Kylin team,
I am sadly to say that I find one patch was not introduced in both two 
release candidates, thus cause some serious Hive integration error, such as 
Glue support. Here is the patch link : 
https://github.com/apache/kylin/pull/1115/ . 
So my vote is: -1.










--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-05-11 20:51:14, "George Ni"  wrote:
>Hi all,
>
>
>
>I have created a build for Apache Kylin 2.6.6, release candidate 1.
>
>
>
>Changes highlights:
>
>[KYLIN-4390] - Update tomcat to 7.0.100
>
>[KYLIN-4426] - Refine CliCommandExecutor
>
>[KYLIN-4206] - Support Glue as Hive Metatdata
>
>
>
>Thanks to everyone who has contributed to this release.
>
>Here are the release notes:
>
>https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346976==12316121
>
>
>
>The commit to being voted upon:
>https://github.com/apache/kylin/commit/b2f6fea3368f6b75892cf38294c2d18696758fa7
>
>
>Its hash is b2f6fea3368f6b75892cf38294c2d18696758fa7.
>
>
>
>The artifacts to be voted on, including the source package and four
>
>pre-compiled binary packages are located here:
>
>https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.6-rc1/
>
>
>
>The hash of the artifacts are as follows:
>
>apache-kylin-2.6.6-source-release.zip.sha256
>1a796080df7bc3e7ed75a2f07a8795de11275686cca8b851f53a3df21f0cb824
>
>apache-kylin-2.6.6-bin-hbase1x.tar.gz.sha256
>39dee0f749cb8d83505ec8d24bbfc1c98681c2cfcc6ca7aec2ec058e20a68e21
>
>apache-kylin-2.6.6-bin-cdh57.tar.gz.sha256
>48169ddbe9ebba1977a490333e6a767c9eb5e172134d1c53cf581cff7064cc98
>
>apache-kylin-2.6.6-bin-hadoop3.tar.gz.sha256
>589dd14afaf5751ee1277d2920748a7000e44ecfef12d21574c3adf83825149b
>
>apache-kylin-2.6.6-bin-cdh60.tar.gz.sha256
>1768b8dc916f35d437c5c6080fa3c9ce4debeb2d3a7a442eace9bbf7a8840c26
>
>
>
>A staged Maven repository is available for review at:
>
>https://repository.apache.org/content/repositories/orgapachekylin-1076/
>
>
>
>Release artifacts are signed with the following key:
>
>https://people.apache.org/keys/committer/nic.asc
>
>
>
>Please vote on releasing this package as Apache Kylin 2.6.6.
>
>
>
>The vote is open for the next 72 hours and passes if a majority of
>
>at least three +1 PMC votes are cast.
>
>
>
>[ ] +1 Release this package as Apache Kylin 2.6.6
>
>[ ]  0 I don't feel strongly about it, but I'm okay with the release
>
>[ ] -1 Do not release this package because...
>
>
>
>
>
>Here is my vote:
>
>
>
>+1 (binding)
>
>--
>
>-
>
>Best regards,
>
>
>
>Ni Chunen / George

Re:[VOTE] Release apache-kylin-3.0.2 (RC2)

2020-05-14 Thread Xiaoxiang Yu

+1 .
Maven test passed and happy path passed in CDH5.7.













--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-05-14 17:11:28, "George Ni"  wrote:
>Hi all,
>
>
>
>I have created a build for Apache Kylin 3.0.2, release candidate 2.
>
>
>
>Changes highlights:
>
>[KYLIN-4390] - Update tomcat to 7.0.100
>
>[KYLIN-4426] - Refine CliCommandExecutor
>
>[KYLIN-4206] - Support Glue as Hive Metatdata
>
>
>
>Thanks to everyone who has contributed to this release.
>
>Here are the release notes:
>
>https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346975==12316121
>
>
>
>The commit to being voted upon:
>
>https://github.com/apache/kylin/commit/57090efe4bdc079ccfde4f9c8729d69ba3a90624
>
>
>Its hash is 57090efe4bdc079ccfde4f9c8729d69ba3a90624.
>
>
>
>The artifacts to be voted on, including the source package and four
>
>pre-compiled binary packages are located here:
>
>https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.2-rc2/
>
>
>
>The hash of the artifacts are as follows:
>
>apache-kylin-3.0.2-source-release.zip.sha256
>1add5892bd1d5994e0e467846e9a844758420f14819ceef63370c07a6aa0b8af
>
>apache-kylin-3.0.2-bin-hbase1x.tar.gz.sha256
>086397d9ecbccf80517977a4b65b660b8e1496ad097d890226bd78a34a9fe190
>
>apache-kylin-3.0.2-bin-cdh57.tar.gz.sha256
>181929fcd35a63a81b6dc097137a3dd1e129fd1f81400e09f64019dcb7ac8a21
>
>apache-kylin-3.0.2-bin-hadoop3.tar.gz.sha256
>c2250734fed971f32d242036a55ba955bcf8de91e0e73704e07cfb09124d9899
>
>apache-kylin-3.0.2-bin-cdh60.tar.gz.sha256
>83a68d2aec32e634475c490434981ebc91e8680dbb6388edc4ed919687ad1dac
>
>
>
>A staged Maven repository is available for review at:
>
>https://repository.apache.org/content/repositories/orgapachekylin-1078/
>
>
>
>Release artifacts are signed with the following key:
>
>https://people.apache.org/keys/committer/nic.asc
>
>
>
>Please vote on releasing this package as Apache Kylin 3.0.2.
>
>
>
>The vote is open for the next 72 hours and passes if a majority of
>
>at least three +1 PMC votes are cast.
>
>
>
>[ ] +1 Release this package as Apache Kylin 3.0.2
>
>[ ]  0 I don't feel strongly about it, but I'm okay with the release
>
>[ ] -1 Do not release this package because...
>
>
>
>
>
>Here is my vote:
>
>
>
>+1 (binding)
>
>--
>
>-
>
>Best regards,
>
>
>
>Ni Chunen / George

Re:[VOTE] Release apache-kylin-2.6.6 (RC2)

2020-05-14 Thread Xiaoxiang Yu

+1 .
Maven test passed and happy path passed in CDH5.7.










--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-05-14 17:06:55, "George Ni"  wrote:
>Hi all,
>
>
>
>I have created a build for Apache Kylin 2.6.6, release candidate 2.
>
>
>
>Changes highlights:
>
>[KYLIN-4390] - Update tomcat to 7.0.100
>
>[KYLIN-4426] - Refine CliCommandExecutor
>
>[KYLIN-4206] - Support Glue as Hive Metatdata
>
>
>
>Thanks to everyone who has contributed to this release.
>
>Here are the release notes:
>
>https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346976==12316121
>
>
>
>The commit to being voted upon:
>
>https://github.com/apache/kylin/commit/ddd5f8ecd4157b8f889b047e421dd9cfae7e1142
>
>
>Its hash is ddd5f8ecd4157b8f889b047e421dd9cfae7e1142.
>
>
>
>The artifacts to be voted on, including the source package and four
>
>pre-compiled binary packages are located here:
>
>https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.6-rc2/
>
>
>
>The hash of the artifacts are as follows:
>
>apache-kylin-2.6.6-source-release.zip.sha256
>6d38671f494e3d5f2bb26dfb94d996a5ceb0c00c2a17b9c181ad853639198d3a
>
>apache-kylin-2.6.6-bin-hbase1x.tar.gz.sha256
>6a42962efbce5a51e2ce4bf8db0b8fa7341ef0b30e4f02e876a5c2fb0500944b
>
>apache-kylin-2.6.6-bin-cdh57.tar.gz.sha256
>85cb22e7d6d9adad214854f9ab285b7d47e874eb9f9df1c5bd01882877171762
>
>apache-kylin-2.6.6-bin-hadoop3.tar.gz.sha256
>f060f8e16f909ae74d9e3c188bb071fcfa87e0a21fd7581fc968e1bcf00e5121
>
>apache-kylin-2.6.6-bin-cdh60.tar.gz.sha256
>8d85a3036d312b47030e3b309af526afe4484720be156f7e3f05e626c02bf531
>
>
>
>A staged Maven repository is available for review at:
>
>https://repository.apache.org/content/repositories/orgapachekylin-1077/
>
>
>
>Release artifacts are signed with the following key:
>
>https://people.apache.org/keys/committer/nic.asc
>
>
>
>Please vote on releasing this package as Apache Kylin 2.6.6.
>
>
>
>The vote is open for the next 72 hours and passes if a majority of
>
>at least three +1 PMC votes are cast.
>
>
>
>[ ] +1 Release this package as Apache Kylin 2.6.6
>
>[ ]  0 I don't feel strongly about it, but I'm okay with the release
>
>[ ] -1 Do not release this package because...
>
>
>
>
>
>Here is my vote:
>
>
>
>+1 (binding)
>
>--
>
>-
>
>Best regards,
>
>
>
>Ni Chunen / George

Re:[VOTE] Release apache-kylin-3.0.2 (RC1)

2020-05-11 Thread Xiaoxiang Yu

Hi Kylin team,
I am sadly to say that I find one patch was not introduced in both two 
release candidates, thus cause some serious Hive integration error, such as 
Glue support. Here is the patch link : 
https://github.com/apache/kylin/pull/1115/ . 
So my vote is: -1.













--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-05-11 20:36:42, "George Ni"  wrote:
>Hi all,
>
>
>
>I have created a build for Apache Kylin 3.0.2, release candidate 1.
>
>
>
>Changes highlights:
>
>[KYLIN-4390] - Update tomcat to 7.0.100
>
>[KYLIN-4426] - Refine CliCommandExecutor
>
>[KYLIN-4206] - Support Glue as Hive Metatdata
>
>
>
>Thanks to everyone who has contributed to this release.
>
>Here are the release notes:
>
>https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346975==12316121
>
>
>
>The commit to being voted upon:
>
>https://github.com/apache/kylin/commit/d0d3e124372991331d96f881b2361b865bf4f2d9
>
>
>
>Its hash is d0d3e124372991331d96f881b2361b865bf4f2d9.
>
>
>
>The artifacts to be voted on, including the source package and four
>
>pre-compiled binary packages are located here:
>
>https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.2-rc1/
>
>
>
>The hash of the artifacts are as follows:
>
>apache-kylin-3.0.2-source-release.zip.sha256
>7f68ba3d9ffd73cb405cc1bea6b14f988274e851e1be0e93c75682120308c994
>
>apache-kylin-3.0.2-bin-hbase1x.tar.gz.sha256
>3119db9f3fdcf530a031d31f605148da092dc41be8c69194b609766091915ea0
>
>apache-kylin-3.0.2-bin-cdh57.tar.gz.sha256
>7a4f2f1aeb66d68012b42e03f0b92ad7a8abd3d5b6afcfbc9a3c27c9c3e7a219
>
>apache-kylin-3.0.2-bin-hadoop3.tar.gz.sha256
>7678390303e03c98fcf2d1233d21d95ac04d3e36424ba7f2b23155103d089c7b
>
>apache-kylin-3.0.2-bin-cdh60.tar.gz.sha256
>db5998d22679b19cf038b707faf6ad70178742b788eec2dcb215ccfb443c3433
>
>
>
>A staged Maven repository is available for review at:
>
>https://repository.apache.org/content/repositories/orgapachekylin-1076/
>
>
>
>Release artifacts are signed with the following key:
>
>https://people.apache.org/keys/committer/nic.asc
>
>
>
>Please vote on releasing this package as Apache Kylin 3.0.2.
>
>
>
>The vote is open for the next 72 hours and passes if a majority of
>
>at least three +1 PMC votes are cast.
>
>
>
>[ ] +1 Release this package as Apache Kylin 3.0.2
>
>[ ]  0 I don't feel strongly about it, but I'm okay with the release
>
>[ ] -1 Do not release this package because...
>
>
>
>
>
>Here is my vote:
>
>
>
>+1 (binding)
>
>--
>
>-
>
>Best regards,
>
>
>
>Ni Chunen / George

Re:回复：[VOTE] Release apache-kylin-4.0.0-alpha (RC1)

2020-09-08 Thread Xiaoxiang Yu




+1 


mvn test passed on my machine; 
happy path passed on CDH 5.7.












--

Best wishes to you ! 
From ：Xiaoxiang Yu





在 2020-09-09 08:21:57，"恩爸" <441586...@qq.com> 写道：
>+1 from my side. (non-binding)
>
>
>
>
>
>
>
>Best regards,
>Zhichao Zhang
>
>
>
>
>
>
>
>
>
>--原始邮件--
>发件人:   
> "dev" 
>   发送时间:2020年9月8日(星期二) 晚上11:49
>收件人:"dev"
>主题:[VOTE] Release apache-kylin-4.0.0-alpha (RC1)
>
>
>
>Hi all,
>
>I have created a build for Apache Kylin 4.0.0-alpha, release candidate
>1. Please note, this release is built on kylin-on-parquet-v2 branch.
>
>
>Changes highlights:
>[KYLIN-4213] - The new build engine with Spark-SQL
>[KYLIN-4450] - Add the feature that adjusting spark driver memory adaptively
>[KYLIN-4458] - FilePruner prune shards
>[KYLIN-4475] - Support intersect count for Kylin on Parquet
>[KYLIN-4462] - Support Count Distinct,TopN and Percentile by kylin on Parquet
>[KYLIN-4713] - Support use diff spark schedule pool for diff query
>[KYLIN-4468] - Support Percentile by kylin on Parquet
>[KYLIN-4662] - Migrate from third-party Spark to offical Apache Spark
>[KYLIN-4701] - Upgrade front-end from HBase Storage to Parquet Storage
>[KYLIN-4644] - New tool to clean up intermediate files for Kylin 4.0
>[KYLIN-4744] - Add tracking URL for build spark job on yarn
>
>Thanks to everyone who has contributed to this release.
>
>Here are the release
>notes:https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121version=12348093
>
>
>The commit to being voted
>upon:https://github.com/apache/kylin/commit/a285f9a5b84affc36c5466ce5a1b2fcdb4348b37
>
>Its hash is a285f9a5b84affc36c5466ce5a1b2fcdb4348b37.
>
>The artifacts to be voted on, including the source package and two
>pre-compiled binary packages are located
>here:https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-4.0.0-alpha-rc1/
>
>The hash of the artifacts are as follows:
>apache-kylin-4.0.0-alpha-source-release.zip.sha256
>f98da070a9839251c8cf3806c274c1191bb11d0b251288b4d2586e034f6ac291
>apache-kylin-4.0.0-alpha-bin-hadoop2.tar.gz.sha256
>8075af2608b62177f04bc5a528194c555959775ed69ae80e5ccaf9a37ec1bf74
>apache-kylin-4.0.0-alpha-bin-cdh57.tar.gz.sha256
>f98da070a9839251c8cf3806c274c1191bb11d0b251288b4d2586e034f6ac291
>
>
>A staged Maven repository is available for review
>at:https://repository.apache.org/content/repositories/orgapachekylin-1081/
>
>Release artifacts are signed with the following
>key:https://people.apache.org/keys/committer/nic.asc
>
>
>Please vote on releasing this package as Apache Kylin 4.0.0-alpha.
>
>
>The vote is open for the next 72 hours and passes if a majority of at
>least three +1 binding votes are cast.
>
>
>[ ] +1 Release this package as Apache Kylin 4.0.0-alpha
>[ ] 0 I don't feel strongly about it, but I'm okay with the release
>[ ] -1 Do not release this package because...
>
>
>Here is my vote:
>
>+1 (binding)
>
>-
>
>Best regards,
>Ni Chunen / George

Re:Kylin v4 query engine tuning

2020-10-07 Thread Xiaoxiang Yu

Hi,
Would you please like to check this doc : 
https://cwiki.apache.org/confluence/display/KYLIN/Improve+query+performance+by+setting+shard+by+column
 .







--

Best wishes to you ! 
From ：Xiaoxiang Yu





在 2020-10-05 20:11:42，"hubert stefani"  写道：
>Hi,
>We are currently using Kylin v4 on AWS EMR for tests and benchmarks. 
>
>We have successfully optimized the building parameters to speed-up cube 
>building.
>
>We are now searching for tuning tips regarding the spark query engine 
>(sparder) : changing parameters (memory, cpu) doesn't seem to have any effect 
>on performances ( and nothing seems to change in the monitoring views). 
>
>do you have any recommendation ?  
>
>Regards, 
>Hubert.

Re: new committer: Rupeng Wang

2020-10-14 Thread Xiaoxiang Yu

Congrats to rupeng.


| |
敏丞
邮箱：hit_la...@126.com
|

签名由 网易邮箱大师 定制

On 10/14/2020 22:25, ShaoFeng Shi wrote:
The Project Management Committee (PMC) for Apache Kylin has invited Rupeng
Wang (王汝鹏, wangrup...@apache.org) to become a committer and we are pleased
to announce that he has accepted.

Being a committer enables easier contribution to the project since there is
no need to go via the patch submission process. This should enable better
productivity.

Congratulations, Rupeng!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org

[VOTE] Release apache-kylin-3.1.1 (RC1)

2020-10-14 Thread Xiaoxiang Yu

Hi all,

 

I have created a build for Apache Kylin 3.1.1, release candidate 1. 

 

 

Changes highlights:

 

[KYLIN-4612] - Support job status write to kafka

 

[KYLIN-4712] - Optimize CubeMetaIngester.java CLI

 

[KYLIN-4657] - dead-loop in 
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork

 

[KYLIN-4688] - Too many tmp files in HDFS tmp directory

 

[KYLIN-4619] - Make shrunken dict able to coexist with mr-hive global dict

 

 

 

Thanks to everyone who has contributed to this release.

 

Here are the release notes:

 

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316121=12348354

 

 

The commit to being voted upon:

 

https://github.com/apache/kylin/commit/d8f5b1b40da42401df90f6205e5f650be05c81c4

 

Its hash is d8f5b1b40da42401df90f6205e5f650be05c81c4.

 

 

 

The artifacts to be voted on, including the source package and four

 

pre-compiled binary packages are located here:

 

https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.1.1-rc1/

 

 

 

The hash of the artifacts are as follows:

 

apache-kylin-3.1.1-source-release.zip.sha256

1f4e28dd53e2ef72faf40c3313f6a53d61205000250a57658d45800ad243594a

 

apache-kylin-3.1.1-bin-hbase1x.tar.gz.sha256

23dcc21c3aa3d496afe39749a2e6832e3aeb4cabc83819a283a1468d70248302

 

apache-kylin-3.1.1-bin-cdh57.tar.gz.sha256

a0d50fb19f11918a9849ab93bd7a6033ae0e8a7fa5ffcfd7c4e8b5889e4b4829

 

apache-kylin-3.1.1-bin-cdh60.tar.gz.sha256

856cb8e3fbb1a3593121e3ba9c9f5b528ff96d156fd0648fa3ee71804d946283

 

apache-kylin-3.1.1-bin-hadoop3.tar.gz.sha256

4a0090acaa627e3c2611a1827ab49b822c33a43fc316b26e9efb0a0117031ddf

 

 

A staged Maven repository is available for review at:

 

https://repository.apache.org/content/repositories/orgapachekylin-1083/

 

 

 

Release artifacts are signed with the following key:

 

https://people.apache.org/keys/committer/xxyu.asc

 

 

 

Please vote on releasing this package as Apache Kylin 3.1.1 .

 

 

 

The vote is open for the next 72 hours and passes if a majority of

 

at least three +1 binding votes are cast.

 

 

 

[ ] +1 Release this package as Apache Kylin 3.1.1

 

[ ]  0 I don't feel strongly about it, but I'm okay with the release

 

[ ] -1 Do not release this package because...

 

 

Here is my vote:

 

+1 (binding)




--

Best wishes to you ! 
From ：Xiaoxiang Yu

[RESULT][VOTE] Release apache-kylin-3.1.1 (RC1)

2020-10-18 Thread Xiaoxiang Yu

Thanks to everyone who has tested the release candidate and given their 
comments and votes.




The tally is as follows.




3 binding +1s:

Chunen Ni




Shaofeng Shi




Xiaoxiang Yu




6 non-binding +1s:

Yaqian Zhang




Rupeng Wang




Chuxiao




Zhichao Zhang




Chao Long 




Johnson




No 0s or -1s.




Therefore I am delighted to announce that the proposal to release

Apache-Kylin-3.1.1 has passed.
















--

Best wishes to you ! 
From ：Xiaoxiang Yu

Re:Query over Rest API results in bad query

2020-10-18 Thread Xiaoxiang Yu

Hello, 
What kind of exception/error did you met ? Could you please share with us 
some more log? 
   Actually,  "org.apache.kylin.rest.service.BadQueryDetector - Detect bad 
query." did not provided enough information for cause analysis.













--

Best wishes to you ! 
From ：Xiaoxiang Yu





At 2020-10-17 22:45:34, "Aditya Rohilla"  wrote:
>Hey,
>
>
>
>I am trying to send this query :
>
>
>
>{ "sql": "select process, avg(w) as `avg_w`, sum(vol) as `sum_vol` from
>WHY_XYZ where id in ('BLA') and balance_date between '2020-03-06' and
>'2020-09-06' group by process order by process asc, "offset": 0, "limit":
>1000, "acceptPartial": false, "project": null }
>
>
>
>And it results in bad query as shown in the logs:
>
>
>
>2020-10-15T22:29:37,092 INFO [BadQueryDetector]
>org.apache.kylin.rest.service.BadQueryDetector - Detect bad query.
>
>
>Any idea how I should send the sql query?
>
>I am using mysql as sqldialect.
>
>
>
>Thanks and Regards,
>
>Aditya Rohilla

[SECURITY][CVE-2020-13937] Unauthenticated Configuration Disclosure

2020-10-19 Thread Xiaoxiang Yu

Versions Affected:

Kylin 2.0.0, 2.1.0, 2.2.0, 2.3.0, 2.3.1, 2.3.2, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 
2.5.2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.6.6, 3.0.0-alpha, 
3.0.0-alpha2, 3.0.0-beta, 3.0.0, 3.0.1, 3.0.2, 3.1.0, 4.0.0-alpha.




Description:

Kylin has one restful api which exposed Kylin's configuration information 
without any authentication, so it is dangerous because some confidential 
information entries will be disclosed to everyone.




Mitigation:

Users could edit "$KYLIN_HOME/WEB-INF/classes/kylinSecurity.xml", and remove 
this line "". After that,  restart all Kylin instances to make it 
effective.

Otherwise, you can upgrade Kylin to 3.1.1.




Credit:

This issue was discovered by Ngo Wei Lin (@Creastery) of STAR Labs 
(@starlabs_sg).

--

Best wishes to you ! 
From ：Xiaoxiang Yu

Re: Failed to enable Real-time OLAP cube（SSL configuration not work）

2020-10-20 Thread Xiaoxiang Yu

After research, I guess you can try to add 
"kylin.source.kafka.config-override.xxx" for consumer related properties in 
Cube level configuration and try again.
If you have some time, you can check source code at 
"https://github.com/apache/kylin/blob/fb4cdb32828a6508dcb8fd2cd953762dbd8a7e02/stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/KafkaSource.java#L75;
 to confirm .

--

Best wishes to you ! 
From ：Xiaoxiang Yu





在 2020-10-20 15:10:06，"Xiaoxiang Yu"  写道：
>Hi,
>1. From "Caused by: org.apache.kafka.common.errors.TimeoutException: 
> Timeout expired while fetching topic metadata" in your log files, I guess you 
> did not configure kafka broker list in correct way. Please
>double check this doc : 
>http://kylin.apache.org/docs/tutorial/realtime_olap.html .
>2. kylin-kafka-consumer.xml won't take effect for Realtime OLAP because it 
> is used for NRT ( http://kylin.apache.org/docs/tutorial/cube_streaming.html  
> ) .
>
>--
>
>Best wishes to you ! 
>From ：Xiaoxiang Yu
>
>
>
>
>在 2020-10-20 14:18:58，"张敏"  写道：
>
>Hi,
>Sorry about the screenshots.
>I attached kylin-kafka-consumer.xml & kylin instance log kylin.out in the 
> attachment. Receiver instances have no log actually.
>Thank you a lot.
>
>
>
>|
>|
>
>
>|
>|
>zhang_...@startimes.com.cn
>|
>签名由网易邮箱大师定制
>On 10/20/2020 12:06，Xiaoxiang Yu wrote：
>Hi,
>I cannot see your screenshots, maybe you can try to use plain text format 
> to share with us your error messages/exceptions, I perfer you to upload logs 
> both from kylin instance & receiver instances.
>
>--
>
>Best wishes to you ! 
>From ：Xiaoxiang Yu
>
>
>
>
>在 2020-10-20 10:57:18，"张敏"  写道：
>
>
>
>encryption between clients and brokers in my kafka cluster :
>- port 9092: Plaintext
>- port 9094: TLS
>
>
>it's all good to test with kafka-console-producer.sh and 
>kafka-console-consumer.sh on the machine where kylin is located
>
>
>it's also ok when specify 9092 in kylin, like cube kylin_test_cube, while 
>there is no other configuration
>
>
>
>
>but failed to enable kylin_ssl_test_cube which is ssl encryption（port 9094）, I 
>did update conf/kylin-kafka-consumer.xml
>
>
>
>
>
>
>
>
>Did I make any wrong configuration? thank you
>签名由网易邮箱大师定制

Re: Failed to enable Real-time OLAP cube（SSL configuration not work）

2020-10-20 Thread Xiaoxiang Yu

Hi,
1. From "Caused by: org.apache.kafka.common.errors.TimeoutException: 
Timeout expired while fetching topic metadata" in your log files, I guess you 
did not configure kafka broker list in correct way. Please
double check this doc : 
http://kylin.apache.org/docs/tutorial/realtime_olap.html .
2. kylin-kafka-consumer.xml won't take effect for Realtime OLAP because it 
is used for NRT ( http://kylin.apache.org/docs/tutorial/cube_streaming.html  ) .

--

Best wishes to you ! 
From ：Xiaoxiang Yu




在 2020-10-20 14:18:58，"张敏"  写道：

Hi,
Sorry about the screenshots.
I attached kylin-kafka-consumer.xml & kylin instance log kylin.out in the 
attachment. Receiver instances have no log actually.
Thank you a lot.



|
|


|
|
zhang_...@startimes.com.cn
|
签名由网易邮箱大师定制
On 10/20/2020 12:06，Xiaoxiang Yu wrote：
Hi,
I cannot see your screenshots, maybe you can try to use plain text format 
to share with us your error messages/exceptions, I perfer you to upload logs 
both from kylin instance & receiver instances.

--

Best wishes to you ! 
From ：Xiaoxiang Yu




在 2020-10-20 10:57:18，"张敏"  写道：



encryption between clients and brokers in my kafka cluster :
- port 9092: Plaintext
- port 9094: TLS


it's all good to test with kafka-console-producer.sh and 
kafka-console-consumer.sh on the machine where kylin is located


it's also ok when specify 9092 in kylin, like cube kylin_test_cube, while there 
is no other configuration




but failed to enable kylin_ssl_test_cube which is ssl encryption（port 9094）, I 
did update conf/kylin-kafka-consumer.xml








Did I make any wrong configuration? thank you
签名由网易邮箱大师定制

Re: Failed to enable Real-time OLAP cube（SSL configuration not work）

2020-10-20 Thread Xiaoxiang Yu

Glad to hear that. Thanks for update.

--

Best wishes to you ! 
From ：Xiaoxiang Yu




在 2020-10-20 17:30:00，"张敏"  写道：



It works! thank you : )


| |
张敏
|
|
zhang_...@startimes.com.cn
|
签名由网易邮箱大师定制
On 10/20/2020 16:37，Xiaoxiang Yu wrote：
After research, I guess you can try to add 
"kylin.source.kafka.config-override.xxx" for consumer related properties in 
Cube level configuration and try again.
If you have some time, you can check source code at 
"https://github.com/apache/kylin/blob/fb4cdb32828a6508dcb8fd2cd953762dbd8a7e02/stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/KafkaSource.java#L75;
 to confirm .

--

Best wishes to you ! 
From ：Xiaoxiang Yu





在 2020-10-20 15:10:06，"Xiaoxiang Yu"  写道：
>Hi,
>1. From "Caused by: org.apache.kafka.common.errors.TimeoutException: 
> Timeout expired while fetching topic metadata" in your log files, I guess you 
> did not configure kafka broker list in correct way. Please
>double check this doc : 
>http://kylin.apache.org/docs/tutorial/realtime_olap.html .
>2. kylin-kafka-consumer.xml won't take effect for Realtime OLAP because it 
> is used for NRT ( http://kylin.apache.org/docs/tutorial/cube_streaming.html  
> ) .
>
>--
>
>Best wishes to you ! 
>From ：Xiaoxiang Yu
>
>
>
>
>在 2020-10-20 14:18:58，"张敏"  写道：
>
>Hi,
>Sorry about the screenshots.
>I attached kylin-kafka-consumer.xml & kylin instance log kylin.out in the 
> attachment. Receiver instances have no log actually.
>Thank you a lot.
>
>
>
>|
>|
>
>
>|
>|
>zhang_...@startimes.com.cn
>|
>签名由网易邮箱大师定制
>On 10/20/2020 12:06，Xiaoxiang Yu wrote：
>Hi,
>I cannot see your screenshots, maybe you can try to use plain text format 
> to share with us your error messages/exceptions, I perfer you to upload logs 
> both from kylin instance & receiver instances.
>
>--
>
>Best wishes to you ! 
>From ：Xiaoxiang Yu
>
>
>
>
>在 2020-10-20 10:57:18，"张敏"  写道：
>
>
>
>encryption between clients and brokers in my kafka cluster :
>- port 9092: Plaintext
>- port 9094: TLS
>
>
>it's all good to test with kafka-console-producer.sh and 
>kafka-console-consumer.sh on the machine where kylin is located
>
>
>it's also ok when specify 9092 in kylin, like cube kylin_test_cube, while 
>there is no other configuration
>
>
>
>
>but failed to enable kylin_ssl_test_cube which is ssl encryption（port 9094）, I 
>did update conf/kylin-kafka-consumer.xml
>
>
>
>
>
>
>
>
>Did I make any wrong configuration? thank you
>签名由网易邮箱大师定制

Re:你好，请问，如何编译 v3.1.0-cdh6.0/cdh6.1 版本

2020-10-14 Thread Xiaoxiang Yu

Hi,
Please checkout to "master-hadoop3" branch, and use 
"build/script/package.sh -P cdh60" to build kylin package.













--

Best wishes to you ! 
From ：Xiaoxiang Yu





在 2020-10-14 11:18:48，"李 小元"  写道：
>我应该切换到哪个分支？
>
>我可以使用idea maven package 命令吗，还是使用下面的命令？
>
>
>
>cd kylin
>
>build/script/package.sh -P cdh5.7
>
>cd kylin
>
>build/script/package.sh -P cdh6.0
>
>
>谢谢！！！
>
>发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用
>

[Announce] Apache Kylin 3.1.1 released

2020-10-18 Thread Xiaoxiang Yu

The Apache Kylin team is pleased to announce the immediate availability of

the 3.1.1 release.




This is a bugfix release after 3.1.0, with 21 bug fixes and 37 enhancements.

All of the changes in this release can be found in:

https://kylin.apache.org/docs/release_notes.html




You can download the source release and binary packages from Apache Kylin's

download page: https://kylin.apache.org/download/




Apache Kylin is an open-source Distributed Analytical Data Warehouse for

Big Data; it was designed to provide OLAP (Online Analytical Processing)

capability in the big data era. By renovating the multi-dimensional cube

and precalculation technology on Hadoop and Spark, Kylin is able to achieve

near-constant query speed regardless of the ever-growing data volume.

Reducing query latency from minutes to sub-second, Kylin brings online

analytics back to big data.




Apache Kylin lets you query billions of rows at sub-second latency in 3

steps:

1. Identify a Star/Snowflake Schema on Hadoop.

2. Build Cube from the identified tables.

3. Query using ANSI-SQL and get results in sub-second, via ODBC, JDBC or

RESTful API.




Thanks to everyone who has contributed to this release.




We welcome your help and feedback. For more information on how to report

problems, and to get involved, visit the project website at

https://kylin.apache.org/

--

Best wishes to you ! 
From ：Xiaoxiang Yu

1 2 3 4 >

1 - 100 of 349 matches

Mail list logo