[jira] [Created] (KYLIN-3886) Missing argument for options for yarn command

2019-03-17 Thread Liu Shaohui (JIRA)
Liu Shaohui created KYLIN-3886:
--

 Summary:  Missing argument for options for yarn command
 Key: KYLIN-3886
 URL: https://issues.apache.org/jira/browse/KYLIN-3886
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


2019-03-13 11:48:08,604 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 : Missing 
argument for options
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 : usage: 
application
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  
-appStates  Works with -list to filter applications
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  based on input comma-separated list of
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  application states. The valid application
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  state can be one of the following:
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  NING,FINISHED,FAILED,KILLED
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -appTypes 
   Works with -list to filter applications
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  based on input comma-separated list of
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  application types.
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -help 
  Displays help for all commands.
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -kill 
  Kills the application.
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -list 
  List applications. Supports optional use
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  of -appTypes to filter applications based



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3885) Build dimension dictionary job costs too long when using Spark fact distinct

2019-03-17 Thread Liu Shaohui (JIRA)
Liu Shaohui created KYLIN-3885:
--

 Summary: Build dimension dictionary job costs too long when using 
Spark fact distinct
 Key: KYLIN-3885
 URL: https://issues.apache.org/jira/browse/KYLIN-3885
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


Build dimension dictionary job costs less than 20 minutes when using mapreduce 
fact distinct,but but it costs more than 3 hours when using spark fact distinct.
{code:java}
"Scheduler 542945608 Job 05c62aca-853f-396e-9653-f20c9ebd8ebc-329" #329 prio=5 
os_prio=0 tid=0x7f312109c800 nid=0x2dc0b in Object.wait() 
[0x7f30d8d24000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.apache.hadoop.ipc.Client.call(Client.java:1482)
- locked <0x0005c3110fc0> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy33.delete(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:573)
at sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:249)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:107)
at com.sun.proxy.$Proxy34.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2057)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:682)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:696)
at 
org.apache.hadoop.fs.FilterFileSystem.delete(FilterFileSystem.java:232)
at 
org.apache.hadoop.fs.viewfs.ChRootedFileSystem.delete(ChRootedFileSystem.java:198)
at 
org.apache.hadoop.fs.viewfs.ViewFileSystem.delete(ViewFileSystem.java:334)
at 
org.apache.hadoop.hdfs.FederatedDFSFileSystem.delete(FederatedDFSFileSystem.java:232)
at 
org.apache.kylin.dict.global.GlobalDictHDFSStore.deleteSlice(GlobalDictHDFSStore.java:211)
at 
org.apache.kylin.dict.global.AppendTrieDictionaryBuilder.flushCurrentNode(AppendTrieDictionaryBuilder.java:137)
at 
org.apache.kylin.dict.global.AppendTrieDictionaryBuilder.addValue(AppendTrieDictionaryBuilder.java:97)
at 
org.apache.kylin.dict.GlobalDictionaryBuilder.addValue(GlobalDictionaryBuilder.java:85)
at 
org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:82)
at 
org.apache.kylin.dict.DictionaryManager.buildDictFromReadableTable(DictionaryManager.java:303)
at 
org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:290)
at 
org.apache.kylin.cube.CubeManager$DictionaryAssist.buildDictionary(CubeManager.java:1043)
at 
org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:1012)
at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:72)
at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
at 
org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)
at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
at 
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3883) Kylin supports column count aggregation

2019-03-17 Thread xiaodongzhang (JIRA)
xiaodongzhang created KYLIN-3883:


 Summary: Kylin supports column count aggregation
 Key: KYLIN-3883
 URL: https://issues.apache.org/jira/browse/KYLIN-3883
 Project: Kylin
  Issue Type: New Feature
  Components: Job Engine
Affects Versions: all
Reporter: xiaodongzhang
Assignee: xiaodongzhang
 Fix For: v3.0.0


Kylin目前只支持对常量1进行count聚合预计算,即count(1),count(col_1)的查询内部全部重写到对count(1)的查询,这样会导致一个问题:当col_1列中存在null值时,count(col_1)的查询结果并不准确。这样就导致Kylin的查询结果和Hive、Spark等不一致。该Patch中,提供了对count(col_1)的支持。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3884) loading hfile to HBase failed for temporary dir in output path

2019-03-17 Thread Liu Shaohui (JIRA)
Liu Shaohui created KYLIN-3884:
--

 Summary: loading hfile  to HBase failed for temporary dir in 
output path
 Key: KYLIN-3884
 URL: https://issues.apache.org/jira/browse/KYLIN-3884
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


{code:java}
2019-03-14 20:18:46,591 DEBUG [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] steps.BulkLoadJob:77 : Start to run 
LoadIncrementalHFiles
2019-03-14 20:18:46,642 WARN  [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] mapreduce.LoadIncrementalHFiles:197 : 
Skipping non-directory 
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_SUCCESS
2019-03-14 20:18:46,650 ERROR [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] mapreduce.LoadIncrementalHFiles:352 : 
-
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/4170d772384144848c1c10cba66152c3
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/50ec331ff3c648e3b6e4f54a7b1fe7e9
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/703ade3b535b4fedab39ee183e22aa7c
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/82019f8ca00a4f16b9d2b45356a55a3a
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/8cc8844bced24cb88fda52fecc7224d5
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/cbac78e0c6d74b5c96a7b64f99e0d0b3
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/e3844766a4d0486d89f287450034f378
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_temporary/0
2019-03-14 20:18:46,651 ERROR [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] common.HadoopShellExecutable:65 : 
error execute HadoopShellExecutable{id=e48de76a-6e16-309f-a3a5-191c04071072-08, 
name=Load HFile to HBase Table, state=RUNNING}
java.io.FileNotFoundException: Path is not a file: 
/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_temporary/0
Caused by: 
org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path is 
not a file: 
/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_temporary/0{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Release apache-kylin-2.6.1 binary packages

2019-03-17 Thread Li Yang
+1
UT/IT passed

Yang

On Sun, Mar 17, 2019 at 12:15 PM Billy Liu  wrote:

> +1 binding
>
> sigs and hashes are OK
> NOTICE and LICENSE are OK
>
>
> With Warm regards
>
> Billy Liu
>
> ShaoFeng Shi  于2019年3月15日周五 上午8:58写道:
> >
> > Hi all,
> >
> > The source code of apache-kylin-2.6.1 has been released on 3/8 on last
> > week. Now we prepared the binary packages of v2.6.1 for users'
> convenience.
> > Please review the binary packages, and give your vote.
> >
> > The packages are in :
> > https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-2.6.1-rc1/
> >
> > The hash of the artifact is as follows:
> > apache-kylin-2.6.1-bin-hbase1x.tar.gz -
> > f91f3ff0d6426f84e752cc1178fd704895842e9464ce5cd31c099b1f31eb6b68
> > apache-kylin-2.6.1-bin-hadoop3.tar.gz -
> > 6f06e94055d7639729f7879508669375a80eddd76c2a4880da38a0f7f223de44
> > apache-kylin-2.6.1-bin-cdh57.tar.gz  -
> > b5038da13bfbf7fbba9a46b4675b587c882a8e152d244b063c4a610d6000bd55
> > apache-kylin-2.6.1-bin-cdh60.tar.gz  -
> > d1ba39a6e288131a89e3c8e4d0959fd3c05c4ed42df1164df4d2ec9ddf55f92f
> >
> > The checking content should include:
> >
> >- sigs and hashes must be OK
> >- the package must contain the correct NOTICE and LICENSE files for
> the
> >included content
> >- the package must not contain any content not derived from the
> source.
> >- in the case of bundled binaries, reviewers must check that all
> >contents are represented in the LICENSE (and NOTICE file if required).
> >The bundle must not contain any files that are prohibited from
> >distribution (category X).
> >
> >
> > Here is my vote:
> > +1 (binding)
> >
> > Thank you!
> >
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> > Apache Kylin PMC
> > Email: shaofeng...@apache.org
> >
> > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> > Join Kylin user mail group: user-subscr...@kylin.apache.org
> > Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>


[jira] [Created] (KYLIN-3882) kylin master build failed for pom issues

2019-03-17 Thread Liu Shaohui (JIRA)
Liu Shaohui created KYLIN-3882:
--

 Summary: kylin master build failed for pom issues
 Key: KYLIN-3882
 URL: https://issues.apache.org/jira/browse/KYLIN-3882
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


As title.

1,  Kyligence repo id : nexus conflicts with local maven settings.xml
{code:java}
[ERROR] Failed to execute goal on project kylin-core-metadata: Could not 
resolve dependencies for project 
org.apache.kylin:kylin-core-metadata:jar:3.0.0-SNAPSHOT: Failure to find 
org.apache.calcite:calcite-core:jar:1.16.0-kylin-r2 in 
http://nexus.x./nexus/content/groups/public was cached in the local 
repository, resolution will not be reattempted until the update interval of 
nexus has elapsed or updates are forced -> [Help 1]
{code}
 

2, maven.compiler.source/target is not set
{code:java}
[INFO] Compiling 2 Scala sources and 18 Java sources to 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/target/classes ...
[WARNING] [Warn] : bootstrap class path not set in conjunction with -source 1.6
[ERROR] [Error] 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkBatchCubingJobBuilder2.java:148:
 diamond operator is not supported in -source 1.6
  (use -source 7 or higher to enable diamond operator)
[ERROR] [Error] 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkCubingByLayer.java:239:
 try-with-resources is not supported in -source 1.6
  (use -source 7 or higher to enable try-with-resources)
[ERROR] [Error] 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkCubingByLayer.java:251:
 diamond operator is not supported in -source 1.6
  (use -source 7 or higher to enable diamond operator){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


答复: [Discussion] Enable shrunken dictionary by default

2019-03-17 Thread Na Zhai
+1



发送自 Windows 10 版邮件应用




发件人: Billy Liu 
发送时间: Monday, March 18, 2019 11:50:49 AM
收件人: dev
抄送: Xiaoxiang Yu
主题: Re: [Discussion] Enable shrunken dictionary by default

22 hours to 5 minutes, incredible progress.
+1

With Warm regards

Billy Liu

ShaoFeng Shi  于2019年3月18日周一 上午2:59写道:
>
> +1.
>
> Thanks to Xiaoxiang for raising this; Kylin has some advanced but hidden
> feature. As the function becomes stable, we should enable them by default
> to benefit all users.
>
> Please also raise similar discussion if you wish to enable some good
> features.
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Zhong, Yanghong  于2019年3月18日周一 上午10:39写道:
>
> > +1.
> >
> > Best regards,
> > Yanghong Zhong
> >
> > On 2019/3/18, 10:27 AM, "Xiaoxiang Yu"  wrote:
> >
> > Dear all,
> > I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by
> > default(it is disabled by default), because I found enable it will speed up
> > cube build process when cube have count distinct(bitmap) on a large
> > cardinality column. This feature is contributed in KYLIN-3491.
> >
> > When using count distinct(bitmap) measure on a large cardinality
> > column(this require global dictionary), build base cuboid step need
> > frequent cache swap so it cannot finished within a reasonable period.
> > KYLIN-3491 add a new step to build separated dictionary for each InputSplit
> > before BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to
> > fetch a smaller dictionary for itself(without unused value), instead of a
> > larger global dictionary. It will reduce cache swap and make
> > BuildBaseCuboid step run as quick as possible.
> >
> > In my test env, my hadoop cluster is a CDH cluster with 56 vcore and
> > 110GB Memory. I create a model with a fact table (153326740 rows) and three
> > dimension tables, there are three count distinct(bitmap) measure which the
> > largest cardinality of single column is 55200325. With ShrunkenDict
> > disabled, the BuildBaseCuboid cannot completed in 22 hours. Comparatively,
> > with ShrunkenDict enabled, build process completed in a reasonable
> > duration(Extra Dictionary cost 5 minutes, Build Base Cuboid costs 5
> > minutes).
> >
> >
> > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F14030549%2F54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=KuUcbcerY42oG4J11G1jlEcIs4v%2BPPVt40B9G9fqa80%3D&reserved=0
> >
> > If you want know more, please check
> > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-3491&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=T1P1rCA1munwUedC0PC4qttqbFqiDkda%2FZ%2BgqgkQn%2BE%3D&reserved=0.
> > If you have any suggestion, please let me know.
> >
> > 
> > Best wishes,
> > Xiaoxiang Yu
> >
> >
> >
> >


Re: [Discussion] Enable shrunken dictionary by default

2019-03-17 Thread Billy Liu
22 hours to 5 minutes, incredible progress.
+1

With Warm regards

Billy Liu

ShaoFeng Shi  于2019年3月18日周一 上午2:59写道:
>
> +1.
>
> Thanks to Xiaoxiang for raising this; Kylin has some advanced but hidden
> feature. As the function becomes stable, we should enable them by default
> to benefit all users.
>
> Please also raise similar discussion if you wish to enable some good
> features.
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Email: shaofeng...@apache.org
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>
>
> Zhong, Yanghong  于2019年3月18日周一 上午10:39写道:
>
> > +1.
> >
> > Best regards,
> > Yanghong Zhong
> >
> > On 2019/3/18, 10:27 AM, "Xiaoxiang Yu"  wrote:
> >
> > Dear all,
> > I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by
> > default(it is disabled by default), because I found enable it will speed up
> > cube build process when cube have count distinct(bitmap) on a large
> > cardinality column. This feature is contributed in KYLIN-3491.
> >
> > When using count distinct(bitmap) measure on a large cardinality
> > column(this require global dictionary), build base cuboid step need
> > frequent cache swap so it cannot finished within a reasonable period.
> > KYLIN-3491 add a new step to build separated dictionary for each InputSplit
> > before BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to
> > fetch a smaller dictionary for itself(without unused value), instead of a
> > larger global dictionary. It will reduce cache swap and make
> > BuildBaseCuboid step run as quick as possible.
> >
> > In my test env, my hadoop cluster is a CDH cluster with 56 vcore and
> > 110GB Memory. I create a model with a fact table (153326740 rows) and three
> > dimension tables, there are three count distinct(bitmap) measure which the
> > largest cardinality of single column is 55200325. With ShrunkenDict
> > disabled, the BuildBaseCuboid cannot completed in 22 hours. Comparatively,
> > with ShrunkenDict enabled, build process completed in a reasonable
> > duration(Extra Dictionary cost 5 minutes, Build Base Cuboid costs 5
> > minutes).
> >
> >
> > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F14030549%2F54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=KuUcbcerY42oG4J11G1jlEcIs4v%2BPPVt40B9G9fqa80%3D&reserved=0
> >
> > If you want know more, please check
> > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-3491&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=T1P1rCA1munwUedC0PC4qttqbFqiDkda%2FZ%2BgqgkQn%2BE%3D&reserved=0.
> > If you have any suggestion, please let me know.
> >
> > 
> > Best wishes,
> > Xiaoxiang Yu
> >
> >
> >
> >


[jira] [Created] (KYLIN-3881) Kylin may return incorrect results when there's a CompareTupleFilter, like colName = (1 = 1)

2019-03-17 Thread Zhong Yanghong (JIRA)
Zhong Yanghong created KYLIN-3881:
-

 Summary: Kylin may return incorrect results when there's a 
CompareTupleFilter, like colName = (1 = 1) 
 Key: KYLIN-3881
 URL: https://issues.apache.org/jira/browse/KYLIN-3881
 Project: Kylin
  Issue Type: Bug
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3880) DataType is incompatible in Kylin HBase coprocessor

2019-03-17 Thread Liu Shaohui (JIRA)
Liu Shaohui created KYLIN-3880:
--

 Summary: DataType is incompatible in Kylin HBase coprocessor
 Key: KYLIN-3880
 URL: https://issues.apache.org/jira/browse/KYLIN-3880
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


During upgrade kylin from 2.4.1 to 2.5.2, the query will failed for the 
incompatible class in Kylin HBase coprocessor
{code:java}
2019-03-12,17:48:11,530 INFO 
[FifoRWQ.default.readRpcServer.handler=197,queue=13,port=24600] 
org.apache.hadoop.hdfs.DFSClient: Access token was invalid when connecting to 
/10.152.33.45:22402 : 
org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got 
access token error for OP_READ_BLOCK, self=/10.152.33.44:55387, 
remote=/10.152.33.45:22402, for file 
/hbase/zjyprc-xiaomi/data/miui_sec/data/4b88a72f5bd37daca00efb842e676ca8/C/6593503eb213431998db117cf3dab3a6,
 for pool BP-792581576-10.152.48.22-1510572454905 block 1899006034_825272806
2019-03-12,17:48:12,135 INFO 
[FifoRWQ.default.readRpcServer.handler=231,queue=15,port=24600] 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService: 
start query dc0fadcf-3689-5508-9a45-559aaebfd4e0 in thread 
FifoRWQ.default.readRpcServer.handler=231,queue=15,port=24600
2019-03-12,17:48:12,135 ERROR 
[FifoRWQ.default.readRpcServer.handler=231,queue=15,port=24600] 
org.apache.hadoop.ipc.RpcServer: Unexpected throwable object 
java.lang.RuntimeException: java.io.InvalidClassException: 
org.apache.kylin.metadata.datatype.DataType; local class incompatible: stream 
classdesc serialVersionUID = -8891652700267537109, local class serialVersionUID 
= -406124487097947
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem.readDimensionEncoding(TrimmedCubeCodeSystem.java:87)
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem$1.deserialize(TrimmedCubeCodeSystem.java:122)
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem$1.deserialize(TrimmedCubeCodeSystem.java:91)
at org.apache.kylin.gridtable.GTInfo$1.deserialize(GTInfo.java:346)
at org.apache.kylin.gridtable.GTInfo$1.deserialize(GTInfo.java:307)
at 
org.apache.kylin.gridtable.GTScanRequest$2.deserialize(GTScanRequest.java:466)
at 
org.apache.kylin.gridtable.GTScanRequest$2.deserialize(GTScanRequest.java:412)
at 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService.visitCube(CubeVisitService.java:259)
at 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService.callMethod(CubeVisitProtos.java:)
at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6625)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:4336)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:4318)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34964)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2059)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:126)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:152)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:128)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.InvalidClassException: 
org.apache.kylin.metadata.datatype.DataType; local class incompatible: stream 
classdesc serialVersionUID = -8891652700267537109, local class serialVersionUID 
= -406124487097947
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at 
org.apache.kylin.dimension.AbstractDateDimEnc.readExternal(AbstractDateDimEnc.java:137)
at 
java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:2118)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2067)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem.

Re: [Discussion] Enable shrunken dictionary by default

2019-03-17 Thread ShaoFeng Shi
+1.

Thanks to Xiaoxiang for raising this; Kylin has some advanced but hidden
feature. As the function becomes stable, we should enable them by default
to benefit all users.

Please also raise similar discussion if you wish to enable some good
features.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Zhong, Yanghong  于2019年3月18日周一 上午10:39写道:

> +1.
>
> Best regards,
> Yanghong Zhong
>
> On 2019/3/18, 10:27 AM, "Xiaoxiang Yu"  wrote:
>
> Dear all,
> I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by
> default(it is disabled by default), because I found enable it will speed up
> cube build process when cube have count distinct(bitmap) on a large
> cardinality column. This feature is contributed in KYLIN-3491.
>
> When using count distinct(bitmap) measure on a large cardinality
> column(this require global dictionary), build base cuboid step need
> frequent cache swap so it cannot finished within a reasonable period.
> KYLIN-3491 add a new step to build separated dictionary for each InputSplit
> before BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to
> fetch a smaller dictionary for itself(without unused value), instead of a
> larger global dictionary. It will reduce cache swap and make
> BuildBaseCuboid step run as quick as possible.
>
> In my test env, my hadoop cluster is a CDH cluster with 56 vcore and
> 110GB Memory. I create a model with a fact table (153326740 rows) and three
> dimension tables, there are three count distinct(bitmap) measure which the
> largest cardinality of single column is 55200325. With ShrunkenDict
> disabled, the BuildBaseCuboid cannot completed in 22 hours. Comparatively,
> with ShrunkenDict enabled, build process completed in a reasonable
> duration(Extra Dictionary cost 5 minutes, Build Base Cuboid costs 5
> minutes).
>
>
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F14030549%2F54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=KuUcbcerY42oG4J11G1jlEcIs4v%2BPPVt40B9G9fqa80%3D&reserved=0
>
> If you want know more, please check
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-3491&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=T1P1rCA1munwUedC0PC4qttqbFqiDkda%2FZ%2BgqgkQn%2BE%3D&reserved=0.
> If you have any suggestion, please let me know.
>
> 
> Best wishes,
> Xiaoxiang Yu
>
>
>
>


Re: [Discussion] Enable shrunken dictionary by default

2019-03-17 Thread JiaTao Tao
+1, seems improved a lot.


-- 


Regards!

Aron Tao

Xiaoxiang Yu  于2019年3月18日周一 上午2:27写道:

> Dear all,
> I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by
> default(it is disabled by default), because I found enable it will speed up
> cube build process when cube have count distinct(bitmap) on a large
> cardinality column. This feature is contributed in KYLIN-3491.
>
> When using count distinct(bitmap) measure on a large cardinality
> column(this require global dictionary), build base cuboid step need
> frequent cache swap so it cannot finished within a reasonable period.
> KYLIN-3491 add a new step to build separated dictionary for each InputSplit
> before BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to
> fetch a smaller dictionary for itself(without unused value), instead of a
> larger global dictionary. It will reduce cache swap and make
> BuildBaseCuboid step run as quick as possible.
>
> In my test env, my hadoop cluster is a CDH cluster with 56 vcore and 110GB
> Memory. I create a model with a fact table (153326740 rows) and three
> dimension tables, there are three count distinct(bitmap) measure which the
> largest cardinality of single column is 55200325. With ShrunkenDict
> disabled, the BuildBaseCuboid cannot completed in 22 hours. Comparatively,
> with ShrunkenDict enabled, build process completed in a reasonable
> duration(Extra Dictionary cost 5 minutes, Build Base Cuboid costs 5
> minutes).
>
>
> https://user-images.githubusercontent.com/14030549/54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png
>
> If you want know more, please check
> https://issues.apache.org/jira/browse/KYLIN-3491. If you have any
> suggestion, please let me know.
>
> 
> Best wishes,
> Xiaoxiang Yu
>
>


回复: [Discussion] Enable shrunken dictionary by default

2019-03-17 Thread Chao Long
+1
--
Best Regards,
Chao Long


-- 原始邮件 --
发件人: "Zhong, Yanghong";
发送时间: 2019年3月18日(星期一) 上午10:30
收件人: "dev@kylin.apache.org";
抄送: "Xiaoxiang Yu"; 
主题: Re: [Discussion] Enable shrunken dictionary by default



+1.

Best regards,
Yanghong Zhong

On 2019/3/18, 10:27 AM, "Xiaoxiang Yu"  wrote:

Dear all,
I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by 
default(it is disabled by default), because I found enable it will speed up 
cube build process when cube have count distinct(bitmap) on a large cardinality 
column. This feature is contributed in KYLIN-3491.

When using count distinct(bitmap) measure on a large cardinality 
column(this require global dictionary), build base cuboid step need frequent 
cache swap so it cannot finished within a reasonable period. KYLIN-3491 add a 
new step to build separated dictionary for each InputSplit before 
BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to fetch a 
smaller dictionary for itself(without unused value), instead of a larger global 
dictionary. It will reduce cache swap and make BuildBaseCuboid step run as 
quick as possible.

In my test env, my hadoop cluster is a CDH cluster with 56 vcore and 110GB 
Memory. I create a model with a fact table (153326740 rows) and three dimension 
tables, there are three count distinct(bitmap) measure which the largest 
cardinality of single column is 55200325. With ShrunkenDict disabled, the 
BuildBaseCuboid cannot completed in 22 hours. Comparatively, with ShrunkenDict 
enabled, build process completed in a reasonable duration(Extra Dictionary cost 
5 minutes, Build Base Cuboid costs 5 minutes).


https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F14030549%2F54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=KuUcbcerY42oG4J11G1jlEcIs4v%2BPPVt40B9G9fqa80%3D&reserved=0

If you want know more, please check 
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-3491&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=T1P1rCA1munwUedC0PC4qttqbFqiDkda%2FZ%2BgqgkQn%2BE%3D&reserved=0.
 If you have any suggestion, please let me know.


Best wishes,
Xiaoxiang Yu

Re: [Discussion] Enable shrunken dictionary by default

2019-03-17 Thread Zhong, Yanghong
+1.

Best regards,
Yanghong Zhong

On 2019/3/18, 10:27 AM, "Xiaoxiang Yu"  wrote:

Dear all,
I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by 
default(it is disabled by default), because I found enable it will speed up 
cube build process when cube have count distinct(bitmap) on a large cardinality 
column. This feature is contributed in KYLIN-3491.

When using count distinct(bitmap) measure on a large cardinality 
column(this require global dictionary), build base cuboid step need frequent 
cache swap so it cannot finished within a reasonable period. KYLIN-3491 add a 
new step to build separated dictionary for each InputSplit before 
BuildBaseCuboid step. So mapper of BuildBaseCuboid step only has to fetch a 
smaller dictionary for itself(without unused value), instead of a larger global 
dictionary. It will reduce cache swap and make BuildBaseCuboid step run as 
quick as possible.

In my test env, my hadoop cluster is a CDH cluster with 56 vcore and 110GB 
Memory. I create a model with a fact table (153326740 rows) and three dimension 
tables, there are three count distinct(bitmap) measure which the largest 
cardinality of single column is 55200325. With ShrunkenDict disabled, the 
BuildBaseCuboid cannot completed in 22 hours. Comparatively, with ShrunkenDict 
enabled, build process completed in a reasonable duration(Extra Dictionary cost 
5 minutes, Build Base Cuboid costs 5 minutes).


https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fuser-images.githubusercontent.com%2F14030549%2F54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=KuUcbcerY42oG4J11G1jlEcIs4v%2BPPVt40B9G9fqa80%3D&reserved=0

If you want know more, please check 
https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-3491&data=02%7C01%7Cyangzhong%40ebay.com%7C5f549f14059d4731d7a808d6ab4954ef%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636884728786178583&sdata=T1P1rCA1munwUedC0PC4qttqbFqiDkda%2FZ%2BgqgkQn%2BE%3D&reserved=0.
 If you have any suggestion, please let me know.


Best wishes,
Xiaoxiang Yu





[Discussion] Enable shrunken dictionary by default

2019-03-17 Thread Xiaoxiang Yu
Dear all,
I suggest enable "kylin.dictionary.shrunken-from-global-enabled" by default(it 
is disabled by default), because I found enable it will speed up cube build 
process when cube have count distinct(bitmap) on a large cardinality column. 
This feature is contributed in KYLIN-3491.

When using count distinct(bitmap) measure on a large cardinality column(this 
require global dictionary), build base cuboid step need frequent cache swap so 
it cannot finished within a reasonable period. KYLIN-3491 add a new step to 
build separated dictionary for each InputSplit before BuildBaseCuboid step. So 
mapper of BuildBaseCuboid step only has to fetch a smaller dictionary for 
itself(without unused value), instead of a larger global dictionary. It will 
reduce cache swap and make BuildBaseCuboid step run as quick as possible.

In my test env, my hadoop cluster is a CDH cluster with 56 vcore and 110GB 
Memory. I create a model with a fact table (153326740 rows) and three dimension 
tables, there are three count distinct(bitmap) measure which the largest 
cardinality of single column is 55200325. With ShrunkenDict disabled, the 
BuildBaseCuboid cannot completed in 22 hours. Comparatively, with ShrunkenDict 
enabled, build process completed in a reasonable duration(Extra Dictionary cost 
5 minutes, Build Base Cuboid costs 5 minutes).

https://user-images.githubusercontent.com/14030549/54363305-ad25e200-46a5-11e9-8bc7-fe2c385c0278.png

If you want know more, please check 
https://issues.apache.org/jira/browse/KYLIN-3491. If you have any suggestion, 
please let me know.


Best wishes,
Xiaoxiang Yu



回复:no mapreduce.tar.gz available for kylin to build cube

2019-03-17 Thread 叩 龙
Hi Na Zhai,

issue resolved by uploading that file. thanks

Thanks,
Sent from phone


 原始邮件 
主题:答复: no mapreduce.tar.gz available for kylin to build cube
发件人:Na Zhai
收件人:dev@kylin.apache.org
抄送:


Hi, 叩龙.

You should find mapreduce.tar.gz in your hdp env and put it to the 
hdfs://s1.hdp.com:8020/app/hw-base/hdp/apps/3.0.1.0-187/mapreduce/ directory.

发送自 Windows 10 版邮件应用


发件人: 叩 龙
发送时间: Friday, March 15, 2019 12:14:58 PM
收件人: dev@kylin.apache.org
主题: no mapreduce.tar.gz available for kylin to build cube

hi Team,

I use below :
HDP 3.0.1
Apache-Kylin-2.6.0-bin-hadoop3

while trying to build cube from kylin, I got below error. I checked that path, 
we don’t have mapreduce but only tez. Could you please advise how to make it 
forward? thanks

[cid:Image449.png@1697f8d33c71c2]




Error while building cube:

java.io.FileNotFoundException: File does not exist: 
hdfs://s1.hdp.com:8020/app/hw-base/hdp/apps/3.0.1.0-187/mapreduce/mapreduce.tar.gz
at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:145)
at 
org.apache.hadoop.fs.AbstractFileSystem.resolvePath(AbstractFileSystem.java:488)
at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2292)
at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2288)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)

[cid:Image442.png@1697f89cd191bb]

发自 Windows 邮件



回复:/app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: [: too many arguments

2019-03-17 Thread 叩 龙
hi Na Zhai ,

yes , I can start it and query from cube .thanks for your reply

Thanks,
Sent from phone


 原始邮件 
主题:答复: /app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 
40: [: too many arguments
发件人:Na Zhai
收件人:dev@kylin.apache.org
抄送:


Hi, 叩 龙.



Can you start Kylin and query with sample Cube successfully? If so, I think it 
is does not matter.



发送自 Windows 10 版邮件应用




发件人: 叩 龙
发送时间: Friday, March 15, 2019 1:36:10 PM
收件人: dev@kylin.apache.org
主题: /app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: 
[: too many arguments

hi Team,

does subjected error matter while starting up kylin?

kylin 2.6.0 & HDP 3.0

[root@s4 bin]# ./kylin.sh start
Retrieving hadoop conf dir...
KYLIN_HOME is set to /app/apache-kylin-2.6.0-bin-hadoop3
Retrieving hive dependency...
/app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: [: 
too many arguments
Retrieving hbase dependency...
Retrieving hadoop conf dir...
Retrieving kafka dependency...
Retrieving Spark dependency...
Start to check whether we need to migrate acl tables
Retrieving hive dependency...
/app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: [: 
too many arguments
Retrieving hbase dependency...
Retrieving hadoop conf dir...
Retrieving kafka dependency...
Retrieving Spark dependency...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/app/apache-kylin-2.6.0-bin-hadoop3/tool/kylin-tool-2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-03-15 13:35:03,549 INFO [main] common.KylinConfig:101 : Loading 
kylin-defaults.properties from 
file:/app/apache-kylin-2.6.0-bin-hadoop3/tool/kylin-tool-2.6.0.jar!/kylin-defaults.properties
2019-03-15 13:35:03,572 DEBUG [main] common.KylinConfig:328 : KYLIN_CONF 
property was not set, will seek KYLIN_HOME env variable
2019-03-15 13:35:03,576 INFO [main] common.KylinConfig:136 : Initialized a new 
KylinConfig from getInstanceFromEnv : 957465255
2019-03-15 13:35:03,644 INFO [main] persistence.ResourceStore:88 : Using 
metadata url kylin_metadata@hbase for resource store


发自 Windows 邮件



答复: How kylin store data in Hbase ?

2019-03-17 Thread Na Zhai
Hi, rsanadhya.



Can you provide more informations? Have you set “Mandatory Dimensions”, 
“Hierarchy Dimensions” or “Joint Dimensions”?



发送自 Windows 10 版邮件应用




发件人: rsanad...@gmail.com 
发送时间: Thursday, March 14, 2019 7:29:48 PM
收件人: dev@kylin.apache.org
主题: How kylin store data in Hbase ?

HI All, i Just wanted to understand How kylin Store data in Hbase ?

Eg. I have 1 fact - having 3 column (Dim1.A,Dim2.B,C cal) 2 Dims- Dim 1
(A,Desc), Dim2 (B,Desc) i have 4 distinct records in facts and same 4
records properly exits in Dim1 and Dim2. Kindly help me to understand how
many combination this will create in Hbase ? Any leads will be great help !

Thanks, Rahul S

--
Sent from: http://apache-kylin.74782.x6.nabble.com/


答复: no mapreduce.tar.gz available for kylin to build cube

2019-03-17 Thread Na Zhai
Hi, 叩龙.

You should find mapreduce.tar.gz in your hdp env and put it to the 
hdfs://s1.hdp.com:8020/app/hw-base/hdp/apps/3.0.1.0-187/mapreduce/ directory.

发送自 Windows 10 版邮件应用


发件人: 叩 龙 
发送时间: Friday, March 15, 2019 12:14:58 PM
收件人: dev@kylin.apache.org
主题: no mapreduce.tar.gz available for kylin to build cube

hi Team,

I use below :
HDP 3.0.1
Apache-Kylin-2.6.0-bin-hadoop3

while trying to build cube from kylin, I got below error. I checked that path, 
we don’t have mapreduce but only tez. Could you please advise how to make it 
forward? thanks

[cid:Image449.png@1697f8d33c71c2]




Error while building cube:

java.io.FileNotFoundException: File does not exist: 
hdfs://s1.hdp.com:8020/app/hw-base/hdp/apps/3.0.1.0-187/mapreduce/mapreduce.tar.gz
at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:145)
at 
org.apache.hadoop.fs.AbstractFileSystem.resolvePath(AbstractFileSystem.java:488)
at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2292)
at org.apache.hadoop.fs.FileContext$25.next(FileContext.java:2288)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)

[cid:Image442.png@1697f89cd191bb]

发自 Windows 邮件



答复: /app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: [: too many arguments

2019-03-17 Thread Na Zhai
Hi, 叩 龙.



Can you start Kylin and query with sample Cube successfully? If so, I think it 
is does not matter.



发送自 Windows 10 版邮件应用




发件人: 叩 龙 
发送时间: Friday, March 15, 2019 1:36:10 PM
收件人: dev@kylin.apache.org
主题: /app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: 
[: too many arguments

hi Team,

does subjected error matter while starting up kylin?

kylin 2.6.0 & HDP 3.0

[root@s4 bin]# ./kylin.sh start
Retrieving hadoop conf dir...
KYLIN_HOME is set to /app/apache-kylin-2.6.0-bin-hadoop3
Retrieving hive dependency...
/app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: [: 
too many arguments
Retrieving hbase dependency...
Retrieving hadoop conf dir...
Retrieving kafka dependency...
Retrieving Spark dependency...
Start to check whether we need to migrate acl tables
Retrieving hive dependency...
/app/apache-kylin-2.6.0-bin-hadoop3/bin/find-hive-dependency.sh: line 40: [: 
too many arguments
Retrieving hbase dependency...
Retrieving hadoop conf dir...
Retrieving kafka dependency...
Retrieving Spark dependency...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/app/apache-kylin-2.6.0-bin-hadoop3/tool/kylin-tool-2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-03-15 13:35:03,549 INFO  [main] common.KylinConfig:101 : Loading 
kylin-defaults.properties from 
file:/app/apache-kylin-2.6.0-bin-hadoop3/tool/kylin-tool-2.6.0.jar!/kylin-defaults.properties
2019-03-15 13:35:03,572 DEBUG [main] common.KylinConfig:328 : KYLIN_CONF 
property was not set, will seek KYLIN_HOME env variable
2019-03-15 13:35:03,576 INFO  [main] common.KylinConfig:136 : Initialized a new 
KylinConfig from getInstanceFromEnv : 957465255
2019-03-15 13:35:03,644 INFO  [main] persistence.ResourceStore:88 : Using 
metadata url kylin_metadata@hbase for resource 
store


发自 Windows 邮件