[jira] [Created] (KYLIN-3559) Use Splitter for splitting String

2018-09-12 Thread Ted Yu (JIRA)
Ted Yu created KYLIN-3559:
-

 Summary: Use Splitter for splitting String
 Key: KYLIN-3559
 URL: https://issues.apache.org/jira/browse/KYLIN-3559
 Project: Kylin
  Issue Type: Task
Reporter: Ted Yu


See http://errorprone.info/bugpattern/StringSplitter for why Splitter is 
preferred.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3457) Distribute by multiple columns if not set shard-by column

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3457:

Summary: Distribute by multiple columns if not set shard-by column  (was: 
Distribute by multi column if not set distribute column during the redistribute 
step)

> Distribute by multiple columns if not set shard-by column
> -
>
> Key: KYLIN-3457
> URL: https://issues.apache.org/jira/browse/KYLIN-3457
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.5.0
>
>
> KYLIN-3388 remove redistribute step may cause a data skew problem。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KYLIN-3310) Use lint for maven-compiler-plugin

2018-09-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560940#comment-16560940
 ] 

Ted Yu edited comment on KYLIN-3310 at 9/13/18 3:16 AM:


Thanks, Jiatao .


was (Author: yuzhih...@gmail.com):
Thanks, Jiatao.

> Use lint for maven-compiler-plugin
> --
>
> Key: KYLIN-3310
> URL: https://issues.apache.org/jira/browse/KYLIN-3310
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Reporter: Ted Yu
>Assignee: jiatao.tao
>Priority: Major
>
> lint helps identify structural problems.
> We should enable lint for maven-compiler-plugin
> {code}
>   maven-compiler-plugin
>   ${maven-compiler-plugin.version}
>   
> 1.8
> 1.8
> 
>   -Xlint:all
>   ${compiler.error.flag}
>   
>   -Xlint:-options
>   
>   -Xlint:-cast
>   -Xlint:-deprecation
>   -Xlint:-processing
>   -Xlint:-rawtypes
>   -Xlint:-serial
>   -Xlint:-try
>   -Xlint:-unchecked
>   -Xlint:-varargs
>   
>   
>   
> 
> true
> 
> false
>   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612952#comment-16612952
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit ae0d824f60a67760822e4ce7399337e39f0f1abc in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=ae0d824 ]

KYLIN-3513 add plugin version to pom.xml
KYLIN-3513 disable javadoc check
KYLIN-3513 use SHA-256 algorithm


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612950#comment-16612950
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit ae0d824f60a67760822e4ce7399337e39f0f1abc in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=ae0d824 ]

KYLIN-3513 add plugin version to pom.xml
KYLIN-3513 disable javadoc check
KYLIN-3513 use SHA-256 algorithm


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612951#comment-16612951
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit ae0d824f60a67760822e4ce7399337e39f0f1abc in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=ae0d824 ]

KYLIN-3513 add plugin version to pom.xml
KYLIN-3513 disable javadoc check
KYLIN-3513 use SHA-256 algorithm


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612945#comment-16612945
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit bde5a525826d058858ce35db1935dd110a25215f in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=bde5a52 ]

KYLIN-3513 change scala-maven-plugin to 3.4.1


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3520) Deal with NULL values of measures for inmem cubing

2018-09-12 Thread Zhong Yanghong (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612940#comment-16612940
 ] 

Zhong Yanghong commented on KYLIN-3520:
---

Hi [~Shaofengshi], the previous patch is gotten by *git show*. Now it's refined.

> Deal with NULL values of measures for inmem cubing
> --
>
> Key: KYLIN-3520
> URL: https://issues.apache.org/jira/browse/KYLIN-3520
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: APACHE-KYLIN-3520.patch
>
>
> Previously NULL values will be dealt for dimensions during both layered 
> cubing and inmem cubing. However, NULL values of measures only are dealt for 
> layered cubing. NULL values of measures for inmem cubing also should be dealt 
> with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3520) Deal with NULL values of measures for inmem cubing

2018-09-12 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-3520:
--
Attachment: APACHE-KYLIN-3520.patch

> Deal with NULL values of measures for inmem cubing
> --
>
> Key: KYLIN-3520
> URL: https://issues.apache.org/jira/browse/KYLIN-3520
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: APACHE-KYLIN-3520.patch
>
>
> Previously NULL values will be dealt for dimensions during both layered 
> cubing and inmem cubing. However, NULL values of measures only are dealt for 
> layered cubing. NULL values of measures for inmem cubing also should be dealt 
> with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3520) Deal with NULL values of measures for inmem cubing

2018-09-12 Thread Zhong Yanghong (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-3520:
--
Attachment: (was: APACHE-KYLIN-3520.patch)

> Deal with NULL values of measures for inmem cubing
> --
>
> Key: KYLIN-3520
> URL: https://issues.apache.org/jira/browse/KYLIN-3520
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> Previously NULL values will be dealt for dimensions during both layered 
> cubing and inmem cubing. However, NULL values of measures only are dealt for 
> layered cubing. NULL values of measures for inmem cubing also should be dealt 
> with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3555) Garbage collection on HBase step fails with S3 selected as storage

2018-09-12 Thread JIRA


[ 
https://issues.apache.org/jira/browse/KYLIN-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612285#comment-16612285
 ] 

Iñigo Martinez commented on KYLIN-3555:
---

Hi Shaofeng.

This is our config.

kylin.env.hdfs-working-dir=s3://XXX-emr-kylin/kylin/
kylin.storage.hbase.cluster-fs=s3://XXX-emr-kylin/hbase/

In 2.4.0 the config is exactly the same and this problem is not present.

We have compared 2.4.1 and 2.4.0 and it seems that some changes has been done 
in Garbage method.

https://github.com/apache/kylin/commit/3177d79ca5cd8533164319acda8676684a6d307e#diff-784d6aaca261296ea18c7dd2de78

 

> Garbage collection on HBase step fails with S3 selected as storage
> --
>
> Key: KYLIN-3555
> URL: https://issues.apache.org/jira/browse/KYLIN-3555
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.4.1
>Reporter: Iñigo Martinez
>Priority: Major
>  Labels: build
> Attachments: Screenshot from 2018-09-11 12-31-25.png
>
>
> When building a cube with S3 selected has storage, build process fails at 
> latest step.
> Although s3 has been defined as storage, cleanup task tries to delete from 
> HDFS and, of course, there is no file at HDFS.
>  
> {code:java}
> 2018-09-11 12:27:56,311 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: 
> s3://XXX-emr-kylin
> 2018-09-11 12:27:57,364 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:87 : HDFS path 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns
>  is dropped.
> 2018-09-11 12:27:58,104 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:87 : HDFS path 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/hfile
>  is dropped.
> 2018-09-11 12:27:58,140 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: 
> hdfs://ip-10-0-1-63.eu-west-1.compute.internal:8020
> 2018-09-11 12:27:58,142 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:90 : HDFS path 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns
>  not exists.
> 2018-09-11 12:27:58,147 ERROR [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:68 : 
> job:f8416975-eea6-4500-9cb7-4374f28451dc-15 execute finished with exception
> java.io.FileNotFoundException: File 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1
>  does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:971)
> at 
> org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.dropHdfsPathOnCluster(HDFSPathGarbageCollectionStep.java:95)
> at 
> org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.doWork(HDFSPathGarbageCollectionStep.java:65)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
> at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3471) Merge dictionary and statistics on Yarn

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3471:

Summary: Merge dictionary and statistics on Yarn  (was: Merge dictionary 
and statistics on yarn)

> Merge dictionary and statistics on Yarn
> ---
>
> Key: KYLIN-3471
> URL: https://issues.apache.org/jira/browse/KYLIN-3471
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine, Spark Engine
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.5.0
>
>
> Currently, merge dictionary and statistics step is in kylin`s jvm,  which 
> causes a great burden on kylin.
> we should move this step on yarn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3477) Spark job size not available when deployMode is cluster

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3477:

Summary: Spark job size not available when deployMode is cluster  (was: 
Spark job size wasn't displayed when submit deployMode is cluster)

> Spark job size not available when deployMode is cluster
> ---
>
> Key: KYLIN-3477
> URL: https://issues.apache.org/jira/browse/KYLIN-3477
> Project: Kylin
>  Issue Type: Bug
>  Components: Spark Engine
>Reporter: Na Zhai
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.5.0
>
>
> When "spark.submit.deployMode=cluster", the Yarn job size can't be printed in 
> console, Kylin can't get that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3490) For single column queries, only dictionaries are enough

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3490:

Summary: For single column queries, only dictionaries are enough  (was: For 
some single column related queries, don't need to query cuboids, only 
dictionaries are enough)

> For single column queries, only dictionaries are enough
> ---
>
> Key: KYLIN-3490
> URL: https://issues.apache.org/jira/browse/KYLIN-3490
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.5.0
>
> Attachments: APACHE-KYLIN-3490.patch
>
>
> A common use case for BI tools is as follows:
> # Firstly, extract all of the values of a dimension column
> # Then, select part of the values as filter condition.
> Previously query for the first step requires to hit all of the segments' 
> cuboid data, which may not be efficient, especially when the segments occupy 
> many regions. 
> To use dictionary rather than cuboid data to answer this kind of queries, 
> will reduce the cost of many rpcs to hbase.
> Sample queries are as follows:
> {code}
> select A
> from T
> group by A
> {code}
> {code}
> select distinct A
> from T
> {code}
> {code}
> select max(A)
> from T
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3489) Improve the efficiency of enumerating dictionary values

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3489:

Summary: Improve the efficiency of enumerating dictionary values  (was: 
Improve the efficiency of enumerating dictionary values by pre-order visiting)

> Improve the efficiency of enumerating dictionary values
> ---
>
> Key: KYLIN-3489
> URL: https://issues.apache.org/jira/browse/KYLIN-3489
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.5.0
>
> Attachments: APACHE-KYLIN-3489.patch
>
>
> Currently, to enumerate all of the values of a dictionary, we enumerate the 
> ids first and then get the related values. The compute complexity is nlogn. 
> We can achieve the compute complexity of n by pre-order visiting the trie 
> tree.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3535) "kylin-port-replace-util.sh" changed port but not uncomment it

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3535:

Summary: "kylin-port-replace-util.sh" changed port but not uncomment it  
(was: kylin-port-replace-util.sh could not take effect )

> "kylin-port-replace-util.sh" changed port but not uncomment it
> --
>
> Key: KYLIN-3535
> URL: https://issues.apache.org/jira/browse/KYLIN-3535
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Reporter: Yichen Zhou
>Assignee: Yichen Zhou
>Priority: Major
> Fix For: v2.5.0
>
>
> $KYLIN_HOME/bin/kylin-port-replace-util.sh replace web servers port in 
> kylin.properties:
> {quote}sed -i 
> "s/kylin.server.cluster-servers=\(.*\).*:\(.*\)/kylin.server.cluster-servers=\1:${new_kylin_port}/g"
>  ${KYLIN_CONFIG_FILE}
> {quote}
> However, all configurations in kylin.properties are commented out by default. 
> New port numbers will not take effect unless being uncommented manually.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3553) Upgrade Tomcat to 7.0.90.

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3553:

Summary: Upgrade Tomcat to 7.0.90.  (was: Tomcat Security Vulnerability 
Alert. The version of the tomcat for kylin should upgrade to 7.0.90.)

> Upgrade Tomcat to 7.0.90.
> -
>
> Key: KYLIN-3553
> URL: https://issues.apache.org/jira/browse/KYLIN-3553
> Project: Kylin
>  Issue Type: Bug
>  Components: Security
>Affects Versions: v2.4.0
>Reporter: Peng Xing
>Assignee: Peng Xing
>Priority: Major
> Fix For: v2.5.0, v2.4.2
>
>
> [SECURITY] CVE-2018-1336
> Severity: High 
> Versions Affected: Apache Tomcat 9.0.0.M9 to 9.0.7, 8.5.0 to 8.5.30, 
> 8.0.0.RC1 to 8.0.51, and 7.0.28 to 7.0.86.
> Description: An improper handing of overflow in the UTF-8 decoder with 
> supplementary characters can lead to an infinite loop in the decoder causing 
> a Denial of Service.
> CVE-2018-8014
> Description: The defaults settings for the CORS filter provided in Apache 
> Tomcat 9.0.0.M1 to 9.0.8, 8.5.0 to 8.5.31, 8.0.0.RC1 to 8.0.52, 7.0.41 to 
> 7.0.88 are insecure and enable 'supportsCredentials' for all origins. It is 
> expected that users of the CORS filter will have configured it appropriately 
> for their environment rather than using it in the default configuration. 
> Therefore, it is expected that most users will not be impacted by this issue.
> CVE-2018-8034
> Description: The host name verification when using TLS with the WebSocket 
> client was missing. It is now enabled by default. 
> Versions Affected: Apache Tomcat 9.0.0.M1 to 9.0.9, 8.5.0 to 8.5.31, 
> 8.0.0.RC1 to 8.0.52, and 7.0.35 to 7.0.88.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3550) "kylin.source.hive.flat-table-field-delimiter" has extra "\"

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3550:

Summary: "kylin.source.hive.flat-table-field-delimiter" has extra "\"   
(was: "kylin.source.hive.flat-table-field-delimiter" has extra "\" when create 
intermediate flat table)

> "kylin.source.hive.flat-table-field-delimiter" has extra "\" 
> -
>
> Key: KYLIN-3550
> URL: https://issues.apache.org/jira/browse/KYLIN-3550
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.4.0, v2.4.1
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Minor
> Fix For: v2.5.0
>
>
> The extra "\" will cause user need to enter "t" if want to use "\t" as the 
> delimiter.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3033) Support HBase 2.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3033:

Summary: Support HBase 2.0  (was: Provide API compatibility for hbase 2.0 
release)

> Support HBase 2.0
> -
>
> Key: KYLIN-3033
> URL: https://issues.apache.org/jira/browse/KYLIN-3033
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - HBase
>Reporter: Ted Yu
>Priority: Major
>  Labels: compatibility
> Fix For: v2.5.0
>
>
> Compiling against hbase 2.0.0-alpha4 release, I got the following compilation 
> errors:
> https://pastebin.com/yfejnTBE
> We should start preparing migration to hbase 2.0 compatible APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-2565) Support Hadoop 3.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-2565:

Summary: Support Hadoop 3.0  (was: Support Hadoop3.0)

> Support Hadoop 3.0
> --
>
> Key: KYLIN-2565
> URL: https://issues.apache.org/jira/browse/KYLIN-2565
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine
>Reporter: Wang Cheng
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.5.0
>
>
> Hadoop3.0-alpha is released, Kylin should also keep compatible with it. Below 
> is the Hadoop3.0 components requirements:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-2565) Support Hadoop3.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-2565:

Summary: Support Hadoop3.0  (was: Upgrade Kylin to Hadoop3.0)

> Support Hadoop3.0
> -
>
> Key: KYLIN-2565
> URL: https://issues.apache.org/jira/browse/KYLIN-2565
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine
>Reporter: Wang Cheng
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.5.0
>
>
> Hadoop3.0-alpha is released, Kylin should also keep compatible with it. Below 
> is the Hadoop3.0 components requirements:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-2565) Upgrade Kylin to Hadoop3.0

2018-09-12 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612069#comment-16612069
 ] 

Shaofeng SHI commented on KYLIN-2565:
-

We cut off two branches for the new version:
 
 * master-hadoop3.1: Kylin master code base with Hadoop 3.1 and HBase 2.0 API
 * 2.5.x-hadoop3.1: Kylin 2.5.x code base with Hadoop 3.1 and HBase 2.0 API

> Upgrade Kylin to Hadoop3.0
> --
>
> Key: KYLIN-2565
> URL: https://issues.apache.org/jira/browse/KYLIN-2565
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine
>Reporter: Wang Cheng
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.5.0
>
>
> Hadoop3.0-alpha is released, Kylin should also keep compatible with it. Below 
> is the Hadoop3.0 components requirements:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3517) Couldn't update coprocessor on HBase 2.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3517.
-
Resolution: Fixed

> Couldn't update coprocessor on HBase 2.0
> 
>
> Key: KYLIN-3517
> URL: https://issues.apache.org/jira/browse/KYLIN-3517
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Reporter: Shaofeng SHI
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.5.0
>
>
> On HDP 3.0, run update coprocessor, got this error:
>  
> {code:java}
> 2018-08-28 00:24:26,683 ERROR [pool-7-thread-1] util.DeployCoprocessorCLI:383 
> : Error processing KYLIN_O9JRT8XOQ9
> java.lang.UnsupportedOperationException: HTableDescriptor is read-only
> at 
> org.apache.hadoop.hbase.client.ImmutableHTableDescriptor.getDelegateeForModification(ImmutableHTableDescriptor.java:59)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.removeCoprocessor(HTableDescriptor.java:768)
> at 
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI.resetCoprocessor(DeployCoprocessorCLI.java:300)
> at 
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI$ResetCoprocessorWorker.run(DeployCoprocessorCLI.java:375)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3517) Couldn't update coprocessor on HBase 2.0

2018-09-12 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612075#comment-16612075
 ] 

Shaofeng SHI commented on KYLIN-3517:
-

We cut off two branches for the new version:
 
 * master-hadoop3.1: Kylin master code base with Hadoop 3.1 and HBase 2.0 API
 * 2.5.x-hadoop3.1: Kylin 2.5.x code base with Hadoop 3.1 and HBase 2.0 API

Tested on HDP 3.0

> Couldn't update coprocessor on HBase 2.0
> 
>
> Key: KYLIN-3517
> URL: https://issues.apache.org/jira/browse/KYLIN-3517
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Reporter: Shaofeng SHI
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.5.0
>
>
> On HDP 3.0, run update coprocessor, got this error:
>  
> {code:java}
> 2018-08-28 00:24:26,683 ERROR [pool-7-thread-1] util.DeployCoprocessorCLI:383 
> : Error processing KYLIN_O9JRT8XOQ9
> java.lang.UnsupportedOperationException: HTableDescriptor is read-only
> at 
> org.apache.hadoop.hbase.client.ImmutableHTableDescriptor.getDelegateeForModification(ImmutableHTableDescriptor.java:59)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.removeCoprocessor(HTableDescriptor.java:768)
> at 
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI.resetCoprocessor(DeployCoprocessorCLI.java:300)
> at 
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI$ResetCoprocessorWorker.run(DeployCoprocessorCLI.java:375)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3033) Provide API compatibility for hbase 2.0 release

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3033.
-
Resolution: Fixed

> Provide API compatibility for hbase 2.0 release
> ---
>
> Key: KYLIN-3033
> URL: https://issues.apache.org/jira/browse/KYLIN-3033
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - HBase
>Reporter: Ted Yu
>Priority: Major
>  Labels: compatibility
> Fix For: v2.5.0
>
>
> Compiling against hbase 2.0.0-alpha4 release, I got the following compilation 
> errors:
> https://pastebin.com/yfejnTBE
> We should start preparing migration to hbase 2.0 compatible APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3033) Provide API compatibility for hbase 2.0 release

2018-09-12 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612072#comment-16612072
 ] 

Shaofeng SHI commented on KYLIN-3033:
-

We cut off two branches for the new version:
 
 * master-hadoop3.1: Kylin master code base with Hadoop 3.1 and HBase 2.0 API
 * 2.5.x-hadoop3.1: Kylin 2.5.x code base with Hadoop 3.1 and HBase 2.0 API

Tested on HDP 3.0

> Provide API compatibility for hbase 2.0 release
> ---
>
> Key: KYLIN-3033
> URL: https://issues.apache.org/jira/browse/KYLIN-3033
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - HBase
>Reporter: Ted Yu
>Priority: Major
>  Labels: compatibility
> Fix For: v2.5.0
>
>
> Compiling against hbase 2.0.0-alpha4 release, I got the following compilation 
> errors:
> https://pastebin.com/yfejnTBE
> We should start preparing migration to hbase 2.0 compatible APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3517) Couldn't update coprocessor on HBase 2.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3517:

Fix Version/s: v2.5.0

> Couldn't update coprocessor on HBase 2.0
> 
>
> Key: KYLIN-3517
> URL: https://issues.apache.org/jira/browse/KYLIN-3517
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Reporter: Shaofeng SHI
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.5.0
>
>
> On HDP 3.0, run update coprocessor, got this error:
>  
> {code:java}
> 2018-08-28 00:24:26,683 ERROR [pool-7-thread-1] util.DeployCoprocessorCLI:383 
> : Error processing KYLIN_O9JRT8XOQ9
> java.lang.UnsupportedOperationException: HTableDescriptor is read-only
> at 
> org.apache.hadoop.hbase.client.ImmutableHTableDescriptor.getDelegateeForModification(ImmutableHTableDescriptor.java:59)
> at 
> org.apache.hadoop.hbase.HTableDescriptor.removeCoprocessor(HTableDescriptor.java:768)
> at 
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI.resetCoprocessor(DeployCoprocessorCLI.java:300)
> at 
> org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI$ResetCoprocessorWorker.run(DeployCoprocessorCLI.java:375)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3033) Provide API compatibility for hbase 2.0 release

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3033:

Fix Version/s: v2.5.0
  Component/s: Storage - HBase

> Provide API compatibility for hbase 2.0 release
> ---
>
> Key: KYLIN-3033
> URL: https://issues.apache.org/jira/browse/KYLIN-3033
> Project: Kylin
>  Issue Type: Improvement
>  Components: Storage - HBase
>Reporter: Ted Yu
>Priority: Major
>  Labels: compatibility
> Fix For: v2.5.0
>
>
> Compiling against hbase 2.0.0-alpha4 release, I got the following compilation 
> errors:
> https://pastebin.com/yfejnTBE
> We should start preparing migration to hbase 2.0 compatible APIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-2565) Upgrade Kylin to Hadoop3.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-2565.
-
Resolution: Fixed

> Upgrade Kylin to Hadoop3.0
> --
>
> Key: KYLIN-2565
> URL: https://issues.apache.org/jira/browse/KYLIN-2565
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine
>Reporter: Wang Cheng
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.5.0
>
>
> Hadoop3.0-alpha is released, Kylin should also keep compatible with it. Below 
> is the Hadoop3.0 components requirements:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-2565) Upgrade Kylin to Hadoop3.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-2565:

Fix Version/s: v2.5.0
  Component/s: Job Engine

> Upgrade Kylin to Hadoop3.0
> --
>
> Key: KYLIN-2565
> URL: https://issues.apache.org/jira/browse/KYLIN-2565
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine
>Reporter: Wang Cheng
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.5.0
>
>
> Hadoop3.0-alpha is released, Kylin should also keep compatible with it. Below 
> is the Hadoop3.0 components requirements:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3441) Merge cube segments in Spark

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3441.
-
Resolution: Fixed

> Merge cube segments in Spark
> 
>
> Key: KYLIN-3441
> URL: https://issues.apache.org/jira/browse/KYLIN-3441
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
> Fix For: v2.5.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3497) Make JDBC Module more testable

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3497:

Fix Version/s: (was: v2.5.0)
   v2.6.0

> Make JDBC Module more testable
> --
>
> Key: KYLIN-3497
> URL: https://issues.apache.org/jira/browse/KYLIN-3497
> Project: Kylin
>  Issue Type: Improvement
>  Components: Driver - JDBC
>Reporter: Ian Hu
>Assignee: Ian Hu
>Priority: Minor
> Fix For: v2.6.0
>
>
> While I am trying my work about KYLIN-3496, I found it is difficult to test. 
> I would offer a work to make it more testable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-2861) For dictionary building of lookup table columns, reduce the table scan chance

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-2861:

Fix Version/s: (was: v2.5.0)
   v2.6.0

> For dictionary building of lookup table columns, reduce the table scan chance
> -
>
> Key: KYLIN-2861
> URL: https://issues.apache.org/jira/browse/KYLIN-2861
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.6.0
>
>
> In some cases, several columns from the same lookup table are defined as 
> normal dimensions. We'll build dictionaries for these columns. With current 
> implementation, for each column, the whole lookup table needs to be scanned 
> once. Then if there are N columns, kylin needs to scan the lookup table N 
> times.
> For this JIRA, we'll scan only once to build all of the related columns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3525) kylin.source.hive.keep-flat-table=true will delete data

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3525:

Fix Version/s: (was: v2.6.0)
   v2.5.0

> kylin.source.hive.keep-flat-table=true will delete data
> ---
>
> Key: KYLIN-3525
> URL: https://issues.apache.org/jira/browse/KYLIN-3525
> Project: Kylin
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Minor
> Fix For: v2.5.0
>
> Attachments: 1535534470(1).png, HiveMRInput.java
>
>
> kylin.source.hive.keep-flat-table这个参数设置为true后,只是保留了hive表结构,但是数据还是被清除了,查看了源码,这个参数确实只是对表结构进行控制。如果希望数据也保存下来,并且是一个cube下面所有作业的数据都保存到一个表里面(目前是一个作业会产生一个表),是否有什么好的解决方案。
> 附件为v2.4.0源码
> 谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3525) kylin.source.hive.keep-flat-table=true will delete data

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3525.
-
Resolution: Fixed

> kylin.source.hive.keep-flat-table=true will delete data
> ---
>
> Key: KYLIN-3525
> URL: https://issues.apache.org/jira/browse/KYLIN-3525
> Project: Kylin
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Minor
> Fix For: v2.5.0
>
> Attachments: 1535534470(1).png, HiveMRInput.java
>
>
> kylin.source.hive.keep-flat-table这个参数设置为true后,只是保留了hive表结构,但是数据还是被清除了,查看了源码,这个参数确实只是对表结构进行控制。如果希望数据也保存下来,并且是一个cube下面所有作业的数据都保存到一个表里面(目前是一个作业会产生一个表),是否有什么好的解决方案。
> 附件为v2.4.0源码
> 谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3525) kylin.source.hive.keep-flat-table=true will delete data

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611923#comment-16611923
 ] 

ASF subversion and git services commented on KYLIN-3525:


Commit 63b435d6d00799f558a2dc8e8cac712f3dc1eb33 in kylin's branch 
refs/heads/2.5.0-release from hit-lacus
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=63b435d ]

KYLIN-3525 Reserve intermediate flat table data if 
kylin.source.hive.keep-flat-table set to true


> kylin.source.hive.keep-flat-table=true will delete data
> ---
>
> Key: KYLIN-3525
> URL: https://issues.apache.org/jira/browse/KYLIN-3525
> Project: Kylin
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Minor
> Fix For: v2.6.0
>
> Attachments: 1535534470(1).png, HiveMRInput.java
>
>
> kylin.source.hive.keep-flat-table这个参数设置为true后,只是保留了hive表结构,但是数据还是被清除了,查看了源码,这个参数确实只是对表结构进行控制。如果希望数据也保存下来,并且是一个cube下面所有作业的数据都保存到一个表里面(目前是一个作业会产生一个表),是否有什么好的解决方案。
> 附件为v2.4.0源码
> 谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611925#comment-16611925
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit 3d20c8b05db88899ef00b8f780cb0d9ea4f4c516 in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=3d20c8b ]

KYLIN-3513 add plugin version to pom.xml
KYLIN-3513 disable javadoc check
KYLIN-3513 use SHA-256 algorithm


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611924#comment-16611924
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit 3d20c8b05db88899ef00b8f780cb0d9ea4f4c516 in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=3d20c8b ]

KYLIN-3513 add plugin version to pom.xml
KYLIN-3513 disable javadoc check
KYLIN-3513 use SHA-256 algorithm


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611926#comment-16611926
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit 3d20c8b05db88899ef00b8f780cb0d9ea4f4c516 in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=3d20c8b ]

KYLIN-3513 add plugin version to pom.xml
KYLIN-3513 disable javadoc check
KYLIN-3513 use SHA-256 algorithm


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3525) kylin.source.hive.keep-flat-table=true will delete data

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611917#comment-16611917
 ] 

ASF subversion and git services commented on KYLIN-3525:


Commit 63b435d6d00799f558a2dc8e8cac712f3dc1eb33 in kylin's branch 
refs/heads/2.5.x from hit-lacus
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=63b435d ]

KYLIN-3525 Reserve intermediate flat table data if 
kylin.source.hive.keep-flat-table set to true


> kylin.source.hive.keep-flat-table=true will delete data
> ---
>
> Key: KYLIN-3525
> URL: https://issues.apache.org/jira/browse/KYLIN-3525
> Project: Kylin
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Minor
> Fix For: v2.6.0
>
> Attachments: 1535534470(1).png, HiveMRInput.java
>
>
> kylin.source.hive.keep-flat-table这个参数设置为true后,只是保留了hive表结构,但是数据还是被清除了,查看了源码,这个参数确实只是对表结构进行控制。如果希望数据也保存下来,并且是一个cube下面所有作业的数据都保存到一个表里面(目前是一个作业会产生一个表),是否有什么好的解决方案。
> 附件为v2.4.0源码
> 谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3525) kylin.source.hive.keep-flat-table=true will delete data

2018-09-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611895#comment-16611895
 ] 

ASF GitHub Bot commented on KYLIN-3525:
---

shaofengshi closed pull request #230: KYLIN-3525 Reserve intermediate flat 
table data if kylin.source.hive.keep-flat-table set to true
URL: https://github.com/apache/kylin/pull/230
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/source-hive/src/main/java/org/apache/kylin/source/hive/GarbageCollectionStep.java
 
b/source-hive/src/main/java/org/apache/kylin/source/hive/GarbageCollectionStep.java
index ac25d07389..7dc8260980 100644
--- 
a/source-hive/src/main/java/org/apache/kylin/source/hive/GarbageCollectionStep.java
+++ 
b/source-hive/src/main/java/org/apache/kylin/source/hive/GarbageCollectionStep.java
@@ -60,16 +60,17 @@ private String cleanUpIntermediateFlatTable(KylinConfig 
config) throws IOExcepti
 StringBuffer output = new StringBuffer();
 final HiveCmdBuilder hiveCmdBuilder = new HiveCmdBuilder();
 final List hiveTables = this.getIntermediateTables();
-for (String hiveTable : hiveTables) {
-if (config.isHiveKeepFlatTable() == false && 
StringUtils.isNotEmpty(hiveTable)) {
-hiveCmdBuilder.addStatement("USE " + 
config.getHiveDatabaseForIntermediateTable() + ";");
-hiveCmdBuilder.addStatement("DROP TABLE IF EXISTS  " + 
hiveTable + ";");
-
-output.append("Hive table " + hiveTable + " is dropped. \n");
+if (!config.isHiveKeepFlatTable()){
+for (String hiveTable : hiveTables) {
+if (StringUtils.isNotEmpty(hiveTable)) {
+hiveCmdBuilder.addStatement("USE " + 
config.getHiveDatabaseForIntermediateTable() + ";");
+hiveCmdBuilder.addStatement("DROP TABLE IF EXISTS  " + 
hiveTable + ";");
+output.append("Hive table " + hiveTable + " is dropped. 
\n");
+}
 }
+rmdirOnHDFS(getExternalDataPaths());
 }
 config.getCliCommandExecutor().execute(hiveCmdBuilder.build());
-rmdirOnHDFS(getExternalDataPaths());
 output.append("Path " + getExternalDataPaths() + " is deleted. \n");
 
 return output.toString();


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> kylin.source.hive.keep-flat-table=true will delete data
> ---
>
> Key: KYLIN-3525
> URL: https://issues.apache.org/jira/browse/KYLIN-3525
> Project: Kylin
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Minor
> Fix For: v2.6.0
>
> Attachments: 1535534470(1).png, HiveMRInput.java
>
>
> kylin.source.hive.keep-flat-table这个参数设置为true后,只是保留了hive表结构,但是数据还是被清除了,查看了源码,这个参数确实只是对表结构进行控制。如果希望数据也保存下来,并且是一个cube下面所有作业的数据都保存到一个表里面(目前是一个作业会产生一个表),是否有什么好的解决方案。
> 附件为v2.4.0源码
> 谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3525) kylin.source.hive.keep-flat-table=true will delete data

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611896#comment-16611896
 ] 

ASF subversion and git services commented on KYLIN-3525:


Commit 375acdedf88a298bfcd2d2dddbf6f8551f42e427 in kylin's branch 
refs/heads/master from hit-lacus
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=375acde ]

KYLIN-3525 Reserve intermediate flat table data if 
kylin.source.hive.keep-flat-table set to true


> kylin.source.hive.keep-flat-table=true will delete data
> ---
>
> Key: KYLIN-3525
> URL: https://issues.apache.org/jira/browse/KYLIN-3525
> Project: Kylin
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Minor
> Fix For: v2.6.0
>
> Attachments: 1535534470(1).png, HiveMRInput.java
>
>
> kylin.source.hive.keep-flat-table这个参数设置为true后,只是保留了hive表结构,但是数据还是被清除了,查看了源码,这个参数确实只是对表结构进行控制。如果希望数据也保存下来,并且是一个cube下面所有作业的数据都保存到一个表里面(目前是一个作业会产生一个表),是否有什么好的解决方案。
> 附件为v2.4.0源码
> 谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3558) Kylin for CDH6.0.0

2018-09-12 Thread Lijun Cao (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijun Cao reassigned KYLIN-3558:


Assignee: Lijun Cao

> Kylin for CDH6.0.0
> --
>
> Key: KYLIN-3558
> URL: https://issues.apache.org/jira/browse/KYLIN-3558
> Project: Kylin
>  Issue Type: Wish
>Reporter: Lijun Cao
>Assignee: Lijun Cao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3558) Kylin for CDH6.0.0

2018-09-12 Thread Lijun Cao (JIRA)
Lijun Cao created KYLIN-3558:


 Summary: Kylin for CDH6.0.0
 Key: KYLIN-3558
 URL: https://issues.apache.org/jira/browse/KYLIN-3558
 Project: Kylin
  Issue Type: Wish
Reporter: Lijun Cao






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611870#comment-16611870
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit 8d576e19e50ac135c580b914c8bbe735e21ffc56 in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=8d576e1 ]

KYLIN-3513 add plugin version to pom.xml


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611871#comment-16611871
 ] 

ASF subversion and git services commented on KYLIN-3513:


Commit 07af02ff8037cececb8e188934d30530f4e4da90 in kylin's branch 
refs/heads/2.5.0-release from shaofengshi
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=07af02f ]

KYLIN-3513 disable javadoc check


> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3513) Release 2.5.0

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3513:
---

Assignee: Shaofeng SHI

> Release 2.5.0
> -
>
> Key: KYLIN-3513
> URL: https://issues.apache.org/jira/browse/KYLIN-3513
> Project: Kylin
>  Issue Type: Task
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3551) Spark job failed with "FileNotFoundException"

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3551.
-
Resolution: Fixed

> Spark job failed with "FileNotFoundException" 
> --
>
> Key: KYLIN-3551
> URL: https://issues.apache.org/jira/browse/KYLIN-3551
> Project: Kylin
>  Issue Type: Bug
>  Components: Spark Engine
>Reporter: Shaofeng SHI
>Assignee: Chao Long
>Priority: Minor
> Fix For: v2.5.0
>
>
> java.io.FileNotFoundException: File does not exist: 
> hdfs://sandbox.hortonworks.com:8020/kylin/kylin_default_instance/kylin-a3e39298-8dc3-21f2-cf16-0aa5e451c777/kylin_sales_cube_clone_clone/counter
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1319)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1311)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1311)
>   at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1752)
>   at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1776)
>   at 
> org.apache.kylin.common.util.HadoopUtil.readFromSequenceFile(HadoopUtil.java:218)
>   at 
> org.apache.kylin.common.util.HadoopUtil.readFromSequenceFile(HadoopUtil.java:233)
>   at 
> org.apache.kylin.engine.spark.SparkExecutable.doWork(SparkExecutable.java:319)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
>   at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:163)
>   at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3557) PreparedStatement should be closed in JDBCResourceDAO#checkTableExists

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3557.
-
Resolution: Fixed

> PreparedStatement should be closed in JDBCResourceDAO#checkTableExists
> --
>
> Key: KYLIN-3557
> URL: https://issues.apache.org/jira/browse/KYLIN-3557
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Ted Yu
>Assignee: Shaofeng SHI
>Priority: Minor
> Fix For: v2.5.0
>
>
> {code}
> final PreparedStatement ps = 
> connection.prepareStatement(getCheckTableExistsSql(tableName));
> final ResultSet rs = ps.executeQuery();
> {code}
> {{ps}} should be closed upon return.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3547) DimensionRangeInfo: Unsupported data type boolean

2018-09-12 Thread Shaofeng SHI (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3547.
-
Resolution: Fixed

> DimensionRangeInfo: Unsupported data type boolean
> -
>
> Key: KYLIN-3547
> URL: https://issues.apache.org/jira/browse/KYLIN-3547
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine, Metadata
>Affects Versions: v2.5.0
>Reporter: Yichen Zhou
>Assignee: Yichen Zhou
>Priority: Blocker
> Fix For: v2.5.0
>
>
> In Extract Fact Table Distinct Columns Step, Kylin can not get dimension 
> range information from data of boolean type because DataType.getOrder() does 
> not support boolean.
> {quote}java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkFactDistinct at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
>  at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:636)
>  Caused by: org.apache.spark.SparkException: Job aborted due to stage 
> failure: Task 5 in stage 1.0 failed 4 times, most recent failure: Lost task 
> 5.3 in stage 1.0 (TID 17, slave1.kcluster, executor 5): 
> java.lang.IllegalArgumentException: Unsupported data type boolean at 
> org.apache.kylin.metadata.datatype.DataTypeOrder.getInstance(DataTypeOrder.java:53)
>  at org.apache.kylin.metadata.datatype.DataType.getOrder(DataType.java:230) 
> at org.apache.kylin.metadata.datatype.DataType.compare(DataType.java:236) at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:739)
>  at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:612)
>  at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>  at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:797)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:797)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at 
> org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at 
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at 
> org.apache.spark.scheduler.Task.run(Task.scala:99) at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3555) Garbage collection on HBase step fails with S3 selected as storage

2018-09-12 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611665#comment-16611665
 ] 

Shaofeng SHI commented on KYLIN-3555:
-

How did you configure the filesytem in kylin.properties? Especially the 
following parameters:

kylin.storage.hbase.cluster-fs=

kylin.env.hdfs-working-dir=

> Garbage collection on HBase step fails with S3 selected as storage
> --
>
> Key: KYLIN-3555
> URL: https://issues.apache.org/jira/browse/KYLIN-3555
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.4.1
>Reporter: Iñigo Martinez
>Priority: Major
>  Labels: build
> Attachments: Screenshot from 2018-09-11 12-31-25.png
>
>
> When building a cube with S3 selected has storage, build process fails at 
> latest step.
> Although s3 has been defined as storage, cleanup task tries to delete from 
> HDFS and, of course, there is no file at HDFS.
>  
> {code:java}
> 2018-09-11 12:27:56,311 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: 
> s3://XXX-emr-kylin
> 2018-09-11 12:27:57,364 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:87 : HDFS path 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns
>  is dropped.
> 2018-09-11 12:27:58,104 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:87 : HDFS path 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/hfile
>  is dropped.
> 2018-09-11 12:27:58,140 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: 
> hdfs://ip-10-0-1-63.eu-west-1.compute.internal:8020
> 2018-09-11 12:27:58,142 DEBUG [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:90 : HDFS path 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns
>  not exists.
> 2018-09-11 12:27:58,147 ERROR [Scheduler 1407846257 Job 
> f8416975-eea6-4500-9cb7-4374f28451dc-237] 
> steps.HDFSPathGarbageCollectionStep:68 : 
> job:f8416975-eea6-4500-9cb7-4374f28451dc-15 execute finished with exception
> java.io.FileNotFoundException: File 
> /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1
>  does not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:971)
> at 
> org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.dropHdfsPathOnCluster(HDFSPathGarbageCollectionStep.java:95)
> at 
> org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.doWork(HDFSPathGarbageCollectionStep.java:65)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
> at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)