from:"Liu Shaohui \(JIRA\)"

[jira] [Created] (KYLIN-2846) Add a config of hbase namespace for cube storage

2017-09-05 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-2846:
--

 Summary: Add a config of hbase namespace for cube storage
 Key: KYLIN-2846
 URL: https://issues.apache.org/jira/browse/KYLIN-2846
 Project: Kylin
  Issue Type: New Feature
  Components: Storage - HBase
Affects Versions: v2.1.0
Reporter: Liu Shaohui
Assignee: liyang
Priority: Minor
 Fix For: Future


In multi-tenancy HBase cluster, namespace is important for quota management and 
permission control. So we add a global configuration of hbase namespace for 
cube storage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (KYLIN-3156) Failed to delete meta path in SparkCubingByLayer

2018-01-07 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3156:
--

 Summary: Failed to delete meta path in SparkCubingByLayer
 Key: KYLIN-3156
 URL: https://issues.apache.org/jira/browse/KYLIN-3156
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


After KYLIN-2945, the meta url in SparkCubingByLayer will be a string of 
StorageURL not string with path@hdfs format. This will make the deleteHDFSMeta 
method failed in SparkCubingByLayer.

{quote}
2018-01-08,11:51:50,903 ERROR org.apache.spark.deploy.yarn.ApplicationMaster: 
User class threw exception: java.lang.RuntimeException: error execute 
org.apache.kylin.engine.spark.SparkCubingByLayer
java.lang.RuntimeException: error execute 
org.apache.kylin.engine.spark.SparkCubingByLayer
at 
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:653)
Caused by: java.lang.IllegalArgumentException: Cannot create FileSystem from 
URI: kylin_tst:kylin_metadata
at org.apache.kylin.common.util.HadoopUtil.makeURI(HadoopUtil.java:98)
at 
org.apache.kylin.common.util.HadoopUtil.getFileSystem(HadoopUtil.java:78)
at 
org.apache.kylin.engine.spark.SparkCubingByLayer.deleteHDFSMeta(SparkCubingByLayer.java:484)
at 
org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:207)
at 
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
... 6 more
Caused by: java.net.URISyntaxException: Illegal character in scheme name at 
index 5: kylin_tst:kylin_metadata
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.checkChars(URI.java:3021)
at java.net.URI$Parser.parse(URI.java:3048)
at java.net.URI.(URI.java:588)
at org.apache.kylin.common.util.HadoopUtil.makeURI(HadoopUtil.java:96)
... 10 more
{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (KYLIN-3357) Sum of small int measure may be nagetive after KYLIN-2982

2018-05-01 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3357:
--

 Summary: Sum of small int measure may be nagetive after KYLIN-2982
 Key: KYLIN-3357
 URL: https://issues.apache.org/jira/browse/KYLIN-3357
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: v2.3.0
Reporter: Liu Shaohui


After KYLIN-2982, the sum of small int measure may be nagetive.

Same problem is reported in kylin user mail with title "negative result in 
kylin 2.3.0"

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3495) Wrong datatypo when using on max function on a empty column

2018-08-13 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3495:
--

 Summary: Wrong datatypo when using on max function on a empty 
column
 Key: KYLIN-3495
 URL: https://issues.apache.org/jira/browse/KYLIN-3495
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


SQL:
{code:java}
select 
count(*),sum(PER_BYTES_TIME_COST)/count(PER_BYTES_TIME_COST),max(PER_BYTES_TIME_COST),min(PER_BYTES_TIME_COST)
 from KYLIN_ONEBOX.HIVE_METRICS_JOB_DEV where KDAY_DATE >= '2018-07-01' and 
KDAY_DATE <= '2018-07-31' and PROJECT ='LEARN_KYLIN'{code}
 

Exception:
{code:java}
NoSuchMethodException: SqlFunctions.greater(java.math.BigDecimal, double)

while resolving method 'greater[class java.math.BigDecimal, double]' in class 
class org.apache.calcite.runtime.SqlFunctions
at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
at 
org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:218)
at 
org.apache.kylin.rest.service.QueryService.execute(QueryService.java:940)
at 
org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:670)
at 
org.apache.kylin.rest.service.QueryService.query(QueryService.java:188)
at 
org.apache.kylin.rest.service.QueryService.queryAndUpdateCache(QueryService.java:505)
at 
org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:464)
at 
org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:390)
at 
org.apache.kylin.rest.controller.QueryController.query(QueryController.java:86)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at 
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
at 
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
at 
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
at 
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at 
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)
at 
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)
at 
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
at 
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:650)
at 
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at 
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3726) KylinSession should load spark properties from spark-defaults.conf

2018-12-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3726:
--

 Summary: KylinSession should load spark properties from 
spark-defaults.conf
 Key: KYLIN-3726
 URL: https://issues.apache.org/jira/browse/KYLIN-3726
 Project: Kylin
  Issue Type: Sub-task
  Components: Storage - Parquet
Reporter: Liu Shaohui


When testing parquet storage, the spark session job failed to be submit for no 
JAVA_HOME in executor env. This config is set in the spark default property 
file: spark-defaults.conf.

 

{code}
2018-12-18,15:13:15,466 ERROR org.apache.spark.deploy.yarn.YarnAllocator: 
Failed to launch executor 6 on container 
container_e823_1541646991414_1025309_01_07 
java.util.NoSuchElementException: key not found: JAVA_HOME at 
scala.collection.MapLike$class.default(MapLike.scala:228) at 
scala.collection.AbstractMap.default(Map.scala:59) at 
scala.collection.mutable.HashMap.apply(HashMap.scala:65) at 
org.apache.spark.deploy.yarn.ExecutorRunnable$$anonfun$prepareEnvironment$3$$anonfun$apply$3.apply(ExecutorRunnable.scala:286)
 at 
org.apache.spark.deploy.yarn.ExecutorRunnable$$anonfun$prepareEnvironment$3$$anonfun$apply$3.apply(ExecutorRunnable.scala:275)
 at scala.Option.foreach(Option.scala:257) at 
org.apache.spark.deploy.yarn.ExecutorRunnable$$anonfun$prepareEnvironment$3.apply(ExecutorRunnable.scala:275)
 at 
org.apache.spark.deploy.yarn.ExecutorRunnable$$anonfun$prepareEnvironment$3.apply(ExecutorRunnable.scala:274)
 at scala.Option.foreach(Option.scala:257) at 
org.apache.spark.deploy.yarn.ExecutorRunnable.prepareEnvironment(ExecutorRunnable.scala:274)
 at 
org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:92)
 at 
org.apache.spark.deploy.yarn.ExecutorRunnable.run(ExecutorRunnable.scala:69) at 
org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$runAllocatedContainers$1$$anon$1.run(YarnAllocator.scala:556)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)

{code}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3780) Add built instance in Job info

2019-01-21 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3780:
--

 Summary: Add built instance in Job info
 Key: KYLIN-3780
 URL: https://issues.apache.org/jira/browse/KYLIN-3780
 Project: Kylin
  Issue Type: New Feature
Reporter: Liu Shaohui
Assignee: Liu Shaohui


In DistributedScheduler, it's hard to known which machine the kylin job is 
running on.

But this info is helpful to debug the failed jobs.

So we add the  job built instance info in job info and kylin web ui.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3809) Support Zookeeper based rest server discovery

2019-02-11 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3809:
--

 Summary: Support Zookeeper based rest server discovery
 Key: KYLIN-3809
 URL: https://issues.apache.org/jira/browse/KYLIN-3809
 Project: Kylin
  Issue Type: New Feature
Reporter: Liu Shaohui


Currently to broadcast config or meta changes, all kylin servers must be set in 
kylin.properties. It's not convenient when adding or removing kylin server 
especially in k8s env.

 

So we can register the endpoint to zk and make the rest server discovery  
automatically.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3880) DataType is incompatible in Kylin HBase coprocessor

2019-03-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3880:
--

 Summary: DataType is incompatible in Kylin HBase coprocessor
 Key: KYLIN-3880
 URL: https://issues.apache.org/jira/browse/KYLIN-3880
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


During upgrade kylin from 2.4.1 to 2.5.2, the query will failed for the 
incompatible class in Kylin HBase coprocessor
{code:java}
2019-03-12,17:48:11,530 INFO 
[FifoRWQ.default.readRpcServer.handler=197,queue=13,port=24600] 
org.apache.hadoop.hdfs.DFSClient: Access token was invalid when connecting to 
/10.152.33.45:22402 : 
org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got 
access token error for OP_READ_BLOCK, self=/10.152.33.44:55387, 
remote=/10.152.33.45:22402, for file 
/hbase/zjyprc-xiaomi/data/miui_sec/data/4b88a72f5bd37daca00efb842e676ca8/C/6593503eb213431998db117cf3dab3a6,
 for pool BP-792581576-10.152.48.22-1510572454905 block 1899006034_825272806
2019-03-12,17:48:12,135 INFO 
[FifoRWQ.default.readRpcServer.handler=231,queue=15,port=24600] 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService: 
start query dc0fadcf-3689-5508-9a45-559aaebfd4e0 in thread 
FifoRWQ.default.readRpcServer.handler=231,queue=15,port=24600
2019-03-12,17:48:12,135 ERROR 
[FifoRWQ.default.readRpcServer.handler=231,queue=15,port=24600] 
org.apache.hadoop.ipc.RpcServer: Unexpected throwable object 
java.lang.RuntimeException: java.io.InvalidClassException: 
org.apache.kylin.metadata.datatype.DataType; local class incompatible: stream 
classdesc serialVersionUID = -8891652700267537109, local class serialVersionUID 
= -406124487097947
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem.readDimensionEncoding(TrimmedCubeCodeSystem.java:87)
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem$1.deserialize(TrimmedCubeCodeSystem.java:122)
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem$1.deserialize(TrimmedCubeCodeSystem.java:91)
at org.apache.kylin.gridtable.GTInfo$1.deserialize(GTInfo.java:346)
at org.apache.kylin.gridtable.GTInfo$1.deserialize(GTInfo.java:307)
at 
org.apache.kylin.gridtable.GTScanRequest$2.deserialize(GTScanRequest.java:466)
at 
org.apache.kylin.gridtable.GTScanRequest$2.deserialize(GTScanRequest.java:412)
at 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.CubeVisitService.visitCube(CubeVisitService.java:259)
at 
org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService.callMethod(CubeVisitProtos.java:)
at 
org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:6625)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execServiceOnRegion(HRegionServer.java:4336)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.execService(HRegionServer.java:4318)
at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34964)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2059)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:126)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:152)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:128)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.InvalidClassException: 
org.apache.kylin.metadata.datatype.DataType; local class incompatible: stream 
classdesc serialVersionUID = -8891652700267537109, local class serialVersionUID 
= -406124487097947
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at 
org.apache.kylin.dimension.AbstractDateDimEnc.readExternal(AbstractDateDimEnc.java:137)
at 
java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:2118)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2067)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at 
org.apache.kylin.cube.gridtable.TrimmedCubeCodeSystem.

[jira] [Created] (KYLIN-3882) kylin master build failed for pom issues

2019-03-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3882:
--

 Summary: kylin master build failed for pom issues
 Key: KYLIN-3882
 URL: https://issues.apache.org/jira/browse/KYLIN-3882
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


As title.

1,  Kyligence repo id : nexus conflicts with local maven settings.xml
{code:java}
[ERROR] Failed to execute goal on project kylin-core-metadata: Could not 
resolve dependencies for project 
org.apache.kylin:kylin-core-metadata:jar:3.0.0-SNAPSHOT: Failure to find 
org.apache.calcite:calcite-core:jar:1.16.0-kylin-r2 in 
http://nexus.x./nexus/content/groups/public was cached in the local 
repository, resolution will not be reattempted until the update interval of 
nexus has elapsed or updates are forced -> [Help 1]
{code}
 

2, maven.compiler.source/target is not set
{code:java}
[INFO] Compiling 2 Scala sources and 18 Java sources to 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/target/classes ...
[WARNING] [Warn] : bootstrap class path not set in conjunction with -source 1.6
[ERROR] [Error] 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkBatchCubingJobBuilder2.java:148:
 diamond operator is not supported in -source 1.6
  (use -source 7 or higher to enable diamond operator)
[ERROR] [Error] 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkCubingByLayer.java:239:
 try-with-resources is not supported in -source 1.6
  (use -source 7 or higher to enable try-with-resources)
[ERROR] [Error] 
/ssd/liushaohui/workspace/computing/kylin/engine-spark/src/main/java/org/apache/kylin/engine/spark/SparkCubingByLayer.java:251:
 diamond operator is not supported in -source 1.6
  (use -source 7 or higher to enable diamond operator){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3884) loading hfile to HBase failed for temporary dir in output path

2019-03-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3884:
--

 Summary: loading hfile  to HBase failed for temporary dir in 
output path
 Key: KYLIN-3884
 URL: https://issues.apache.org/jira/browse/KYLIN-3884
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


{code:java}
2019-03-14 20:18:46,591 DEBUG [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] steps.BulkLoadJob:77 : Start to run 
LoadIncrementalHFiles
2019-03-14 20:18:46,642 WARN  [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] mapreduce.LoadIncrementalHFiles:197 : 
Skipping non-directory 
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_SUCCESS
2019-03-14 20:18:46,650 ERROR [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] mapreduce.LoadIncrementalHFiles:352 : 
-
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/4170d772384144848c1c10cba66152c3
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/50ec331ff3c648e3b6e4f54a7b1fe7e9
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/703ade3b535b4fedab39ee183e22aa7c
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/82019f8ca00a4f16b9d2b45356a55a3a
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/8cc8844bced24cb88fda52fecc7224d5
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/cbac78e0c6d74b5c96a7b64f99e0d0b3
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/F1/e3844766a4d0486d89f287450034f378
  
hdfs://zjyprc-xiaomi/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_temporary/0
2019-03-14 20:18:46,651 ERROR [Scheduler 2084224398 Job 
e48de76a-6e16-309f-a3a5-191c04071072-131] common.HadoopShellExecutable:65 : 
error execute HadoopShellExecutable{id=e48de76a-6e16-309f-a3a5-191c04071072-08, 
name=Load HFile to HBase Table, state=RUNNING}
java.io.FileNotFoundException: Path is not a file: 
/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_temporary/0
Caused by: 
org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): Path is 
not a file: 
/user/s_kylin/kylin_zjyprc_bigdata_staging/kylin_zjyprc_bigdata_staging-kylin_metadata/kylin-e48de76a-6e16-309f-a3a5-191c04071072/total_user_cube/hfile/_temporary/0{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3885) Build dimension dictionary job costs too long when using Spark fact distinct

2019-03-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3885:
--

 Summary: Build dimension dictionary job costs too long when using 
Spark fact distinct
 Key: KYLIN-3885
 URL: https://issues.apache.org/jira/browse/KYLIN-3885
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


Build dimension dictionary job costs less than 20 minutes when using mapreduce 
fact distinct,but but it costs more than 3 hours when using spark fact distinct.
{code:java}
"Scheduler 542945608 Job 05c62aca-853f-396e-9653-f20c9ebd8ebc-329" #329 prio=5 
os_prio=0 tid=0x7f312109c800 nid=0x2dc0b in Object.wait() 
[0x7f30d8d24000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.apache.hadoop.ipc.Client.call(Client.java:1482)
- locked <0x0005c3110fc0> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy33.delete(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:573)
at sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:249)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:107)
at com.sun.proxy.$Proxy34.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2057)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:682)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:696)
at 
org.apache.hadoop.fs.FilterFileSystem.delete(FilterFileSystem.java:232)
at 
org.apache.hadoop.fs.viewfs.ChRootedFileSystem.delete(ChRootedFileSystem.java:198)
at 
org.apache.hadoop.fs.viewfs.ViewFileSystem.delete(ViewFileSystem.java:334)
at 
org.apache.hadoop.hdfs.FederatedDFSFileSystem.delete(FederatedDFSFileSystem.java:232)
at 
org.apache.kylin.dict.global.GlobalDictHDFSStore.deleteSlice(GlobalDictHDFSStore.java:211)
at 
org.apache.kylin.dict.global.AppendTrieDictionaryBuilder.flushCurrentNode(AppendTrieDictionaryBuilder.java:137)
at 
org.apache.kylin.dict.global.AppendTrieDictionaryBuilder.addValue(AppendTrieDictionaryBuilder.java:97)
at 
org.apache.kylin.dict.GlobalDictionaryBuilder.addValue(GlobalDictionaryBuilder.java:85)
at 
org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:82)
at 
org.apache.kylin.dict.DictionaryManager.buildDictFromReadableTable(DictionaryManager.java:303)
at 
org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:290)
at 
org.apache.kylin.cube.CubeManager$DictionaryAssist.buildDictionary(CubeManager.java:1043)
at 
org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:1012)
at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:72)
at 
org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
at 
org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)
at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
at 
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3886) Missing argument for options for yarn command

2019-03-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3886:
--

 Summary:  Missing argument for options for yarn command
 Key: KYLIN-3886
 URL: https://issues.apache.org/jira/browse/KYLIN-3886
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


2019-03-13 11:48:08,604 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 : Missing 
argument for options
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 : usage: 
application
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  
-appStates  Works with -list to filter applications
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  based on input comma-separated list of
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  application states. The valid application
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  state can be one of the following:
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  NING,FINISHED,FAILED,KILLED
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -appTypes 
   Works with -list to filter applications
2019-03-13 11:48:08,606 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  based on input comma-separated list of
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  application types.
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -help 
  Displays help for all commands.
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -kill 
  Kills the application.
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :  -list 
  List applications. Supports optional use
2019-03-13 11:48:08,607 INFO  [Scheduler 542945608 Job 
f918877a-deb0-704c-ec6f-82f33f5e39a5-323] spark.SparkExecutable:38 :
  of -appTypes to filter applications based



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3887) Query with decimal sum measure of double complied failed after KYLIN-3703

2019-03-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3887:
--

 Summary: Query with decimal sum measure of double complied failed 
after KYLIN-3703
 Key: KYLIN-3887
 URL: https://issues.apache.org/jira/browse/KYLIN-3887
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


After KYLIN-3703, Query with decimal sum measure of double complied failed.
{code:java}
Caused by: org.codehaus.commons.compiler.CompileException: 
Line 112, Column 42: Cannot cast "java.math.BigDecimal" to 
"java.lang.Double"{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3893) Cube build failed for wrong row key column description

2019-03-19 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3893:
--

 Summary: Cube build failed for wrong row key column description
 Key: KYLIN-3893
 URL: https://issues.apache.org/jira/browse/KYLIN-3893
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


User created wrong RowKeyColDesc, eg,
RowKeyColDesc\{column=MYSQL_FEEDBACK_USER_AUDIT.DATE, 
encoding=integer:undefined}
which cause the cube build forever.

 
{code:java}
org.apache.kylin.engine.mr.exception.HadoopShellException: 
java.lang.NumberFormatException: For input string: "undefined"    at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)   
 at java.lang.Integer.parseInt(Integer.java:580)    at 
java.lang.Integer.parseInt(Integer.java:615)    at 
org.apache.kylin.dimension.IntegerDimEnc$Factory.createDimensionEncoding(IntegerDimEnc.java:65)
    at 
org.apache.kylin.dimension.DimensionEncodingFactory.create(DimensionEncodingFactory.java:65)
    at org.apache.kylin.cube.kv.CubeDimEncMap.get(CubeDimEncMap.java:74)    at 
org.apache.kylin.engine.mr.common.CubeStatsReader.getCuboidSizeMapFromRowCount(CubeStatsReader.java:206)
    at 
org.apache.kylin.engine.mr.common.CubeStatsReader.getCuboidSizeMap(CubeStatsReader.java:170)
    at 
org.apache.kylin.storage.hbase.steps.CreateHTableJob.run(CreateHTableJob.java:102)
    at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)    at 
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
   at java.lang.Thread.run(Thread.java:748)result code:2    at 
org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:73)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
    at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
   at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3900) Discard all expired ERROR or STOPPED jobs to cleanup kylin metadata

2019-03-21 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3900:
--

 Summary: Discard all expired ERROR or STOPPED jobs to cleanup 
kylin metadata
 Key: KYLIN-3900
 URL: https://issues.apache.org/jira/browse/KYLIN-3900
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


Currently metadata cleanup job only delete expired  discarded and succeed jobs, 
ERROR or STOPPED jobs are left which may cause too many meta in hbase in a long 
term.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3901) Use multi threads to speed up the storage cleanup job

2019-03-21 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3901:
--

 Summary: Use multi threads to speed up the storage cleanup job
 Key: KYLIN-3901
 URL: https://issues.apache.org/jira/browse/KYLIN-3901
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3909) kylin job failed for MappeableRunContainer is not registered

2019-03-25 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3909:
--

 Summary: kylin job failed for MappeableRunContainer is not 
registered
 Key: KYLIN-3909
 URL: https://issues.apache.org/jira/browse/KYLIN-3909
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


|Job aborted due to stage failure: Task 2 in stage 1.0 failed 4 times, most 
recent failure: Lost task 2.3 in stage 1.0 (TID 2621, zjy-hadoop-prc-st2587.bj, 
executor 53): com.esotericsoftware.kryo.KryoException: 
java.lang.IllegalArgumentException: Class is not registered: 
org.apache.kylin.job.shaded.org.roaringbitmap.buffer.MappeableRunContainer|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3911) Check if HBase table is enabled before diabling table in DeployCoprocessorCLI

2019-03-26 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3911:
--

 Summary: Check if HBase table is enabled before diabling table in 
DeployCoprocessorCLI
 Key: KYLIN-3911
 URL: https://issues.apache.org/jira/browse/KYLIN-3911
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


HBase tables may be disabled for operation issues and the previous interrupted 
DeployCoprocessorCLI, which cause the new  DeployCoprocessorCLI failed.
{code:java}
2018-06-08 10:40:23,489 ERROR [pool-5-thread-6] util.DeployCoprocessorCLI:383 : 
Error processing kylin_bigdata_prod:KYLIN_A9520J93GU
org.apache.hadoop.hbase.TableNotEnabledException: 
org.apache.hadoop.hbase.TableNotEnabledException: 
kylin_bigdata_prod:KYLIN_A9520J93GU
    at 
org.apache.hadoop.hbase.master.handler.DisableTableHandler.prepare(DisableTableHandler.java:102)
    at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:2609)
    at org.apache.hadoop.hbase.master.HMaster.disableTable(HMaster.java:2619)
    at 
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:44586)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2061)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:125)
    at 
org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:83)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
@c3-hadoop-prc-ct36.bj/10.136.14.13:33500
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
    at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
    at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:220)


ingCaller.java:86)
    at 
org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3239)
    at 
org.apache.hadoop.hbase.client.HBaseAdmin.disableTableAsync(HBaseAdmin.java:919)
    at 
org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:948)
    at 
org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI.resetCoprocessor(DeployCoprocessorCLI.java:294)
    at 
org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI$ResetCoprocessorWorker.run(DeployCoprocessorCLI.java:375)
```{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3912) Support cube level mapreduuce queue config for BeelineHiveClient

2019-03-26 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3912:
--

 Summary: Support cube level mapreduuce queue config for 
BeelineHiveClient
 Key: KYLIN-3912
 URL: https://issues.apache.org/jira/browse/KYLIN-3912
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


To support multi tenants, we set different mapreduce queue config for different 
projects and cubes, but BeelineHiveClient don't use those configs. So the 
getHiveTableRows api always run on same queue in kylin_hive_conf or jdbc url, 
which cause computing resource competition.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3913) Remove getAllOutputs api in ExecutableManager to avoid OOM for large metadata

2019-03-26 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3913:
--

 Summary: Remove getAllOutputs api in ExecutableManager to avoid 
OOM for large metadata
 Key: KYLIN-3913
 URL: https://issues.apache.org/jira/browse/KYLIN-3913
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


In a big cluster, there will be many job info left in the metadata. The kylin 
server will be OOM when search the jobs with a long time range. The reason is 
that ExecutableManager will load all job output info into memory when search a 
job.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3917) Add max segment merge span to cleanup intermediate data of cube building

2019-03-27 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3917:
--

 Summary: Add max segment merge span to cleanup intermediate data 
of cube building 
 Key: KYLIN-3917
 URL: https://issues.apache.org/jira/browse/KYLIN-3917
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


Currently the intermediate data of building cube can not be deleted for it 
maybe used for later cubing merging. But it result in double space used in HDFS.

In actual scenario, we only need month-level segment span in maximum.

So if a span of segment is larger than a month, we think it don't need be 
merged and the intermediate data can be deleted.

So we can add a config kylin.cube.max-segment-merge.span,  default is -1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3918) Add project name in cube and job pages

2019-03-27 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3918:
--

 Summary: Add project name in cube and job pages
 Key: KYLIN-3918
 URL: https://issues.apache.org/jira/browse/KYLIN-3918
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


In a production cluster, there will be many projects and each project has many 
cubes. It's useful to show project name in cube and job pages.

So the admin can be quick to known which project the abnormal cube or failed 
job belongs to and get contact with the users.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3962) Support streaming cubing using Spark Streaming

2019-04-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3962:
--

 Summary: Support streaming cubing using Spark Streaming
 Key: KYLIN-3962
 URL: https://issues.apache.org/jira/browse/KYLIN-3962
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


KYLIN-3654 introduced the Real-time Streaming, but in my opinion, the arch is a 
little too complicated to handle.

As streaming frameworks like spark streaming, flink are widely used in many 
companies.Can we use the streaming framework to support real time cubing in 
Kylin.

This is just a proposal. More discussion and suggestions are welcomed~

More details of this proposal will be added later.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3994) StorageCleanupJob may delete cube id data of new built segment because of cube cache in CubeManager

2019-05-06 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3994:
--

 Summary: StorageCleanupJob may delete cube id data of new built 
segment because of cube cache in CubeManager
 Key: KYLIN-3994
 URL: https://issues.apache.org/jira/browse/KYLIN-3994
 Project: Kylin
  Issue Type: Bug
Affects Versions: v2.5.2
Reporter: Liu Shaohui


In our production cluster, we found that the cube id data of a new-built 
segment is deleted by the StorageCleanupJob.

After checking the code of cleanUnusedHdfsFiles in StorageCleanupJob, we found 
that there is  a bug here:  CubeManager read all cube meta in initiation and 
cache it for later

listAllCubes operations, the metadata will be out of data after listing the 
hdfs working dir.

So the working directory of  a finished job may be deleted  unexpectedly.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3997) Add a health check job of Kylin

2019-05-07 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-3997:
--

 Summary: Add a health check job of Kylin
 Key: KYLIN-3997
 URL: https://issues.apache.org/jira/browse/KYLIN-3997
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Kylin has many inner meta data and outer dependencies. There may be no 
inconsistent for bugs or failures. It's better to have a a health check job to 
find these inconsistent issues in advance。

The inconsistent issues we found in our clusters are followings
 * {color:#808080}the cubeid data not exist for cube merging
{color}
 * {color:#808080}hbase table not exist or online for a segment{color}
 * {color:#808080}there are holes in cube segments(The build of some days 
failed, but user not found it){color}
 * {color:#808080}Too many segment(hbase tables){color}
 * {color:#808080}metadata of stale segment  left in cube{color}
 * {color:#808080}Some cubes have not be updated/built for a long time{color}
 * {color:#808080}Some  important parameters are no set in cube desc{color}
 * {color:#808080}...{color}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4005) Saving Cube of a aggregation Groups(40 Dimensions, Max Dimension Combination:5) may cause kylin server OOM

2019-05-16 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4005:
--

 Summary: Saving Cube of a aggregation Groups(40 Dimensions, Max 
Dimension Combination:5) may cause kylin server OOM
 Key: KYLIN-4005
 URL: https://issues.apache.org/jira/browse/KYLIN-4005
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: v2.5.2
Reporter: Liu Shaohui
 Fix For: Future


A user try to save a cube with a aggregation Groups(40 Dimensions, Max 
Dimension Combination:5) caused the kylin server OOM. The reason is that the 
DefaultCuboidScheduler will cost a lot memory when calculating all cube ids. 
The stack is following
{code}
http-bio-7070-exec-35
  at java.lang.OutOfMemoryError.()V (OutOfMemoryError.java:48)
  at java.util.HashMap.resize()[Ljava/util/HashMap$Node; (HashMap.java:704)
  at 
java.util.HashMap.putVal(ILjava/lang/Object;Ljava/lang/Object;ZZ)Ljava/lang/Object;
 (HashMap.java:663)
  at 
java.util.HashMap.put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; 
(HashMap.java:612)
  at java.util.HashSet.add(Ljava/lang/Object;)Z (HashSet.java:220)
  at java.util.AbstractCollection.addAll(Ljava/util/Collection;)Z 
(AbstractCollection.java:344)
  at 
org.apache.kylin.cube.cuboid.DefaultCuboidScheduler.getOnTreeParentsByLayer(Ljava/util/Collection;)Ljava/util/Set;
 (DefaultCuboidScheduler.java:240)
  at 
org.apache.kylin.cube.cuboid.DefaultCuboidScheduler.buildTreeBottomUp()Lorg/apache/kylin/common/util/Pair;
 (DefaultCuboidScheduler.java:183)
  at 
org.apache.kylin.cube.cuboid.DefaultCuboidScheduler.(Lorg/apache/kylin/cube/model/CubeDesc;)V
 (DefaultCuboidScheduler.java:58)
  at 
sun.reflect.GeneratedConstructorAccessor140.newInstance([Ljava/lang/Object;)Ljava/lang/Object;
 (Unknown Source)
  at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance([Ljava/lang/Object;)Ljava/lang/Object;
 (DelegatingConstructorAccessorImpl.java:45)
  at 
java.lang.reflect.Constructor.newInstance([Ljava/lang/Object;)Ljava/lang/Object;
 (Constructor.java:423)
  at 
org.apache.kylin.cube.cuboid.CuboidScheduler.getInstance(Lorg/apache/kylin/cube/model/CubeDesc;)Lorg/apache/kylin/cube/cuboid/CuboidScheduler;
 (CuboidScheduler.java:41)
  at 
org.apache.kylin.cube.model.CubeDesc.getInitialCuboidScheduler()Lorg/apache/kylin/cube/cuboid/CuboidScheduler;
 (CubeDesc.java:750)
  at 
org.apache.kylin.cube.cuboid.CuboidCLI.simulateCuboidGeneration(Lorg/apache/kylin/cube/model/CubeDesc;Z)I
 (CuboidCLI.java:47)
  at 
org.apache.kylin.rest.service.CubeService.updateCubeAndDesc(Lorg/apache/kylin/cube/CubeInstance;Lorg/apache/kylin/cube/model/CubeDesc;Ljava/lang/String;Z)Lorg/apache/kylin/cube/model/CubeDesc;
 (CubeService.java:287)
  at 
org.apache.kylin.rest.service.CubeService$$FastClassBySpringCGLIB$$17a07c0e.invoke(ILjava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
 (Unknown Source)
  at 
org.springframework.cglib.proxy.MethodProxy.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
 (MethodProxy.java:204)
  at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(Ljava/lang/Object;Ljava/lang/reflect/Method;[Ljava/lang/Object;Lorg/springframework/cglib/proxy/MethodProxy;)Ljava/lang/Object;
 (CglibAopProxy.java:669)
  at 
org.apache.kylin.rest.service.CubeService$$EnhancerBySpringCGLIB$$34de75c4.updateCubeAndDesc(Lorg/apache/kylin/cube/CubeInstance;Lorg/apache/kylin/cube/model/CubeDesc;Ljava/lang/String;Z)Lorg/apache/kylin/cube/model/CubeDesc;
 (Unknown Source)
  at 
org.apache.kylin.rest.controller.CubeController.updateCubeDesc(Lorg/apache/kylin/rest/request/CubeRequest;)Lorg/apache/kylin/rest/request/CubeReq
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4021) Async Broadcast of project schema may cause creating cube failed

2019-05-28 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4021:
--

 Summary: Async Broadcast of project schema may cause creating cube 
failed
 Key: KYLIN-4021
 URL: https://issues.apache.org/jira/browse/KYLIN-4021
 Project: Kylin
  Issue Type: Bug
  Components: Metadata
Reporter: Liu Shaohui


In our prod cluster, we found some creating cube requests failed for the model 
not found.

The problem is that users will create the cube right after creating the model 
success. But the the two requests may be routed to two different servers.

When the other server receive creating cube request, the project schema may be 
not updated for the async Broadcast and the server can not found the model 
related to the cube.

the log at query server 1
{code:java}
kylin.log.11:2019-05-27 10:26:44,143 INFO  [http-bio-7070-exec-962] 
model.DataModelManager:248 : Saving Model model_k1_bb_83_uyyy3636 to 
Project BigBI_Hive with bigbi_kylin as owner
kylin.log.11:2019-05-27 10:26:44,144 INFO  [http-bio-7070-exec-962] 
model.DataModelManager:185 : Model model_k1_bb_83_uyyy3636 is missing or 
unloaded yet
kylin.log.11:2019-05-27 10:26:44,145 INFO  [http-bio-7070-exec-962] 
persistence.ResourceStore:309 : Update resource: 
/model_desc/model_k1_bb_83_uyyy3636.json with content:{code}
and the log at query server 2
{code:java}
2019-05-27 10:26:44,296 WARN  [http-bio-7070-exec-132] cube.CubeDescManager:195 
: Broken cube desc CubeDesc [name=cube_b_bb_83_uyyy3636]
java.lang.NullPointerException: DateModelDesc(model_k1_bb_83_uyyy3636) not 
found
at 
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:235)
at org.apache.kylin.cube.model.CubeDesc.init(CubeDesc.java:664)
at 
org.apache.kylin.cube.CubeDescManager.createCubeDesc(CubeDescManager.java:193)
at 
org.apache.kylin.rest.service.CubeService.createCubeAndDesc(CubeService.java:216)
at 
org.apache.kylin.rest.service.CubeService$$FastClassBySpringCGLIB$$17a07c0e.invoke()
at 
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at 
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:738)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at 
org.springframework.security.access.intercept.aopalliance.MethodSecurityInterceptor.invoke(MethodSecurityInterceptor.java:69)
at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at 
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:673)
at 
org.apache.kylin.rest.service.CubeService$$EnhancerBySpringCGLIB$$20946622.createCubeAndDesc()
at 
org.apache.kylin.rest.controller.CubeController.saveCubeDesc(CubeController.java:735)
at sun.reflect.GeneratedMethodAccessor341.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498){code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4025) Add detail exception in kylin http response

2019-05-31 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4025:
--

 Summary: Add detail exception in kylin http response 
 Key: KYLIN-4025
 URL: https://issues.apache.org/jira/browse/KYLIN-4025
 Project: Kylin
  Issue Type: New Feature
  Components: REST Service
Affects Versions: v2.5.2
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Currently there is no detailed in http response when the requests go wrong, 
because InternalErrorException in controller wrap the execption and remove the 
exception stack.

It's better add the detail exception in kylin http response 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4026) Avoid too many file append operation in HiveProducer of hive metrics reporter

2019-05-31 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4026:
--

 Summary: Avoid too many file append operation in HiveProducer of 
hive metrics reporter
 Key: KYLIN-4026
 URL: https://issues.apache.org/jira/browse/KYLIN-4026
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


Currently  for each write in HiveProducer, there will be a hdfs append 
operation, which is heavy for HDFS. 

A improvement is to keep a FSDataOutputStream in  HiveProducer and write data 
to it continuously 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4029) Overwriting conflict when create a new data model

2019-06-03 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4029:
--

 Summary:  Overwriting conflict when create a new data model
 Key: KYLIN-4029
 URL: https://issues.apache.org/jira/browse/KYLIN-4029
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


A creating model request failed for Overwriting conflicts when saving project 
metadata.

It left a stale state in meta and user can not delete it or creating a one with 
same name.
{code:java}
2019-05-31 16:35:11,668 ERROR [http-bio-7070-exec-57] 
controller.BasicController:63 :
org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
conflict /project/BigBI_Hive.json, expect old TS 1559291698212, but it is 
1559291711327
at 
org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl(HBaseResourceStore.java:326)
at 
org.apache.kylin.common.persistence.ResourceStore.checkAndPutResourceCheckpoint(ResourceStore.java:327)
at 
org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:309)
at 
org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:288)
at 
org.apache.kylin.metadata.cachesync.CachedCrudAssist.save(CachedCrudAssist.java:192)
at 
org.apache.kylin.metadata.project.ProjectManager.save(ProjectManager.java:373)
at 
org.apache.kylin.metadata.project.ProjectManager.addModelToProject(ProjectManager.java:251)
at 
org.apache.kylin.metadata.model.DataModelManager.createDataModelDesc(DataModelManager.java:256)
at 
org.apache.kylin.rest.service.ModelService.createModelDesc(ModelService.java:148)
at 
org.apache.kylin.rest.controller.ModelController.saveModelDesc(ModelController.java:128){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4048) Too long spark cube building time for too many eviction and loading for dict slices

2019-06-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4048:
--

 Summary: Too long spark cube building time for too many eviction 
and loading for dict slices
 Key: KYLIN-4048
 URL: https://issues.apache.org/jira/browse/KYLIN-4048
 Project: Kylin
  Issue Type: Improvement
Affects Versions: v2.5.2
Reporter: Liu Shaohui


In our cluster, a cube building costs too long. In the log for spark, we found 
there are too many eviction and loading for dict slices in AppendTrieDictionary.
{code:java}
$ grep "read slice from" spark.log | wc -l
119721
$ grep "Evict slice with key" spark.log| wc -l
119634
{code}
The reason is that the memory of spark executor(4G) is not enough to hold all 
the slices of dict(3.3G in hdfs) in memory, which cause the bad performance of 
cube building.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4092) Support setting seperate jvm params for kylin backgroud tools

2019-07-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4092:
--

 Summary: Support setting seperate jvm params for kylin backgroud 
tools
 Key: KYLIN-4092
 URL: https://issues.apache.org/jira/browse/KYLIN-4092
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


Usually, the memory set in setenv.sh for query server is larger then 8G, which 
is not suitable for kylin background tools (meta cleaup, storage cleanup, 
health check) 

So It's better to have a seperate env for kylin tools



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4093) Slow query pages should be open to all users of the project

2019-07-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4093:
--

 Summary: Slow query pages should be open to all users of the 
project
 Key: KYLIN-4093
 URL: https://issues.apache.org/jira/browse/KYLIN-4093
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


Currently the show query page only can been seen for kylin admins. It's very 
useful for modlers and analysts for this project.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4094) Add script to create system tables and cubes automatically

2019-07-17 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4094:
--

 Summary: Add script to create system tables and cubes automatically
 Key: KYLIN-4094
 URL: https://issues.apache.org/jira/browse/KYLIN-4094
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


See: [http://kylin.apache.org/docs/tutorial/setup_systemcube.html]

It's a little complex to setup the the system cubes. We can add a scripts to 
make it easier.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4095) Add RESOURCE_PATH_PREFIX option in ResourceTool

2019-07-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4095:
--

 Summary: Add RESOURCE_PATH_PREFIX option in ResourceTool
 Key: KYLIN-4095
 URL: https://issues.apache.org/jira/browse/KYLIN-4095
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui


ResourceTool is very useful to fix the metadata with overlap segments.

But downloading and uploading entire metadata is too heavy.

It's better to have a RESOURCE_PATH_PREFIX option for downloading and uploading 
cmds.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4096) Make cube metadata validator rules configuable

2019-07-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4096:
--

 Summary: Make cube metadata validator rules configuable
 Key: KYLIN-4096
 URL: https://issues.apache.org/jira/browse/KYLIN-4096
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


CubeMetadataValidator is very useful to format the cube creation.

In xiaomi, we implements multi rules to reduce the operation cost.

eg: ConfOverrideRule which make user set computing queue in cube configuration 
and forbid to set some configurations like: kylin.query.max-scan-bytes

So it's better to make the rules configuable



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4097) Throw exception when too many dict slice eviction in AppendTrieDictionary

2019-07-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4097:
--

 Summary: Throw exception when too many dict slice eviction in 
AppendTrieDictionary
 Key: KYLIN-4097
 URL: https://issues.apache.org/jira/browse/KYLIN-4097
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


When the global dict is too large than the spark executor memory, there will be 
too many dict slice evictions and loads in AppendTrieDictionary, and  the build 
job will be very slow.

It's better to throw an exception in advance in this case.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4098) Add cube auto merge api

2019-07-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4098:
--

 Summary: Add cube auto merge api
 Key: KYLIN-4098
 URL: https://issues.apache.org/jira/browse/KYLIN-4098
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Currently the auto merging of cube is triggered by the event of new segment is 
ready automatically. When the cluster restart, there may be too many merging 
job.

It's better to have a rest api to trigger the merging and make it more 
controllable.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4099) Using no blocking unpersist in spark cubing job

2019-07-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4099:
--

 Summary: Using no blocking unpersist in spark cubing job 
 Key: KYLIN-4099
 URL: https://issues.apache.org/jira/browse/KYLIN-4099
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


By default, the unpersist operation of RDD in spark is blocking which may cost 
a lot time and

some times it may failed for some spark executors lost. 

We can set blocking false to improve it.
{code:java}
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
scala.concurrent.Await$.result(package.scala:190)
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:127)
org.apache.spark.SparkContext.unpersistRDD(SparkContext.scala:1709)
org.apache.spark.rdd.RDD.unpersist(RDD.scala:216)
org.apache.spark.api.java.JavaPairRDD.unpersist(JavaPairRDD.scala:73)
org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:204)
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:653){code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4100) Add overall job number statistics in monitor page

2019-07-18 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4100:
--

 Summary: Add overall job number statistics in monitor page
 Key: KYLIN-4100
 URL: https://issues.apache.org/jira/browse/KYLIN-4100
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Currently it's hard to get pending and running job number in mointor page, we 
can only continue to click more until the end.

It's better to have an overall job number statistics in monitor page.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4101) set hive and spark job name when building cube

2019-07-19 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4101:
--

 Summary: set hive and spark job name when building cube
 Key: KYLIN-4101
 URL: https://issues.apache.org/jira/browse/KYLIN-4101
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Currently the job name of spark is 
{color:#22}org.apache.kylin.common.util.SparkEntry{color}, which is the 
main class name of spark . The mapreduce job name of hive sql is substring of 
the query, which is difficult to read.

It's better to set a more readable name for the hive and spark jobs



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4103) Make the user string in granting operation of project is case insensitive

2019-07-19 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4103:
--

 Summary: Make the user string in granting operation of project is 
case insensitive
 Key: KYLIN-4103
 URL: https://issues.apache.org/jira/browse/KYLIN-4103
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Currently the user name of login operation is case insensitive. User can login 
in kylin with lower case string or upper case string. But it is not granting 
operation.

If we use lower case string of user name in project granting operation, there 
will be no exception but the user can not set the project.

The reason is that the sid in AccessService/AclService is not case insensitive



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4109) CubeHFileMapperTest failed after commit: f4d2405f6aa978bbc3153c9ca9fa339b9d7e6c30

2019-07-24 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4109:
--

 Summary: CubeHFileMapperTest failed after commit: 
f4d2405f6aa978bbc3153c9ca9fa339b9d7e6c30
 Key: KYLIN-4109
 URL: https://issues.apache.org/jira/browse/KYLIN-4109
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui


CubeHFileMapperTest failed in 2.5.x-hadoop3.1 for following changes
{code:java}
-    assertEquals("cf1", new String(p1.getSecond().getFamily(), 
StandardCharsets.UTF_8));
-    assertEquals("usd_amt", new String(p1.getSecond().getQualifier(), 
StandardCharsets.UTF_8));
-    assertEquals("35.43", new String(p1.getSecond().getValue(), 
StandardCharsets.UTF_8));
+    assertEquals("cf1", new String(copy(p1.getSecond(;
+    assertEquals("usd_amt", new String(copy(p1.getSecond(;
+    assertEquals("35.43", new String(copy(p1.getSecond(;
 
 assertEquals(key, p2.getFirst());
-    assertEquals("cf1", new String(p2.getSecond().getFamily(), 
StandardCharsets.UTF_8));
-    assertEquals("item_count", new String(p2.getSecond().getQualifier(), 
StandardCharsets.UTF_8));
-    assertEquals("2", new String(p2.getSecond().getValue(), 
StandardCharsets.UTF_8));
+    assertEquals("cf1", new String(copy(p2.getSecond(;
+    assertEquals("item_count", new String(copy(p2.getSecond(;
+    assertEquals("2", new String(copy(p2.getSecond(;
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4111) drop table failed with no valid privileges after KYLIN-3857

2019-07-24 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4111:
--

 Summary: drop table failed with no valid privileges after 
KYLIN-3857
 Key: KYLIN-4111
 URL: https://issues.apache.org/jira/browse/KYLIN-4111
 Project: Kylin
  Issue Type: Bug
Reporter: Liu Shaohui
Assignee: Liu Shaohui


After KYLIN-3857, there will be quote ` around database and table.

The drop table sql will be:
{code:java}
DROP TABLE IF EXISTS 
`kylin_onebox.kylin_intermediate_kylin_sales_cube_7be84be1_a153_07c4_3ce6_270e8d99ff85`;{code}
Hive (1.2)with sentry will throw exception:
{code:java}
Error: Error while compiling statement: FAILED: HiveAccessControlException No 
valid privileges
 Required privileges for this query: 
Server=server1->Db=`kylin_onebox->Table=kylin_intermediate_kylin_sales_cube_7be84be1_a153_07c4_3ce6_270e8d99ff85`->action=drop;
Query log: 
http://zjy-hadoop-prc-ct14.bj:18201/log?qid=898c7878-a961-443d-b120-cca0e2667d15_f486bd16-4bbd-4014-a0a7-c2ebfdbe6668
 (state=42000,code=4)

{code}
The reason is that hive identify the databse be `kylin_onebox and table be: 
kylin_intermediate_kylin_sales_cube_7be84be1_a153_07c4_3ce6_270e8d99ff85`

May be we can fix it in hive and sentry. Just create a jira to show this 
problem.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4112) Add hdfs keberos token delegation in Spark to support HBase and MR use different HDFS clusters

2019-07-24 Thread Liu Shaohui (JIRA)

Liu Shaohui created KYLIN-4112:
--

 Summary: Add hdfs keberos token delegation in Spark to support 
HBase and MR use different HDFS clusters
 Key: KYLIN-4112
 URL: https://issues.apache.org/jira/browse/KYLIN-4112
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Currently the SparkExecutable only delegate the token for yarn hdfs cluster, 
not for the hdfs cluster used by the HBase cluster.

The spark job of Convert Cuboid Data to HFile will failed for kerberos issue.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4175) Support secondary hbase storage config for hbase cluster migration

2019-09-22 Thread Liu Shaohui (Jira)

Liu Shaohui created KYLIN-4175:
--

 Summary: Support secondary hbase storage config for hbase cluster 
migration
 Key: KYLIN-4175
 URL: https://issues.apache.org/jira/browse/KYLIN-4175
 Project: Kylin
  Issue Type: Improvement
Reporter: Liu Shaohui
Assignee: Liu Shaohui


Currently when migrating a kylin cluster from on data center to the other or 
the hbase cluster the kylin depends on will be changed from one cluster to the 
other, there will be a long down time to migrating the history data from one 
cluster to the other or we must rebuild all the history data of cube in the 
other cluster.

In xiaomi, we added the support of secondary hbase storage and made the kylin 
cluster can query cube data from the old hbase cluster during the migration.

As a result,  the  migration is very smooth with minimum down time.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

46 matches

Mail list logo