Re: How kylin can log in without password

2019-12-11 Thread yuzhang
kylinSecurity.xml config the bean for spring security, simplify this file and 
remove the login web page in frontend webapp may help you


| |
yuzhang
|
|
Email:shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 12/11/2019 22:26,yuzhang wrote:
Hi Wang, Kylin use Spring Security as its authorization framework. Maybe modify 
the frontend webapp can satisfy your demand


| |
yuzhang
|
|
Email:shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 12/11/2019 10:18,wangdongd...@bidcc.cn wrote:
Dear developer, due to the demand problem, we need to do password free login to 
use kylin system, but we have tried to modify it many times and found that it 
can't succeed. Therefore, I'd like to ask you how kylin can achieve password 
free login and what is the configuration?

This is my modified file kylinsecurity.xml. I don't know if it is this file. If 
you have time to help me, I will be very grateful. Thank you.



王栋栋  java研发工程师

中国科学院计算机网络信息中心
北京北龙云海网络数据科技有限责任公司


Re: How kylin can log in without password

2019-12-11 Thread yuzhang
Hi Wang, Kylin use Spring Security as its authorization framework. Maybe modify 
the frontend webapp can satisfy your demand


| |
yuzhang
|
|
Email:shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 12/11/2019 10:18,wangdongd...@bidcc.cn wrote:
Dear developer, due to the demand problem, we need to do password free login to 
use kylin system, but we have tried to modify it many times and found that it 
can't succeed. Therefore, I'd like to ask you how kylin can achieve password 
free login and what is the configuration?

This is my modified file kylinsecurity.xml. I don't know if it is this file. If 
you have time to help me, I will be very grateful. Thank you.



王栋栋  java研发工程师

中国科学院计算机网络信息中心
北京北龙云海网络数据科技有限责任公司


Re: [VOTE] Release apache-kylin-3.0.0 (RC1)

2019-12-09 Thread yuzhang
expect it  +1


| |
yuzhang
|
|
Email:shifengdefan...@163.com
|

Signature is customized by Netease Mail Master

On 12/10/2019 14:07, ShaoFeng Shi wrote:
Hi all,

I have created a build for Apache Kylin 3.0.0, release candidate 1.

Changes highlights:
[KYLIN-4258] - Real-time OLAP may return an incorrect result for some case
[KYLIN-4167] - Refactor streaming coordinator
[KYLIN-4273] - Make cube planner works for real-time streaming job
[KYLIN-4187] - Building dimension dictionary using spark
[KYLIN-4098] - Add cube auto-merge API

Thanks to everyone who has contributed to this release.
Here are the release notes:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12345005==12316121

The commit to being voted upon:
https://github.com/apache/kylin/commit/c75242a9b55fd57a3a58d92a2dfa9f21cfe4eebc

Its hash is c75242a9b55fd57a3a58d92a2dfa9f21cfe4eebc.

The artifacts to be voted on, including the source package and two
pre-compiled binary packages are located here:
https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.0-rc1/

The hash of the artifacts are as follows:
apache-kylin-3.0.0-source-release.zip.sha256
9224742a87750b8d127c5031c03f3716e3af732c9805a6d0c64871605704f6c0
apache-kylin-3.0.0-bin-hbase1x.tar.gz.sha256
bdeddee3eb453c139eabaa2ce7ebd5d14f72d5ac48e5a64636aba2ed7357dda9
apache-kylin-3.0.0-bin-cdh57.tar.gz.sha256
c2ae9498f61edbacb6dae5fc32e2c4ea14539ef6d906d53194492e042c80185f
apache-kylin-3.0.0-bin-hadoop3.tar.gz.sha256
116ba002d794058bd34bd05989da2c3a7ff87cf67d3647d2f1cc5b5717d445f6
apache-kylin-3.0.0-bin-cdh60.tar.gz.sha256
22a0701b5a03a8d40c8b1be4fe4acb1ff2550a18c52d509b592d59ef5a094f7e

A staged Maven repository is available for review at:
https://repository.apache.org/content/repositories/orgapachekylin-1070/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/shaofengshi.asc

Please vote on releasing this package as Apache Kylin 3.0.0.

The vote is open for the next 72 hours and passes if a majority of
at least three +1 PMC votes are cast.

[ ] +1 Release this package as Apache Kylin 3.0.0
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...


Here is my vote:

+1 (binding)

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org


[jira] [Created] (KYLIN-4142) Upgrade ehcache version from 2 to 3

2019-08-19 Thread Yuzhang QIU (Jira)
Yuzhang QIU created KYLIN-4142:
--

 Summary: Upgrade ehcache version from 2 to 3
 Key: KYLIN-4142
 URL: https://issues.apache.org/jira/browse/KYLIN-4142
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Assignee: Yuzhang QIU






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[KYLIN-3392]Support NULL value in sum, max, min aggregation

2019-08-12 Thread yuzhang
Hi dear team:
  How do you think about KYLIN-3392


| |
yuzhang
|
|
Email:shifengdefan...@163.com
|

Signature is customized by Netease Mail Master

[jira] [Created] (KYLIN-4080) Project schema update event casues error reload NEW DataModelDesc

2019-07-13 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-4080:
--

 Summary: Project schema update event casues error reload NEW 
DataModelDesc
 Key: KYLIN-4080
 URL: https://issues.apache.org/jira/browse/KYLIN-4080
 Project: Kylin
  Issue Type: Bug
  Components: Metadata
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi, dear Kylin dev team:
   When create new DataModelDesc, DataModelManager.createDataModelDese:246 will 
temporarily add the new model name into selected project(project1) cache, but 
won't persist it. The TEMPORARY ADD operation will make the model reloading 
successful, rather than throw "No project found for model ..." exception(at 
ProjectManager:391).
   However, If there have another threads are processing  "Broadcasting update 
project_schema, project1", it will clean up cache of project1 and reload it, 
which will reset the "TEMPORARY ADD" operation. Meanwhile, the model saving 
thread has persisted the DataModelDesc and start to reload it, but will find 
there have "No project for this model".
  The new model can't be created again because the conflict timestamp and can't 
be reloaded into cache because the abrove problem. 
   How do you think about this??


 Best regards
    
 yuzhang



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KYLIN-4032) Add tools to show kylin instance which schedule the running job

2019-06-04 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-4032:
--

 Summary: Add tools to show kylin instance which schedule the 
running job
 Key: KYLIN-4032
 URL: https://issues.apache.org/jira/browse/KYLIN-4032
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi team:
 Sometime, the operator need to know the running/error job owner to trace 
the log file among the kylin cluster. A simple tool to show this may be helpful.
  

  Best regards

yuzhang




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-4031) RestClient will throw exception with message contains clear-text password

2019-06-03 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-4031:
--

 Summary: RestClient will throw exception with message contains 
clear-text password
 Key: KYLIN-4031
 URL: https://issues.apache.org/jira/browse/KYLIN-4031
 Project: Kylin
  Issue Type: Improvement
  Components: REST Service
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi dear kylin team:
  I found that RestClient:97 will throw IllegalArgumentException with 
message contains clear-text password when set error uri with user:pwd. This may 
casue some security problem, I think.
  How do you think about this?


   Best Regards

 yuzhang




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-4020) fix_length rowkey encode without sepecified length can be saved but cause CreateHTable step failed

2019-05-28 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-4020:
--

 Summary: fix_length rowkey encode without sepecified length can be 
saved but cause CreateHTable step failed
 Key: KYLIN-4020
 URL: https://issues.apache.org/jira/browse/KYLIN-4020
 Project: Kylin
  Issue Type: Improvement
  Components: Metadata
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi dear team:

Just as title said.  
Maybe there should have more strict check for advanced settings, I think.

How do you think about this?

If there already have same JIRA,please inform me and close this one.


   Best regards

   yuzhang



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-4013) Only show the cubes under one model

2019-05-25 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-4013:
--

 Summary: Only show the cubes under one model
 Key: KYLIN-4013
 URL: https://issues.apache.org/jira/browse/KYLIN-4013
 Project: Kylin
  Issue Type: Improvement
  Components: Web 
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Assignee: Yuzhang QIU


Some improvement for UI。
User may want to see the cubes under specified model. Add an extra action 
'Cubes' in the drop-down list of model to filter cube list



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Altlas Error when run IT on sandbox2.4

2019-05-05 Thread yuzhang
Well, stop atlas process and remove `org.apache.atlas.hive.hook.HiveHook`  in 
Hive configuration in ambari can solve this problem. Atlas process is not 
necessary for runing Integration Test.


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 5/5/2019 09:01,yuzhang wrote:


And I find the following exception message in Atlas log file.


2019-05-05 00:40:06,346 DEBUG - [qtp1798286609-13 - 
1a863767-d092-4b8d-a45a-d8cb82d8e6ae:] ~ submitting entity {
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"id":"-39494778537956",
"version":0,
"typeName":"hive_table"
},
"typeName":"hive_table",
"values":{
"tableType":"MANAGED_TABLE",
"name":"default.kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by@Sandbox",
"createTime":"1557016193",
"temporary":false,
"db":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"id":"409729d5-c11f-482e-b211-3f50bd097b8e",
"version":0,
"typeName":"hive_db"
},
"typeName":"hive_db",
"values":{

},
"traitNames":[

],
"traits":{

}
},
"retention":0,
"tableName":"kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by",
"columns":[
{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"id":"bb04e660-6be2-42a0-8ad2-1b36487e24b0",
"version":0,
"typeName":"hive_column"
},
"typeName":"hive_column",
"values":{

},
"traitNames":[

],
"traits":{

}
}
],
"comment":"",
"lastAccessTime":0,
"owner":"root",
"sd":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"id":"b71fce9e-4e47-4af3-8c51-d8f93a45ebe4",
"version":0,
"typeName":"hive_storagedesc"
},
"typeName":"hive_storagedesc",
"values":{

},
"traitNames":[

],
"traits":{

}
},
"parameters":{
"comment":"",
"transient_lastDdlTime":"1557016193"
},
"partitionKeys":[
{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference",
"id":{
"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id",
"id":"81cd90bf-7f7f-4951-9d68-3273337573d3",
"version":0,
"typeName":"hive_column"
},
"typeName":"hive_column",
"values":{

},
"traitNames":[

],
"traits":{

}
}
]
},
"traitNames":[

],
"traits":{

}
}  (EntityResource:94)
2019-05-05 00:40:06,349 ERROR - [qtp1798286609-13 - 
1a863767-d092-4b8d-a45a-d8cb82d8e6ae:] ~ Unable to persist entity instance due 
to a desrialization error  (EntityResource:109)
org.apache.atlas.typesystem.types.ValueConversionException: Cannot convert 
value 'org.apache.atlas.typesystem.Referenceable@2f651f62' to datatype 
hive_table
at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143)
at 
org.apache.atlas.services.DefaultMetadataService.deserializeClassInstance(DefaultMetadataService.java:252)
at 
org.apache.atlas.services.DefaultMetadataService.createEntity(DefaultMetadataService.java:230)
at org.apache.atlas.web.resources.EntityResource.submit(EntityResource.java:96)
at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at 
com.sun.jersey.server.impl.uri.rules.

Re: Altlas Error when run IT on sandbox2.4

2019-05-04 Thread yuzhang
sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at org.apache.atlas.web.filters.AuditFilter.doFilter(AuditFilter.java:67)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException:
 Null value not allowed for multiplicty Multiplicity{lower=1, upper=1, 
isUnique=false}
at 
org.apache.atlas.typesystem.types.DataTypes$PrimitiveType.convertNull(DataTypes.java:93)
at 
org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:469)
at 
org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:452)
at 
org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:606)
at 
org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:562)
at 
org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:118)
at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:141)
... 51 more
| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 5/5/2019 08:47,yuzhang wrote:
Hi dear all:
I meet this exception when I run IT on sandbox2.4 or run the hive cmd on 
shell. Does any people meet the same problem?
Here is a discussion about this problem but I think it isn't helpful. 
https://community.hortonworks.com/questions/224847/i-am-getting-errornull-value-not

Altlas Error when run IT on sandbox2.4

2019-05-04 Thread yuzhang
Hi dear all:
I meet this exception when I run IT on sandbox2.4 or run the hive cmd on 
shell. Does any people meet the same problem?
Here is a discussion about this problem but I think it isn't helpful. 
https://community.hortonworks.com/questions/224847/i-am-getting-errornull-value-not-allowed-for-multi.html
Best regards
yuzhang




==Hive cmd:==
hive -e "USE default;


CREATE TABLE IF NOT EXISTS default.ci_inner_join_cube_global_dict
( dict_key STRING COMMENT '', 
dict_val INT COMMENT '' 
) 
COMMENT '' 
PARTITIONED BY (dict_column string) 
STORED AS TEXTFILE; 
DROP TABLE IF EXISTS 
kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by;
 
CREATE TABLE IF NOT EXISTS 
kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by
 
( 
 dict_key STRING COMMENT '' 
) 
COMMENT '' 
PARTITIONED BY (dict_column string) 
STORED AS SEQUENCEFILE 
;
INSERT OVERWRITE TABLE 
kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by
 
PARTITION (dict_column = 'TEST_KYLIN_FACT_TEST_COUNT_DISTINCT_BITMAP') 
SELECT
TEST_KYLIN_FACT_TEST_COUNT_DISTINCT_BITMAP 
FROM kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd
GROUP BY TEST_KYLIN_FACT_TEST_COUNT_DISTINCT_BITMAP 
;


" --hiveconf 
mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec 
--hiveconf dfs.replication=2 --hiveconf hive.exec.compress.output=true


  


===Error
org.apache.atlas.AtlasServiceException: Metadata service API CREATE_ENTITY 
failed with status 400(Bad Request) Response Body ({"error":"Null value not 
allowed for multiplicty Multiplicity{lower=1, upper=1, 
isUnique=false}","stackTrace":"org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException:
 Null value not allowed for multiplicty Multiplicity{lower=1, upper=1, 
isUnique=false}\n\tat 
org.apache.atlas.typesystem.types.DataTypes$PrimitiveType.convertNull(DataTypes.java:93)\n\tat
 
org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:469)\n\tat
 
org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:452)\n\tat
 
org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:606)\n\tat
 
org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:562)\n\tat
 
org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:118)\n\tat
 org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:141)\n\tat 
org.apache.atlas.services.DefaultMetadataService.deserializeClassInstance(DefaultMetadataService.java:252)\n\tat
 
org.apache.atlas.services.DefaultMetadataService.createEntity(DefaultMetadataService.java:230)\n\tat
 
org.apache.atlas.web.resources.EntityResource.submit(EntityResource.java:96)\n\tat
 sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)\n\tat 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
 java.lang.reflect.Method.invoke(Method.java:498)\n\tat 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)\n\tat
 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)\n\tat
 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)\n\tat
 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)\n\tat
 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)\n\tat
 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)\n\tat
 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)\n\tat
 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)\n\tat
 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)\n\tat
 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)\n\tat
 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)\n\tat
 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)\n\tat
 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)\n\tat
 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)\n\tat
 javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)\n\tat
 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)\n\tat
 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)\n\tat
 
com.google.

Re: Why does the log always say “No Data Available” when the cube is built?

2019-04-28 Thread yuzhang
Hi shiqi:
"No Data available" mean the step of the job hasn't been completed. There 
will be some log message if the step has been completed, whether successful or 
not.


For you problem, could you provide more detail about you build job? Such 
as, log on server, which step is running, your deploy environment, etc will be 
helpful.




Best regards


        yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 4/28/2019 21:05,shiqi wrote:
In the sample case on the Kylin official website, when I was building cube,
in the first step of the Create Intermediate Flat Hive Table, the log is
always No Data Available, the status is always running.

The cube build has been executed for more than three hours.

I checked the hive database table kylin_sales and there is data in the
table.

And I fount that the intermediate flat hive table
kylin_intermediate_kylin_sales_cube_402e3eaa_dfb2_7e3e_04f3_07248c04c10c
has been created successfully in the hive, but there is no data in its.

```
hive> show tables;
OK
...
kylin_intermediate_kylin_sales_cube_402e3eaa_dfb2_7e3e_04f3_07248c04c10c
kylin_sales
...
Time taken: 9.816 seconds, Fetched: 1 row(s)

hive> select * from kylin_sales;
OK
...
89922012-04-17  ABIN15687   0   13  95.5336 17  19751507   
ADMIN   Shanghai
89932013-02-02  FP-non GTC  67698   0   13  85.7528 6   1856   
10004882MODELER Hongkong
...
Time taken: 3.759 seconds, Fetched: 1 row(s)
```

--
Sent from: http://apache-kylin.74782.x6.nabble.com/


Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin

2019-04-26 Thread yuzhang
Thanks for your replies. Here is the jira about this feature. The PR are being 
prepared. Hoping for more advices from yours.


https://issues.apache.org/jira/browse/KYLIN-3982


Regards


yuzhang

| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 4/26/2019 17:05,Iñigo Martínez wrote:
+1 Post Jira issue in order to subscribe, please. :-)


Our BI area is modifying cubes very frequently by adding or remove dimensions. 
In many cases, they want to include all historical series and so they have to 
rebuild cubes from scratch and sometimes takes days to finish. From a point of 
view of resources usage this is a waste of computational and memory power 
because other tasks are affected by lack of resources.
I think this proposal is really interesting for us.


El vie., 26 abr. 2019 a las 10:57, ShaoFeng Shi () 
escribió:

Hi Yuzhang,


Please open a JIRA for this enhancement; If it can be implemented in an elegant 
way, that will be great!


Best regards,


Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org


Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org









yuzhang  于2019年4月23日周二 上午8:56写道:

Hi Shaofeng:
We also take some experiment for add measure after cube be built and 
encountered byte error at the very start. The default mapping strategy between 
HBase store and measure definition is "multiple measures are stored in one 
column of column family", which may cause byte error after add a measure and 
insert it in original measure sequence. Add an column for new measure may be 
better, I think.

I just have a preliminary idea, may be impractical for now, about the 
measure management design. 
Dimensions and metrics are defined once model be designed. The measure 
aggregate the metrics in different dimensions to observe the data entities 
represented by the model. All of these are design of 'logical view', I think. 
The Cube is materialized view of these logical model, which is the bridge 
between the logical view and the physical storage (and the highway is set up). 
The life cycle of the measure may depend on the model rather than the cube. 


Based on the design, an measure management can be set up after model design 
be completed. We can define the measure based on model. Cubes under the model 
can reuse those measure and build their segment data. When a SQL arrive, Kylin 
query server need to find the suitable model with suitable measure, then find 
the available cube.


Of course, such an design change will have a very large impact on the 
existing kylin architecture, and the query and metadata will have very large 
changes. So it seems that it is still on paper.
More realistic or transitional design is increasing the metadata of the 
measure. Just as CubeDesc defines the schema, and a relative CubeInstance 
manages the built Segments. MeasureDesc can also has a MeasureInstance to 
manage the segment containing it.
I observed that kylin's query service generates a GridTable for mapping between 
logical views and HBase physical storage: Cuboid + Measure -> Grid Table <- 
HBase store. This Grid Table is generated based on CubeDesc and has such a 
mapping process for each Segment. Therefore, in the mapping stage, it is 
possible to know which columns of the Grid Table can't be obtained in current 
segment by the metadata. So the measure data can be selectively read at the RS 
backend.
But its life cycle is the same as MeasureDesc, managed by CubeDesc.


Regarding adding dimensions to the same cube, we also need to consider 
aggregation groups and Rowkey order. I am curious and interesting how you 
implemented it.




Best regards


        yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 4/22/2019 09:05,ShaoFeng Shi wrote:
Hi Yuzhang,

Glad to see such a discussion; How to support "schema change" in a friendly
way is what we should do in the next phase, as we see this requirement is
stronger than before.

Last week I also did a try on 1) adding a dimension after cube be built,
and 2) adding a measure after cube be built;

For 1) I have got an idea, the first try was successful, and want to
discuss it with the community in some day.

The 2) was failed; after a new measure is added, the query got failed and
in HBase RS side there is byte parsing error. Then I didn't continue that.

Could you elaborate your idea on "the measures of the analysis system can
be 

[jira] [Created] (KYLIN-3986) Add hint about the absent measures after a success query

2019-04-26 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3986:
--

 Summary: Add hint about the absent measures after a success query
 Key: KYLIN-3986
 URL: https://issues.apache.org/jira/browse/KYLIN-3986
 Project: Kylin
  Issue Type: Sub-task
  Components: Query Engine
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Assignee: Yuzhang QIU






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3985) [Web UI] Support map measures to muti-qualifier in column family

2019-04-26 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3985:
--

 Summary: [Web UI] Support map measures to muti-qualifier in column 
family
 Key: KYLIN-3985
 URL: https://issues.apache.org/jira/browse/KYLIN-3985
 Project: Kylin
  Issue Type: Sub-task
  Components: Web 
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Assignee: Yuzhang QIU






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3984) Update measure metadata after job finished

2019-04-26 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3984:
--

 Summary: Update measure metadata after job finished
 Key: KYLIN-3984
 URL: https://issues.apache.org/jira/browse/KYLIN-3984
 Project: Kylin
  Issue Type: Sub-task
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Assignee: Yuzhang QIU


Merge, build and refresh cube will update measure metadata



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3983) Add extra metadata for measure

2019-04-26 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3983:
--

 Summary: Add extra metadata for measure
 Key: KYLIN-3983
 URL: https://issues.apache.org/jira/browse/KYLIN-3983
 Project: Kylin
  Issue Type: Sub-task
  Components: Metadata
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Assignee: Yuzhang QIU


Just like CubeDesc and CubeInstance, we need to add extra metadata for measure 
to persist some runtime data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3982) Add measures without purging segments

2019-04-26 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3982:
--

 Summary: Add measures without purging segments
 Key: KYLIN-3982
 URL: https://issues.apache.org/jira/browse/KYLIN-3982
 Project: Kylin
  Issue Type: New Feature
  Components: Metadata, Query Engine, Tools, Build and Test
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Assignee: Yuzhang QIU


Here is the discussion

https://lists.apache.org/thread.html/44bf088f278d0ca3087bb8bdffda158534994d4c41be5405eb4699d8@%3Cdev.kylin.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin

2019-04-22 Thread yuzhang
Hi Shaofeng:
We also take some experiment for add measure after cube be built and 
encountered byte error at the very start. The default mapping strategy between 
HBase store and measure definition is "multiple measures are stored in one 
column of column family", which may cause byte error after add a measure and 
insert it in original measure sequence. Add an column for new measure may be 
better, I think.

I just have a preliminary idea, may be impractical for now, about the 
measure management design. 
Dimensions and metrics are defined once model be designed. The measure 
aggregate the metrics in different dimensions to observe the data entities 
represented by the model. All of these are design of 'logical view', I think. 
The Cube is materialized view of these logical model, which is the bridge 
between the logical view and the physical storage (and the highway is set up). 
The life cycle of the measure may depend on the model rather than the cube. 


Based on the design, an measure management can be set up after model design 
be completed. We can define the measure based on model. Cubes under the model 
can reuse those measure and build their segment data. When a SQL arrive, Kylin 
query server need to find the suitable model with suitable measure, then find 
the available cube.


Of course, such an design change will have a very large impact on the 
existing kylin architecture, and the query and metadata will have very large 
changes. So it seems that it is still on paper.
More realistic or transitional design is increasing the metadata of the 
measure. Just as CubeDesc defines the schema, and a relative CubeInstance 
manages the built Segments. MeasureDesc can also has a MeasureInstance to 
manage the segment containing it.
I observed that kylin's query service generates a GridTable for mapping between 
logical views and HBase physical storage: Cuboid + Measure -> Grid Table <- 
HBase store. This Grid Table is generated based on CubeDesc and has such a 
mapping process for each Segment. Therefore, in the mapping stage, it is 
possible to know which columns of the Grid Table can't be obtained in current 
segment by the metadata. So the measure data can be selectively read at the RS 
backend.
But its life cycle is the same as MeasureDesc, managed by CubeDesc.


Regarding adding dimensions to the same cube, we also need to consider 
aggregation groups and Rowkey order. I am curious and interesting how you 
implemented it.




Best regards


        yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 4/22/2019 09:05,ShaoFeng Shi wrote:
Hi Yuzhang,

Glad to see such a discussion; How to support "schema change" in a friendly
way is what we should do in the next phase, as we see this requirement is
stronger than before.

Last week I also did a try on 1) adding a dimension after cube be built,
and 2) adding a measure after cube be built;

For 1) I have got an idea, the first try was successful, and want to
discuss it with the community in some day.

The 2) was failed; after a new measure is added, the query got failed and
in HBase RS side there is byte parsing error. Then I didn't continue that.

Could you elaborate your idea on "the measures of the analysis system can
be decoupled from the materialized view(cube) and have their own management
system"? Have you got a rough design on it? Thank you!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




yuzhang  于2019年4月21日周日 下午8:08写道:

Hi JiaTao:
Maybe it's necessary that there is an optional auto-complete machanism
among different measure's view, isn't it?


yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 4/20/2019 11:38,JiaTao Tao wrote:
Hi

The idea that supports Kylin adding measures dynamically is impressive.

But in my opinion, once you add a measure, the existing segments should
also calculate the new measure(just add a new measure column). Users can
have many cubes, a cube can have many segments, if measure's view is
different in each segment, it will increase the burden of the user.

--


Regards!

Aron Tao

yuzhang  于2019年4月20日周六 上午1:43写道:

Hi dear kylin users and develop team:
Here have some things I want to discuss with community.
As a representative of MOLAP engine, kylin uses pre-aggregation strategies
to provide high-c

Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin

2019-04-21 Thread yuzhang
Hi JiaTao:
Maybe it's necessary that there is an optional auto-complete machanism 
among different measure's view, isn't it?


yuzhang 


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 4/20/2019 11:38,JiaTao Tao wrote:
Hi

The idea that supports Kylin adding measures dynamically is impressive.

But in my opinion, once you add a measure, the existing segments should
also calculate the new measure(just add a new measure column). Users can
have many cubes, a cube can have many segments, if measure's view is
different in each segment, it will increase the burden of the user.

--


Regards!

Aron Tao

yuzhang  于2019年4月20日周六 上午1:43写道:

Hi dear kylin users and develop team:
Here have some things I want to discuss with community.
As a representative of MOLAP engine, kylin uses pre-aggregation strategies
to provide high-concurrency and second-level response analysis
capabilities, but also loses some flexibility.
The limitation that purge existing segment firstly to add an additional
measure will cause many double calculation and unnecessary disk IO. Such
waste should be avoid especially in MOLAP engine.
For example, there is an cubeA with one measure m1 and segments over time
range1(tr1). Now, user add one measure m2, but don't want to clear segments
over tr1. The value of m2 will exist in tr2, the segments build
subsequently. Sure, tr1 doesn't contain value of m2, which will be
understanded by user who know litte about MOLAP. Querying over tr1 and tr2
is valid for both m1 and m2, but the result of m2 over tr1 will be null.
It's will be better to reminder user the measure missing.Moreover,
refreshing will supply the m2 to segments over tr1.
Currently, kylin's storage engine uses HBase. The measure are aggregated
values based on combination of various dimension members and stored in a
column of a Column Family in HBase. For the same cube, adding a new measure
will add a column to the HBase table(mapping) and will take effect in the
next build. For the existing HTables(segments), the new column is allowed
to be missing. Refreshing old existing segments will add a new column in
their HTable to store new measure. Value of new measure is aggregated
according to the combination of dimension members in rowkey, without
recalculating existing measure.
Now, For additional measure and even additional dimensions, Kylin's
current solution is Hybrid, but we found the following shortcomings during
use:
1. Management costs: Repeated maintenance of similar Cubes, most of which
have many intersections of dimensions and indicators. If you want to
perform optimization operations such as pruning, you need to configure all
of these cubes.
2. A large number of cubes: The initial analysis of the business is not
stable, and analysts often have the need to increase some measures. The
cube is added continuously to the Hybrid group, which will produce a lot of
cubes.
3. Repeat calculation: If you want to drop the old cube in the Hybrid
group, you need to build the latest cube by compute historical data to
cover the old cube.
Those will result in a lot of waste.
In addition, I felt that the metadata about the measure was not perfect
during the applying of Kylin.
1. As one of the most important concerns of analysts, if the measures of
the analysis system can be decoupled from the materialized view(cube) and
have their own management system, it may be more flexibility.
2. Once the dimensions have been choose in cube designing, it's cuboids
are confirmed no matter the number of measures. It may make confuse to
maintenance cubes with different measures but same cuboids. Cubes with
different cuboids should be considered different cube, which is the
definition of cube, isn't it?
It's just some thinking about MOLAP during I using kylin. How do you think
about this? Looking forward your reply, sincerely.
Maybe here are some mistake or misunderstanding, please feel free to
correct me or discuss further more if you find any of them.
Best regards
yuzhang


yuzhang
shifengdefan...@163.com

<https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1=yuzhang=shifengdefannao%40163.com=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg=%5B%22shifengdefannao%40163.com%22%5D>
签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制



[DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin

2019-04-19 Thread yuzhang
Hi dear kylin users and develop team:
Here have some things I want to discuss with community.
As a representative of MOLAP engine, kylin uses pre-aggregation strategies to 
provide high-concurrency and second-level response analysis capabilities, but 
also loses some flexibility.
The limitation that purge existing segment firstly to add an additional measure 
will cause many double calculation and unnecessary disk IO. Such waste should 
be avoid especially in MOLAP engine.
For example, there is an cubeA with one measure m1 and segments over time 
range1(tr1). Now, user add one measure m2, but don't want to clear segments 
over tr1. The value of m2 will exist in tr2, the segments build subsequently. 
Sure, tr1 doesn't contain value of m2, which will be understanded by user who 
know litte about MOLAP. Querying over tr1 and tr2 is valid for both m1 and m2, 
but the result of m2 over tr1 will be null. It's will be better to reminder 
user the measure missing.Moreover, refreshing will supply the m2 to segments 
over tr1.
Currently, kylin's storage engine uses HBase. The measure are aggregated values 
based on combination of various dimension members and stored in a column of a 
Column Family in HBase. For the same cube, adding a new measure will add a 
column to the HBase table(mapping) and will take effect in the next build. For 
the existing HTables(segments), the new column is allowed to be missing. 
Refreshing old existing segments will add a new column in their HTable to store 
new measure. Value of new measure is aggregated according to the combination of 
dimension members in rowkey, without recalculating existing measure.
Now, For additional measure and even additional dimensions, Kylin's current 
solution is Hybrid, but we found the following shortcomings during use:
1. Management costs: Repeated maintenance of similar Cubes, most of which have 
many intersections of dimensions and indicators. If you want to perform 
optimization operations such as pruning, you need to configure all of these 
cubes.
2. A large number of cubes: The initial analysis of the business is not stable, 
and analysts often have the need to increase some measures. The cube is added 
continuously to the Hybrid group, which will produce a lot of cubes.
3. Repeat calculation: If you want to drop the old cube in the Hybrid group, 
you need to build the latest cube by compute historical data to cover the old 
cube.
Those will result in a lot of waste.
In addition, I felt that the metadata about the measure was not perfect during 
the applying of Kylin.
1. As one of the most important concerns of analysts, if the measures of the 
analysis system can be decoupled from the materialized view(cube) and have 
their own management system, it may be more flexibility.
2. Once the dimensions have been choose in cube designing, it's cuboids are 
confirmed no matter the number of measures. It may make confuse to maintenance 
cubes with different measures but same cuboids. Cubes with different cuboids 
should be considered different cube, which is the definition of cube, isn't it?
It's just some thinking about MOLAP during I using kylin. How do you think 
about this? Looking forward your reply, sincerely.
Maybe here are some mistake or misunderstanding, please feel free to correct me 
or discuss further more if you find any of them.
Best regards
yuzhang
 


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制

[jira] [Created] (KYLIN-3956) Segments of not only streaming cube but also batch cube need to show their status

2019-04-14 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3956:
--

 Summary: Segments of not only streaming cube but also batch cube 
need to show their status
 Key: KYLIN-3956
 URL: https://issues.apache.org/jira/browse/KYLIN-3956
 Project: Kylin
  Issue Type: Improvement
  Components: Web 
Affects Versions: v2.6.1
Reporter: Yuzhang QIU


Hi team:
   In file 'cube_detail.html'(arround 112 line), only segments of streaming 
cube will show their segment status. When refresh an old segment of batch cube, 
there have two same time range segment, which may make confuse for user. So 
show their status may be neccessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


回复:答复: Should I remove this check about compare the last and fetched

2019-04-10 Thread yuzhang
We use UTF-8, but there are some emoji content such as ⚔


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
在2019年4月8日 19:06,Na Zhai 写道:
Hi, yuzhang.

What’s the encoding of the column that you query? The original intention of the 
code that you mentioned is to find out the columns with inconsistent sequence 
before and after encoding.

发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用


发件人: yuzhang 
发送时间: Tuesday, March 26, 2019 11:27:14 PM
收件人: dev@kylin.apache.org
抄送: dev@kylin.apache.org
主题: Re: Should I remove this check about compare the last and fetched

Sure, here is the code  at SortedIteratorMergerWithLimit.java:130. And the "Not 
sorted! last: XX fetched: XXX" exception may happen when query some 
table contain Chinese value(or messy code).
```

@Override
public E next() {
if (!nextFetched) {
throw new IllegalStateException("Should hasNext() before next()");
}

//TODO: remove this check when validated
if (last != null) {
if (comparator.compare(last, fetched) > 0)
throw new IllegalStateException("Not sorted! last: " + last + " fetched: " + 
fetched);
}

last = fetched;
nextFetched = false;

return fetched;
}

```
<https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1=yuzhang=shifengdefannao%40163.com=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg=%5B%22shifengdefannao%40163.com%22%5D>
[http://mail-online.nosdn.127.net/sm1c0446ade9371d208d1e209c8bc0827f.jpg]
yuzhang

shifengdefan...@163.com

签名由 网易邮箱大师<https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
On 3/26/2019 23:08,elkan1788<mailto:elkan1...@gmail.com> 
wrote:
I not sure can understand your question cleanly. Can you give more
information about it, just like with a good sample.  Also you can forward
the code what you found and think that is happened!

--
Sent from: http://apache-kylin.74782.x6.nabble.com/


[jira] [Created] (KYLIN-3920) Don't merge same dictionaries when merge dictionary

2019-03-28 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3920:
--

 Summary: Don't merge same dictionaries when merge dictionary
 Key: KYLIN-3920
 URL: https://issues.apache.org/jira/browse/KYLIN-3920
 Project: Kylin
  Issue Type: Improvement
  Components: Others
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi team:
   I found DictionaryManager will pass some dictionaries to DictionaryGenerator 
to merge them when there is different one among them. But If there are 3 
dictionaries {Dic1, Dic1, Dic2} in 3 segments, kylin may don't need to merge 
Dic1 and Dic1, which won't add same value into new dictionary twice.
  If I misunderstand the merge job logic, please feel free to correct me!
  Here is the code snapshot at DictionaryManager.java:251

```
boolean identicalSourceDicts = true;
for (int i = 1; i < dicts.size(); ++i) {
if 
(!dicts.get(0).getDictionaryObject().equals(dicts.get(i).getDictionaryObject()))
 {
identicalSourceDicts = false;
break;
}
}

if (identicalSourceDicts) {
logger.info("Use one of the merging dictionaries directly");
return dicts.get(0);
} else {
Dictionary newDict = 
DictionaryGenerator.mergeDictionaries(DataType.getType(newDictInfo.getDataType()),
 dicts);
return trySaveNewDict(newDict, newDictInfo);
}
```

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Should I remove this check about compare the last and fetched

2019-03-26 Thread yuzhang
Sure, here is the code  at SortedIteratorMergerWithLimit.java:130. And the "Not 
sorted! last: XX fetched: XXX" exception may happen when query some 
table contain Chinese value(or messy code).
```
@Override
public E next() {
if (!nextFetched) {
throw new IllegalStateException("Should hasNext() before next()");
}

//TODO: remove this check when validated
if (last != null) {
if (comparator.compare(last, fetched) > 0)
throw new IllegalStateException("Not sorted! last: " + last + " fetched: " + 
fetched);
}

last = fetched;
nextFetched = false;

return fetched;
}
```
| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 3/26/2019 23:08,elkan1788 wrote:
I not sure can understand your question cleanly. Can you give more
information about it, just like with a good sample.  Also you can forward
the code what you found and think that is happened!

--
Sent from: http://apache-kylin.74782.x6.nabble.com/


Should I remove this check about compare the last and fetched

2019-03-26 Thread yuzhang
Hi team:
   When we use kylin, some queries over table contain Chinese value will throw 
"Not sorted! last: XX fetched: XXX" exception. 
Then, I found there is an check about compare last ITuple and fetched 
ITuple at SortedIteratorMergerWithLimit:130. But there also have an comment 
said "TODO: remove this check when validated".
So, what's this check's aim in the very first? Should I remove this check?
I'll appreciate if some developers can provide some logic about this code.


Best regards


        yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制

[jira] [Created] (KYLIN-3907) Sort the cube list by create time in descending order.

2019-03-25 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3907:
--

 Summary: Sort the cube list by create time in descending order.
 Key: KYLIN-3907
 URL: https://issues.apache.org/jira/browse/KYLIN-3907
 Project: Kylin
  Issue Type: Improvement
  Components: REST Service
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi team:
Maybe there have a use experience problem in the Web UI of cube list. We 
will create many cubes over time and need click "MORE" to show the lastest cube 
when the number cubes increate to over 15.
   In most cases, I think, the older cube should be steady and the new cube may 
need to be debuged. So, sort  the cube list by create time in descending order 
may be better.
How do you think about this?

Best regards
yuzhang



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: spark task error occurs when run IT in sanbox

2019-03-25 Thread yuzhang
Hi elkan:
 Thank you take time to reply.
 Just as you said, the reason is the unmatched jdk version. I just set 
root's JAVA_HOME to point jdk1.8, but every server in sandbox has it's own user 
to run it. So I should re-link the original JAVA_HOME to new one.


Best regards 
yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 3/25/2019 13:32,elkan1788 wrote:
Seems there your Java running time environment was not clean. Please check
the JAVA_HOME and PATH system variable, use the echo command see what output
from them.

By the way the Kylin also can run in Hadoop clusters which use JDK1.7,  just
a simple modify. The steps like this:

1. modify the HBase conf file which name is hbase-env.sh, add export
JAVA_HOME=/path/of/jdk1.8

2. append the below configure into kylin_job_conf.xml and
kylin_job_conf_inmem.xml files.


mapred.child.env
JAVA_HOME=/usr/lib/java/jdk1.8.0_201



yarn.app.mapreduce.am.env
JAVA_HOME=/usr/lib/java/jdk1.8.0_201


Hope those can help you!

--
Sent from: http://apache-kylin.74782.x6.nabble.com/


回复:答复: [Discussion]Does 'UNION ALL' support query on two fact table ?

2019-03-21 Thread yuzhang


Hi Na Zhai:
 thanks for you take time to reply. Yes, I test it and the query can hit 
two cube.


yuzhang
| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
在2019年3月21日 22:44,Na Zhai 写道:
Hi, yuzhang.



There is one fact table in Cube A and one fact table in Cube B. I think “union 
all” supports query on these two fact tables.





发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用




发件人: yuzhang 
发送时间: Tuesday, March 19, 2019 8:29:26 PM
收件人: dev@kylin.apache.org; u...@kylin.apache.org
主题: [Discussion]Does 'UNION ALL' support query on two fact table ?

Hi dear all:
Simple question as mail title desc.


Best regards
yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制


Re: question related to the aggregation groups configuration

2019-03-20 Thread yuzhang
Hi Kang-sen:
 I read your email carely and think can share some information with you.
1. You can use cube planner to view the generated cuboid and relative 
dimension combination. Here is the doc 
http://kylin.apache.org/docs/tutorial/use_cube_planner.html
2.  the number of all combination of D1-to-D10 is 2^10, not factorial(10) I 
think, according the blog I sent you before. Did I misunderstand you?
3. I think we can apply those three rule independently. Because I have 
found those code snapshot in AggregationGroup.java. If we don't define either 
mandatory or hierarchy or joint, the code just return, which don't influence 
other defined rules. I have test it just now, it's work.
4. According to the DefaultCuboidScheduler.java:340, if we set 
'kylin.cube.aggrgroup.is-mandatory-only-valid', kylin would generate cuboids 
contain manadatory dims. But according to kylin's configuration 
document(kylin.cube.aggrgroup.is-mandatory-only-valid: whether to allow Cube 
contains only Base Cuboid. The default value is FALSE, set to TRUE when using 
Spark Cubing), the doc seems misleading.
Please correct me kindly if something is wrong.


Best regards 
yuzhang
| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 3/20/2019 23:31,Lu, Kang-Sen wrote:

Hi, Yuzhang:

 

I found out the reason why if includes = {D1, … , D10}, and mandatory = {D1, …, 
D9}, then we only get one cuboid as {D1, … D10}, and {D1, …, D9} is NOT 
generated.

 

It is caused by this code in 
“core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java”:

 

public boolean getCubeAggrGroupIsMandatoryOnlyValid() {

return 
Boolean.parseBoolean(getOptional("kylin.cube.aggrgroup.is-mandatory-only-valid",
 "false"));

}

 

In kylin.properties, we did not config this parameter 
“kylin.cube.aggrgroup.is-mandatory-only-valid" as “true”, and by default, it is 
set to “false”. So {D1, …, D9} is so-called “mandatory-only”, and treated as 
not valid.

 

Kang-sen

 

 

From: Lu, Kang-Sen 
Sent: Wednesday, March 20, 2019 8:32 AM
To:u...@kylin.apache.org; dev@kylin.apache.org
Subject: RE: question related to the aggregation groups configuration

 

NOTICE: This email was received from an EXTERNAL sender

 

Hi, Yuzhang:

 

Thank you for taking the time to respond. I did read this requirement for 
“mandatory dimension”: ("if a dimension is specified as “mandatory”, then all 
of the combinations without such dimension can be pruned"). That is the key 
point information.

BTW: I am curious if there is an easy way to find out how many cuboids are 
generated by kylin and every cuboid’s dimension set from kylin’s metadata. Your 
finding is what I suspected. But I am not able to verify it as you did.

 

We can live with this fact as is, if it is documented. But it would be better 
to fix the bug and allow the original description of mandatory stand as correct.

 

About Q2, I read the link you mentioned, it seems if hierarchy and joint are 
both specified, then the joint is being treated as tag-alone restriction, say 
D2 is in hierarchy and became “mandatory” in cuboids, if joint says D2 and D3 
must be together, then D2 will pull D3 into the “mandatory” list. That is 
elegant.

 

I am wondering why these three selection-rules can NOT be applied 
independently. If we have D1-to-D10 in the includes set. Then the number of all 
combination of D1-to-D10 is factorial(10). Now, we can apply “mandatory” to 
“prune” some of the combination out. After that, we may further prune by 
applying the hierarchy and joint rules. Isn’t it possible?

 

Thanks.

 

Kang-sen

 

From: yuzhang 
Sent: Wednesday, March 20, 2019 1:00 AM
To:u...@kylin.apache.org; dev@kylin.apache.org
Subject: Re: question related to the aggregation groups configuration

 

NOTICE: This email was received from an EXTERNAL sender

 

Hi kang-sen:

I do some test about Q1, {D1 to D10} have been included in an aggregation 
group and {D1 to D9} have been added into mandatory dimension. Then kylin only 
generates Cuboid {D1 to D10}(base Cuboid) which I expect {D1 to D10} and {D1 to 
D9}. When I add {D1 to D8} in to mandatory dimension, kylin generates Cuboid 
{D1 to D10}, {D1 to D8, D9} and {D1 to D8, D10} which I expect {D1 to D10}, {D1 
to D8, D9}, {D1 to D8, D10} and {D1 to D8}. About your Q1, I think the answer 
is ONLY ONE cuboid {D1 to D10} has been generated. But according the blog ("if 
a dimension is specified as “mandatory”, then all of the combinations without 
such dimension can be pruned"), the Cuboid {D1 to D9} should't been pruned. 
Maybe someone else can give more detail.

Q2 is similar with this email 
https://lists.apache.org/thread.html/3ccc8d7f98748d7c590c01c7da6ce666a16c4fe2b34be070940cae8f@%3Cuser.kylin.apache.org%3E
 and jira https://issues.apache.org/jira/browse/KYLIN-2149 . Now kylin will 
prevent config overlapping hierachy, mandatory and joint. Although the minds of 
three aggregat

Re: question related to the aggregation groups configuration

2019-03-19 Thread yuzhang
Hi kang-sen:
I do some test about Q1, {D1 to D10} have been included in an aggregation 
group and {D1 to D9} have been added into mandatory dimension. Then kylin only 
generates Cuboid {D1 to D10}(base Cuboid) which I expect {D1 to D10} and {D1 to 
D9}. When I add {D1 to D8} in to mandatory dimension, kylin generates Cuboid 
{D1 to D10}, {D1 to D8, D9} and {D1 to D8, D10} which I expect {D1 to D10}, {D1 
to D8, D9}, {D1 to D8, D10} and {D1 to D8}. About your Q1, I think the answer 
is ONLY ONE cuboid {D1 to D10} has been generated. But according the blog ("if 
a dimension is specified as “mandatory”, then all of the combinations without 
such dimension can be pruned"), the Cuboid {D1 to D9} should't been pruned. 
Maybe someone else can give more detail.
Q2 is similar with this email 
https://lists.apache.org/thread.html/3ccc8d7f98748d7c590c01c7da6ce666a16c4fe2b34be070940cae8f@%3Cuser.kylin.apache.org%3E
 and jira https://issues.apache.org/jira/browse/KYLIN-2149 . Now kylin will 
prevent config overlapping hierachy, mandatory and joint. Although the minds of 
three aggregation rule are different and even contradictory, auto merging those 
rules into Cuboids is feasible. For now, the restriction of aggregation group 
can't realize your requirement which I think is common. May be the jira 
KYLIN-2149 can be resolved in the future.


   Best regards
        yuzhang




| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 3/19/2019 23:09,Lu, Kang-Sen wrote:

Hi, Yuzhang:

 

I would appreciate if you can provide answer to my 2 questions.

 

Thanks.

 

Kang-sen

 

From: Lu, Kang-Sen 
Sent: Friday, March 15, 2019 8:15 AM
To:u...@kylin.apache.org
Subject: RE: question related to the aggregation groups configuration

 

NOTICE: This email was received from an EXTERNAL sender

 

Hi, Yuzhang:

 

Thanks for taking time to reply.

 

I actually have read that article several times earlier before.

 

However, may be I missed some details or what, I am not clear about how those 
rules actual work and how they interfere with each other.

 

From the article you pointed out, the hierarchy rule does have an example, so 
it is less likely to be confused.

 

I did not find any discussion about the “mandatory rule”. It is supposed to be 
very simple, but I am stuck by the details. Let’s say, “includes” is a set of 
dim: { d1, d2, … d10}, and the “mandatory” is a set of dim: {D1, …, D9}.

So it is obvious that each cuboid generated from this agg group should all 
include set of dim {D1, …, D9}.

Now, D10 could be either selected or not. So the natural guess is that this agg 
group will generate two cuboids, i.e {D1,…,D9} and {D1,… D10}. Is this what 
kylin will do?

 

Another detail I am not clear is the interaction of “joint rule” and the 
“mandatory rule”. It seems that there is an interaction between these two 
rules. I am not clear why, and it is not discussed in the article you mentioned.

 

That was my two original questions.

 

Thanks again.

 

Kang-sen

 

From: yuzhang 
Sent: Friday, March 15, 2019 7:46 AM
To:u...@kylin.apache.org
Subject: Re: question related to the aggregation groups configuration

 

NOTICE: This email was received from an EXTERNAL sender

 

Hi kang-sen:

  Here is a blog about the mind of aggregation group. I hope it will help you.

https://kylin.apache.org/blog/2016/02/18/new-aggregation-group/

 

Best regards

 yuzhang

 

|

|

yuzhang

|
|

shifengdefan...@163.com

|

签名由 网易邮箱大师 定制

On 3/14/2019 21:21,Lu, Kang-Sen wrote:

I am running kylin 2.5.1

 

I have two questions related to the aggregation group configuration. In the 
kylin GUI, select “Model”, then try to edit a cube design, under “Grid”, select 
“Advanced Setting”, we can enter multiple “Aggregation Groups”. Each 
“Aggregation Group” can specify zero, one, or many cuboids, with the 
combination of dimensions.

 

Q1: If I want one and only one cuboid to be created with dimensions set = {D1, 
D2, … , D10}, then is it correct to enter D1-to-D10 in the “includes” list, and 
“D1-to-D9 in the “Mandatory Dimensions” list? The key question is “will kylin 
generate two cuboids, i.e. {D1, …, D9} and {D1, … , D10} or just one cuboid”?

 

Q2: If I entered D1-to-D10 into the “includes” list, and entered {D1, D2} in 
the “Joint Dimensions” list, then I can’t enter either D1 or D2 into the 
“Mandatory Dimensions” list? I was thinking if I entered {D1, D3, … , D9} in 
the “Mandatory Dimensions”, and with {D1, D2} in the “Joint Dimensions”, then 
there should only one cuboid generated for {D1, D2, …, D10}. Why is it not 
allowed?

 

Maybe the doc have this information described. But it is not clear to me 
exactly how does kylin process the info entered in the “includes”, “Mandatory 
Dimensions”, and “Joint Dimensions”. Can someone either point me to some 
document or answer the questions I mentioned above.

[Discussion]Does 'UNION ALL' support query on two fact table ?

2019-03-19 Thread yuzhang
Hi dear all:
Simple question as mail title desc.


Best regards
yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制

[jira] [Created] (KYLIN-3890) Add doc about usage of ./bin/metadata.sh

2019-03-18 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3890:
--

 Summary: Add doc about usage of ./bin/metadata.sh
 Key: KYLIN-3890
 URL: https://issues.apache.org/jira/browse/KYLIN-3890
 Project: Kylin
  Issue Type: Improvement
  Components: Documentation
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


JIRA title descript the JIRA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3875) During cube or model design, from 1th-step jump to 4th-step doesn't check validity of step between 1th-step and 4th-step, which click next button does.

2019-03-15 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3875:
--

 Summary: During cube or model design, from 1th-step jump to 
4th-step doesn't check validity of step between 1th-step and 4th-step, which 
click next button does.
 Key: KYLIN-3875
 URL: https://issues.apache.org/jira/browse/KYLIN-3875
 Project: Kylin
  Issue Type: Bug
  Components: Web 
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi dear team:
I found a minor problem in webapp.
When I designing a model, I clear all dimension and click next. Then an alert 
window show the warnning about null dimension. But when I return to pre step 
and click "Measure" step, it pass and can be saved successfully.
May be same problems will happen when design an cube.
How do you think about this?

Best regards
yuzhang




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [jira] [Commented] (KYLIN-3830) return wrong result when 'SELECT SUM(dim1)' without set a relative metric of dim1.

2019-03-11 Thread yuzhang
Sure, I will try my best  


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 3/11/2019 14:35,Shaofeng SHI (JIRA) wrote:

[ 
https://issues.apache.org/jira/browse/KYLIN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789202#comment-16789202
 ]

Shaofeng SHI commented on KYLIN-3830:
-

Yuzhang, if you have find the resolution, welcome to raise a PR. Thank you!

return wrong result when 'SELECT SUM(dim1)' without set a relative metric of 
dim1.
--

Key: KYLIN-3830
URL: https://issues.apache.org/jira/browse/KYLIN-3830
Project: Kylin
Issue Type: Bug
Affects Versions: v2.5.2
Reporter: Yuzhang QIU
Priority: Major

Hi, dear team:
I design an cube1 based on table table1 with dim1, dim2, dim3 and only one 
metric count(1), and 'SELECT SUM(dim1) FROM table1 group by dim2', Kylin 
process this SQL and return some result1. It seems ok. But as we know, Kylin 
don't store the detail data, the dimensions' members have been encoded and 
stored in Hbase as rowkey(cause I don't set any metric with an column). So, is 
the result1 right?
Then, I clone cube1 to cube2, and set a metric SUM(dim1). the same SQL has been 
passed to kylin and got result2. It's different from result1 at the aggregation 
field. I also pass same SQL to hive and got result3, it's same with result2.
Yes, I turn off the pushdown.
I think there are some problems.
I can't upload some picture of results for secret policy, sorry for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


回复:kylin内存溢出问题

2019-03-08 Thread yuzhang
Hi, wanmin
I am interested in this problem and making some research on it. 
When the query kylin instance down, are the job and all instances running 
normally? The most number of concurrents are query request? Have you ever 
redeploy or shutdown kylin web app by hand before the exception occured? Any 
extra configurations have been set in tomcat?
As shishaofeng said, the log information is limited. More log or the way to 
reproduce the error will be helpful.



Best regards

yuzhang


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
在2019年3月7日 11:08,wanmin_ws 写道:
你好,请问能解答一下吗。
--
发件人:wanmin_ws 
发送时间:2019年3月6日(星期三) 10:55
收件人:wanmin_ws ; dev 
抄 送:dev 
主 题:回复:kylin内存溢出问题

kylin 2.5.0 大数据平台HDP,一共5台kylin节点,一台all,一台job,三台query节点。 
挂掉的都是query节点,但是这个错误在5个节点上都有报
--
发件人:yuzhang 
发送时间:2019年3月5日(星期二) 17:37
收件人:wanmin_ws 
抄 送:dev 
主 题:回复:kylin内存溢出问题

Hi, Could you describe your deploy environment and Kylie version and Number of 
concurrent


| |
yuzhang
|
|
Email:shifengdefan...@163.com
|

Signature is customized by Netease Mail Master

在2019年03月05日 17:27,wanmin_ws 写道:
只有查询高峰期的时候会出现这个问题,这个问题和slow query 
有没有关系?这个错是kylin.out报出来的,显示是tomcat不能开启更多的session
--
发件人:ShaoFeng Shi 
发送时间:2019年3月5日(星期二) 17:23
收件人:dev ; wanmin_ws 
抄 送:user 
主 题:Re: kylin内存溢出问题

Hi Min,

The log information is so limited that we don't know what may caused that.
I highly recommend you to do some analysis from the following perspective:
1) check the log files in "logs/" and "tomcat/logs" folder;
2) using jmap and jhat to analysis the heap usage;
3) using jstack to analysis the thread information;
4) check your cube definition to see whether there is some UHC dimension
and the dictionary encoding was used for that.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




wanmin_ws  于2019年3月5日周二 下午5:03写道:

错误描述:在访问高峰期的时候,kylin会挂掉,查看日志如下,但不知道如何操作,请问能帮我看一下吗。这个问题已经困扰很久了。
日志:
严重: The web application [/kylin] created a ThreadLocal with key of type
[java.lang.ThreadLocal] (value [java.lang.ThreadLocal@6cc63032]) and a
value of type [org.apache.kylin.rest.msg.Message] (value
[org.apache.kylin.rest.msg.Message@3ad18e2f]) but failed to remove it
when the web application was stopped. Threads are going to be renewed over
time to try and avoid a probable memory leak.
三月 01, 2019 9:50:52 上午 org.apache.catalina.loader.WebappClassLoaderBase
checkThreadLocalMapForLeaks
严重: The web application [/kylin] created a ThreadLocal with key of type
[java.lang.ThreadLocal] (value [java.lang.ThreadLocal@76397470]) and a
value of type [org.apache.kylin.common.util.ImplementationSwitch] (value
[org.apache.kylin.common.util.ImplementationSwitch@26f33af5]) but failed
to remove it when the web application was stopped. Threads are going to be
renewed over time to try and avoid a probable memory leak.
三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop
信息: Stopping ProtocolHandler ["http-bio-7070"]
三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop
信息: Stopping ProtocolHandler ["ajp-bio-9009"]





Re: [Discuss] Won't ship Spark binary in Kylin binary anymore

2019-03-07 Thread yuzhang
Agree! Downloading Spark binary when pack kylin has ever made me confuse.


| |
yuzhang
|
|
shifengdefan...@163.com
|
签名由网易邮箱大师定制
On 3/8/2019 10:42,ShaoFeng Shi wrote:
Hello,


As we know Kylin ships a Spark in its binary package; The total package becomes 
bigger and bigger as the version grows; the latest version (v2.6.1) is bigger 
than 350MB, which was rejected by Apache SVN server when trying to upload the 
new package. Among the 350MB, more than 200MB is Spark, while Spark is not 
mandatory for Kylin. 


So I would propose to exclude Spark from Kylin's binary package, from the 
current v2.6.1; the user just needs to point SPARK_HOME to any a folder of the 
expected spark version, or manually download and then put it to 
KYLIN_HOME/spark.  All other behaviors are not impacted.


Just share your comments if any.


Best regards,


Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofeng...@apache.org


Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org






[jira] [Created] (KYLIN-3860) Add doc about configuration of kylin.web.hide-measures

2019-03-07 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3860:
--

 Summary: Add doc about configuration of kylin.web.hide-measures
 Key: KYLIN-3860
 URL: https://issues.apache.org/jira/browse/KYLIN-3860
 Project: Kylin
  Issue Type: Improvement
  Components: Documentation
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Wellanother jira about document.
*kylin.web.hide-measures* can be used to hide some measures (such as TOP_N, 
Percentile) in some bussiness. Can the configuration document add some 
instruction about this config even though it's easy to understand and use ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


What's the purpose of 'isDraft' in CubeDesc

2019-03-05 Thread yuzhang
Hi team:
The frontend web doesn't pass 'is_draft' field to kylin server when update 
or save CubeDesc and don't use it's value for some change of Web UI. Then the 
backend call setDraft(false) before update or save an CubeDesc in backend and 
then read isDraft(must be false) in CubeDescManager before persist into 
metastore. I don't understand the meaning of those logic. What's the purpose of 
'isDraft' in CubeDesc? 
The sincerity anticipates your reply.

   


Best 
regards


 
yuzhang

回复:kylin内存溢出问题

2019-03-05 Thread yuzhang
Hi, Could you describe your deploy environment and Kylie version and Number of 
concurrent


| |
yuzhang
|
|
Email:shifengdefan...@163.com
|

Signature is customized by Netease Mail Master

在2019年03月05日 17:27,wanmin_ws 写道:
只有查询高峰期的时候会出现这个问题,这个问题和slow query 
有没有关系?这个错是kylin.out报出来的,显示是tomcat不能开启更多的session
--
发件人:ShaoFeng Shi 
发送时间:2019年3月5日(星期二) 17:23
收件人:dev ; wanmin_ws 
抄 送:user 
主 题:Re: kylin内存溢出问题

Hi Min,

The log information is so limited that we don't know what may caused that.
I highly recommend you to do some analysis from the following perspective:
1) check the log files in "logs/" and "tomcat/logs" folder;
2) using jmap and jhat to analysis the heap usage;
3) using jstack to analysis the thread information;
4) check your cube definition to see whether there is some UHC dimension
and the dictionary encoding was used for that.

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




wanmin_ws  于2019年3月5日周二 下午5:03写道:

> 错误描述:在访问高峰期的时候,kylin会挂掉,查看日志如下,但不知道如何操作,请问能帮我看一下吗。这个问题已经困扰很久了。
> 日志:
> 严重: The web application [/kylin] created a ThreadLocal with key of type
> [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@6cc63032]) and a
> value of type [org.apache.kylin.rest.msg.Message] (value
> [org.apache.kylin.rest.msg.Message@3ad18e2f]) but failed to remove it
> when the web application was stopped. Threads are going to be renewed over
> time to try and avoid a probable memory leak.
> 三月 01, 2019 9:50:52 上午 org.apache.catalina.loader.WebappClassLoaderBase
> checkThreadLocalMapForLeaks
> 严重: The web application [/kylin] created a ThreadLocal with key of type
> [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@76397470]) and a
> value of type [org.apache.kylin.common.util.ImplementationSwitch] (value
> [org.apache.kylin.common.util.ImplementationSwitch@26f33af5]) but failed
> to remove it when the web application was stopped. Threads are going to be
> renewed over time to try and avoid a probable memory leak.
> 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop
> 信息: Stopping ProtocolHandler ["http-bio-7070"]
> 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop
> 信息: Stopping ProtocolHandler ["ajp-bio-9009"]



[jira] [Created] (KYLIN-3844) some instruction about config 'kylin.metadata.hbasemapping-adapter'

2019-03-05 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3844:
--

 Summary: some instruction about config 
'kylin.metadata.hbasemapping-adapter'
 Key: KYLIN-3844
 URL: https://issues.apache.org/jira/browse/KYLIN-3844
 Project: Kylin
  Issue Type: Improvement
  Components: Documentation
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi team:
When someone want to self-define hbase column family mapping, they may need 
to know how to config 'kylin.metadata.hbasemapping-adapter'.
Although tracing the code at CubeDesc:around 678 will show the usage of 
this configuration, some official instruction in document may be better.
Just a small suggestion. :)


 Best regards

   yuzhang



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3842) [Defective kylinProperties.js]Unable to get the public configuration of the first line in the front end

2019-03-04 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3842:
--

 Summary: [Defective kylinProperties.js]Unable to get the public 
configuration of the first line in the front end
 Key: KYLIN-3842
 URL: https://issues.apache.org/jira/browse/KYLIN-3842
 Project: Kylin
  Issue Type: Bug
  Components: Web 
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi dear team:
  I'm developing OLAP Platform based on Kylin2.5.2. During my work, I found 
that kylinProperties.js:37(getProperty(name)) can't get the property of the 
first line in the '_config' which initialized through /admin/public_config. 
  For example, the public config is 
'kylin.restclient.connection.default-max-per-route=20\nkylin.restclient.connection.max-total=200\nkylin.engine.default=2\nkylin.storage.default=2\n
kylin.web.hive-limit=20\nkylin.web.help.length=4\n'.  I expected to get 20 but 
got '' when I want to get config by key 
'kylin.restclient.connection.default-max-per-route'. This problem caused by 
'var keyIndex = _config.indexOf('\n' + name + '=');'(at kylinProperties.js:37) 
return -1 for those names before which don't have an \n(at the first line).
  Then, I debug the AdminService.java, KylinConfig.java and found that the  
KylinConfig.java:517(around this line, in method 
exportToString(Collection propertyKeys)) build the public config string 
with a char '\n' after each property, which cause the first property don't has 
'\n' before it.
  Those are what I found, which will cause problem for developers.
  How do you think? 

Best regard
 yuzhang



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3835) [Defective TableSchemaUpdateChecker] Don't check used models when reload table

2019-02-27 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3835:
--

 Summary: [Defective TableSchemaUpdateChecker] Don't check used 
models when reload table
 Key: KYLIN-3835
 URL: https://issues.apache.org/jira/browse/KYLIN-3835
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


1. load table1 from hive.
2. create model1 based on table1 and use table1.column1 as dimension1
3. alter table1.column1 to table1.column11 in hive.
4. reload table1 successfully. (it's bug)
5. swicth to model, the model1 still exist. I can create cube1 based on model1 
and launch a build job, of course, the job turn out error after a period of 
time. (can't find table1.column1, etc)
6. reload metadata in system page, the model1 is disappeared from Web UI, and 
cube1 change to DESCBROKEN, and can't be deleted due to "null" (trace the log, 
I found it's caused by null DataModelDesc in CubeInstance).
7. I want to recreated the model1, but Kylin tell me model1 already existed in 
current project. yes, I use 'sh bin/metadata.sh backup', I found the model1's 
metadata is still stored in Hbase.
8. I hacked the code, the reload table validation is checked in 
TableSchemaUpdateChecker.allowLoad(), but it just check the used cubes. If a 
model using the changed table without any cube based on it, the table can be 
reloaded successfully!
I think it shouldn't be like this.

Best regard



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3830) return wrong result when 'SELECT SUM(dim1)' without set a relative metric of dim1.

2019-02-25 Thread Yuzhang QIU (JIRA)
Yuzhang QIU created KYLIN-3830:
--

 Summary: return wrong result when 'SELECT SUM(dim1)' without set a 
relative metric of dim1.
 Key: KYLIN-3830
 URL: https://issues.apache.org/jira/browse/KYLIN-3830
 Project: Kylin
  Issue Type: Bug
Affects Versions: v2.5.2
Reporter: Yuzhang QIU


Hi, dear team:
  I design an cube1 based on table table1 with dim1, dim2, dim3 and only 
one metric count(1), and 'SELECT SUM(dim1) FROM table1 group by dim2', Kylin 
process this SQL and return some result1. It seems ok. But as we know, Kylin 
don't store the detail data, the dimensions' members have been encoded and 
stored in Hbase as rowkey(cause I don't set any metric with an column). So, is 
the result1 right?
  Then, I clone cube1 to cube2, and set a metric SUM(dim1). the same SQL has 
been passed to kylin and got result2. It's different from result1 at the 
aggregation field. I also pass same SQL to hive and got result3, it's same with 
result2.
  Yes, I turn off the pushdown.
  I think there are some problems.
  I can't upload some picture of results for secret policy, sorry for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: development environment

2019-02-21 Thread yuzhang
May me you can login in Sandbox and type "hive" or "beeline" start hive command 
line by hand for testing of hive dependence


| |
shifengdefannao
|
|
Email:shifengdefan...@163.com
|

Signature is customized by Netease Mail Master

On 02/22/2019 11:01, XiaoHui Zhang wrote:
 Hi, dear team,I am a beginner of Kylin and I am building kylin development 
environment with HDP Sandbox.But when I am running $KYLIN_HOME/bin/kylin.sh 
start,it occurs the following errors:

[root@sandbox-hdp apache-ktlin-2.6.0-bin-hadoop3]#./bin/kylin.sh start
Retrieving hadoop conf dir...
KYLIN_HOME is set to /usr/local/apache-kylin-2.6.0-bin-hadoop3
Retrieving hive dependency...
Something wrong with Hive CLI or Beline,please execute Hive CLI or Beeline CLI 
in termina




Did I need to make any changes to kylin's configuration file?if not,why did 
this happen?


Hope for any of yours reply.





I wonder about the qualifier assignment within an column family for measures

2019-02-21 Thread yuzhang
  Hi, dear team, when I read the source code about create HTable and 
CubeDescCreator, I found there is only one column("M") per column family(F1, 
F2, F3). And the column "M" contain all measures which are assigned to this 
column family.  Then, HBaseReadonlyStore will load all this column(or Cell) 
data into buffer in region server, and then only return the selected measure to 
query server. I wonder why don't kyiln assign an column qualifer(like M1, M2) 
per measures?

and if my understanding of these codes is incorrect, could you let me know, 
Please.
Hope for any of yours reply.

| |
shifengdefannao
|
|
Email:shifengdefan...@163.com
|

Signature is customized by Netease Mail Master

[jira] [Created] (KYLIN-3783) The mapreduce.map.java.opts config in kylin_job_conf_inmem.xml overrides the krb5.conf config in Cluster

2019-01-23 Thread yuzhang qiu (JIRA)
yuzhang qiu created KYLIN-3783:
--

 Summary: The mapreduce.map.java.opts config in 
kylin_job_conf_inmem.xml overrides the krb5.conf config in Cluster
 Key: KYLIN-3783
 URL: https://issues.apache.org/jira/browse/KYLIN-3783
 Project: Kylin
  Issue Type: Bug
Affects Versions: v2.5.2
 Environment: hadoop 2.7
Reporter: yuzhang qiu


In our cluster, we use kerberos for authorization, and config 
`-Djava.security.krb5.conf` in `mapreduce.map.java.opts`. But the default 
configuration in kylin_job_conf_inmem.xml is `-Xmx2700m 
-XX:OnOutOfMemoryError='kill -9 %p'`, which will overrides the krb5 config when 
the cubing algorithm is **inmem**. Then, we got `Caused by: KrbException: 
Cannot locate default realm

`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3723) Can't find bad query configuration in kylin config doc

2018-12-16 Thread yuzhang qiu (JIRA)
yuzhang qiu created KYLIN-3723:
--

 Summary: Can't find bad query configuration in kylin config doc
 Key: KYLIN-3723
 URL: https://issues.apache.org/jira/browse/KYLIN-3723
 Project: Kylin
  Issue Type: Improvement
  Components: Documentation
Affects Versions: v2.5.2
Reporter: yuzhang qiu


Well,I want to self-define the threshold of bad(slow) query in kylin, but can't 
find the relative configuration in kylin config document. However, I find the 
config properties by trace the code(KylinConfigBase.java). So, I wonder why the 
document doesn't contain the bad query config?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)