Re: How kylin can log in without password
kylinSecurity.xml config the bean for spring security, simplify this file and remove the login web page in frontend webapp may help you | | yuzhang | | Email:shifengdefan...@163.com | 签名由网易邮箱大师定制 On 12/11/2019 22:26,yuzhang wrote: Hi Wang, Kylin use Spring Security as its authorization framework. Maybe modify the frontend webapp can satisfy your demand | | yuzhang | | Email:shifengdefan...@163.com | 签名由网易邮箱大师定制 On 12/11/2019 10:18,wangdongd...@bidcc.cn wrote: Dear developer, due to the demand problem, we need to do password free login to use kylin system, but we have tried to modify it many times and found that it can't succeed. Therefore, I'd like to ask you how kylin can achieve password free login and what is the configuration? This is my modified file kylinsecurity.xml. I don't know if it is this file. If you have time to help me, I will be very grateful. Thank you. 王栋栋 java研发工程师 中国科学院计算机网络信息中心 北京北龙云海网络数据科技有限责任公司
Re: How kylin can log in without password
Hi Wang, Kylin use Spring Security as its authorization framework. Maybe modify the frontend webapp can satisfy your demand | | yuzhang | | Email:shifengdefan...@163.com | 签名由网易邮箱大师定制 On 12/11/2019 10:18,wangdongd...@bidcc.cn wrote: Dear developer, due to the demand problem, we need to do password free login to use kylin system, but we have tried to modify it many times and found that it can't succeed. Therefore, I'd like to ask you how kylin can achieve password free login and what is the configuration? This is my modified file kylinsecurity.xml. I don't know if it is this file. If you have time to help me, I will be very grateful. Thank you. 王栋栋 java研发工程师 中国科学院计算机网络信息中心 北京北龙云海网络数据科技有限责任公司
Re: [VOTE] Release apache-kylin-3.0.0 (RC1)
expect it +1 | | yuzhang | | Email:shifengdefan...@163.com | Signature is customized by Netease Mail Master On 12/10/2019 14:07, ShaoFeng Shi wrote: Hi all, I have created a build for Apache Kylin 3.0.0, release candidate 1. Changes highlights: [KYLIN-4258] - Real-time OLAP may return an incorrect result for some case [KYLIN-4167] - Refactor streaming coordinator [KYLIN-4273] - Make cube planner works for real-time streaming job [KYLIN-4187] - Building dimension dictionary using spark [KYLIN-4098] - Add cube auto-merge API Thanks to everyone who has contributed to this release. Here are the release notes: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12345005==12316121 The commit to being voted upon: https://github.com/apache/kylin/commit/c75242a9b55fd57a3a58d92a2dfa9f21cfe4eebc Its hash is c75242a9b55fd57a3a58d92a2dfa9f21cfe4eebc. The artifacts to be voted on, including the source package and two pre-compiled binary packages are located here: https://dist.apache.org/repos/dist/dev/kylin/apache-kylin-3.0.0-rc1/ The hash of the artifacts are as follows: apache-kylin-3.0.0-source-release.zip.sha256 9224742a87750b8d127c5031c03f3716e3af732c9805a6d0c64871605704f6c0 apache-kylin-3.0.0-bin-hbase1x.tar.gz.sha256 bdeddee3eb453c139eabaa2ce7ebd5d14f72d5ac48e5a64636aba2ed7357dda9 apache-kylin-3.0.0-bin-cdh57.tar.gz.sha256 c2ae9498f61edbacb6dae5fc32e2c4ea14539ef6d906d53194492e042c80185f apache-kylin-3.0.0-bin-hadoop3.tar.gz.sha256 116ba002d794058bd34bd05989da2c3a7ff87cf67d3647d2f1cc5b5717d445f6 apache-kylin-3.0.0-bin-cdh60.tar.gz.sha256 22a0701b5a03a8d40c8b1be4fe4acb1ff2550a18c52d509b592d59ef5a094f7e A staged Maven repository is available for review at: https://repository.apache.org/content/repositories/orgapachekylin-1070/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/shaofengshi.asc Please vote on releasing this package as Apache Kylin 3.0.0. The vote is open for the next 72 hours and passes if a majority of at least three +1 PMC votes are cast. [ ] +1 Release this package as Apache Kylin 3.0.0 [ ] 0 I don't feel strongly about it, but I'm okay with the release [ ] -1 Do not release this package because... Here is my vote: +1 (binding) Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
[jira] [Created] (KYLIN-4142) Upgrade ehcache version from 2 to 3
Yuzhang QIU created KYLIN-4142: -- Summary: Upgrade ehcache version from 2 to 3 Key: KYLIN-4142 URL: https://issues.apache.org/jira/browse/KYLIN-4142 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: v2.5.2 Reporter: Yuzhang QIU Assignee: Yuzhang QIU -- This message was sent by Atlassian Jira (v8.3.2#803003)
[KYLIN-3392]Support NULL value in sum, max, min aggregation
Hi dear team: How do you think about KYLIN-3392 | | yuzhang | | Email:shifengdefan...@163.com | Signature is customized by Netease Mail Master
[jira] [Created] (KYLIN-4080) Project schema update event casues error reload NEW DataModelDesc
Yuzhang QIU created KYLIN-4080: -- Summary: Project schema update event casues error reload NEW DataModelDesc Key: KYLIN-4080 URL: https://issues.apache.org/jira/browse/KYLIN-4080 Project: Kylin Issue Type: Bug Components: Metadata Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi, dear Kylin dev team: When create new DataModelDesc, DataModelManager.createDataModelDese:246 will temporarily add the new model name into selected project(project1) cache, but won't persist it. The TEMPORARY ADD operation will make the model reloading successful, rather than throw "No project found for model ..." exception(at ProjectManager:391). However, If there have another threads are processing "Broadcasting update project_schema, project1", it will clean up cache of project1 and reload it, which will reset the "TEMPORARY ADD" operation. Meanwhile, the model saving thread has persisted the DataModelDesc and start to reload it, but will find there have "No project for this model". The new model can't be created again because the conflict timestamp and can't be reloaded into cache because the abrove problem. How do you think about this?? Best regards yuzhang -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (KYLIN-4032) Add tools to show kylin instance which schedule the running job
Yuzhang QIU created KYLIN-4032: -- Summary: Add tools to show kylin instance which schedule the running job Key: KYLIN-4032 URL: https://issues.apache.org/jira/browse/KYLIN-4032 Project: Kylin Issue Type: Improvement Components: Job Engine Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi team: Sometime, the operator need to know the running/error job owner to trace the log file among the kylin cluster. A simple tool to show this may be helpful. Best regards yuzhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-4031) RestClient will throw exception with message contains clear-text password
Yuzhang QIU created KYLIN-4031: -- Summary: RestClient will throw exception with message contains clear-text password Key: KYLIN-4031 URL: https://issues.apache.org/jira/browse/KYLIN-4031 Project: Kylin Issue Type: Improvement Components: REST Service Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi dear kylin team: I found that RestClient:97 will throw IllegalArgumentException with message contains clear-text password when set error uri with user:pwd. This may casue some security problem, I think. How do you think about this? Best Regards yuzhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-4020) fix_length rowkey encode without sepecified length can be saved but cause CreateHTable step failed
Yuzhang QIU created KYLIN-4020: -- Summary: fix_length rowkey encode without sepecified length can be saved but cause CreateHTable step failed Key: KYLIN-4020 URL: https://issues.apache.org/jira/browse/KYLIN-4020 Project: Kylin Issue Type: Improvement Components: Metadata Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi dear team: Just as title said. Maybe there should have more strict check for advanced settings, I think. How do you think about this? If there already have same JIRA,please inform me and close this one. Best regards yuzhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-4013) Only show the cubes under one model
Yuzhang QIU created KYLIN-4013: -- Summary: Only show the cubes under one model Key: KYLIN-4013 URL: https://issues.apache.org/jira/browse/KYLIN-4013 Project: Kylin Issue Type: Improvement Components: Web Affects Versions: v2.5.2 Reporter: Yuzhang QIU Assignee: Yuzhang QIU Some improvement for UI。 User may want to see the cubes under specified model. Add an extra action 'Cubes' in the drop-down list of model to filter cube list -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Altlas Error when run IT on sandbox2.4
Well, stop atlas process and remove `org.apache.atlas.hive.hook.HiveHook` in Hive configuration in ambari can solve this problem. Atlas process is not necessary for runing Integration Test. | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 5/5/2019 09:01,yuzhang wrote: And I find the following exception message in Atlas log file. 2019-05-05 00:40:06,346 DEBUG - [qtp1798286609-13 - 1a863767-d092-4b8d-a45a-d8cb82d8e6ae:] ~ submitting entity { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"-39494778537956", "version":0, "typeName":"hive_table" }, "typeName":"hive_table", "values":{ "tableType":"MANAGED_TABLE", "name":"default.kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by@Sandbox", "createTime":"1557016193", "temporary":false, "db":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"409729d5-c11f-482e-b211-3f50bd097b8e", "version":0, "typeName":"hive_db" }, "typeName":"hive_db", "values":{ }, "traitNames":[ ], "traits":{ } }, "retention":0, "tableName":"kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by", "columns":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"bb04e660-6be2-42a0-8ad2-1b36487e24b0", "version":0, "typeName":"hive_column" }, "typeName":"hive_column", "values":{ }, "traitNames":[ ], "traits":{ } } ], "comment":"", "lastAccessTime":0, "owner":"root", "sd":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"b71fce9e-4e47-4af3-8c51-d8f93a45ebe4", "version":0, "typeName":"hive_storagedesc" }, "typeName":"hive_storagedesc", "values":{ }, "traitNames":[ ], "traits":{ } }, "parameters":{ "comment":"", "transient_lastDdlTime":"1557016193" }, "partitionKeys":[ { "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Reference", "id":{ "jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Id", "id":"81cd90bf-7f7f-4951-9d68-3273337573d3", "version":0, "typeName":"hive_column" }, "typeName":"hive_column", "values":{ }, "traitNames":[ ], "traits":{ } } ] }, "traitNames":[ ], "traits":{ } } (EntityResource:94) 2019-05-05 00:40:06,349 ERROR - [qtp1798286609-13 - 1a863767-d092-4b8d-a45a-d8cb82d8e6ae:] ~ Unable to persist entity instance due to a desrialization error (EntityResource:109) org.apache.atlas.typesystem.types.ValueConversionException: Cannot convert value 'org.apache.atlas.typesystem.Referenceable@2f651f62' to datatype hive_table at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:143) at org.apache.atlas.services.DefaultMetadataService.deserializeClassInstance(DefaultMetadataService.java:252) at org.apache.atlas.services.DefaultMetadataService.createEntity(DefaultMetadataService.java:230) at org.apache.atlas.web.resources.EntityResource.submit(EntityResource.java:96) at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.
Re: Altlas Error when run IT on sandbox2.4
sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) at org.apache.atlas.web.filters.AuditFilter.doFilter(AuditFilter.java:67) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) at com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:499) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException: Null value not allowed for multiplicty Multiplicity{lower=1, upper=1, isUnique=false} at org.apache.atlas.typesystem.types.DataTypes$PrimitiveType.convertNull(DataTypes.java:93) at org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:469) at org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:452) at org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:606) at org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:562) at org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:118) at org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:141) ... 51 more | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 5/5/2019 08:47,yuzhang wrote: Hi dear all: I meet this exception when I run IT on sandbox2.4 or run the hive cmd on shell. Does any people meet the same problem? Here is a discussion about this problem but I think it isn't helpful. https://community.hortonworks.com/questions/224847/i-am-getting-errornull-value-not
Altlas Error when run IT on sandbox2.4
Hi dear all: I meet this exception when I run IT on sandbox2.4 or run the hive cmd on shell. Does any people meet the same problem? Here is a discussion about this problem but I think it isn't helpful. https://community.hortonworks.com/questions/224847/i-am-getting-errornull-value-not-allowed-for-multi.html Best regards yuzhang ==Hive cmd:== hive -e "USE default; CREATE TABLE IF NOT EXISTS default.ci_inner_join_cube_global_dict ( dict_key STRING COMMENT '', dict_val INT COMMENT '' ) COMMENT '' PARTITIONED BY (dict_column string) STORED AS TEXTFILE; DROP TABLE IF EXISTS kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by; CREATE TABLE IF NOT EXISTS kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by ( dict_key STRING COMMENT '' ) COMMENT '' PARTITIONED BY (dict_column string) STORED AS SEQUENCEFILE ; INSERT OVERWRITE TABLE kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd__group_by PARTITION (dict_column = 'TEST_KYLIN_FACT_TEST_COUNT_DISTINCT_BITMAP') SELECT TEST_KYLIN_FACT_TEST_COUNT_DISTINCT_BITMAP FROM kylin_intermediate_ci_inner_join_cube_325139ef_5dd0_01b4_ae61_bc4dcc99c2bd GROUP BY TEST_KYLIN_FACT_TEST_COUNT_DISTINCT_BITMAP ; " --hiveconf mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec --hiveconf dfs.replication=2 --hiveconf hive.exec.compress.output=true ===Error org.apache.atlas.AtlasServiceException: Metadata service API CREATE_ENTITY failed with status 400(Bad Request) Response Body ({"error":"Null value not allowed for multiplicty Multiplicity{lower=1, upper=1, isUnique=false}","stackTrace":"org.apache.atlas.typesystem.types.ValueConversionException$NullConversionException: Null value not allowed for multiplicty Multiplicity{lower=1, upper=1, isUnique=false}\n\tat org.apache.atlas.typesystem.types.DataTypes$PrimitiveType.convertNull(DataTypes.java:93)\n\tat org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:469)\n\tat org.apache.atlas.typesystem.types.DataTypes$StringType.convert(DataTypes.java:452)\n\tat org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:606)\n\tat org.apache.atlas.typesystem.types.DataTypes$MapType.convert(DataTypes.java:562)\n\tat org.apache.atlas.typesystem.persistence.StructInstance.set(StructInstance.java:118)\n\tat org.apache.atlas.typesystem.types.ClassType.convert(ClassType.java:141)\n\tat org.apache.atlas.services.DefaultMetadataService.deserializeClassInstance(DefaultMetadataService.java:252)\n\tat org.apache.atlas.services.DefaultMetadataService.createEntity(DefaultMetadataService.java:230)\n\tat org.apache.atlas.web.resources.EntityResource.submit(EntityResource.java:96)\n\tat sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)\n\tat com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)\n\tat com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)\n\tat com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)\n\tat com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)\n\tat com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)\n\tat com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)\n\tat com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)\n\tat com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)\n\tat com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)\n\tat com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)\n\tat com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)\n\tat com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)\n\tat com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)\n\tat com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)\n\tat com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)\n\tat com.google.
Re: Why does the log always say “No Data Available” when the cube is built?
Hi shiqi: "No Data available" mean the step of the job hasn't been completed. There will be some log message if the step has been completed, whether successful or not. For you problem, could you provide more detail about you build job? Such as, log on server, which step is running, your deploy environment, etc will be helpful. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 4/28/2019 21:05,shiqi wrote: In the sample case on the Kylin official website, when I was building cube, in the first step of the Create Intermediate Flat Hive Table, the log is always No Data Available, the status is always running. The cube build has been executed for more than three hours. I checked the hive database table kylin_sales and there is data in the table. And I fount that the intermediate flat hive table kylin_intermediate_kylin_sales_cube_402e3eaa_dfb2_7e3e_04f3_07248c04c10c has been created successfully in the hive, but there is no data in its. ``` hive> show tables; OK ... kylin_intermediate_kylin_sales_cube_402e3eaa_dfb2_7e3e_04f3_07248c04c10c kylin_sales ... Time taken: 9.816 seconds, Fetched: 1 row(s) hive> select * from kylin_sales; OK ... 89922012-04-17 ABIN15687 0 13 95.5336 17 19751507 ADMIN Shanghai 89932013-02-02 FP-non GTC 67698 0 13 85.7528 6 1856 10004882MODELER Hongkong ... Time taken: 3.759 seconds, Fetched: 1 row(s) ``` -- Sent from: http://apache-kylin.74782.x6.nabble.com/
Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin
Thanks for your replies. Here is the jira about this feature. The PR are being prepared. Hoping for more advices from yours. https://issues.apache.org/jira/browse/KYLIN-3982 Regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 4/26/2019 17:05,Iñigo Martínez wrote: +1 Post Jira issue in order to subscribe, please. :-) Our BI area is modifying cubes very frequently by adding or remove dimensions. In many cases, they want to include all historical series and so they have to rebuild cubes from scratch and sometimes takes days to finish. From a point of view of resources usage this is a waste of computational and memory power because other tasks are affected by lack of resources. I think this proposal is really interesting for us. El vie., 26 abr. 2019 a las 10:57, ShaoFeng Shi () escribió: Hi Yuzhang, Please open a JIRA for this enhancement; If it can be implemented in an elegant way, that will be great! Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org yuzhang 于2019年4月23日周二 上午8:56写道: Hi Shaofeng: We also take some experiment for add measure after cube be built and encountered byte error at the very start. The default mapping strategy between HBase store and measure definition is "multiple measures are stored in one column of column family", which may cause byte error after add a measure and insert it in original measure sequence. Add an column for new measure may be better, I think. I just have a preliminary idea, may be impractical for now, about the measure management design. Dimensions and metrics are defined once model be designed. The measure aggregate the metrics in different dimensions to observe the data entities represented by the model. All of these are design of 'logical view', I think. The Cube is materialized view of these logical model, which is the bridge between the logical view and the physical storage (and the highway is set up). The life cycle of the measure may depend on the model rather than the cube. Based on the design, an measure management can be set up after model design be completed. We can define the measure based on model. Cubes under the model can reuse those measure and build their segment data. When a SQL arrive, Kylin query server need to find the suitable model with suitable measure, then find the available cube. Of course, such an design change will have a very large impact on the existing kylin architecture, and the query and metadata will have very large changes. So it seems that it is still on paper. More realistic or transitional design is increasing the metadata of the measure. Just as CubeDesc defines the schema, and a relative CubeInstance manages the built Segments. MeasureDesc can also has a MeasureInstance to manage the segment containing it. I observed that kylin's query service generates a GridTable for mapping between logical views and HBase physical storage: Cuboid + Measure -> Grid Table <- HBase store. This Grid Table is generated based on CubeDesc and has such a mapping process for each Segment. Therefore, in the mapping stage, it is possible to know which columns of the Grid Table can't be obtained in current segment by the metadata. So the measure data can be selectively read at the RS backend. But its life cycle is the same as MeasureDesc, managed by CubeDesc. Regarding adding dimensions to the same cube, we also need to consider aggregation groups and Rowkey order. I am curious and interesting how you implemented it. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 4/22/2019 09:05,ShaoFeng Shi wrote: Hi Yuzhang, Glad to see such a discussion; How to support "schema change" in a friendly way is what we should do in the next phase, as we see this requirement is stronger than before. Last week I also did a try on 1) adding a dimension after cube be built, and 2) adding a measure after cube be built; For 1) I have got an idea, the first try was successful, and want to discuss it with the community in some day. The 2) was failed; after a new measure is added, the query got failed and in HBase RS side there is byte parsing error. Then I didn't continue that. Could you elaborate your idea on "the measures of the analysis system can be
[jira] [Created] (KYLIN-3986) Add hint about the absent measures after a success query
Yuzhang QIU created KYLIN-3986: -- Summary: Add hint about the absent measures after a success query Key: KYLIN-3986 URL: https://issues.apache.org/jira/browse/KYLIN-3986 Project: Kylin Issue Type: Sub-task Components: Query Engine Affects Versions: v2.5.2 Reporter: Yuzhang QIU Assignee: Yuzhang QIU -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3985) [Web UI] Support map measures to muti-qualifier in column family
Yuzhang QIU created KYLIN-3985: -- Summary: [Web UI] Support map measures to muti-qualifier in column family Key: KYLIN-3985 URL: https://issues.apache.org/jira/browse/KYLIN-3985 Project: Kylin Issue Type: Sub-task Components: Web Affects Versions: v2.5.2 Reporter: Yuzhang QIU Assignee: Yuzhang QIU -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3984) Update measure metadata after job finished
Yuzhang QIU created KYLIN-3984: -- Summary: Update measure metadata after job finished Key: KYLIN-3984 URL: https://issues.apache.org/jira/browse/KYLIN-3984 Project: Kylin Issue Type: Sub-task Affects Versions: v2.5.2 Reporter: Yuzhang QIU Assignee: Yuzhang QIU Merge, build and refresh cube will update measure metadata -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3983) Add extra metadata for measure
Yuzhang QIU created KYLIN-3983: -- Summary: Add extra metadata for measure Key: KYLIN-3983 URL: https://issues.apache.org/jira/browse/KYLIN-3983 Project: Kylin Issue Type: Sub-task Components: Metadata Affects Versions: v2.5.2 Reporter: Yuzhang QIU Assignee: Yuzhang QIU Just like CubeDesc and CubeInstance, we need to add extra metadata for measure to persist some runtime data. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3982) Add measures without purging segments
Yuzhang QIU created KYLIN-3982: -- Summary: Add measures without purging segments Key: KYLIN-3982 URL: https://issues.apache.org/jira/browse/KYLIN-3982 Project: Kylin Issue Type: New Feature Components: Metadata, Query Engine, Tools, Build and Test Affects Versions: v2.5.2 Reporter: Yuzhang QIU Assignee: Yuzhang QIU Here is the discussion https://lists.apache.org/thread.html/44bf088f278d0ca3087bb8bdffda158534994d4c41be5405eb4699d8@%3Cdev.kylin.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin
Hi Shaofeng: We also take some experiment for add measure after cube be built and encountered byte error at the very start. The default mapping strategy between HBase store and measure definition is "multiple measures are stored in one column of column family", which may cause byte error after add a measure and insert it in original measure sequence. Add an column for new measure may be better, I think. I just have a preliminary idea, may be impractical for now, about the measure management design. Dimensions and metrics are defined once model be designed. The measure aggregate the metrics in different dimensions to observe the data entities represented by the model. All of these are design of 'logical view', I think. The Cube is materialized view of these logical model, which is the bridge between the logical view and the physical storage (and the highway is set up). The life cycle of the measure may depend on the model rather than the cube. Based on the design, an measure management can be set up after model design be completed. We can define the measure based on model. Cubes under the model can reuse those measure and build their segment data. When a SQL arrive, Kylin query server need to find the suitable model with suitable measure, then find the available cube. Of course, such an design change will have a very large impact on the existing kylin architecture, and the query and metadata will have very large changes. So it seems that it is still on paper. More realistic or transitional design is increasing the metadata of the measure. Just as CubeDesc defines the schema, and a relative CubeInstance manages the built Segments. MeasureDesc can also has a MeasureInstance to manage the segment containing it. I observed that kylin's query service generates a GridTable for mapping between logical views and HBase physical storage: Cuboid + Measure -> Grid Table <- HBase store. This Grid Table is generated based on CubeDesc and has such a mapping process for each Segment. Therefore, in the mapping stage, it is possible to know which columns of the Grid Table can't be obtained in current segment by the metadata. So the measure data can be selectively read at the RS backend. But its life cycle is the same as MeasureDesc, managed by CubeDesc. Regarding adding dimensions to the same cube, we also need to consider aggregation groups and Rowkey order. I am curious and interesting how you implemented it. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 4/22/2019 09:05,ShaoFeng Shi wrote: Hi Yuzhang, Glad to see such a discussion; How to support "schema change" in a friendly way is what we should do in the next phase, as we see this requirement is stronger than before. Last week I also did a try on 1) adding a dimension after cube be built, and 2) adding a measure after cube be built; For 1) I have got an idea, the first try was successful, and want to discuss it with the community in some day. The 2) was failed; after a new measure is added, the query got failed and in HBase RS side there is byte parsing error. Then I didn't continue that. Could you elaborate your idea on "the measures of the analysis system can be decoupled from the materialized view(cube) and have their own management system"? Have you got a rough design on it? Thank you! Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org yuzhang 于2019年4月21日周日 下午8:08写道: Hi JiaTao: Maybe it's necessary that there is an optional auto-complete machanism among different measure's view, isn't it? yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 4/20/2019 11:38,JiaTao Tao wrote: Hi The idea that supports Kylin adding measures dynamically is impressive. But in my opinion, once you add a measure, the existing segments should also calculate the new measure(just add a new measure column). Users can have many cubes, a cube can have many segments, if measure's view is different in each segment, it will increase the burden of the user. -- Regards! Aron Tao yuzhang 于2019年4月20日周六 上午1:43写道: Hi dear kylin users and develop team: Here have some things I want to discuss with community. As a representative of MOLAP engine, kylin uses pre-aggregation strategies to provide high-c
Re: [DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin
Hi JiaTao: Maybe it's necessary that there is an optional auto-complete machanism among different measure's view, isn't it? yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 4/20/2019 11:38,JiaTao Tao wrote: Hi The idea that supports Kylin adding measures dynamically is impressive. But in my opinion, once you add a measure, the existing segments should also calculate the new measure(just add a new measure column). Users can have many cubes, a cube can have many segments, if measure's view is different in each segment, it will increase the burden of the user. -- Regards! Aron Tao yuzhang 于2019年4月20日周六 上午1:43写道: Hi dear kylin users and develop team: Here have some things I want to discuss with community. As a representative of MOLAP engine, kylin uses pre-aggregation strategies to provide high-concurrency and second-level response analysis capabilities, but also loses some flexibility. The limitation that purge existing segment firstly to add an additional measure will cause many double calculation and unnecessary disk IO. Such waste should be avoid especially in MOLAP engine. For example, there is an cubeA with one measure m1 and segments over time range1(tr1). Now, user add one measure m2, but don't want to clear segments over tr1. The value of m2 will exist in tr2, the segments build subsequently. Sure, tr1 doesn't contain value of m2, which will be understanded by user who know litte about MOLAP. Querying over tr1 and tr2 is valid for both m1 and m2, but the result of m2 over tr1 will be null. It's will be better to reminder user the measure missing.Moreover, refreshing will supply the m2 to segments over tr1. Currently, kylin's storage engine uses HBase. The measure are aggregated values based on combination of various dimension members and stored in a column of a Column Family in HBase. For the same cube, adding a new measure will add a column to the HBase table(mapping) and will take effect in the next build. For the existing HTables(segments), the new column is allowed to be missing. Refreshing old existing segments will add a new column in their HTable to store new measure. Value of new measure is aggregated according to the combination of dimension members in rowkey, without recalculating existing measure. Now, For additional measure and even additional dimensions, Kylin's current solution is Hybrid, but we found the following shortcomings during use: 1. Management costs: Repeated maintenance of similar Cubes, most of which have many intersections of dimensions and indicators. If you want to perform optimization operations such as pruning, you need to configure all of these cubes. 2. A large number of cubes: The initial analysis of the business is not stable, and analysts often have the need to increase some measures. The cube is added continuously to the Hybrid group, which will produce a lot of cubes. 3. Repeat calculation: If you want to drop the old cube in the Hybrid group, you need to build the latest cube by compute historical data to cover the old cube. Those will result in a lot of waste. In addition, I felt that the metadata about the measure was not perfect during the applying of Kylin. 1. As one of the most important concerns of analysts, if the measures of the analysis system can be decoupled from the materialized view(cube) and have their own management system, it may be more flexibility. 2. Once the dimensions have been choose in cube designing, it's cuboids are confirmed no matter the number of measures. It may make confuse to maintenance cubes with different measures but same cuboids. Cubes with different cuboids should be considered different cube, which is the definition of cube, isn't it? It's just some thinking about MOLAP during I using kylin. How do you think about this? Looking forward your reply, sincerely. Maybe here are some mistake or misunderstanding, please feel free to correct me or discuss further more if you find any of them. Best regards yuzhang yuzhang shifengdefan...@163.com <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1=yuzhang=shifengdefannao%40163.com=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg=%5B%22shifengdefannao%40163.com%22%5D> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
[DISCUSSION] Don't need to purge existing segment of cube to add new measures in Kylin
Hi dear kylin users and develop team: Here have some things I want to discuss with community. As a representative of MOLAP engine, kylin uses pre-aggregation strategies to provide high-concurrency and second-level response analysis capabilities, but also loses some flexibility. The limitation that purge existing segment firstly to add an additional measure will cause many double calculation and unnecessary disk IO. Such waste should be avoid especially in MOLAP engine. For example, there is an cubeA with one measure m1 and segments over time range1(tr1). Now, user add one measure m2, but don't want to clear segments over tr1. The value of m2 will exist in tr2, the segments build subsequently. Sure, tr1 doesn't contain value of m2, which will be understanded by user who know litte about MOLAP. Querying over tr1 and tr2 is valid for both m1 and m2, but the result of m2 over tr1 will be null. It's will be better to reminder user the measure missing.Moreover, refreshing will supply the m2 to segments over tr1. Currently, kylin's storage engine uses HBase. The measure are aggregated values based on combination of various dimension members and stored in a column of a Column Family in HBase. For the same cube, adding a new measure will add a column to the HBase table(mapping) and will take effect in the next build. For the existing HTables(segments), the new column is allowed to be missing. Refreshing old existing segments will add a new column in their HTable to store new measure. Value of new measure is aggregated according to the combination of dimension members in rowkey, without recalculating existing measure. Now, For additional measure and even additional dimensions, Kylin's current solution is Hybrid, but we found the following shortcomings during use: 1. Management costs: Repeated maintenance of similar Cubes, most of which have many intersections of dimensions and indicators. If you want to perform optimization operations such as pruning, you need to configure all of these cubes. 2. A large number of cubes: The initial analysis of the business is not stable, and analysts often have the need to increase some measures. The cube is added continuously to the Hybrid group, which will produce a lot of cubes. 3. Repeat calculation: If you want to drop the old cube in the Hybrid group, you need to build the latest cube by compute historical data to cover the old cube. Those will result in a lot of waste. In addition, I felt that the metadata about the measure was not perfect during the applying of Kylin. 1. As one of the most important concerns of analysts, if the measures of the analysis system can be decoupled from the materialized view(cube) and have their own management system, it may be more flexibility. 2. Once the dimensions have been choose in cube designing, it's cuboids are confirmed no matter the number of measures. It may make confuse to maintenance cubes with different measures but same cuboids. Cubes with different cuboids should be considered different cube, which is the definition of cube, isn't it? It's just some thinking about MOLAP during I using kylin. How do you think about this? Looking forward your reply, sincerely. Maybe here are some mistake or misunderstanding, please feel free to correct me or discuss further more if you find any of them. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制
[jira] [Created] (KYLIN-3956) Segments of not only streaming cube but also batch cube need to show their status
Yuzhang QIU created KYLIN-3956: -- Summary: Segments of not only streaming cube but also batch cube need to show their status Key: KYLIN-3956 URL: https://issues.apache.org/jira/browse/KYLIN-3956 Project: Kylin Issue Type: Improvement Components: Web Affects Versions: v2.6.1 Reporter: Yuzhang QIU Hi team: In file 'cube_detail.html'(arround 112 line), only segments of streaming cube will show their segment status. When refresh an old segment of batch cube, there have two same time range segment, which may make confuse for user. So show their status may be neccessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
回复:答复: Should I remove this check about compare the last and fetched
We use UTF-8, but there are some emoji content such as ⚔ | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 在2019年4月8日 19:06,Na Zhai 写道: Hi, yuzhang. What’s the encoding of the column that you query? The original intention of the code that you mentioned is to find out the columns with inconsistent sequence before and after encoding. 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用 发件人: yuzhang 发送时间: Tuesday, March 26, 2019 11:27:14 PM 收件人: dev@kylin.apache.org 抄送: dev@kylin.apache.org 主题: Re: Should I remove this check about compare the last and fetched Sure, here is the code at SortedIteratorMergerWithLimit.java:130. And the "Not sorted! last: XX fetched: XXX" exception may happen when query some table contain Chinese value(or messy code). ``` @Override public E next() { if (!nextFetched) { throw new IllegalStateException("Should hasNext() before next()"); } //TODO: remove this check when validated if (last != null) { if (comparator.compare(last, fetched) > 0) throw new IllegalStateException("Not sorted! last: " + last + " fetched: " + fetched); } last = fetched; nextFetched = false; return fetched; } ``` <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1=yuzhang=shifengdefannao%40163.com=http%3A%2F%2Fmail-online.nosdn.127.net%2Fsm1c0446ade9371d208d1e209c8bc0827f.jpg=%5B%22shifengdefannao%40163.com%22%5D> [http://mail-online.nosdn.127.net/sm1c0446ade9371d208d1e209c8bc0827f.jpg] yuzhang shifengdefan...@163.com 签名由 网易邮箱大师<https://mail.163.com/dashi/dlpro.html?from=mail81> 定制 On 3/26/2019 23:08,elkan1788<mailto:elkan1...@gmail.com> wrote: I not sure can understand your question cleanly. Can you give more information about it, just like with a good sample. Also you can forward the code what you found and think that is happened! -- Sent from: http://apache-kylin.74782.x6.nabble.com/
[jira] [Created] (KYLIN-3920) Don't merge same dictionaries when merge dictionary
Yuzhang QIU created KYLIN-3920: -- Summary: Don't merge same dictionaries when merge dictionary Key: KYLIN-3920 URL: https://issues.apache.org/jira/browse/KYLIN-3920 Project: Kylin Issue Type: Improvement Components: Others Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi team: I found DictionaryManager will pass some dictionaries to DictionaryGenerator to merge them when there is different one among them. But If there are 3 dictionaries {Dic1, Dic1, Dic2} in 3 segments, kylin may don't need to merge Dic1 and Dic1, which won't add same value into new dictionary twice. If I misunderstand the merge job logic, please feel free to correct me! Here is the code snapshot at DictionaryManager.java:251 ``` boolean identicalSourceDicts = true; for (int i = 1; i < dicts.size(); ++i) { if (!dicts.get(0).getDictionaryObject().equals(dicts.get(i).getDictionaryObject())) { identicalSourceDicts = false; break; } } if (identicalSourceDicts) { logger.info("Use one of the merging dictionaries directly"); return dicts.get(0); } else { Dictionary newDict = DictionaryGenerator.mergeDictionaries(DataType.getType(newDictInfo.getDataType()), dicts); return trySaveNewDict(newDict, newDictInfo); } ``` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Should I remove this check about compare the last and fetched
Sure, here is the code at SortedIteratorMergerWithLimit.java:130. And the "Not sorted! last: XX fetched: XXX" exception may happen when query some table contain Chinese value(or messy code). ``` @Override public E next() { if (!nextFetched) { throw new IllegalStateException("Should hasNext() before next()"); } //TODO: remove this check when validated if (last != null) { if (comparator.compare(last, fetched) > 0) throw new IllegalStateException("Not sorted! last: " + last + " fetched: " + fetched); } last = fetched; nextFetched = false; return fetched; } ``` | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 3/26/2019 23:08,elkan1788 wrote: I not sure can understand your question cleanly. Can you give more information about it, just like with a good sample. Also you can forward the code what you found and think that is happened! -- Sent from: http://apache-kylin.74782.x6.nabble.com/
Should I remove this check about compare the last and fetched
Hi team: When we use kylin, some queries over table contain Chinese value will throw "Not sorted! last: XX fetched: XXX" exception. Then, I found there is an check about compare last ITuple and fetched ITuple at SortedIteratorMergerWithLimit:130. But there also have an comment said "TODO: remove this check when validated". So, what's this check's aim in the very first? Should I remove this check? I'll appreciate if some developers can provide some logic about this code. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制
[jira] [Created] (KYLIN-3907) Sort the cube list by create time in descending order.
Yuzhang QIU created KYLIN-3907: -- Summary: Sort the cube list by create time in descending order. Key: KYLIN-3907 URL: https://issues.apache.org/jira/browse/KYLIN-3907 Project: Kylin Issue Type: Improvement Components: REST Service Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi team: Maybe there have a use experience problem in the Web UI of cube list. We will create many cubes over time and need click "MORE" to show the lastest cube when the number cubes increate to over 15. In most cases, I think, the older cube should be steady and the new cube may need to be debuged. So, sort the cube list by create time in descending order may be better. How do you think about this? Best regards yuzhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: spark task error occurs when run IT in sanbox
Hi elkan: Thank you take time to reply. Just as you said, the reason is the unmatched jdk version. I just set root's JAVA_HOME to point jdk1.8, but every server in sandbox has it's own user to run it. So I should re-link the original JAVA_HOME to new one. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 3/25/2019 13:32,elkan1788 wrote: Seems there your Java running time environment was not clean. Please check the JAVA_HOME and PATH system variable, use the echo command see what output from them. By the way the Kylin also can run in Hadoop clusters which use JDK1.7, just a simple modify. The steps like this: 1. modify the HBase conf file which name is hbase-env.sh, add export JAVA_HOME=/path/of/jdk1.8 2. append the below configure into kylin_job_conf.xml and kylin_job_conf_inmem.xml files. mapred.child.env JAVA_HOME=/usr/lib/java/jdk1.8.0_201 yarn.app.mapreduce.am.env JAVA_HOME=/usr/lib/java/jdk1.8.0_201 Hope those can help you! -- Sent from: http://apache-kylin.74782.x6.nabble.com/
回复:答复: [Discussion]Does 'UNION ALL' support query on two fact table ?
Hi Na Zhai: thanks for you take time to reply. Yes, I test it and the query can hit two cube. yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 在2019年3月21日 22:44,Na Zhai 写道: Hi, yuzhang. There is one fact table in Cube A and one fact table in Cube B. I think “union all” supports query on these two fact tables. 发送自 Windows 10 版邮件<https://go.microsoft.com/fwlink/?LinkId=550986>应用 发件人: yuzhang 发送时间: Tuesday, March 19, 2019 8:29:26 PM 收件人: dev@kylin.apache.org; u...@kylin.apache.org 主题: [Discussion]Does 'UNION ALL' support query on two fact table ? Hi dear all: Simple question as mail title desc. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制
Re: question related to the aggregation groups configuration
Hi Kang-sen: I read your email carely and think can share some information with you. 1. You can use cube planner to view the generated cuboid and relative dimension combination. Here is the doc http://kylin.apache.org/docs/tutorial/use_cube_planner.html 2. the number of all combination of D1-to-D10 is 2^10, not factorial(10) I think, according the blog I sent you before. Did I misunderstand you? 3. I think we can apply those three rule independently. Because I have found those code snapshot in AggregationGroup.java. If we don't define either mandatory or hierarchy or joint, the code just return, which don't influence other defined rules. I have test it just now, it's work. 4. According to the DefaultCuboidScheduler.java:340, if we set 'kylin.cube.aggrgroup.is-mandatory-only-valid', kylin would generate cuboids contain manadatory dims. But according to kylin's configuration document(kylin.cube.aggrgroup.is-mandatory-only-valid: whether to allow Cube contains only Base Cuboid. The default value is FALSE, set to TRUE when using Spark Cubing), the doc seems misleading. Please correct me kindly if something is wrong. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 3/20/2019 23:31,Lu, Kang-Sen wrote: Hi, Yuzhang: I found out the reason why if includes = {D1, … , D10}, and mandatory = {D1, …, D9}, then we only get one cuboid as {D1, … D10}, and {D1, …, D9} is NOT generated. It is caused by this code in “core-common/src/main/java/org/apache/kylin/common/KylinConfigBase.java”: public boolean getCubeAggrGroupIsMandatoryOnlyValid() { return Boolean.parseBoolean(getOptional("kylin.cube.aggrgroup.is-mandatory-only-valid", "false")); } In kylin.properties, we did not config this parameter “kylin.cube.aggrgroup.is-mandatory-only-valid" as “true”, and by default, it is set to “false”. So {D1, …, D9} is so-called “mandatory-only”, and treated as not valid. Kang-sen From: Lu, Kang-Sen Sent: Wednesday, March 20, 2019 8:32 AM To:u...@kylin.apache.org; dev@kylin.apache.org Subject: RE: question related to the aggregation groups configuration NOTICE: This email was received from an EXTERNAL sender Hi, Yuzhang: Thank you for taking the time to respond. I did read this requirement for “mandatory dimension”: ("if a dimension is specified as “mandatory”, then all of the combinations without such dimension can be pruned"). That is the key point information. BTW: I am curious if there is an easy way to find out how many cuboids are generated by kylin and every cuboid’s dimension set from kylin’s metadata. Your finding is what I suspected. But I am not able to verify it as you did. We can live with this fact as is, if it is documented. But it would be better to fix the bug and allow the original description of mandatory stand as correct. About Q2, I read the link you mentioned, it seems if hierarchy and joint are both specified, then the joint is being treated as tag-alone restriction, say D2 is in hierarchy and became “mandatory” in cuboids, if joint says D2 and D3 must be together, then D2 will pull D3 into the “mandatory” list. That is elegant. I am wondering why these three selection-rules can NOT be applied independently. If we have D1-to-D10 in the includes set. Then the number of all combination of D1-to-D10 is factorial(10). Now, we can apply “mandatory” to “prune” some of the combination out. After that, we may further prune by applying the hierarchy and joint rules. Isn’t it possible? Thanks. Kang-sen From: yuzhang Sent: Wednesday, March 20, 2019 1:00 AM To:u...@kylin.apache.org; dev@kylin.apache.org Subject: Re: question related to the aggregation groups configuration NOTICE: This email was received from an EXTERNAL sender Hi kang-sen: I do some test about Q1, {D1 to D10} have been included in an aggregation group and {D1 to D9} have been added into mandatory dimension. Then kylin only generates Cuboid {D1 to D10}(base Cuboid) which I expect {D1 to D10} and {D1 to D9}. When I add {D1 to D8} in to mandatory dimension, kylin generates Cuboid {D1 to D10}, {D1 to D8, D9} and {D1 to D8, D10} which I expect {D1 to D10}, {D1 to D8, D9}, {D1 to D8, D10} and {D1 to D8}. About your Q1, I think the answer is ONLY ONE cuboid {D1 to D10} has been generated. But according the blog ("if a dimension is specified as “mandatory”, then all of the combinations without such dimension can be pruned"), the Cuboid {D1 to D9} should't been pruned. Maybe someone else can give more detail. Q2 is similar with this email https://lists.apache.org/thread.html/3ccc8d7f98748d7c590c01c7da6ce666a16c4fe2b34be070940cae8f@%3Cuser.kylin.apache.org%3E and jira https://issues.apache.org/jira/browse/KYLIN-2149 . Now kylin will prevent config overlapping hierachy, mandatory and joint. Although the minds of three aggregat
Re: question related to the aggregation groups configuration
Hi kang-sen: I do some test about Q1, {D1 to D10} have been included in an aggregation group and {D1 to D9} have been added into mandatory dimension. Then kylin only generates Cuboid {D1 to D10}(base Cuboid) which I expect {D1 to D10} and {D1 to D9}. When I add {D1 to D8} in to mandatory dimension, kylin generates Cuboid {D1 to D10}, {D1 to D8, D9} and {D1 to D8, D10} which I expect {D1 to D10}, {D1 to D8, D9}, {D1 to D8, D10} and {D1 to D8}. About your Q1, I think the answer is ONLY ONE cuboid {D1 to D10} has been generated. But according the blog ("if a dimension is specified as “mandatory”, then all of the combinations without such dimension can be pruned"), the Cuboid {D1 to D9} should't been pruned. Maybe someone else can give more detail. Q2 is similar with this email https://lists.apache.org/thread.html/3ccc8d7f98748d7c590c01c7da6ce666a16c4fe2b34be070940cae8f@%3Cuser.kylin.apache.org%3E and jira https://issues.apache.org/jira/browse/KYLIN-2149 . Now kylin will prevent config overlapping hierachy, mandatory and joint. Although the minds of three aggregation rule are different and even contradictory, auto merging those rules into Cuboids is feasible. For now, the restriction of aggregation group can't realize your requirement which I think is common. May be the jira KYLIN-2149 can be resolved in the future. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 3/19/2019 23:09,Lu, Kang-Sen wrote: Hi, Yuzhang: I would appreciate if you can provide answer to my 2 questions. Thanks. Kang-sen From: Lu, Kang-Sen Sent: Friday, March 15, 2019 8:15 AM To:u...@kylin.apache.org Subject: RE: question related to the aggregation groups configuration NOTICE: This email was received from an EXTERNAL sender Hi, Yuzhang: Thanks for taking time to reply. I actually have read that article several times earlier before. However, may be I missed some details or what, I am not clear about how those rules actual work and how they interfere with each other. From the article you pointed out, the hierarchy rule does have an example, so it is less likely to be confused. I did not find any discussion about the “mandatory rule”. It is supposed to be very simple, but I am stuck by the details. Let’s say, “includes” is a set of dim: { d1, d2, … d10}, and the “mandatory” is a set of dim: {D1, …, D9}. So it is obvious that each cuboid generated from this agg group should all include set of dim {D1, …, D9}. Now, D10 could be either selected or not. So the natural guess is that this agg group will generate two cuboids, i.e {D1,…,D9} and {D1,… D10}. Is this what kylin will do? Another detail I am not clear is the interaction of “joint rule” and the “mandatory rule”. It seems that there is an interaction between these two rules. I am not clear why, and it is not discussed in the article you mentioned. That was my two original questions. Thanks again. Kang-sen From: yuzhang Sent: Friday, March 15, 2019 7:46 AM To:u...@kylin.apache.org Subject: Re: question related to the aggregation groups configuration NOTICE: This email was received from an EXTERNAL sender Hi kang-sen: Here is a blog about the mind of aggregation group. I hope it will help you. https://kylin.apache.org/blog/2016/02/18/new-aggregation-group/ Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由 网易邮箱大师 定制 On 3/14/2019 21:21,Lu, Kang-Sen wrote: I am running kylin 2.5.1 I have two questions related to the aggregation group configuration. In the kylin GUI, select “Model”, then try to edit a cube design, under “Grid”, select “Advanced Setting”, we can enter multiple “Aggregation Groups”. Each “Aggregation Group” can specify zero, one, or many cuboids, with the combination of dimensions. Q1: If I want one and only one cuboid to be created with dimensions set = {D1, D2, … , D10}, then is it correct to enter D1-to-D10 in the “includes” list, and “D1-to-D9 in the “Mandatory Dimensions” list? The key question is “will kylin generate two cuboids, i.e. {D1, …, D9} and {D1, … , D10} or just one cuboid”? Q2: If I entered D1-to-D10 into the “includes” list, and entered {D1, D2} in the “Joint Dimensions” list, then I can’t enter either D1 or D2 into the “Mandatory Dimensions” list? I was thinking if I entered {D1, D3, … , D9} in the “Mandatory Dimensions”, and with {D1, D2} in the “Joint Dimensions”, then there should only one cuboid generated for {D1, D2, …, D10}. Why is it not allowed? Maybe the doc have this information described. But it is not clear to me exactly how does kylin process the info entered in the “includes”, “Mandatory Dimensions”, and “Joint Dimensions”. Can someone either point me to some document or answer the questions I mentioned above.
[Discussion]Does 'UNION ALL' support query on two fact table ?
Hi dear all: Simple question as mail title desc. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制
[jira] [Created] (KYLIN-3890) Add doc about usage of ./bin/metadata.sh
Yuzhang QIU created KYLIN-3890: -- Summary: Add doc about usage of ./bin/metadata.sh Key: KYLIN-3890 URL: https://issues.apache.org/jira/browse/KYLIN-3890 Project: Kylin Issue Type: Improvement Components: Documentation Affects Versions: v2.5.2 Reporter: Yuzhang QIU JIRA title descript the JIRA -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3875) During cube or model design, from 1th-step jump to 4th-step doesn't check validity of step between 1th-step and 4th-step, which click next button does.
Yuzhang QIU created KYLIN-3875: -- Summary: During cube or model design, from 1th-step jump to 4th-step doesn't check validity of step between 1th-step and 4th-step, which click next button does. Key: KYLIN-3875 URL: https://issues.apache.org/jira/browse/KYLIN-3875 Project: Kylin Issue Type: Bug Components: Web Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi dear team: I found a minor problem in webapp. When I designing a model, I clear all dimension and click next. Then an alert window show the warnning about null dimension. But when I return to pre step and click "Measure" step, it pass and can be saved successfully. May be same problems will happen when design an cube. How do you think about this? Best regards yuzhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [jira] [Commented] (KYLIN-3830) return wrong result when 'SELECT SUM(dim1)' without set a relative metric of dim1.
Sure, I will try my best | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 3/11/2019 14:35,Shaofeng SHI (JIRA) wrote: [ https://issues.apache.org/jira/browse/KYLIN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789202#comment-16789202 ] Shaofeng SHI commented on KYLIN-3830: - Yuzhang, if you have find the resolution, welcome to raise a PR. Thank you! return wrong result when 'SELECT SUM(dim1)' without set a relative metric of dim1. -- Key: KYLIN-3830 URL: https://issues.apache.org/jira/browse/KYLIN-3830 Project: Kylin Issue Type: Bug Affects Versions: v2.5.2 Reporter: Yuzhang QIU Priority: Major Hi, dear team: I design an cube1 based on table table1 with dim1, dim2, dim3 and only one metric count(1), and 'SELECT SUM(dim1) FROM table1 group by dim2', Kylin process this SQL and return some result1. It seems ok. But as we know, Kylin don't store the detail data, the dimensions' members have been encoded and stored in Hbase as rowkey(cause I don't set any metric with an column). So, is the result1 right? Then, I clone cube1 to cube2, and set a metric SUM(dim1). the same SQL has been passed to kylin and got result2. It's different from result1 at the aggregation field. I also pass same SQL to hive and got result3, it's same with result2. Yes, I turn off the pushdown. I think there are some problems. I can't upload some picture of results for secret policy, sorry for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
回复:kylin内存溢出问题
Hi, wanmin I am interested in this problem and making some research on it. When the query kylin instance down, are the job and all instances running normally? The most number of concurrents are query request? Have you ever redeploy or shutdown kylin web app by hand before the exception occured? Any extra configurations have been set in tomcat? As shishaofeng said, the log information is limited. More log or the way to reproduce the error will be helpful. Best regards yuzhang | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 在2019年3月7日 11:08,wanmin_ws 写道: 你好,请问能解答一下吗。 -- 发件人:wanmin_ws 发送时间:2019年3月6日(星期三) 10:55 收件人:wanmin_ws ; dev 抄 送:dev 主 题:回复:kylin内存溢出问题 kylin 2.5.0 大数据平台HDP,一共5台kylin节点,一台all,一台job,三台query节点。 挂掉的都是query节点,但是这个错误在5个节点上都有报 -- 发件人:yuzhang 发送时间:2019年3月5日(星期二) 17:37 收件人:wanmin_ws 抄 送:dev 主 题:回复:kylin内存溢出问题 Hi, Could you describe your deploy environment and Kylie version and Number of concurrent | | yuzhang | | Email:shifengdefan...@163.com | Signature is customized by Netease Mail Master 在2019年03月05日 17:27,wanmin_ws 写道: 只有查询高峰期的时候会出现这个问题,这个问题和slow query 有没有关系?这个错是kylin.out报出来的,显示是tomcat不能开启更多的session -- 发件人:ShaoFeng Shi 发送时间:2019年3月5日(星期二) 17:23 收件人:dev ; wanmin_ws 抄 送:user 主 题:Re: kylin内存溢出问题 Hi Min, The log information is so limited that we don't know what may caused that. I highly recommend you to do some analysis from the following perspective: 1) check the log files in "logs/" and "tomcat/logs" folder; 2) using jmap and jhat to analysis the heap usage; 3) using jstack to analysis the thread information; 4) check your cube definition to see whether there is some UHC dimension and the dictionary encoding was used for that. Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: shaofeng@kyligence.io Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org wanmin_ws 于2019年3月5日周二 下午5:03写道: 错误描述:在访问高峰期的时候,kylin会挂掉,查看日志如下,但不知道如何操作,请问能帮我看一下吗。这个问题已经困扰很久了。 日志: 严重: The web application [/kylin] created a ThreadLocal with key of type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@6cc63032]) and a value of type [org.apache.kylin.rest.msg.Message] (value [org.apache.kylin.rest.msg.Message@3ad18e2f]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak. 三月 01, 2019 9:50:52 上午 org.apache.catalina.loader.WebappClassLoaderBase checkThreadLocalMapForLeaks 严重: The web application [/kylin] created a ThreadLocal with key of type [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@76397470]) and a value of type [org.apache.kylin.common.util.ImplementationSwitch] (value [org.apache.kylin.common.util.ImplementationSwitch@26f33af5]) but failed to remove it when the web application was stopped. Threads are going to be renewed over time to try and avoid a probable memory leak. 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop 信息: Stopping ProtocolHandler ["http-bio-7070"] 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop 信息: Stopping ProtocolHandler ["ajp-bio-9009"]
Re: [Discuss] Won't ship Spark binary in Kylin binary anymore
Agree! Downloading Spark binary when pack kylin has ever made me confuse. | | yuzhang | | shifengdefan...@163.com | 签名由网易邮箱大师定制 On 3/8/2019 10:42,ShaoFeng Shi wrote: Hello, As we know Kylin ships a Spark in its binary package; The total package becomes bigger and bigger as the version grows; the latest version (v2.6.1) is bigger than 350MB, which was rejected by Apache SVN server when trying to upload the new package. Among the 350MB, more than 200MB is Spark, while Spark is not mandatory for Kylin. So I would propose to exclude Spark from Kylin's binary package, from the current v2.6.1; the user just needs to point SPARK_HOME to any a folder of the expected spark version, or manually download and then put it to KYLIN_HOME/spark. All other behaviors are not impacted. Just share your comments if any. Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org
[jira] [Created] (KYLIN-3860) Add doc about configuration of kylin.web.hide-measures
Yuzhang QIU created KYLIN-3860: -- Summary: Add doc about configuration of kylin.web.hide-measures Key: KYLIN-3860 URL: https://issues.apache.org/jira/browse/KYLIN-3860 Project: Kylin Issue Type: Improvement Components: Documentation Affects Versions: v2.5.2 Reporter: Yuzhang QIU Wellanother jira about document. *kylin.web.hide-measures* can be used to hide some measures (such as TOP_N, Percentile) in some bussiness. Can the configuration document add some instruction about this config even though it's easy to understand and use ? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
What's the purpose of 'isDraft' in CubeDesc
Hi team: The frontend web doesn't pass 'is_draft' field to kylin server when update or save CubeDesc and don't use it's value for some change of Web UI. Then the backend call setDraft(false) before update or save an CubeDesc in backend and then read isDraft(must be false) in CubeDescManager before persist into metastore. I don't understand the meaning of those logic. What's the purpose of 'isDraft' in CubeDesc? The sincerity anticipates your reply. Best regards yuzhang
回复:kylin内存溢出问题
Hi, Could you describe your deploy environment and Kylie version and Number of concurrent | | yuzhang | | Email:shifengdefan...@163.com | Signature is customized by Netease Mail Master 在2019年03月05日 17:27,wanmin_ws 写道: 只有查询高峰期的时候会出现这个问题,这个问题和slow query 有没有关系?这个错是kylin.out报出来的,显示是tomcat不能开启更多的session -- 发件人:ShaoFeng Shi 发送时间:2019年3月5日(星期二) 17:23 收件人:dev ; wanmin_ws 抄 送:user 主 题:Re: kylin内存溢出问题 Hi Min, The log information is so limited that we don't know what may caused that. I highly recommend you to do some analysis from the following perspective: 1) check the log files in "logs/" and "tomcat/logs" folder; 2) using jmap and jhat to analysis the heap usage; 3) using jstack to analysis the thread information; 4) check your cube definition to see whether there is some UHC dimension and the dictionary encoding was used for that. Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: shaofeng@kyligence.io Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: user-subscr...@kylin.apache.org Join Kylin dev mail group: dev-subscr...@kylin.apache.org wanmin_ws 于2019年3月5日周二 下午5:03写道: > 错误描述:在访问高峰期的时候,kylin会挂掉,查看日志如下,但不知道如何操作,请问能帮我看一下吗。这个问题已经困扰很久了。 > 日志: > 严重: The web application [/kylin] created a ThreadLocal with key of type > [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@6cc63032]) and a > value of type [org.apache.kylin.rest.msg.Message] (value > [org.apache.kylin.rest.msg.Message@3ad18e2f]) but failed to remove it > when the web application was stopped. Threads are going to be renewed over > time to try and avoid a probable memory leak. > 三月 01, 2019 9:50:52 上午 org.apache.catalina.loader.WebappClassLoaderBase > checkThreadLocalMapForLeaks > 严重: The web application [/kylin] created a ThreadLocal with key of type > [java.lang.ThreadLocal] (value [java.lang.ThreadLocal@76397470]) and a > value of type [org.apache.kylin.common.util.ImplementationSwitch] (value > [org.apache.kylin.common.util.ImplementationSwitch@26f33af5]) but failed > to remove it when the web application was stopped. Threads are going to be > renewed over time to try and avoid a probable memory leak. > 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop > 信息: Stopping ProtocolHandler ["http-bio-7070"] > 三月 01, 2019 9:50:52 上午 org.apache.coyote.AbstractProtocol stop > 信息: Stopping ProtocolHandler ["ajp-bio-9009"]
[jira] [Created] (KYLIN-3844) some instruction about config 'kylin.metadata.hbasemapping-adapter'
Yuzhang QIU created KYLIN-3844: -- Summary: some instruction about config 'kylin.metadata.hbasemapping-adapter' Key: KYLIN-3844 URL: https://issues.apache.org/jira/browse/KYLIN-3844 Project: Kylin Issue Type: Improvement Components: Documentation Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi team: When someone want to self-define hbase column family mapping, they may need to know how to config 'kylin.metadata.hbasemapping-adapter'. Although tracing the code at CubeDesc:around 678 will show the usage of this configuration, some official instruction in document may be better. Just a small suggestion. :) Best regards yuzhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3842) [Defective kylinProperties.js]Unable to get the public configuration of the first line in the front end
Yuzhang QIU created KYLIN-3842: -- Summary: [Defective kylinProperties.js]Unable to get the public configuration of the first line in the front end Key: KYLIN-3842 URL: https://issues.apache.org/jira/browse/KYLIN-3842 Project: Kylin Issue Type: Bug Components: Web Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi dear team: I'm developing OLAP Platform based on Kylin2.5.2. During my work, I found that kylinProperties.js:37(getProperty(name)) can't get the property of the first line in the '_config' which initialized through /admin/public_config. For example, the public config is 'kylin.restclient.connection.default-max-per-route=20\nkylin.restclient.connection.max-total=200\nkylin.engine.default=2\nkylin.storage.default=2\n kylin.web.hive-limit=20\nkylin.web.help.length=4\n'. I expected to get 20 but got '' when I want to get config by key 'kylin.restclient.connection.default-max-per-route'. This problem caused by 'var keyIndex = _config.indexOf('\n' + name + '=');'(at kylinProperties.js:37) return -1 for those names before which don't have an \n(at the first line). Then, I debug the AdminService.java, KylinConfig.java and found that the KylinConfig.java:517(around this line, in method exportToString(Collection propertyKeys)) build the public config string with a char '\n' after each property, which cause the first property don't has '\n' before it. Those are what I found, which will cause problem for developers. How do you think? Best regard yuzhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3835) [Defective TableSchemaUpdateChecker] Don't check used models when reload table
Yuzhang QIU created KYLIN-3835: -- Summary: [Defective TableSchemaUpdateChecker] Don't check used models when reload table Key: KYLIN-3835 URL: https://issues.apache.org/jira/browse/KYLIN-3835 Project: Kylin Issue Type: Bug Components: REST Service Affects Versions: v2.5.2 Reporter: Yuzhang QIU 1. load table1 from hive. 2. create model1 based on table1 and use table1.column1 as dimension1 3. alter table1.column1 to table1.column11 in hive. 4. reload table1 successfully. (it's bug) 5. swicth to model, the model1 still exist. I can create cube1 based on model1 and launch a build job, of course, the job turn out error after a period of time. (can't find table1.column1, etc) 6. reload metadata in system page, the model1 is disappeared from Web UI, and cube1 change to DESCBROKEN, and can't be deleted due to "null" (trace the log, I found it's caused by null DataModelDesc in CubeInstance). 7. I want to recreated the model1, but Kylin tell me model1 already existed in current project. yes, I use 'sh bin/metadata.sh backup', I found the model1's metadata is still stored in Hbase. 8. I hacked the code, the reload table validation is checked in TableSchemaUpdateChecker.allowLoad(), but it just check the used cubes. If a model using the changed table without any cube based on it, the table can be reloaded successfully! I think it shouldn't be like this. Best regard -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3830) return wrong result when 'SELECT SUM(dim1)' without set a relative metric of dim1.
Yuzhang QIU created KYLIN-3830: -- Summary: return wrong result when 'SELECT SUM(dim1)' without set a relative metric of dim1. Key: KYLIN-3830 URL: https://issues.apache.org/jira/browse/KYLIN-3830 Project: Kylin Issue Type: Bug Affects Versions: v2.5.2 Reporter: Yuzhang QIU Hi, dear team: I design an cube1 based on table table1 with dim1, dim2, dim3 and only one metric count(1), and 'SELECT SUM(dim1) FROM table1 group by dim2', Kylin process this SQL and return some result1. It seems ok. But as we know, Kylin don't store the detail data, the dimensions' members have been encoded and stored in Hbase as rowkey(cause I don't set any metric with an column). So, is the result1 right? Then, I clone cube1 to cube2, and set a metric SUM(dim1). the same SQL has been passed to kylin and got result2. It's different from result1 at the aggregation field. I also pass same SQL to hive and got result3, it's same with result2. Yes, I turn off the pushdown. I think there are some problems. I can't upload some picture of results for secret policy, sorry for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: development environment
May me you can login in Sandbox and type "hive" or "beeline" start hive command line by hand for testing of hive dependence | | shifengdefannao | | Email:shifengdefan...@163.com | Signature is customized by Netease Mail Master On 02/22/2019 11:01, XiaoHui Zhang wrote: Hi, dear team,I am a beginner of Kylin and I am building kylin development environment with HDP Sandbox.But when I am running $KYLIN_HOME/bin/kylin.sh start,it occurs the following errors: [root@sandbox-hdp apache-ktlin-2.6.0-bin-hadoop3]#./bin/kylin.sh start Retrieving hadoop conf dir... KYLIN_HOME is set to /usr/local/apache-kylin-2.6.0-bin-hadoop3 Retrieving hive dependency... Something wrong with Hive CLI or Beline,please execute Hive CLI or Beeline CLI in termina Did I need to make any changes to kylin's configuration file?if not,why did this happen? Hope for any of yours reply.
I wonder about the qualifier assignment within an column family for measures
Hi, dear team, when I read the source code about create HTable and CubeDescCreator, I found there is only one column("M") per column family(F1, F2, F3). And the column "M" contain all measures which are assigned to this column family. Then, HBaseReadonlyStore will load all this column(or Cell) data into buffer in region server, and then only return the selected measure to query server. I wonder why don't kyiln assign an column qualifer(like M1, M2) per measures? and if my understanding of these codes is incorrect, could you let me know, Please. Hope for any of yours reply. | | shifengdefannao | | Email:shifengdefan...@163.com | Signature is customized by Netease Mail Master
[jira] [Created] (KYLIN-3783) The mapreduce.map.java.opts config in kylin_job_conf_inmem.xml overrides the krb5.conf config in Cluster
yuzhang qiu created KYLIN-3783: -- Summary: The mapreduce.map.java.opts config in kylin_job_conf_inmem.xml overrides the krb5.conf config in Cluster Key: KYLIN-3783 URL: https://issues.apache.org/jira/browse/KYLIN-3783 Project: Kylin Issue Type: Bug Affects Versions: v2.5.2 Environment: hadoop 2.7 Reporter: yuzhang qiu In our cluster, we use kerberos for authorization, and config `-Djava.security.krb5.conf` in `mapreduce.map.java.opts`. But the default configuration in kylin_job_conf_inmem.xml is `-Xmx2700m -XX:OnOutOfMemoryError='kill -9 %p'`, which will overrides the krb5 config when the cubing algorithm is **inmem**. Then, we got `Caused by: KrbException: Cannot locate default realm ` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3723) Can't find bad query configuration in kylin config doc
yuzhang qiu created KYLIN-3723: -- Summary: Can't find bad query configuration in kylin config doc Key: KYLIN-3723 URL: https://issues.apache.org/jira/browse/KYLIN-3723 Project: Kylin Issue Type: Improvement Components: Documentation Affects Versions: v2.5.2 Reporter: yuzhang qiu Well,I want to self-define the threshold of bad(slow) query in kylin, but can't find the relative configuration in kylin config document. However, I find the config properties by trace the code(KylinConfigBase.java). So, I wonder why the document doesn't contain the bad query config? -- This message was sent by Atlassian JIRA (v7.6.3#76005)