[jira] [Updated] (KYLIN-2776) Using dropwizard as default metric framework
[ https://issues.apache.org/jira/browse/KYLIN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yiming.xu updated KYLIN-2776: - Attachment: KYLIN-2776.patch add patch > Using dropwizard as default metric framework > > > Key: KYLIN-2776 > URL: https://issues.apache.org/jira/browse/KYLIN-2776 > Project: Kylin > Issue Type: New Feature >Affects Versions: v2.0.0 >Reporter: yiming.xu >Assignee: yiming.xu > Attachments: active_calls.png, calls.png, KYLIN-2776.patch, > metric_structure.png, query_count.png, query_duration.png, > query_result_rowcount.png, report.json > > > With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release > a new metric framework. > New metric is different hadoop metric and based on dropwizard . which has > the following advantage: > * Well-defined metric model for frequently-needed metrics (ie JVM metrics) > * Well-defined measurements for all metrics (ie max, mean, stddev, > mean_rate, etc), > * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON > We refactored QueryMetric with new metrics, notice the exposed JMX MBeans > have changed a little bit. > A new tool called perflog is also introduced. Perflog traces call duration > time and current active calls by recording them to metric system. > Some snapshots of the new JMX MBeans can be seen in attachments -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2776) Using dropwizard as default metric framework
[ https://issues.apache.org/jira/browse/KYLIN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-2776: -- Description: With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release a new metric framework. New metric is different hadoop metric and based on dropwizard . which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON We refactored QueryMetric with new metrics, notice the exposed JMX MBeans have changed a little bit. A new tool called perflog is also introduced. Perflog traces call duration time and current active calls by recording them to metric system. Some snapshots of the new JMX MBeans can be seen in attachments was: With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release a new metric framework. New metric is different hadoop metric and based on dropwizard . which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON We refactor QueryMetric with new metris. New metric add perflog. Perflog trace calls duration time and current active calls by recording them to metric system. Attachment is the difference between the two metric system . > Using dropwizard as default metric framework > > > Key: KYLIN-2776 > URL: https://issues.apache.org/jira/browse/KYLIN-2776 > Project: Kylin > Issue Type: New Feature >Affects Versions: v2.0.0 >Reporter: yiming.xu >Assignee: yiming.xu > Attachments: active_calls.png, calls.png, KYLIN-2776.patch, > metric_structure.png, query_count.png, query_duration.png, > query_result_rowcount.png, report.json > > > With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release > a new metric framework. > New metric is different hadoop metric and based on dropwizard . which has > the following advantage: > * Well-defined metric model for frequently-needed metrics (ie JVM metrics) > * Well-defined measurements for all metrics (ie max, mean, stddev, > mean_rate, etc), > * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON > We refactored QueryMetric with new metrics, notice the exposed JMX MBeans > have changed a little bit. > A new tool called perflog is also introduced. Perflog traces call duration > time and current active calls by recording them to metric system. > Some snapshots of the new JMX MBeans can be seen in attachments -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2776) Using dropwizard as default metric framework
[ https://issues.apache.org/jira/browse/KYLIN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-2776: -- Summary: Using dropwizard as default metric framework (was: New metric framework with kylin) > Using dropwizard as default metric framework > > > Key: KYLIN-2776 > URL: https://issues.apache.org/jira/browse/KYLIN-2776 > Project: Kylin > Issue Type: New Feature >Affects Versions: v2.0.0 >Reporter: yiming.xu >Assignee: yiming.xu > Attachments: active_calls.png, calls.png, metric_structure.png, > query_count.png, query_duration.png, query_result_rowcount.png, report.json > > > With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release > a new metric framework. > New metric is different hadoop metric and based on dropwizard . which has > the following advantage: > * Well-defined metric model for frequently-needed metrics (ie JVM metrics) > * Well-defined measurements for all metrics (ie max, mean, stddev, > mean_rate, etc), > * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON > We refactor QueryMetric with new metris. > New metric add perflog. Perflog trace calls duration time and current > active calls by recording them to metric system. > Attachment is the difference between the two metric system . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2776) New metric framework with kylin
[ https://issues.apache.org/jira/browse/KYLIN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-2776: -- Description: With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release a new metric framework. New metric is different hadoop metric and based on dropwizard . which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON We refactor QueryMetric with new metris. New metric add perflog. Perflog trace calls duration time and current active calls by recording them to metric system. Attachment is the difference between the two metric system . was: With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release a new metric framework. New metric is different hadoop metric and based on dropwizard . which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON We refactor QueryMetric with new metris. New metric add perflog. Perflog trace calls duration time and current active calls record to metric system. Attachment is the difference between the two metric system . > New metric framework with kylin > --- > > Key: KYLIN-2776 > URL: https://issues.apache.org/jira/browse/KYLIN-2776 > Project: Kylin > Issue Type: New Feature >Affects Versions: v2.0.0 >Reporter: yiming.xu >Assignee: yiming.xu > Attachments: active_calls.png, calls.png, metric_structure.png, > query_count.png, query_duration.png, query_result_rowcount.png, report.json > > > With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release > a new metric framework. > New metric is different hadoop metric and based on dropwizard . which has > the following advantage: > * Well-defined metric model for frequently-needed metrics (ie JVM metrics) > * Well-defined measurements for all metrics (ie max, mean, stddev, > mean_rate, etc), > * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON > We refactor QueryMetric with new metris. > New metric add perflog. Perflog trace calls duration time and current > active calls by recording them to metric system. > Attachment is the difference between the two metric system . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2776) new metric framework with kylin
yiming.xu created KYLIN-2776: Summary: new metric framework with kylin Key: KYLIN-2776 URL: https://issues.apache.org/jira/browse/KYLIN-2776 Project: Kylin Issue Type: New Feature Affects Versions: v2.0.0 Reporter: yiming.xu Assignee: yiming.xu Attachments: active_calls.png, calls.png, metric_structure.png, query_count.png, query_duration.png, query_result_rowcount.png, report.json With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release a new metric framework. New metric is different hadoop metric and based on dropwizard . which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON We refactor QueryMetric with new metris. New metric add perflog. Perflog trace calls duration time and current active calls record to metric system. Attachment is the difference between the two metric system . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2776) New metric framework with kylin
[ https://issues.apache.org/jira/browse/KYLIN-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yiming.xu updated KYLIN-2776: - Summary: New metric framework with kylin (was: new metric framework with kylin) > New metric framework with kylin > --- > > Key: KYLIN-2776 > URL: https://issues.apache.org/jira/browse/KYLIN-2776 > Project: Kylin > Issue Type: New Feature >Affects Versions: v2.0.0 >Reporter: yiming.xu >Assignee: yiming.xu > Attachments: active_calls.png, calls.png, metric_structure.png, > query_count.png, query_duration.png, query_result_rowcount.png, report.json > > > With https://issues.apache.org/jira/browse/KYLIN-2721.We are plan to release > a new metric framework. > New metric is different hadoop metric and based on dropwizard . which has > the following advantage: > * Well-defined metric model for frequently-needed metrics (ie JVM metrics) > * Well-defined measurements for all metrics (ie max, mean, stddev, > mean_rate, etc), > * Built-in pluggable reporting frameworks like JMX, Console, Log, JSON > We refactor QueryMetric with new metris. > New metric add perflog. Perflog trace calls duration time and current > active calls record to metric system. > Attachment is the difference between the two metric system . -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2603) Try push 'having' filter down to storage
[ https://issues.apache.org/jira/browse/KYLIN-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111870#comment-16111870 ] albertoramon commented on KYLIN-2603: - Is this commited to 2.1? can be closed? > Try push 'having' filter down to storage > > > Key: KYLIN-2603 > URL: https://issues.apache.org/jira/browse/KYLIN-2603 > Project: Kylin > Issue Type: New Feature >Reporter: liyang > > We know push filter down to storage is good and have done that for 'where' > filter. Is it possible to push 'having' filter down to storage as well? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2775) Streaming Cube Sample
Billy Liu created KYLIN-2775: Summary: Streaming Cube Sample Key: KYLIN-2775 URL: https://issues.apache.org/jira/browse/KYLIN-2775 Project: Kylin Issue Type: New Feature Components: General Reporter: Billy Liu Assignee: Billy Liu The sample.sh will generate sample table/model/cube for Hive-based data source. There is no easy way to generate sample table/model/cube for Kafka-based streaming cube. In this issue, the easy to use streaming sample will be provided. Will suppose user has Kafka installed already. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2774) ACL inherit only works when creating model/cube
[ https://issues.apache.org/jira/browse/KYLIN-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110773#comment-16110773 ] jiatao.tao commented on KYLIN-2774: --- If you create model/cube first and then grant acl to a user, this acl will not inherited. > ACL inherit only works when creating model/cube > --- > > Key: KYLIN-2774 > URL: https://issues.apache.org/jira/browse/KYLIN-2774 > Project: Kylin > Issue Type: Bug >Reporter: jiatao.tao >Assignee: jiatao.tao > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2774) ACL inherit only works when creating model/cube
jiatao.tao created KYLIN-2774: - Summary: ACL inherit only works when creating model/cube Key: KYLIN-2774 URL: https://issues.apache.org/jira/browse/KYLIN-2774 Project: Kylin Issue Type: Bug Reporter: jiatao.tao Assignee: jiatao.tao -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2720) Should not allow user to access to all tables' metadata of a project
[ https://issues.apache.org/jira/browse/KYLIN-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110729#comment-16110729 ] qiumingming commented on KYLIN-2720: OK, I will follow this issue. > Should not allow user to access to all tables' metadata of a project > > > Key: KYLIN-2720 > URL: https://issues.apache.org/jira/browse/KYLIN-2720 > Project: Kylin > Issue Type: Improvement >Reporter: qiumingming >Assignee: qiumingming > Fix For: v2.1.0 > > Attachments: KYLIN-2720.patch > > > Currently, user can access to all tables and columns metadata of a specific > project as long as he can access to this project, which is not reasonable. > User should just allow to access to tables that he owned cubes dependent to. > However, user can see some other tables in the web UI in current version. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2646) Project level query authorization
[ https://issues.apache.org/jira/browse/KYLIN-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110728#comment-16110728 ] qiumingming commented on KYLIN-2646: OK, I will follow this issue. > Project level query authorization > - > > Key: KYLIN-2646 > URL: https://issues.apache.org/jira/browse/KYLIN-2646 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v2.1.0 > > > As we introduced ad-hoc queries in > https://issues.apache.org/jira/browse/KYLIN-2515, we'll need to adjust query > authorization as follows: > Query authorization is encouraged to be set as project level. If someone is > assigned READ permission on project, then he has access to query all tables > in the project, regardless thru adhoc or cubes > If a user has READ permission on cubes but no READ permission on project. He > can only issue queries only if the query can be satisfied by those cubes he > has READ permission. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Issue Comment Deleted] (KYLIN-2723) Introduce metrics collector for query & job metrics
[ https://issues.apache.org/jira/browse/KYLIN-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2723: -- Comment: was deleted (was: Hi guys, This is for voting which metrics framework kylin metrics should be based on. Best regards, Yanghong Zhong Email: yangzh...@ebay.com Mobile: +86 13706747741 On 7/21/17, 4:09 PM, "liyang (JIRA)" wrote: [ https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-2723%3Fpage%3Dcom.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel%26focusedCommentId%3D16095934%23comment-16095934&data=02%7C01%7Cyangzhong%40ebay.com%7Ccd5b5627a5244b6d918908d4d00fd11a%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636362213728751004&sdata=88my0zX%2BbGqx96CQbJ1TyQhe5k1WXprUEAuqicC3oTU%3D&reserved=0 ] liyang commented on KYLIN-2723: --- Can we start by listing out the options and pros & cons? What I heard of so far are: - Yammer (Coda Hale) - Hadoop Metrics2 Are different are they? Or they are simply interchangable? > Introduce metrics collector for query & job metrics > --- > > Key: KYLIN-2723 > URL: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKYLIN-2723&data=02%7C01%7Cyangzhong%40ebay.com%7Ccd5b5627a5244b6d918908d4d00fd11a%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C636362213728751004&sdata=H1dfcr%2B%2B2LkMj2yxSXFB26tGnTwz9eSnZrA8PMqqrEE%3D&reserved=0 > Project: Kylin > Issue Type: Sub-task >Affects Versions: v2.0.0 >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Attachments: APACHE-KYLIN-2723+APACHE-KYLIN-2722.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) ) > Introduce metrics collector for query & job metrics > --- > > Key: KYLIN-2723 > URL: https://issues.apache.org/jira/browse/KYLIN-2723 > Project: Kylin > Issue Type: Sub-task >Affects Versions: v2.0.0 >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Attachments: APACHE-KYLIN-2723+APACHE-KYLIN-2722.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2722) Introduce a new measure, called active reservoir, for actively pushing metrics to reporters
[ https://issues.apache.org/jira/browse/KYLIN-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110664#comment-16110664 ] Zhong Yanghong commented on KYLIN-2722: --- Hi [~liyang.g...@gmail.com], here the package is for kylin metrics rather than for a new measure, like hllc. > Introduce a new measure, called active reservoir, for actively pushing > metrics to reporters > --- > > Key: KYLIN-2722 > URL: https://issues.apache.org/jira/browse/KYLIN-2722 > Project: Kylin > Issue Type: Sub-task >Affects Versions: v2.0.0 >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Attachments: APACHE-KYLIN-2722.patch > > > For many existing metrics frameworks, they focus on maintaining metrics in > memory independently for each instance. However, kylin server may consist of > multiple instances. Thus we extend existing metrics framework by introducing > *active reservoir* to actively push metrics to reporters which will report > metrics of its instance to a unified storage. > Here we introduced two *active reservoirs*. One is called > {{BlockingReservoir}}, which will buffer the metrics. The other is called > {{InstantReservoir}}, which owns no buffer and will directly push metrics to > reporters. > Generally, one *active reservoir* can push its metrics to multiple reporters > and one reporter can only listen on one *active reservoir*. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2773) Should not push down join condition related columns are compatible while not consistent
[ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110647#comment-16110647 ] Zhong Yanghong commented on KYLIN-2773: --- Hi [~liyang.g...@gmail.com] and [~mahongbin], could you help have a check? Thanks very much. > Should not push down join condition related columns are compatible while not > consistent > --- > > Key: KYLIN-2773 > URL: https://issues.apache.org/jira/browse/KYLIN-2773 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2773.patch > > > For sql, > {code} > select PART_DT, META_CATEG_NAME, sum(price) > from KYLIN_SALES > INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID > AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID > INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT > group by PART_DT, META_CATEG_NAME > order by PART_DT, META_CATEG_NAME > {code} > the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. > Then the plan transformed by kylin is as follows: > {code} > OLAPToEnumerableConverter > OLAPLimitRel(fetch=[5]) > OLAPSortRel(sort0=[$0], dir0=[ASC]) > OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) > OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], > PRICE=[$t5]) > OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) > {code} > {color:#f79232} > {code} > OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], > LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], > LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], > MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], > USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], > UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], > META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) > OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], > joinType=[inner]) > OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, > 2, 3, 4, 5, 6, 7, 8, 9]]) > OLAPProjectRel(expr#0..8=[{inputs}], > expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], > USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], > LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], > CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) > {code} > {color} > {code} > OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], > fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) > OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, > 3]]) > {code} > However, what we expect is as follows: > {code} > OLAPToEnumerableConverter > OLAPLimitRel(fetch=[5]) > OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) > OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) > OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], > META_CATEG_NAME=[$t17], PRICE=[$t4]) > {code} > {color:#f79232} > {code} > OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) > OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], > joinType=[inner]) > OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, > 3, 4, 5, 6, 7, 8, 9, 10]]) > {code} > {color} > {code} > OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], > fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) > OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) > {code} > The reason for this difference is as follows: > * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, > the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when > creating a join in {{SqlToRelConverter}}. > * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the > datatypes of the join related columns are not the same, *cast* function will > be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a > {{OLAPProjectRel}} will be introduced. > In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, > the solution for this is just remove that logic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2703) kylin supports managing access rights for project and cube through apache ranger.
[ https://issues.apache.org/jira/browse/KYLIN-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] peng.jianhua updated KYLIN-2703: Attachment: 0001-KYLIN-2703-kylin-supports-managing-access-rights-for.patch > kylin supports managing access rights for project and cube through apache > ranger. > - > > Key: KYLIN-2703 > URL: https://issues.apache.org/jira/browse/KYLIN-2703 > Project: Kylin > Issue Type: New Feature > Components: General >Reporter: peng.jianhua >Assignee: peng.jianhua > Labels: newbie, patch > Attachments: > 0001-KYLIN-2703-kylin-supports-managing-access-rights-for.patch, > KylinAuditLog.jpg, KylinPlugins.jpg, KylinPolicies.jpg, > KylinServiceEntry.jpg, NewKylinPolicy.jpg, NewKylinService.jpg, > Ranger-PMS-hope.png > > > Ranger is a framework to enable, monitor and manage comprehensive data > security across the Hadoop platform. Apache Ranger has the following goals: > 1. Centralized security administration to manage all security related tasks > in a central UI or using REST APIs. > 2. Fine grained authorization to do a specific action and/or operation with > Hadoop component/tool and managed through a central administration tool > 3. Standardize authorization method across all Hadoop components. > 4. Enhanced support for different authorization methods - Role based access > control, attribute based access control etc. > 5. Centralize auditing of user access and administrative actions (security > related) within all the components of Hadoop. > Ranger has supported enable, monitor and manage following components: > 1. HDFS > 2. HIVE > 3. HBASE > 4. KNOX > 5. YARN > 6. STORM > 7. SOLR > 8. KAFKA > 9. ATLAS > In order to improve the flexibility of kylin privilege control and enhance > value of kylin in the Apache Hadoop ecosystem, like hdfs, yarn, hive, hbase, > Kylin should also support that using Ranger to control access rights for > project and cube. > Specific implementation plan is as following: > On the ranger website, administrators can configure policies to control user > access to projects and cube permissions. > Kylin provides an abstract class and authorization interfaces for use by the > ranger plugin. kylin instantiates ranger plugin’s implementation class when > starting(this class extends the abstract class provided by kylin). > Ranger plugin periodically polls ranger admin, updates the policy to the > local, and updates project and cube access rights based on policy information. > In the Kylin side: > 1. Kylin provides an abstract class that enables the ranger plugin's > implementation class to extend. > 2. Add configuration item. 1) ranger authorization switch, 2) ranger plugin > implementation class's name. > 3. Instantiate the ranger plugin implementation class when starting kylin. > 4. kylin provides authorization interfaces for ranger plugin calls. > 5. According to the ranger authorization configuration item, hide kylin's > authorization management page. > 6. Using ranger manager access rights of the kylin does not affect kylin's > existing permissions functions and logic. > In the Ranger side: > 1. Ranger plugin will periodically polls ranger admin, updates the policy to > the local. > 2. The ranger plugin invoking the authorization interfaces provided by kylin > to updates the project and cube access rights based on the policy information. > reference link:https://issues.apache.org/jira/browse/RANGER-1672 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2703) kylin supports managing access rights for project and cube through apache ranger.
[ https://issues.apache.org/jira/browse/KYLIN-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] peng.jianhua updated KYLIN-2703: Attachment: (was: 0001-KYLIN-2703-kylin-supports-managing-access-rights-for.patch) > kylin supports managing access rights for project and cube through apache > ranger. > - > > Key: KYLIN-2703 > URL: https://issues.apache.org/jira/browse/KYLIN-2703 > Project: Kylin > Issue Type: New Feature > Components: General >Reporter: peng.jianhua >Assignee: peng.jianhua > Labels: newbie, patch > Attachments: KylinAuditLog.jpg, KylinPlugins.jpg, KylinPolicies.jpg, > KylinServiceEntry.jpg, NewKylinPolicy.jpg, NewKylinService.jpg, > Ranger-PMS-hope.png > > > Ranger is a framework to enable, monitor and manage comprehensive data > security across the Hadoop platform. Apache Ranger has the following goals: > 1. Centralized security administration to manage all security related tasks > in a central UI or using REST APIs. > 2. Fine grained authorization to do a specific action and/or operation with > Hadoop component/tool and managed through a central administration tool > 3. Standardize authorization method across all Hadoop components. > 4. Enhanced support for different authorization methods - Role based access > control, attribute based access control etc. > 5. Centralize auditing of user access and administrative actions (security > related) within all the components of Hadoop. > Ranger has supported enable, monitor and manage following components: > 1. HDFS > 2. HIVE > 3. HBASE > 4. KNOX > 5. YARN > 6. STORM > 7. SOLR > 8. KAFKA > 9. ATLAS > In order to improve the flexibility of kylin privilege control and enhance > value of kylin in the Apache Hadoop ecosystem, like hdfs, yarn, hive, hbase, > Kylin should also support that using Ranger to control access rights for > project and cube. > Specific implementation plan is as following: > On the ranger website, administrators can configure policies to control user > access to projects and cube permissions. > Kylin provides an abstract class and authorization interfaces for use by the > ranger plugin. kylin instantiates ranger plugin’s implementation class when > starting(this class extends the abstract class provided by kylin). > Ranger plugin periodically polls ranger admin, updates the policy to the > local, and updates project and cube access rights based on policy information. > In the Kylin side: > 1. Kylin provides an abstract class that enables the ranger plugin's > implementation class to extend. > 2. Add configuration item. 1) ranger authorization switch, 2) ranger plugin > implementation class's name. > 3. Instantiate the ranger plugin implementation class when starting kylin. > 4. kylin provides authorization interfaces for ranger plugin calls. > 5. According to the ranger authorization configuration item, hide kylin's > authorization management page. > 6. Using ranger manager access rights of the kylin does not affect kylin's > existing permissions functions and logic. > In the Ranger side: > 1. Ranger plugin will periodically polls ranger admin, updates the policy to > the local. > 2. The ranger plugin invoking the authorization interfaces provided by kylin > to updates the project and cube access rights based on the policy information. > reference link:https://issues.apache.org/jira/browse/RANGER-1672 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2773) Should not push down join condition related columns are compatible while not consistent
[ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2773: -- Description: For sql, {code} select PART_DT, META_CATEG_NAME, sum(price) from KYLIN_SALES INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT group by PART_DT, META_CATEG_NAME order by PART_DT, META_CATEG_NAME {code} the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. Then the plan transformed by kylin is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], dir0=[ASC]) OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], PRICE=[$t5]) OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) {code} {color:#f79232} {code} OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) OLAPProjectRel(expr#0..8=[{inputs}], expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, 3]]) {code} However, what we expect is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], META_CATEG_NAME=[$t17], PRICE=[$t4]) {code} {color:#f79232} {code} OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) {code} The reason for this difference is as follows: * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when creating a join in {{SqlToRelConverter}}. * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the datatypes of the join related columns are not the same, *cast* function will be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a {{OLAPProjectRel}} will be introduced. In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, the solution for this is just remove that logic. was: For sql, {code} select PART_DT, META_CATEG_NAME, sum(price) from KYLIN_SALES INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT group by PART_DT, META_CATEG_NAME order by PART_DT, META_CATEG_NAME {code} the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. Then the plan transformed by kylin is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], dir0=[ASC]) OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], PRICE=[$t5]) OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) {code} {color:#f79232} {code} OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEA
[jira] [Updated] (KYLIN-2773) Should not push down join condition related columns are compatible while not consistent
[ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2773: -- Description: For sql, {code} select PART_DT, META_CATEG_NAME, sum(price) from KYLIN_SALES INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT group by PART_DT, META_CATEG_NAME order by PART_DT, META_CATEG_NAME {code} the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. Then the plan transformed by kylin is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], dir0=[ASC]) OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], PRICE=[$t5]) OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) {code} {color:#f79232} {code} OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) OLAPProjectRel(expr#0..8=[{inputs}], expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, 3]]) {code} However, what we expect is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], META_CATEG_NAME=[$t17], PRICE=[$t4]) {code} {color:#f79232} {code} OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]) {code} {color} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) {code} The reason for this difference is as follows: * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when creating a join in {{SqlToRelConverter}}. * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the datatypes of the join related columns are not the same, *cast* function will be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a {{OLAPProjectRel}} will be introduced. In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, the solution for this is just remove that logic. was: For sql, {code} select PART_DT, META_CATEG_NAME, sum(price) from KYLIN_SALES INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT group by PART_DT, META_CATEG_NAME order by PART_DT, META_CATEG_NAME {code} the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. Then the plan transformed by kylin is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], dir0=[ASC]) OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], PRICE=[$t5]) OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) {code} {color:#f79232} {code} OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEA
[jira] [Updated] (KYLIN-2773) Should not push down join condition related columns are compatible while not consistent
[ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2773: -- Description: For sql, {code} select PART_DT, META_CATEG_NAME, sum(price) from KYLIN_SALES INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT group by PART_DT, META_CATEG_NAME order by PART_DT, META_CATEG_NAME {code} the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. Then the plan transformed by kylin is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], dir0=[ASC]) OLAPAggregateRel(group=[{0}], EXPR$1=[SUM($1)]) OLAPProjectRel(expr#0..22=[{inputs}], META_CATEG_NAME=[$t16], PRICE=[$t5]) OLAPJoinRel(condition=[=($0, $22)], joinType=[inner]) {code} {color:#f79232} {code} OLAPProjectRel(expr#0..19=[{inputs}], PART_DT=[$t0], LSTG_FORMAT_NAME=[$t1], SLR_SEGMENT_CD=[$t2], LEAF_CATEG_ID=[$t3], LSTG_SITE_ID=[$t4], PRICE=[$t5], SELLER_ID=[$t6], COUNT__=[$t7], MIN_PRICE_=[$t8], COUNT_DISTINCT_SELLER_ID_=[$t9], USER_DEFINED_FIELD1=[$t10], USER_DEFINED_FIELD3=[$t11], UPD_DATE=[$t12], UPD_USER=[$t13], LEAF_CATEG_ID0=[$t14], SITE_ID=[$t15], META_CATEG_NAME=[$t16], CATEG_LVL2_NAME=[$t17], CATEG_LVL3_NAME=[$t18]) OLAPJoinRel(condition=[AND(=($3, $19), =($4, $15))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) OLAPProjectRel(expr#0..8=[{inputs}], expr#9=[CAST($t4):BIGINT], USER_DEFINED_FIELD1=[$t0], USER_DEFINED_FIELD3=[$t1], UPD_DATE=[$t2], UPD_USER=[$t3], LEAF_CATEG_ID=[$t4], SITE_ID=[$t5], META_CATEG_NAME=[$t6], CATEG_LVL2_NAME=[$t7], CATEG_LVL3_NAME=[$t8], LEAF_CATEG_ID9=[$t9]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1, 2, 3]]) {code} However, what we expect is as follows: {code} OLAPToEnumerableConverter OLAPLimitRel(fetch=[5]) OLAPSortRel(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) OLAPAggregateRel(group=[{0, 1}], EXPR$2=[SUM($2)]) OLAPProjectRel(expr#0..21=[{inputs}], PART_DT=[$t0], META_CATEG_NAME=[$t17], PRICE=[$t4]) {code} {color:#f79232} {code} OLAPJoinRel(condition=[=($0, $21)], joinType=[inner]) OLAPJoinRel(condition=[AND(=($1, $15), =($2, $16))], joinType=[inner]) OLAPTableScan(table=[[DEFAULT, KYLIN_SALES]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]) OLAPTableScan(table=[[DEFAULT, KYLIN_CATEGORY_GROUPINGS]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8]]) {code} {color} {code} OLAPTableScan(table=[[DEFAULT, KYLIN_CAL_DT]], fields=[[0, 1]]) {code} The reason for this difference is as follows: * Although we remove the {{JoinPushExpressionsRule}} in {{OLAPTableScan}}, the method {{RelOptUtil.pushDownJoinConditions()}} is still invoked when creating a join in {{SqlToRelConverter}}. * In the method of {{RelOptUtil.pushDownJoinConditions()}}, since the datatypes of the join related columns are not the same, *cast* function will be automatically assigned to KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID. Then a {{OLAPProjectRel}} will be introduced. In kylin, we don't need {{RelOptUtil.pushDownJoinConditions()}}. Therefore, the solution for this is just remove that logic. > Should not push down join condition related columns are compatible while not > consistent > --- > > Key: KYLIN-2773 > URL: https://issues.apache.org/jira/browse/KYLIN-2773 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2773.patch > > > For sql, > {code} > select PART_DT, META_CATEG_NAME, sum(price) > from KYLIN_SALES > INNER JOIN KYLIN_CATEGORY_GROUPINGS ON KYLIN_SALES.LEAF_CATEG_ID = > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID > AND KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID > INNER JOIN KYLIN_CAL_DT ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT > group by PART_DT, META_CATEG_NAME > order by PART_DT, META_CATEG_NAME > {code} > the datatype of KYLIN_SALES.LEAF_CATEG_ID is bigint, while the one of > KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID is integer. > Then the plan transformed by kylin is as follows: > {code} > OLAPToEnumerableConverter > OLAPLimitRel(fetch=[5]) > OLAPSortRel(sort0=[$0], dir0=[ASC]) >
[jira] [Updated] (KYLIN-2773) Should not push down join condition related columns are compatible while not consistent
[ https://issues.apache.org/jira/browse/KYLIN-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2773: -- Attachment: APACHE-KYLIN-2773.patch > Should not push down join condition related columns are compatible while not > consistent > --- > > Key: KYLIN-2773 > URL: https://issues.apache.org/jira/browse/KYLIN-2773 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2773.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2773) Should not push down join condition related columns are compatible while not consistent
Zhong Yanghong created KYLIN-2773: - Summary: Should not push down join condition related columns are compatible while not consistent Key: KYLIN-2773 URL: https://issues.apache.org/jira/browse/KYLIN-2773 Project: Kylin Issue Type: Bug Components: Query Engine Reporter: Zhong Yanghong Assignee: Zhong Yanghong Fix For: v2.1.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2600) Incorrectly set the range start when filtering by the minimum value
[ https://issues.apache.org/jira/browse/KYLIN-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110530#comment-16110530 ] Zhong Yanghong commented on KYLIN-2600: --- Hi [~liyang.g...@gmail.com], could you help have a check? Thanks very much. > Incorrectly set the range start when filtering by the minimum value > --- > > Key: KYLIN-2600 > URL: https://issues.apache.org/jira/browse/KYLIN-2600 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Attachments: APACHE-KYLIN-2600-master.patch > > > Before defining a range of a scan, the range start may be not correct in the > following case: > {code} > OR [ > AND [ > a <= 3, > b = 2 > ], > AND [ > a = 0, (note that 0 is the minimum of a) > b = 1 > ] > ] > {code} > In this case, kylin will generate two ranges: > {code} > [null,2] - [3, 2] > [0,1] - [0,1] > {code} > There's a rule when merging ranges. If the range start is null, it will be > ordered before others whose range start is not null. Then the merged range of > these two ranges will be > \[null,2\] - \[3, 2\]. > Replacing null with 0, the range sent to coprocessor will be > \[0,2\] - \[3, 2\]. > However, it should be > \[0,1\] - \[3, 2\] > and the rule should be refined. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2716) Using non-thread-safe WeakHashMap leading to server high cpu
[ https://issues.apache.org/jira/browse/KYLIN-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110528#comment-16110528 ] Zhong Yanghong commented on KYLIN-2716: --- Thanks [~liyang.g...@gmail.com]'s help. The patch has been refined according your suggestion. > Using non-thread-safe WeakHashMap leading to server high cpu > > > Key: KYLIN-2716 > URL: https://issues.apache.org/jira/browse/KYLIN-2716 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2716.patch > > > Multiple threads invoke WeakHashMap.get() simultaneously may leading to a > dead loop in *WeakHashMap.get() -> getTable() -> expungeStaleEntries()*, > which finally resulting in server high cpu. There're two places using > WeakHashMap. > > One is used in the method *ClassUtil.forName()*. > We made an inner test by invoking the method *ClassUtil.forName()* 1M times, > the result is as follows: > * With cache: 20ms; > * Without cache: less than 2s. > By invoking the method *ClassUtil.forName()* with *newInstance()* 1M times, > the result is as follows: > * With cache: around 2s > * Without cache: around 3s. > Considering *ClassUtil.forName()* is always invoked with *newInstance()*, > there's no much downgrade without cache. Thus the fix is just to remove the > cache. > > Another is used in the method *CubeService.getHTableInfo()*. We changed the > WeakHashMap to Guava Cache with introducing size & time limitation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2716) Using non-thread-safe WeakHashMap leading to server high cpu
[ https://issues.apache.org/jira/browse/KYLIN-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2716: -- Attachment: APACHE-KYLIN-2716.patch > Using non-thread-safe WeakHashMap leading to server high cpu > > > Key: KYLIN-2716 > URL: https://issues.apache.org/jira/browse/KYLIN-2716 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2716.patch > > > Multiple threads invoke WeakHashMap.get() simultaneously may leading to a > dead loop in *WeakHashMap.get() -> getTable() -> expungeStaleEntries()*, > which finally resulting in server high cpu. There're two places using > WeakHashMap. > > One is used in the method *ClassUtil.forName()*. > We made an inner test by invoking the method *ClassUtil.forName()* 1M times, > the result is as follows: > * With cache: 20ms; > * Without cache: less than 2s. > By invoking the method *ClassUtil.forName()* with *newInstance()* 1M times, > the result is as follows: > * With cache: around 2s > * Without cache: around 3s. > Considering *ClassUtil.forName()* is always invoked with *newInstance()*, > there's no much downgrade without cache. Thus the fix is just to remove the > cache. > > Another is used in the method *CubeService.getHTableInfo()*. We changed the > WeakHashMap to Guava Cache with introducing size & time limitation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2716) Using non-thread-safe WeakHashMap leading to server high cpu
[ https://issues.apache.org/jira/browse/KYLIN-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2716: -- Attachment: (was: APACHE-KYLIN-2716.patch) > Using non-thread-safe WeakHashMap leading to server high cpu > > > Key: KYLIN-2716 > URL: https://issues.apache.org/jira/browse/KYLIN-2716 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > > Multiple threads invoke WeakHashMap.get() simultaneously may leading to a > dead loop in *WeakHashMap.get() -> getTable() -> expungeStaleEntries()*, > which finally resulting in server high cpu. There're two places using > WeakHashMap. > > One is used in the method *ClassUtil.forName()*. > We made an inner test by invoking the method *ClassUtil.forName()* 1M times, > the result is as follows: > * With cache: 20ms; > * Without cache: less than 2s. > By invoking the method *ClassUtil.forName()* with *newInstance()* 1M times, > the result is as follows: > * With cache: around 2s > * Without cache: around 3s. > Considering *ClassUtil.forName()* is always invoked with *newInstance()*, > there's no much downgrade without cache. Thus the fix is just to remove the > cache. > > Another is used in the method *CubeService.getHTableInfo()*. We changed the > WeakHashMap to Guava Cache with introducing size & time limitation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2716) Using non-thread-safe WeakHashMap leading to server high cpu
[ https://issues.apache.org/jira/browse/KYLIN-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2716: -- Fix Version/s: (was: v2.0.0) v2.1.0 > Using non-thread-safe WeakHashMap leading to server high cpu > > > Key: KYLIN-2716 > URL: https://issues.apache.org/jira/browse/KYLIN-2716 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > > Multiple threads invoke WeakHashMap.get() simultaneously may leading to a > dead loop in *WeakHashMap.get() -> getTable() -> expungeStaleEntries()*, > which finally resulting in server high cpu. There're two places using > WeakHashMap. > > One is used in the method *ClassUtil.forName()*. > We made an inner test by invoking the method *ClassUtil.forName()* 1M times, > the result is as follows: > * With cache: 20ms; > * Without cache: less than 2s. > By invoking the method *ClassUtil.forName()* with *newInstance()* 1M times, > the result is as follows: > * With cache: around 2s > * Without cache: around 3s. > Considering *ClassUtil.forName()* is always invoked with *newInstance()*, > there's no much downgrade without cache. Thus the fix is just to remove the > cache. > > Another is used in the method *CubeService.getHTableInfo()*. We changed the > WeakHashMap to Guava Cache with introducing size & time limitation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2718) overflow when calculating combination amount based on static rules
[ https://issues.apache.org/jira/browse/KYLIN-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2718: -- Attachment: (was: APACHE-KYLIN-2718.patch) > overflow when calculating combination amount based on static rules > -- > > Key: KYLIN-2718 > URL: https://issues.apache.org/jira/browse/KYLIN-2718 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2718-Guava.patch > > > In extreme case, value of *combination* will exceed Long.MAX_VALUE leading to > the validation noneffective. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2718) overflow when calculating combination amount based on static rules
[ https://issues.apache.org/jira/browse/KYLIN-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110458#comment-16110458 ] Zhong Yanghong commented on KYLIN-2718: --- Thanks [~liyang.g...@gmail.com]'s comments. Now overflow checking is used based on Guava Lib [^APACHE-KYLIN-2718-Guava.patch] > overflow when calculating combination amount based on static rules > -- > > Key: KYLIN-2718 > URL: https://issues.apache.org/jira/browse/KYLIN-2718 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2718-Guava.patch, APACHE-KYLIN-2718.patch > > > In extreme case, value of *combination* will exceed Long.MAX_VALUE leading to > the validation noneffective. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2718) overflow when calculating combination amount based on static rules
[ https://issues.apache.org/jira/browse/KYLIN-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2718: -- Attachment: APACHE-KYLIN-2718-Guava.patch > overflow when calculating combination amount based on static rules > -- > > Key: KYLIN-2718 > URL: https://issues.apache.org/jira/browse/KYLIN-2718 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2718-Guava.patch, APACHE-KYLIN-2718.patch > > > In extreme case, value of *combination* will exceed Long.MAX_VALUE leading to > the validation noneffective. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2718) overflow when calculating combination amount based on static rules
[ https://issues.apache.org/jira/browse/KYLIN-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2718: -- Attachment: (was: APACHE-KYLIN-2718.patch) > overflow when calculating combination amount based on static rules > -- > > Key: KYLIN-2718 > URL: https://issues.apache.org/jira/browse/KYLIN-2718 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2718.patch > > > In extreme case, value of *combination* will exceed Long.MAX_VALUE leading to > the validation noneffective. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2718) overflow when calculating combination amount based on static rules
[ https://issues.apache.org/jira/browse/KYLIN-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2718: -- Attachment: APACHE-KYLIN-2718.patch > overflow when calculating combination amount based on static rules > -- > > Key: KYLIN-2718 > URL: https://issues.apache.org/jira/browse/KYLIN-2718 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2718.patch, APACHE-KYLIN-2718.patch > > > In extreme case, value of *combination* will exceed Long.MAX_VALUE leading to > the validation noneffective. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2718) overflow when calculating combination amount based on static rules
[ https://issues.apache.org/jira/browse/KYLIN-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-2718: -- Fix Version/s: (was: v2.0.0) v2.1.0 > overflow when calculating combination amount based on static rules > -- > > Key: KYLIN-2718 > URL: https://issues.apache.org/jira/browse/KYLIN-2718 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: all >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v2.1.0 > > Attachments: APACHE-KYLIN-2718.patch > > > In extreme case, value of *combination* will exceed Long.MAX_VALUE leading to > the validation noneffective. -- This message was sent by Atlassian JIRA (v6.4.14#64029)