[jira] [Created] (KYLIN-2021) Cognos Issues
hongbin ma created KYLIN-2021: - Summary: Cognos Issues Key: KYLIN-2021 URL: https://issues.apache.org/jira/browse/KYLIN-2021 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma cognos will generate some queries that kylin does not support yet -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1945) Cuboid.translateToValidCuboid method throw exception while cube building or query execute
[ https://issues.apache.org/jira/browse/KYLIN-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15499098#comment-15499098 ] hongbin ma commented on KYLIN-1945: --- not yet > Cuboid.translateToValidCuboid method throw exception while cube building or > query execute > - > > Key: KYLIN-1945 > URL: https://issues.apache.org/jira/browse/KYLIN-1945 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 > Environment: centos 6.4 ,hadoop 2.6.4,kylin 1.5.3 hbase 1.2.1 >Reporter: logicigam >Assignee: hongbin ma > > I manage to reproduce that exception at sample cube,just by add all of > defined dimension as Mandatory Dimensions into that Aggregation Groups,then > another group contain just part of dims. > === > here is the error stack while query execute: > Caused by: java.util.NoSuchElementException > at java.util.ArrayList$Itr.next(ArrayList.java:834) > at java.util.Collections.min(Collections.java:665) > at > org.apache.kylin.cube.cuboid.Cuboid.translateToValidCuboid(Cuboid.java:217) > at > org.apache.kylin.cube.cuboid.Cuboid.translateToValidCuboid(Cuboid.java:141) > at org.apache.kylin.cube.cuboid.Cuboid.findById(Cuboid.java:83) > at org.apache.kylin.cube.cuboid.Cuboid.identifyCuboid(Cuboid.java:68) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:97) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:120) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:69) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:819) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:754) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > at > org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:326) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:281) > at > org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:545) > and the same thing happen while cube building,exception log could be found at > hadoop job monitor page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1573) aggregator expression like sum(a+b) is not working
[ https://issues.apache.org/jira/browse/KYLIN-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15499096#comment-15499096 ] hongbin ma commented on KYLIN-1573: --- maybe we can borrow the idea of UDAF frameworks from hive to allow customized aggregate functions. With UDAF users can define his own aggregator (with a separate jar) that will be used both in cube building and cube querying. The drawback is he has to create and use a new aggregator name, which will be a problem if the queries are automatically generated by BI tools. Hive UDAF is already the de facto standard and spark borrows it too. Creating a new column with Hive View is the suggested option for now. > aggregator expression like sum(a+b) is not working > -- > > Key: KYLIN-1573 > URL: https://issues.apache.org/jira/browse/KYLIN-1573 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: 1.x-HBase1.1.3 >Reporter: Babulal >Assignee: liyang >Priority: Minor > Labels: newbie, test > > In Kylin expression like + ,/.* ..ect are not working . > for example > select country, >city, > sum(male_population+Female_population) > > from population > group by country,city. > Error > Caused by: java.lang.NullPointerException > at > org.apache.kylin.cube.CubeCapabilityChecker.tryDimensionAsMeasures(CubeCapabilityChecker.java:170) > at > org.apache.kylin.cube.CubeCapabilityChecker.check(CubeCapabilityChecker.java:73) > at org.apache.kylin.cube.CubeInstance.isCapable(CubeInstance.java:330) > at > org.apache.kylin.query.routing.rules.RemoveUncapableRealizationsRule.apply(RemoveUncapableRealizationsRule.java:36) > at > org.apache.kylin.query.routing.RoutingRule.applyRules(RoutingRule.java:47) > at > org.apache.kylin.query.routing.QueryRouter.selectRealization(QueryRouter.java:63) > at > org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:80) > at > org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:102) > at > org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92) > at > org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1171) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:297) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:196) > When Query changed as below it is working fine . > select country, >city, > sum(male_population)+sum(Female_population) > > from population > group by country,city. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2015) replace h2 with alternatives like sqllite or mysql
hongbin ma created KYLIN-2015: - Summary: replace h2 with alternatives like sqllite or mysql Key: KYLIN-2015 URL: https://issues.apache.org/jira/browse/KYLIN-2015 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma in IT we compare kylin's result with H2's results to ensure query correctness. however h2 only supports part of the SQL syntax. For example, it cannot support functions like timestampadd, or (DATE'2013-01-02' + interval '3' day). What's more, subqueries are observed to be very slow on H2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2011) Query without order by should not leverage topn measure
hongbin ma created KYLIN-2011: - Summary: Query without order by should not leverage topn measure Key: KYLIN-2011 URL: https://issues.apache.org/jira/browse/KYLIN-2011 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: hongbin ma Priority: Minor sql_topn/query45.sql: select seller_id, sum(price) as s from test_kylin_fact where lstg_format_name='FP-GTC' group by seller_id this query does not have order by, however it will still leverage topn measure. Since topn use double encoding instead of decimal encoding, there's precision loss. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-2005) Move all storage side behavior hints to GTScanRequest
[ https://issues.apache.org/jira/browse/KYLIN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-2005. --- Resolution: Fixed Fix Version/s: v1.5.4 > Move all storage side behavior hints to GTScanRequest > - > > Key: KYLIN-2005 > URL: https://issues.apache.org/jira/browse/KYLIN-2005 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1922) Improve the logic to decide whether to pre aggregate on Region server
[ https://issues.apache.org/jira/browse/KYLIN-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15483045#comment-15483045 ] hongbin ma commented on KYLIN-1922: --- also improved coprocessor self-termination logic in the commit, so that very time consuming queries will abort it self in coprocessor before draining all region server resources. > Improve the logic to decide whether to pre aggregate on Region server > - > > Key: KYLIN-1922 > URL: https://issues.apache.org/jira/browse/KYLIN-1922 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1922) Improve the logic to decide whether to pre aggregate on Region server
[ https://issues.apache.org/jira/browse/KYLIN-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1922. --- Resolution: Fixed Fix Version/s: v1.5.4 > Improve the logic to decide whether to pre aggregate on Region server > - > > Key: KYLIN-1922 > URL: https://issues.apache.org/jira/browse/KYLIN-1922 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2005) Move all storage side behavior hints to GTScanRequest
hongbin ma created KYLIN-2005: - Summary: Move all storage side behavior hints to GTScanRequest Key: KYLIN-2005 URL: https://issues.apache.org/jira/browse/KYLIN-2005 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: hongbin ma -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1999) Use some compression at UT/IT
[ https://issues.apache.org/jira/browse/KYLIN-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1999: -- Description: KYLIN-1984 disabled compression in packaging configurations to maximum ease of use. Still we need to make sure everything will work if we want compression enabled, especially for snappy. Currently in IT, hbase are compressed by gzip, and mapreduce seems not enabling any compression (was: KYLIN-1984 disabled compression in packaging configurations to maximum ease of use. Still we need to make sure everything will work if we want compression enabled. ) > Use some compression at UT/IT > - > > Key: KYLIN-1999 > URL: https://issues.apache.org/jira/browse/KYLIN-1999 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: Yiming Liu >Priority: Minor > > KYLIN-1984 disabled compression in packaging configurations to maximum ease > of use. Still we need to make sure everything will work if we want > compression enabled, especially for snappy. Currently in IT, hbase are > compressed by gzip, and mapreduce seems not enabling any compression -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1999) Use some compression at UT/IT
[ https://issues.apache.org/jira/browse/KYLIN-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1999: -- Assignee: Yiming Liu (was: hongbin ma) > Use some compression at UT/IT > - > > Key: KYLIN-1999 > URL: https://issues.apache.org/jira/browse/KYLIN-1999 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: Yiming Liu > > KYLIN-1984 disabled compression in packaging configurations to maximum ease > of use. Still we need to make sure everything will work if we want > compression enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1999) Use some compression at UT/IT
hongbin ma created KYLIN-1999: - Summary: Use some compression at UT/IT Key: KYLIN-1999 URL: https://issues.apache.org/jira/browse/KYLIN-1999 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma KYLIN-1984 disabled compression in packaging configurations to maximum ease of use. Still we need to make sure everything will work if we want compression enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1999) Use some compression at UT/IT
[ https://issues.apache.org/jira/browse/KYLIN-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1999: -- Priority: Minor (was: Major) > Use some compression at UT/IT > - > > Key: KYLIN-1999 > URL: https://issues.apache.org/jira/browse/KYLIN-1999 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: Yiming Liu >Priority: Minor > > KYLIN-1984 disabled compression in packaging configurations to maximum ease > of use. Still we need to make sure everything will work if we want > compression enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-2000) update document for KYLIN-1984
[ https://issues.apache.org/jira/browse/KYLIN-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-2000: -- Assignee: Yiming Liu (was: hongbin ma) > update document for KYLIN-1984 > -- > > Key: KYLIN-2000 > URL: https://issues.apache.org/jira/browse/KYLIN-2000 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: Yiming Liu > > http://kylin.apache.org/docs15/install/advance_settings.html > the doc needs updating -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2000) update document for KYLIN-1984
hongbin ma created KYLIN-2000: - Summary: update document for KYLIN-1984 Key: KYLIN-2000 URL: https://issues.apache.org/jira/browse/KYLIN-2000 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma http://kylin.apache.org/docs15/install/advance_settings.html the doc needs updating -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1984) Don't use compression in packaging configuration
[ https://issues.apache.org/jira/browse/KYLIN-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1984: -- Summary: Don't use compression in packaging configuration (was: Don't use compression in default configuration) > Don't use compression in packaging configuration > > > Key: KYLIN-1984 > URL: https://issues.apache.org/jira/browse/KYLIN-1984 > Project: Kylin > Issue Type: Improvement > Components: Environment >Reporter: Shaofeng SHI >Assignee: Yiming Liu > Fix For: v1.5.4 > > Attachments: > 0001-KYLIN-1984-Disable-compress-in-default-configuration.patch > > > Today in Kylin default configuration, it uses snappy compression for Hadoop > and HBase; while many new users don't have hadoop native packages installed, > they might be blocked there. > To improve the experience for new users, Kylin's default configuration should > disable compression, and add a document introduce how to enable compression; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1981) PK/FK derived will break for left join in some cases
[ https://issues.apache.org/jira/browse/KYLIN-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470600#comment-15470600 ] hongbin ma commented on KYLIN-1981: --- another kind of related issue is: (sql_raw/query26.sql) for queries like select LSTG_FORMAT_NAME, LSTG_SITE_ID,SLR_SEGMENT_CD,test_kylin_fact.CAL_DT,test_category_groupings.LEAF_CATEG_ID,PRICE from test_kylin_fact left JOIN edw.test_cal_dt as test_cal_dt ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt left JOIN test_category_groupings ON test_kylin_fact.leaf_categ_id = test_category_groupings.leaf_categ_id AND test_kylin_fact.lstg_site_id = test_category_groupings.site_id left JOIN edw.test_sites as test_sites ON test_kylin_fact.lstg_site_id = test_sites.site_id the expected result looks like: (watch the null values, they exist because left join) Others_B3 5 2013-07-25 75708 719.1005 Others_B3 5 2013-07-30 null745.4067 Others_B3 5 2013-08-31 null671.7349 Others_B3 5 2013-11-30 null781.5007 however the actual output looks like: Others_B3 5 2013-07-25 75708 719.1005 Others_B3 5 2013-07-30 67698 745.4067 Others_B3 5 2013-08-31 106246 671.7349 Others_B3 5 2013-11-30 164261 781.5007 > PK/FK derived will break for left join in some cases > > > Key: KYLIN-1981 > URL: https://issues.apache.org/jira/browse/KYLIN-1981 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: liyang >Priority: Minor > > for left join cubes. suppose A is the FK in fact table, and B is the PK in > lookup table. query like below will end up with "more" results > select B, count(*) from fact left join lookup group by B -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1989) Sum issue: result not consistent with hive
[ https://issues.apache.org/jira/browse/KYLIN-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464080#comment-15464080 ] hongbin ma commented on KYLIN-1989: --- I can reproduce you issue now. kylin does not support case when in aggregators(sum,min,max) yet. Only "case when" for dimension is supported, for example: SELECT (CASE WHEN ("TEST_KYLIN_FACT"."LSTG_FORMAT_NAME" = 'Auction') THEN 'Auction2' ELSE "TEST_KYLIN_FACT"."LSTG_FORMAT_NAME" END) AS "LSTG_FORMAT_NAME__group_", SUM("TEST_KYLIN_FACT"."PRICE") AS "sum_PRICE_ok" FROM "TEST_KYLIN_FACT" "TEST_KYLIN_FACT" GROUP BY (CASE WHEN ("TEST_KYLIN_FACT"."LSTG_FORMAT_NAME" = 'Auction') THEN 'Auction2' ELSE "TEST_KYLIN_FACT"."LSTG_FORMAT_NAME" END) as a workaround now you can modify the "case when" to a filter > Sum issue: result not consistent with hive > -- > > Key: KYLIN-1989 > URL: https://issues.apache.org/jira/browse/KYLIN-1989 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.2 >Reporter: Le Van Ha >Assignee: hongbin ma > Attachments: hive_result.png, kylin_result.png > > > When do the following query, > SELECT channel.name, sum(case when product.product_vendor <> '' then > fact_product.quantity else 0 end) > FROM fact_product_sales as fact_product > join dim_product as product on fact_product.product_id = product.id > join dim_channel as channel on fact_product.channel_id = channel.id > GROUP BY channel.name > --- > The result by kylin: > Buy Button0 > Online Store 0 > --- > The result by hive is shown in figure. > Why is that? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1987) kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF
[ https://issues.apache.org/jira/browse/KYLIN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15462941#comment-15462941 ] hongbin ma commented on KYLIN-1987: --- My mistake, did not grep the doc thoroughly. In calcite, the function belongs to "JDBC function escape", which is a JDBC standard syntax for such scalar function calls. I'm not sure about the difference between {fn TIMESTAMPADD()} with TIMESTAMPADD(), however I guess it is trivial to support the former in Kylin. If BI tool-generated queries can go with the former it's all set. > kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF > --- > > Key: KYLIN-1987 > URL: https://issues.apache.org/jira/browse/KYLIN-1987 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.3 > Environment: kylin-1.5.3-cdh5.7 >Reporter: SF Wang >Assignee: hongbin ma > Labels: newbie > > kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF > when I execute sql on sample table kylin_sales in kylin web cli, It reports > error. > My sql : > select timestampadd(DAY, 1, cast(part_dt as timestamp)) from kylin_sales > Error message: > Encountered "(DAY" at line 1, column 20. Was expecting one of :"."... > "("... ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KYLIN-1987) kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF
[ https://issues.apache.org/jira/browse/KYLIN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma reassigned KYLIN-1987: - Assignee: hongbin ma (was: liyang) > kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF > --- > > Key: KYLIN-1987 > URL: https://issues.apache.org/jira/browse/KYLIN-1987 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.3 > Environment: kylin-1.5.3-cdh5.7 >Reporter: SF Wang >Assignee: hongbin ma > Labels: newbie > > kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF > when I execute sql on sample table kylin_sales in kylin web cli, It reports > error. > My sql : > select timestampadd(DAY, 1, cast(part_dt as timestamp)) from kylin_sales > Error message: > Encountered "(DAY" at line 1, column 20. Was expecting one of :"."... > "("... ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1995) Upgrade MapReduce properties which are deprecated
[ https://issues.apache.org/jira/browse/KYLIN-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15462926#comment-15462926 ] hongbin ma commented on KYLIN-1995: --- nice findings! should we take care of all the kylin_job_conf.xml and kylin_job_conf_inmem.xml in examples/ folder? they're being used in UT/IT > Upgrade MapReduce properties which are deprecated > - > > Key: KYLIN-1995 > URL: https://issues.apache.org/jira/browse/KYLIN-1995 > Project: Kylin > Issue Type: Task > Components: Job Engine >Affects Versions: v1.5.2 >Reporter: Billy(Yiming) Liu >Assignee: Dong Li >Priority: Minor > Attachments: > 0001-KYLIN-1995-Upgrade-deprecated-properties-for-Hadoop-.patch > > > Currently, Kylin use Hadoop 2.6 API. There are many properties are deprecated > in Hadoop 2.6, but still used in kylin's kylin_hive_conf.xml and > kylin_job_conf.xml, such as mapred.compress.map.output, > mapred.map.output.compression.codec and more. The full list is > https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/DeprecatedProperties.html > > In this JIRA task, the deprecated properties will be upgraded to hadoop 2.6 > effective name and value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1992) Clear ThreadLocal Contexts when query failed before scaning HBase
[ https://issues.apache.org/jira/browse/KYLIN-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1992. --- Resolution: Fixed > Clear ThreadLocal Contexts when query failed before scaning HBase > - > > Key: KYLIN-1992 > URL: https://issues.apache.org/jira/browse/KYLIN-1992 > Project: Kylin > Issue Type: Bug >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v1.5.4 > > Attachments: KYLIN-1992.patch > > > currently, we call `OLAPContext.clearThreadLocalContexts()` function before > scaning HBase. > if query failed before scaning HBase, we would get wrong `realization` of the > query in the log possibly. > Because the thread pool of Tomcat multiplexed the thread and didn't clear > ThreadLocal variable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1992) Clear ThreadLocal Contexts when query failed before scaning HBase
[ https://issues.apache.org/jira/browse/KYLIN-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1992: -- Fix Version/s: v1.5.4 > Clear ThreadLocal Contexts when query failed before scaning HBase > - > > Key: KYLIN-1992 > URL: https://issues.apache.org/jira/browse/KYLIN-1992 > Project: Kylin > Issue Type: Bug >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v1.5.4 > > Attachments: KYLIN-1992.patch > > > currently, we call `OLAPContext.clearThreadLocalContexts()` function before > scaning HBase. > if query failed before scaning HBase, we would get wrong `realization` of the > query in the log possibly. > Because the thread pool of Tomcat multiplexed the thread and didn't clear > ThreadLocal variable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1984) Don't use compression in default configuration
[ https://issues.apache.org/jira/browse/KYLIN-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15462919#comment-15462919 ] hongbin ma commented on KYLIN-1984: --- merged and pushed > Don't use compression in default configuration > -- > > Key: KYLIN-1984 > URL: https://issues.apache.org/jira/browse/KYLIN-1984 > Project: Kylin > Issue Type: Improvement > Components: Environment >Reporter: Shaofeng SHI >Assignee: Yiming Liu > Fix For: v1.5.4 > > Attachments: > 0001-KYLIN-1984-Disable-compress-in-default-configuration.patch > > > Today in Kylin default configuration, it uses snappy compression for Hadoop > and HBase; while many new users don't have hadoop native packages installed, > they might be blocked there. > To improve the experience for new users, Kylin's default configuration should > disable compression, and add a document introduce how to enable compression; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1984) Don't use compression in default configuration
[ https://issues.apache.org/jira/browse/KYLIN-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1984: -- Assignee: Yiming Liu (was: hongbin ma) > Don't use compression in default configuration > -- > > Key: KYLIN-1984 > URL: https://issues.apache.org/jira/browse/KYLIN-1984 > Project: Kylin > Issue Type: Improvement > Components: Environment >Reporter: Shaofeng SHI >Assignee: Yiming Liu > > Today in Kylin default configuration, it uses snappy compression for Hadoop > and HBase; while many new users don't have hadoop native packages installed, > they might be blocked there. > To improve the experience for new users, Kylin's default configuration should > disable compression, and add a document introduce how to enable compression; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1978) kylin.sh compatible issue on Ubuntu
[ https://issues.apache.org/jira/browse/KYLIN-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1978: -- Assignee: Yiming Liu (was: hongbin ma) > kylin.sh compatible issue on Ubuntu > --- > > Key: KYLIN-1978 > URL: https://issues.apache.org/jira/browse/KYLIN-1978 > Project: Kylin > Issue Type: Bug > Components: Environment >Affects Versions: v1.5.3 >Reporter: Shaofeng SHI >Assignee: Yiming Liu > Fix For: v1.5.4 > > > Reported by Marcelo(marcelo.n...@quantium.com.au) in the mailing list: > DISTRIB_ID=Ubuntu > DISTRIB_RELEASE=14.04 > DISTRIB_CODENAME=trusty > DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS" > NAME="Ubuntu" > VERSION="14.04.4 LTS, Trusty Tahr" > ID=ubuntu > ID_LIKE=debian > PRETTY_NAME="Ubuntu 14.04.4 LTS" > VERSION_ID="14.04" > HOME_URL="http://www.ubuntu.com/; > SUPPORT_URL="http://help.ubuntu.com/; > BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/; > mapr@qtausc-vpcsdev04:~/kylin/apache-kylin-1.5.3-HBase1.x-bin/bin$ ./kylin.sh > start > *KYLIN_HOME is set to /home/mapr/kylin/apache-kylin-1.5.3-HBase1.x-bin* > cat: invalid option -- '1' > Try 'cat --help' for more information. > -mkdir: Not enough arguments: expected 1 but got 0 > Usage: hadoop fs [generic options] -mkdir [-p] ... > failed to create , Please make sure the user has right to access > That is what is happening when I try to start kylin. > I traced the error and the first one come from get-properties.sh at these > line > for i in `cat ${KYLIN_HOME}/conf/kylin.properties | grep -w "^$1" | grep -v > '^#' | awk -F= '{ n = index($0,"="); print substr($0,n+1)}' | cut -c 1-` > and as you can see kylin home is set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1932) Query did not filter unrelated cuboid shards when cuboid is shard on specific column
[ https://issues.apache.org/jira/browse/KYLIN-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460778#comment-15460778 ] hongbin ma commented on KYLIN-1932: --- hi [~gangma] are you having a filter on the sharded column? If it's not skipping shards as expected then it's a bug. I'll try to reproduce and fix it. any test cases is welcomed form your side > Query did not filter unrelated cuboid shards when cuboid is shard on specific > column > > > Key: KYLIN-1932 > URL: https://issues.apache.org/jira/browse/KYLIN-1932 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Ma Gang >Assignee: hongbin ma > > This is related to KYLIN-1453, I just check the code, and looks like it > didn't work as expected, [~mahongbin] could you help to confirm it is a bug > or not? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KYLIN-1932) Query did not filter unrelated cuboid shards when cuboid is shard on specific column
[ https://issues.apache.org/jira/browse/KYLIN-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma reassigned KYLIN-1932: - Assignee: hongbin ma (was: liyang) > Query did not filter unrelated cuboid shards when cuboid is shard on specific > column > > > Key: KYLIN-1932 > URL: https://issues.apache.org/jira/browse/KYLIN-1932 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Ma Gang >Assignee: hongbin ma > > This is related to KYLIN-1453, I just check the code, and looks like it > didn't work as expected, [~mahongbin] could you help to confirm it is a bug > or not? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1936) Improve enable limit logic (exactAggregation is too strict)
[ https://issues.apache.org/jira/browse/KYLIN-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1936. --- Resolution: Fixed > Improve enable limit logic (exactAggregation is too strict) > --- > > Key: KYLIN-1936 > URL: https://issues.apache.org/jira/browse/KYLIN-1936 > Project: Kylin > Issue Type: Improvement >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > > from zhaotians...@meizu.com: > recently I got the following error while execute query on a cube which is not > that big( about 400mb, 20milion record) > == > Error while executing SQL "select FCRASHTIME,count(1) from > UXIP.EDL_FDT_OUC_UPLOAD_FILES group by FCRASH_ANALYSIS_ID,FCRASHTIME limit > 1": Scan row count exceeded threshold: 1000, please add filter condition > to narrow down backend scan range, like where clause. > I guess what it scan were the intermediate result, but It doesn't any order > by,also the result count is limit to just 1.so it could scan to find any > record with those two dimension and wala. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1963) Delegate the loading of certain package (like slf4j) to tomcat's parent classloader
[ https://issues.apache.org/jira/browse/KYLIN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1963: -- Fix Version/s: v1.5.4 > Delegate the loading of certain package (like slf4j) to tomcat's parent > classloader > --- > > Key: KYLIN-1963 > URL: https://issues.apache.org/jira/browse/KYLIN-1963 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > > currently we use hbase command to start tomcat, which then starts kylin as a > web application. The default classloader that tomcats assigns to the kylin > applications is WebappClassLoader, which will search local repositories > before parent classloader. > the design will lead to two separate log4j logging instances in both the > "HBase space" and "kylin space", the two loggers will attempt to write to the > same file, which is problematic according to official documents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1963) Delegate the loading of certain package (like slf4j) to tomcat's parent classloader
[ https://issues.apache.org/jira/browse/KYLIN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1963. --- Resolution: Fixed > Delegate the loading of certain package (like slf4j) to tomcat's parent > classloader > --- > > Key: KYLIN-1963 > URL: https://issues.apache.org/jira/browse/KYLIN-1963 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > > currently we use hbase command to start tomcat, which then starts kylin as a > web application. The default classloader that tomcats assigns to the kylin > applications is WebappClassLoader, which will search local repositories > before parent classloader. > the design will lead to two separate log4j logging instances in both the > "HBase space" and "kylin space", the two loggers will attempt to write to the > same file, which is problematic according to official documents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1964) Add a companion tool of CubeMetaExtractor for cube importing
[ https://issues.apache.org/jira/browse/KYLIN-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1964. --- Resolution: Fixed Fix Version/s: v1.5.4 use CubeMetaIngester > Add a companion tool of CubeMetaExtractor for cube importing > > > Key: KYLIN-1964 > URL: https://issues.apache.org/jira/browse/KYLIN-1964 > Project: Kylin > Issue Type: Wish >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > > Now that we have CubeMetaExtractor for cube exporting, additionally we need a > importer to import the exported cube -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1979) Move hackNoGroupByAggregation to cube-based storage implementations
[ https://issues.apache.org/jira/browse/KYLIN-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1979. --- Resolution: Fixed Fix Version/s: v1.5.4 > Move hackNoGroupByAggregation to cube-based storage implementations > --- > > Key: KYLIN-1979 > URL: https://issues.apache.org/jira/browse/KYLIN-1979 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > > as it only makes sense for cube-based realizations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1982) CubeMigrationCLI: associate model with project
[ https://issues.apache.org/jira/browse/KYLIN-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1982. --- Resolution: Fixed > CubeMigrationCLI: associate model with project > --- > > Key: KYLIN-1982 > URL: https://issues.apache.org/jira/browse/KYLIN-1982 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Affects Versions: v1.5.3 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v1.5.4 > > Attachments: KYLIN-1982.patch > > > In the current `CubeMigrationCLI`, when we migrated the cube, the model > metadata has migrated indeed, but the model hasn't associated with the > project. > So, if we get model via `getModels` in `ModelController` with "modelName" and > "projectName", we will get null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1883) Consensus Problem when running the tool, MetadataCleanupJob
[ https://issues.apache.org/jira/browse/KYLIN-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460671#comment-15460671 ] hongbin ma commented on KYLIN-1883: --- it's the next release. since last official release is v1.5.3, you should use v1.5.4 > Consensus Problem when running the tool, MetadataCleanupJob > --- > > Key: KYLIN-1883 > URL: https://issues.apache.org/jira/browse/KYLIN-1883 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Attachments: > better_solution_for_consensus_issue_of_MetadataCleanupJob.patch > > > When do the cleanup, current strategy is as follows: > 1. firstly create an referenceSet > 2. then add items not belonging to the referenceSet to the toDeleteSet > 3. finally delete those items in the toDeleteSet > Consensus issue will occur since we cannot make sure that all of the items in > toDeleteSet are not referenced in case that referenceSet changes during the > process. > For example, before the cleanup, SEGMENT_A is deleted and leave a DICT_A > created at the building step. Then the referenceSet will not include DICT_A. > After creating the reference set, SEGMENT_B is starting to build. Since > DICT_A still exists, it can still be referenced by SEGMENT_B. Then DICT_A > will still be included in the toDeleteSet and will be deleted later. Finally > SEGMENT_B only owns a reference with no data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1630) incorrect sum(a)/sum(b) (how to get the rate value)
[ https://issues.apache.org/jira/browse/KYLIN-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460661#comment-15460661 ] hongbin ma commented on KYLIN-1630: --- for future reference, even if the data type is integer, we can workaround the issue by using 1.0* sum(a)/sum(b) > incorrect sum(a)/sum(b) (how to get the rate value) > --- > > Key: KYLIN-1630 > URL: https://issues.apache.org/jira/browse/KYLIN-1630 > Project: Kylin > Issue Type: Bug > Components: General >Affects Versions: v1.5.1 > Environment: Kylin version:1.5.1 >Reporter: 陈雷雷 > > I want to get a value which is defined as sum(a)/sum(b), how can I > do this kind of anlysis. > Kylin version:1.5.1 >Now I build a cube which have sum(a) and sum(b), when I execute > “select sum(a)/sum(b) from table1 group by c” ,the result is wrong. > sum(a)/sum(b) the result is all 0 and sum(b)/sum(a) result is all 1. > MMENE_NAMESUCC ATTSUCC/ATT > CSMME15BZX 336981 368366 1 > CSMME32BZX 338754 366842 1 > CSMME07BZX 687965 747694 1 > CSMME03BHW 703269 747623 1 > CSMME12BZX 705856 764656 1 > CSMME16BHW 1962293142173 1 > MMENE_NAME SUCC ATT ATT/SUCC > CSMME15BZX 336981 368366 0 > CSMME32BZX 338754 366842 0 > CSMME07BZX 687965 747694 0 > CSMME03BHW 703269 747623 0 > CSMME12BZX 705856 764656 0 > CSMME16BHW 1962293142173 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1987) kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF
[ https://issues.apache.org/jira/browse/KYLIN-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460657#comment-15460657 ] hongbin ma commented on KYLIN-1987: --- We need to support such functions badly as BI tools like Tableau and Cognos will generate such functions a lot. the bad news is that such functions are not even supported by Calcite: https://calcite.apache.org/docs/reference.html > kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF > --- > > Key: KYLIN-1987 > URL: https://issues.apache.org/jira/browse/KYLIN-1987 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.3 > Environment: kylin-1.5.3-cdh5.7 >Reporter: SF Wang >Assignee: liyang > Labels: newbie > > kylin can't support date/time functions like TIMESTAMPADD、TIMESTAMPDIFF > when I execute sql on sample table kylin_sales in kylin web cli, It reports > error. > My sql : > select timestampadd(DAY, 1, cast(part_dt as timestamp)) from kylin_sales > Error message: > Encountered "(DAY" at line 1, column 20. Was expecting one of :"."... > "("... ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KYLIN-1988) kylin not support aggregation of metric columns arithmetics , especially like price*count
[ https://issues.apache.org/jira/browse/KYLIN-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma reassigned KYLIN-1988: - Assignee: hongbin ma > kylin not support aggregation of metric columns arithmetics , especially like > price*count > - > > Key: KYLIN-1988 > URL: https://issues.apache.org/jira/browse/KYLIN-1988 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.2 >Reporter: Le Van Ha >Assignee: hongbin ma > > Hi all, > When I run query: > SELECT product_id, sum(quantity * price) > FROM fact_product_sales > group by product_id > Error while executing SQL "SELECT product_id, sum(quantity * price) FROM > fact_product_sales group by product_id LIMIT 5": null -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1989) Sum issue: result not consistent with hive
[ https://issues.apache.org/jira/browse/KYLIN-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460640#comment-15460640 ] hongbin ma commented on KYLIN-1989: --- seems no result is returned at kylin side possible reason is that cube does not contain data. what's the result select count(*) from fact? and can you please also try: SELECT channel.name, sum(fact_product.quantity) FROM fact_product_sales as fact_product join dim_product as product on fact_product.product_id = product.id join dim_channel as channel on fact_product.channel_id = channel.id GROUP BY channel.name > Sum issue: result not consistent with hive > -- > > Key: KYLIN-1989 > URL: https://issues.apache.org/jira/browse/KYLIN-1989 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.2 >Reporter: Le Van Ha > Attachments: hive_result.png, kylin_result.png > > > When do the following query, > SELECT channel.name, sum(case when product.product_vendor <> '' then > fact_product.quantity else 0 end) > FROM fact_product_sales as fact_product > join dim_product as product on fact_product.product_id = product.id > join dim_channel as channel on fact_product.channel_id = channel.id > GROUP BY channel.name > --- > The result by kylin: > Buy Button0 > Online Store 0 > --- > The result by hive is shown in figure. > Why is that? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1992) Clear ThreadLocal Contexts when query failed before scaning HBase
[ https://issues.apache.org/jira/browse/KYLIN-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460596#comment-15460596 ] hongbin ma commented on KYLIN-1992: --- The current location for the method invoke is already in very early stage(way before running hbase scanning) do you have a test case to reproduce your scenario? what is causing the query to fail? Also, to put the method invoke in REST service will break integration query tests that do not go through REST > Clear ThreadLocal Contexts when query failed before scaning HBase > - > > Key: KYLIN-1992 > URL: https://issues.apache.org/jira/browse/KYLIN-1992 > Project: Kylin > Issue Type: Bug >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-1992.patch > > > currently, we call `OLAPContext.clearThreadLocalContexts()` function before > scaning HBase. > if query failed before scaning HBase, we would get wrong `realization` of the > query in the log possibly. > Because the thread pool of Tomcat multiplexed the thread and didn't clear > ThreadLocal variable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1973) java.lang.NegativeArraySizeException when Build Dimension Dictionary
[ https://issues.apache.org/jira/browse/KYLIN-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460558#comment-15460558 ] hongbin ma commented on KYLIN-1973: --- test -- Regards, *Bin Mahone | 马洪宾* > java.lang.NegativeArraySizeException when Build Dimension Dictionary > > > Key: KYLIN-1973 > URL: https://issues.apache.org/jira/browse/KYLIN-1973 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 >Reporter: zhengdong >Assignee: liyang > Fix For: v1.5.4 > > > exception when Build Dimension Dictionary: > java.lang.NegativeArraySizeException > at > org.apache.kylin.dict.TrieDictionary.getValueFromIdImpl(TrieDictionary.java:274) > at > org.apache.kylin.common.util.Dictionary.getValueFromId(Dictionary.java:130) > at > org.apache.kylin.dict.lookup.SnapshotTable$1.getRow(SnapshotTable.java:138) > at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67) > at > org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79) > at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:55) > at > org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:65) > at > org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619) > at > org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:61) > at > org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42) > at > org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > result code:2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1973) java.lang.NegativeArraySizeException when Build Dimension Dictionary
[ https://issues.apache.org/jira/browse/KYLIN-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460523#comment-15460523 ] hongbin ma commented on KYLIN-1973: --- also, for better JIRA management we can link two JIRAs if they're highly related > java.lang.NegativeArraySizeException when Build Dimension Dictionary > > > Key: KYLIN-1973 > URL: https://issues.apache.org/jira/browse/KYLIN-1973 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 >Reporter: zhengdong >Assignee: liyang > Fix For: v1.5.4 > > > exception when Build Dimension Dictionary: > java.lang.NegativeArraySizeException > at > org.apache.kylin.dict.TrieDictionary.getValueFromIdImpl(TrieDictionary.java:274) > at > org.apache.kylin.common.util.Dictionary.getValueFromId(Dictionary.java:130) > at > org.apache.kylin.dict.lookup.SnapshotTable$1.getRow(SnapshotTable.java:138) > at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67) > at > org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79) > at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:55) > at > org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:65) > at > org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619) > at > org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:61) > at > org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42) > at > org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > result code:2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1973) java.lang.NegativeArraySizeException when Build Dimension Dictionary
[ https://issues.apache.org/jira/browse/KYLIN-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15460521#comment-15460521 ] hongbin ma commented on KYLIN-1973: --- what is the root cause? same as [~zhengd]'s guess? > java.lang.NegativeArraySizeException when Build Dimension Dictionary > > > Key: KYLIN-1973 > URL: https://issues.apache.org/jira/browse/KYLIN-1973 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 >Reporter: zhengdong >Assignee: liyang > Fix For: v1.5.4 > > > exception when Build Dimension Dictionary: > java.lang.NegativeArraySizeException > at > org.apache.kylin.dict.TrieDictionary.getValueFromIdImpl(TrieDictionary.java:274) > at > org.apache.kylin.common.util.Dictionary.getValueFromId(Dictionary.java:130) > at > org.apache.kylin.dict.lookup.SnapshotTable$1.getRow(SnapshotTable.java:138) > at org.apache.kylin.dict.lookup.LookupTable.init(LookupTable.java:67) > at > org.apache.kylin.dict.lookup.LookupStringTable.init(LookupStringTable.java:79) > at org.apache.kylin.dict.lookup.LookupTable.(LookupTable.java:55) > at > org.apache.kylin.dict.lookup.LookupStringTable.(LookupStringTable.java:65) > at > org.apache.kylin.cube.CubeManager.getLookupTable(CubeManager.java:619) > at > org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:61) > at > org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42) > at > org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:112) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:127) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > result code:2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1954) BuildInFunctionTransformer should be executed per CubeSegmentScanner
[ https://issues.apache.org/jira/browse/KYLIN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1954. --- Resolution: Fixed Fix Version/s: v1.5.4 > BuildInFunctionTransformer should be executed per CubeSegmentScanner > > > Key: KYLIN-1954 > URL: https://issues.apache.org/jira/browse/KYLIN-1954 > Project: Kylin > Issue Type: Improvement >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > > reported from dev mail list "Question abount BuildInFunctionTransformer" > Sorry for the wrong description and thanks for the explaination. > I have another question on this. > Case1 > select merchant_name,dt_day,count(*) > from session_view_shop_0 > where merchant_name like '%深海新创手机%' > and dt_year='2016' > and dt_month='07' > and dt_day >='25' > and dt_day <='28' > group by merchant_name,dt_day > 2016-08-05 09:25:06,263 INFO [http-bio-7070-exec-10] > dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_ > VIEW_SHOP_0.MERCHANT_NAME,%深海新创手机%)} to IN clause: > {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN []} > Result1 > 深海新创手机专营店80002972 28 6360 > 深海新创手机专营店80002972 27 5501 > 深海新创手机专营店80002972 26 4830 > Case 2 > select merchant_name,dt_day,count(*) > from session_view_shop_0 > where merchant_name like '%深海新创%' > and dt_year='2016' > and dt_month='07' > and dt_day >='25' > and dt_day <='28' > group by merchant_name,dt_day > 2016-08-05 09:37:55,469 INFO [http-bio-7070-exec-15] > dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_ > VIEW_SHOP_0.MERCHANT_NAME,%深海新创%)} to IN clause: > {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN [深海新创专营店80002972]} > Result2 > 深海新创专营店80002972 25 5283 > ’深海新创手机专营店80002972’ is expected in result2 , as it exists which case1 shows. > CubeStorageQuery.search/ CubeSegmentScanner > when filter is translated for the first segment, filter is changed to > CompareTupleFilter(IN clause) > translate will not triger for the next segments. > this is not right because dictionary is not same for every segments. > assume data like this: > merchant_name cube segment > 深海新创专营 20160725 > 深海新创手机 20160726 > when search with like '%深海新创%' > CubeSegmentScanner scan segment '20160725' , and filter is changed to in > clause( IN '深海新创专营') > result is right for this segment ,but not for the next segments because > filter now has been changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1981) PK/FK derived will break for left join in some cases
[ https://issues.apache.org/jira/browse/KYLIN-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1981: -- Priority: Minor (was: Major) > PK/FK derived will break for left join in some cases > > > Key: KYLIN-1981 > URL: https://issues.apache.org/jira/browse/KYLIN-1981 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: hongbin ma >Priority: Minor > > for left join cubes. suppose A is the FK in fact table, and B is the PK in > lookup table. query like below will end up with "more" results > select B, count(*) from fact left join lookup group by B -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1981) PK/FK derived will break for left join in some cases
[ https://issues.apache.org/jira/browse/KYLIN-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1981: -- Assignee: liyang (was: hongbin ma) > PK/FK derived will break for left join in some cases > > > Key: KYLIN-1981 > URL: https://issues.apache.org/jira/browse/KYLIN-1981 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: liyang >Priority: Minor > > for left join cubes. suppose A is the FK in fact table, and B is the PK in > lookup table. query like below will end up with "more" results > select B, count(*) from fact left join lookup group by B -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1981) PK/FK derived will break for left join in some cases
hongbin ma created KYLIN-1981: - Summary: PK/FK derived will break for left join in some cases Key: KYLIN-1981 URL: https://issues.apache.org/jira/browse/KYLIN-1981 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: hongbin ma for left join cubes. suppose A is the FK in fact table, and B is the PK in lookup table. query like below will end up with "more" results select B, count(*) from fact left join lookup group by B -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1954) BuildInFunctionTransformer should be executed per CubeSegmentScanner
[ https://issues.apache.org/jira/browse/KYLIN-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445454#comment-15445454 ] hongbin ma commented on KYLIN-1954: --- CubeStorageQuery.search/ CubeSegmentScanner when filter is translated for the first segment, filter is changed to CompareTupleFilter(IN clause) translate will not triger for the next segments. this is not right because dictionary is not same for every segments. assume data like this: merchant_name cube segment 深海新创专营 20160725 深海新创手机 20160726 when search with like '%深海新创%' CubeSegmentScanner scan segment '20160725' , and filter is changed to in clause( IN '深海新创专营') result is right for this segment ,but not for the next segments because filter now has been changed. > BuildInFunctionTransformer should be executed per CubeSegmentScanner > > > Key: KYLIN-1954 > URL: https://issues.apache.org/jira/browse/KYLIN-1954 > Project: Kylin > Issue Type: Improvement >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > > reported from dev mail list "Question abount BuildInFunctionTransformer" > Sorry for the wrong description and thanks for the explaination. > I have another question on this. > Case1 > select merchant_name,dt_day,count(*) > from session_view_shop_0 > where merchant_name like '%深海新创手机%' > and dt_year='2016' > and dt_month='07' > and dt_day >='25' > and dt_day <='28' > group by merchant_name,dt_day > 2016-08-05 09:25:06,263 INFO [http-bio-7070-exec-10] > dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_ > VIEW_SHOP_0.MERCHANT_NAME,%深海新创手机%)} to IN clause: > {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN []} > Result1 > 深海新创手机专营店80002972 28 6360 > 深海新创手机专营店80002972 27 5501 > 深海新创手机专营店80002972 26 4830 > Case 2 > select merchant_name,dt_day,count(*) > from session_view_shop_0 > where merchant_name like '%深海新创%' > and dt_year='2016' > and dt_month='07' > and dt_day >='25' > and dt_day <='28' > group by merchant_name,dt_day > 2016-08-05 09:37:55,469 INFO [http-bio-7070-exec-15] > dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_ > VIEW_SHOP_0.MERCHANT_NAME,%深海新创%)} to IN clause: > {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN [深海新创专营店80002972]} > Result2 > 深海新创专营店80002972 25 5283 > ’深海新创手机专营店80002972’ is expected in result2 , as it exists which case1 shows. > CubeStorageQuery.search/ CubeSegmentScanner > when filter is translated for the first segment, filter is changed to > CompareTupleFilter(IN clause) > translate will not triger for the next segments. > this is not right because dictionary is not same for every segments. > assume data like this: > merchant_name cube segment > 深海新创专营 20160725 > 深海新创手机 20160726 > when search with like '%深海新创%' > CubeSegmentScanner scan segment '20160725' , and filter is changed to in > clause( IN '深海新创专营') > result is right for this segment ,but not for the next segments because > filter now has been changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1319) Find a better way to check hadoop job status
[ https://issues.apache.org/jira/browse/KYLIN-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15443369#comment-15443369 ] hongbin ma commented on KYLIN-1319: --- KYLIN-1319 makes KYLIN-1014 deprecated > Find a better way to check hadoop job status > > > Key: KYLIN-1319 > URL: https://issues.apache.org/jira/browse/KYLIN-1319 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Reporter: liyang >Assignee: Zhong Yanghong > Labels: newbie > Fix For: v1.5.3 > > Attachments: > Find_better_way_of_checking_hadoop_job_status_by_YarnClient_master.patch, > Find_better_way_of_checking_hadoop_job_status_via_job_API_master.patch > > > Currently Kylin retrieves jobs status via a resource manager web service like > {code}https://:/ws/v1/cluster/apps/${job_id}?anonymous=true{code} > It is not most robust. Some user does not have > "yarn.resourcemanager.webapp.address" set in yarm-site.xml, then get status > will fail out-of-box. They have to set a Kylin property > "kylin.job.yarn.app.rest.check.status.url" to overcome, which is not user > friendly. > Kerberos authentication might cause problem too if security is enabled. > Is there a more robust way to check job status? Via Job API? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1965) Check duplicated measure name
[ https://issues.apache.org/jira/browse/KYLIN-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1965. --- Resolution: Fixed Fix Version/s: v1.5.4 > Check duplicated measure name > - > > Key: KYLIN-1965 > URL: https://issues.apache.org/jira/browse/KYLIN-1965 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v1.5.2, v1.5.3 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v1.5.4 > > Attachments: KYLIN-1965.patch > > > The duplicated measure's name will lead to query failed, so we should check > duplicated measure name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1965) Check duplicated measure name
[ https://issues.apache.org/jira/browse/KYLIN-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15442890#comment-15442890 ] hongbin ma commented on KYLIN-1965: --- patch merged. thanks [~kangkaisen] > Check duplicated measure name > - > > Key: KYLIN-1965 > URL: https://issues.apache.org/jira/browse/KYLIN-1965 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v1.5.2, v1.5.3 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v1.5.4 > > Attachments: KYLIN-1965.patch > > > The duplicated measure's name will lead to query failed, so we should check > duplicated measure name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1702) The Key of the Snapshot to the related lookup table may be not informative
[ https://issues.apache.org/jira/browse/KYLIN-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15442835#comment-15442835 ] hongbin ma commented on KYLIN-1702: --- Hi yanghong consider using org.apache.kylin.metadata.model.TableDesc#getIdentity, it contains db name as well. This could possibly avoid another round "duplication". Also, before merging the patch, you have to make changes in org.apache.kylin.engine.mr.steps.MetadataCleanupJob to reflect your changes here. > The Key of the Snapshot to the related lookup table may be not informative > -- > > Key: KYLIN-1702 > URL: https://issues.apache.org/jira/browse/KYLIN-1702 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Attachments: change_layout_path_of_table_snapshot_with_tableName.patch > > > Currently the key for the snapshot stored in hbase metadata file is as > follows: > ResourceStore.SNAPSHOT_RESOURCE_ROOT + "/" + new > File(signature.getPath()).getName() + "/" + uuid + ".snapshot" > However, some hive tables stored in hive may organized like > dirName/tableName/00, dirName/tableName/01. > Based on current setting, the key will be > ResourceStore.SNAPSHOT_RESOURCE_ROOT + "/" + 00 + "/" + uuid + ".snapshot", > which is lack of the table name information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1828) java.lang.StringIndexOutOfBoundsException in org.apache.kylin.storage.hbase.util.StorageCleanupJob
[ https://issues.apache.org/jira/browse/KYLIN-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436159#comment-15436159 ] hongbin ma commented on KYLIN-1828: --- The patch might need revising as we recently changed segment name convention in KYLIN-1859. Also we need to figure out why so many intermediate tables survived after the GC step (org.apache.kylin.source.hive.HiveMRInput.BatchCubingInputSide#addStepPhase4_Cleanup) > java.lang.StringIndexOutOfBoundsException in > org.apache.kylin.storage.hbase.util.StorageCleanupJob > -- > > Key: KYLIN-1828 > URL: https://issues.apache.org/jira/browse/KYLIN-1828 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v1.5.2.1 >Reporter: Richard Calaba >Assignee: hongbin ma > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1828-fix-OutOfBounds-StorageCleanupJob.patch > > > While running storage cleanup job: > ./bin/kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete > true > I see Hive tables in form > kylin_intermediate__1970010100_20160701031500 > in the defaul schema. > While running the above storage cleaner (v.1.5.2.1 - all previously built > Cubes Disabled & Dropped) I am getting an error: > 2016-06-27 22:28:08,480 INFO [main StorageCleanupJob:262]: Remove > intermediate hive table with job id fc44da88-cffc-4710-8726-ff910cf83451 with > job status ERROR > usage: StorageCleanupJob > -deleteDelete the unused storage > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String > index out of range: -2 > at java.lang.String.substring(String.java:1904) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:269) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.run(StorageCleanupJob.java:91) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.main(StorageCleanupJob.java:308) > 2016-06-27 22:28:08,486 INFO [Thread-0 > HConnectionManager$HConnectionImplementation:1907]: Closing zookeeper > sessionid=0x154c97461586119 > 2016-06-27 22:28:08,491 INFO [Thread-0 ZooKeeper:684]: Session: > 0x154c97461586119 closed > 2016-06-27 22:28:08,491 INFO [main-EventThread ClientCnxn:509]: EventThread > shut down -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1828) java.lang.StringIndexOutOfBoundsException in org.apache.kylin.storage.hbase.util.StorageCleanupJob
[ https://issues.apache.org/jira/browse/KYLIN-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1828: -- Assignee: Wang Cheng (was: hongbin ma) > java.lang.StringIndexOutOfBoundsException in > org.apache.kylin.storage.hbase.util.StorageCleanupJob > -- > > Key: KYLIN-1828 > URL: https://issues.apache.org/jira/browse/KYLIN-1828 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v1.5.2.1 >Reporter: Richard Calaba >Assignee: Wang Cheng > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1828-fix-OutOfBounds-StorageCleanupJob.patch > > > While running storage cleanup job: > ./bin/kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete > true > I see Hive tables in form > kylin_intermediate__1970010100_20160701031500 > in the defaul schema. > While running the above storage cleaner (v.1.5.2.1 - all previously built > Cubes Disabled & Dropped) I am getting an error: > 2016-06-27 22:28:08,480 INFO [main StorageCleanupJob:262]: Remove > intermediate hive table with job id fc44da88-cffc-4710-8726-ff910cf83451 with > job status ERROR > usage: StorageCleanupJob > -deleteDelete the unused storage > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String > index out of range: -2 > at java.lang.String.substring(String.java:1904) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:269) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.run(StorageCleanupJob.java:91) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.main(StorageCleanupJob.java:308) > 2016-06-27 22:28:08,486 INFO [Thread-0 > HConnectionManager$HConnectionImplementation:1907]: Closing zookeeper > sessionid=0x154c97461586119 > 2016-06-27 22:28:08,491 INFO [Thread-0 ZooKeeper:684]: Session: > 0x154c97461586119 closed > 2016-06-27 22:28:08,491 INFO [main-EventThread ClientCnxn:509]: EventThread > shut down -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1828) java.lang.StringIndexOutOfBoundsException in org.apache.kylin.storage.hbase.util.StorageCleanupJob
[ https://issues.apache.org/jira/browse/KYLIN-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1828: -- Fix Version/s: v1.5.4 > java.lang.StringIndexOutOfBoundsException in > org.apache.kylin.storage.hbase.util.StorageCleanupJob > -- > > Key: KYLIN-1828 > URL: https://issues.apache.org/jira/browse/KYLIN-1828 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v1.5.2.1 >Reporter: Richard Calaba >Assignee: hongbin ma > Fix For: v1.5.4 > > Attachments: 0001-KYLIN-1828-fix-OutOfBounds-StorageCleanupJob.patch > > > While running storage cleanup job: > ./bin/kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete > true > I see Hive tables in form > kylin_intermediate__1970010100_20160701031500 > in the defaul schema. > While running the above storage cleaner (v.1.5.2.1 - all previously built > Cubes Disabled & Dropped) I am getting an error: > 2016-06-27 22:28:08,480 INFO [main StorageCleanupJob:262]: Remove > intermediate hive table with job id fc44da88-cffc-4710-8726-ff910cf83451 with > job status ERROR > usage: StorageCleanupJob > -deleteDelete the unused storage > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String > index out of range: -2 > at java.lang.String.substring(String.java:1904) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:269) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.run(StorageCleanupJob.java:91) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.main(StorageCleanupJob.java:308) > 2016-06-27 22:28:08,486 INFO [Thread-0 > HConnectionManager$HConnectionImplementation:1907]: Closing zookeeper > sessionid=0x154c97461586119 > 2016-06-27 22:28:08,491 INFO [Thread-0 ZooKeeper:684]: Session: > 0x154c97461586119 closed > 2016-06-27 22:28:08,491 INFO [main-EventThread ClientCnxn:509]: EventThread > shut down -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1971) Cannot support columns with same name under different table
hongbin ma created KYLIN-1971: - Summary: Cannot support columns with same name under different table Key: KYLIN-1971 URL: https://issues.apache.org/jira/browse/KYLIN-1971 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma Priority: Minor currently we implicitly assume all columns in the model have unique names, for example, in row key and aggregation group we use column name (without tablename) to identify each column -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1964) Add a companion tool of CubeMetaExtractor for cube importing
hongbin ma created KYLIN-1964: - Summary: Add a companion tool of CubeMetaExtractor for cube importing Key: KYLIN-1964 URL: https://issues.apache.org/jira/browse/KYLIN-1964 Project: Kylin Issue Type: Wish Reporter: hongbin ma Assignee: hongbin ma Now that we have CubeMetaExtractor for cube exporting, additionally we need a importer to import the exported cube -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1963) Delegate the loading of certain package (like slf4j) to tomcat's parent classloader
hongbin ma created KYLIN-1963: - Summary: Delegate the loading of certain package (like slf4j) to tomcat's parent classloader Key: KYLIN-1963 URL: https://issues.apache.org/jira/browse/KYLIN-1963 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma currently we use hbase command to start tomcat, which then starts kylin as a web application. The default classloader that tomcats assigns to the kylin applications is WebappClassLoader, which will search local repositories before parent classloader. the design will lead to two separate log4j logging instances in both the "HBase space" and "kylin space", the two loggers will attempt to write to the same file, which is problematic according to official documents -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1943) Error while executing SQL "select * from kylin_sales LIMIT 50000": Error in coprocessor
[ https://issues.apache.org/jira/browse/KYLIN-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416457#comment-15416457 ] hongbin ma commented on KYLIN-1943: --- We had a couple of reports that Kylin does not work well with hbase-1.0.x Suggest upgrade hbase to 1.1.3+ or downgrade to 0.98 > Error while executing SQL "select * from kylin_sales LIMIT 5": Error in > coprocessor > --- > > Key: KYLIN-1943 > URL: https://issues.apache.org/jira/browse/KYLIN-1943 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: 1.x-HBase1.1.3 >Reporter: Ranga >Assignee: liyang >Priority: Blocker > > when I installed the kylin-1.5.3-1.xHbase and run sample.sh, everything looks > good. However , when I queried the result cube in > "Insight", tables "KYLIN_CAL_DT" and "KYLIN_CATEGORY_GROUPINGS" did not show > any error with SQL, as for table > "KYLIN_SALES" could not run and show log: "Error while executing SQL > "select * from KYLIN_SALES LIMIT 5": Error in coprocessor" > At first, I obey the operation which was provided by kylin official website > as :$KYLIN_HOME/bin/kylin.sh > org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI > $KYLIN_HOME/lib/kylin-coprocessor-*.jar all,But it did not work. > Indeed, after building cube based on my source data, I can also not query > cube,and give the exception :"Error while executing SQL "*": Error > in coprocessor" > so ,I need your help. thanks > ps. > kylin-1.5.3-1.xHbase > hadoop-2.6.0 > hbase-1.0.3 > hive-1.2.1 > 2016-08-07 05:26:08,578 INFO [http-bio-7070-exec-2] service.QueryService:253 > : > ==[QUERY]=== > SQL: select * from kylin_sales; > User: ADMIN > Success: false > Duration: 0.0 > Project: learn_kylin > Realization Names: [kylin_sales_cube] > Cuboid Ids: [99] > Total scan count: 0 > Result row count: 0 > Accept Partial: true > Is Partial Result: false > Hit Exception Cache: false > Storage cache used: false > Message: Error while executing SQL "select * from kylin_sales LIMIT 5": > Error in coprocessor > ==[QUERY]=== > 2016-08-07 05:26:08,578 ERROR [http-bio-7070-exec-2] > controller.BasicController:44 : > org.apache.kylin.rest.exception.InternalErrorException: Error while executing > SQL "select * from kylin_sales LIMIT 5": Error in coprocessor > at > org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213) > at > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126) > at > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578) > at > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80) > at > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923) > at > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852) > at > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882) > at > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:650) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:731) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at >
[jira] [Closed] (KYLIN-1943) Error while executing SQL "select * from kylin_sales LIMIT 50000": Error in coprocessor
[ https://issues.apache.org/jira/browse/KYLIN-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma closed KYLIN-1943. - > Error while executing SQL "select * from kylin_sales LIMIT 5": Error in > coprocessor > --- > > Key: KYLIN-1943 > URL: https://issues.apache.org/jira/browse/KYLIN-1943 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: 1.x-HBase1.1.3 >Reporter: Ranga >Assignee: liyang >Priority: Blocker > > when I installed the kylin-1.5.3-1.xHbase and run sample.sh, everything looks > good. However , when I queried the result cube in > "Insight", tables "KYLIN_CAL_DT" and "KYLIN_CATEGORY_GROUPINGS" did not show > any error with SQL, as for table > "KYLIN_SALES" could not run and show log: "Error while executing SQL > "select * from KYLIN_SALES LIMIT 5": Error in coprocessor" > At first, I obey the operation which was provided by kylin official website > as :$KYLIN_HOME/bin/kylin.sh > org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI > $KYLIN_HOME/lib/kylin-coprocessor-*.jar all,But it did not work. > Indeed, after building cube based on my source data, I can also not query > cube,and give the exception :"Error while executing SQL "*": Error > in coprocessor" > so ,I need your help. thanks > ps. > kylin-1.5.3-1.xHbase > hadoop-2.6.0 > hbase-1.0.3 > hive-1.2.1 > 2016-08-07 05:26:08,578 INFO [http-bio-7070-exec-2] service.QueryService:253 > : > ==[QUERY]=== > SQL: select * from kylin_sales; > User: ADMIN > Success: false > Duration: 0.0 > Project: learn_kylin > Realization Names: [kylin_sales_cube] > Cuboid Ids: [99] > Total scan count: 0 > Result row count: 0 > Accept Partial: true > Is Partial Result: false > Hit Exception Cache: false > Storage cache used: false > Message: Error while executing SQL "select * from kylin_sales LIMIT 5": > Error in coprocessor > ==[QUERY]=== > 2016-08-07 05:26:08,578 ERROR [http-bio-7070-exec-2] > controller.BasicController:44 : > org.apache.kylin.rest.exception.InternalErrorException: Error while executing > SQL "select * from kylin_sales LIMIT 5": Error in coprocessor > at > org.apache.kylin.rest.controller.QueryController.doQueryWithCache(QueryController.java:224) > at > org.apache.kylin.rest.controller.QueryController.query(QueryController.java:94) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:213) > at > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:126) > at > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:96) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:617) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:578) > at > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80) > at > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923) > at > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852) > at > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:882) > at > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:789) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:650) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:731) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at >
[jira] [Updated] (KYLIN-1951) cube sergement start time don't match partition start time
[ https://issues.apache.org/jira/browse/KYLIN-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1951: -- Assignee: Zhong,Jason > cube sergement start time don't match partition start time > -- > > Key: KYLIN-1951 > URL: https://issues.apache.org/jira/browse/KYLIN-1951 > Project: Kylin > Issue Type: Bug >Reporter: 一岁时很拽 >Assignee: Zhong,Jason > > for streaming cube build, > i set kylin.rest.timezone=GMT+8,but cube sergement start time don't match > partition start time, A difference of 8 hours,why? > http://apache-kylin.74782.x6.nabble.com/cube-sergement-start-time-don-t-match-partition-start-time-td5538.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1954) BuildInFunctionTransformer should be executed per CubeSegmentScanner
hongbin ma created KYLIN-1954: - Summary: BuildInFunctionTransformer should be executed per CubeSegmentScanner Key: KYLIN-1954 URL: https://issues.apache.org/jira/browse/KYLIN-1954 Project: Kylin Issue Type: Improvement Affects Versions: v1.5.3 Reporter: hongbin ma Assignee: hongbin ma reported from dev mail list "Question abount BuildInFunctionTransformer" Sorry for the wrong description and thanks for the explaination. I have another question on this. Case1 select merchant_name,dt_day,count(*) from session_view_shop_0 where merchant_name like '%深海新创手机%' and dt_year='2016' and dt_month='07' and dt_day >='25' and dt_day <='28' group by merchant_name,dt_day 2016-08-05 09:25:06,263 INFO [http-bio-7070-exec-10] dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_ VIEW_SHOP_0.MERCHANT_NAME,%深海新创手机%)} to IN clause: {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN []} Result1 深海新创手机专营店80002972 28 6360 深海新创手机专营店80002972 27 5501 深海新创手机专营店80002972 26 4830 Case 2 select merchant_name,dt_day,count(*) from session_view_shop_0 where merchant_name like '%深海新创%' and dt_year='2016' and dt_month='07' and dt_day >='25' and dt_day <='28' group by merchant_name,dt_day 2016-08-05 09:37:55,469 INFO [http-bio-7070-exec-15] dict.BuildInFunctionTransformer:66 : Translated {LIKE(KYLIN_REPORT_DB.SESSION_ VIEW_SHOP_0.MERCHANT_NAME,%深海新创%)} to IN clause: {KYLIN_REPORT_DB.SESSION_VIEW_SHOP_0.MERCHANT_NAME IN [深海新创专营店80002972]} Result2 深海新创专营店80002972 25 5283 ’深海新创手机专营店80002972’ is expected in result2 , as it exists which case1 shows. CubeStorageQuery.search/ CubeSegmentScanner when filter is translated for the first segment, filter is changed to CompareTupleFilter(IN clause) translate will not triger for the next segments. this is not right because dictionary is not same for every segments. assume data like this: merchant_name cube segment 深海新创专营 20160725 深海新创手机 20160726 when search with like '%深海新创%' CubeSegmentScanner scan segment '20160725' , and filter is changed to in clause( IN '深海新创专营') result is right for this segment ,but not for the next segments because filter now has been changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1949) selected meta_categ_name as dimension by mistake in subquery/query11.sql
[ https://issues.apache.org/jira/browse/KYLIN-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413268#comment-15413268 ] hongbin ma commented on KYLIN-1949: --- in subquery/11.sql, meta_categ_name exists in the group by clause, however it should not be one of the dimensions pushed down to cuboid. > selected meta_categ_name as dimension by mistake in subquery/query11.sql > > > Key: KYLIN-1949 > URL: https://issues.apache.org/jira/browse/KYLIN-1949 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: hongbin ma > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1949) selected meta_categ_name as dimension by mistake in subquery/query11.sql
hongbin ma created KYLIN-1949: - Summary: selected meta_categ_name as dimension by mistake in subquery/query11.sql Key: KYLIN-1949 URL: https://issues.apache.org/jira/browse/KYLIN-1949 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: hongbin ma -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1934) 'Value not exist' During Cube Merging Caused by Empty Dict
[ https://issues.apache.org/jira/browse/KYLIN-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407921#comment-15407921 ] hongbin ma commented on KYLIN-1934: --- nice findings! The patch is good, please merge to master > 'Value not exist' During Cube Merging Caused by Empty Dict > -- > > Key: KYLIN-1934 > URL: https://issues.apache.org/jira/browse/KYLIN-1934 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v1.5.4 >Reporter: Yerui Sun >Assignee: Yerui Sun >Priority: Critical > Fix For: v1.5.4 > > Attachments: KYLIN-1934.patch > > > When cube merge, new dictionary will be created which consists of all values > in old dictionaries. > The values in old dicts is enumerated by MultipleDictionaryValueEnumerator. > However, if the first dict is empty, the Enumerator.moveNext() will return > false directly and ignore all values in other dicts, made the new dict is > also empty. > The cube merging will be failed because no values contained in the new dict. > Not sure whether this issue related with KYLIN-1834 or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1936) Improve enable limit logic (exactAggregation is too strict)
[ https://issues.apache.org/jira/browse/KYLIN-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1936: -- Affects Version/s: v1.5.3 > Improve enable limit logic (exactAggregation is too strict) > --- > > Key: KYLIN-1936 > URL: https://issues.apache.org/jira/browse/KYLIN-1936 > Project: Kylin > Issue Type: Improvement >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > > from zhaotians...@meizu.com: > recently I got the following error while execute query on a cube which is not > that big( about 400mb, 20milion record) > == > Error while executing SQL "select FCRASHTIME,count(1) from > UXIP.EDL_FDT_OUC_UPLOAD_FILES group by FCRASH_ANALYSIS_ID,FCRASHTIME limit > 1": Scan row count exceeded threshold: 1000, please add filter condition > to narrow down backend scan range, like where clause. > I guess what it scan were the intermediate result, but It doesn't any order > by,also the result count is limit to just 1.so it could scan to find any > record with those two dimension and wala. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1936) Improve enable limit logic (exactAggregation is too strict)
[ https://issues.apache.org/jira/browse/KYLIN-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405298#comment-15405298 ] hongbin ma commented on KYLIN-1936: --- Kylin is basically limiting the footprint on storage visit, the size of each row is inversely proportional to the number of rows to be read. The cuboid row size grows when there's a distinct count measure, so you're observing threshold being 49121, this is normal. The unnormal part is kylins behavior when there's a limit clause, especially for cases like Tianshuo's case, where query being: select FCRASHTIME,count(1) from UXIP.EDL_FDT_OUC_UPLOAD_FILES group by FCRASH_ANALYSIS_ID,FCRASHTIME limit N The query does not have any filters, so we should be able to read the first N rows from cuboid (FCRASH_ANALYSIS_ID,FCRASHTIME) and return the result to users. Yang Li tried to fix the issue in https://issues.apache.org/jira/browse/KYLIN-1787, however the approach was still a little bit too conservative to me. The patch in KYLIN-1787 would not enable the storage read limit as long as the cube has a partition time column (and meanwhile the query is not grouping by the partition time column), because we'll need to further aggregate rows from different segments. This is why 1.5.3 does not behave as Tianshuo expect. However there's still room for improvement even if further aggregation is required across multiple segments. For tianshuo's case, we can ask for N cuboid row from each segment, and merge them at query server side. Since the cuboid rows are respectively sorted in each segment, it is guaranteed that the result is correct However it's a different story if the query contains filters, like in Tiansheng's case. Filter on dimensions may prevent limit clause put down, especially when the dimension is not the first dimension in row key. Below is Tiansheng's case: Error while executing SQL "select "DATE",ADGROUPID,CAMPAIGNID,COMPANYID,APPID,SUM(IMPS) as imps,SUM(CLKS) as clks,SUM(CONS) as cons, (SUM(IMP_CLOSINGPRICE)+SUM(CLK_CLOSINGPRICE)) as cost,COUNT(DISTINCT CLK_DEVICEID) as clk_uv from EXT_MID_EVENT_JOIN where COMPANYID='296' and "DATE">='2016-01-01' and "DATE"<'2016-01-05' group by "DATE",ADGROUPID,CAMPAIGNID,COMPANYID,APPID order by imps desc limit 10 offset 0": Scan row count exceeded threshold: 49121, please add filter condition to narrow down backend scan range, like where clause. > Improve enable limit logic (exactAggregation is too strict) > --- > > Key: KYLIN-1936 > URL: https://issues.apache.org/jira/browse/KYLIN-1936 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > > from zhaotians...@meizu.com: > recently I got the following error while execute query on a cube which is not > that big( about 400mb, 20milion record) > == > Error while executing SQL "select FCRASHTIME,count(1) from > UXIP.EDL_FDT_OUC_UPLOAD_FILES group by FCRASH_ANALYSIS_ID,FCRASHTIME limit > 1": Scan row count exceeded threshold: 1000, please add filter condition > to narrow down backend scan range, like where clause. > I guess what it scan were the intermediate result, but It doesn't any order > by,also the result count is limit to just 1.so it could scan to find any > record with those two dimension and wala. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1936) Improve enable limit logic (exactAggregation is too strict)
hongbin ma created KYLIN-1936: - Summary: Improve enable limit logic (exactAggregation is too strict) Key: KYLIN-1936 URL: https://issues.apache.org/jira/browse/KYLIN-1936 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma from zhaotians...@meizu.com: recently I got the following error while execute query on a cube which is not that big( about 400mb, 20milion record) == Error while executing SQL "select FCRASHTIME,count(1) from UXIP.EDL_FDT_OUC_UPLOAD_FILES group by FCRASH_ANALYSIS_ID,FCRASHTIME limit 1": Scan row count exceeded threshold: 1000, please add filter condition to narrow down backend scan range, like where clause. I guess what it scan were the intermediate result, but It doesn't any order by,also the result count is limit to just 1.so it could scan to find any record with those two dimension and wala. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1924) Region server metrics: replace int type for long type for scanned row count
[ https://issues.apache.org/jira/browse/KYLIN-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1924. --- Resolution: Fixed Fix Version/s: v1.5.4 > Region server metrics: replace int type for long type for scanned row count > --- > > Key: KYLIN-1924 > URL: https://issues.apache.org/jira/browse/KYLIN-1924 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.5.4 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1924) Region server metrics: replace int type for long type for scanned row count
hongbin ma created KYLIN-1924: - Summary: Region server metrics: replace int type for long type for scanned row count Key: KYLIN-1924 URL: https://issues.apache.org/jira/browse/KYLIN-1924 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1922) Improve the logic to decide whether to pre aggregate on Region server
hongbin ma created KYLIN-1922: - Summary: Improve the logic to decide whether to pre aggregate on Region server Key: KYLIN-1922 URL: https://issues.apache.org/jira/browse/KYLIN-1922 Project: Kylin Issue Type: Improvement Reporter: hongbin ma Assignee: hongbin ma -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1796) query return 0 results when join with sub sql with composite keys
[ https://issues.apache.org/jira/browse/KYLIN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391084#comment-15391084 ] hongbin ma commented on KYLIN-1796: --- hi [~wormholer] does this issue relate with your recent fix on query engine? > query return 0 results when join with sub sql with composite keys > - > > Key: KYLIN-1796 > URL: https://issues.apache.org/jira/browse/KYLIN-1796 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.2.1 >Reporter: Zhong,Jason >Assignee: liyang > > from kylin sample cube 'kylin_sales_cube' > after build successfully. > if we want to query sql like this. > " > SELECT COUNT(*) FROM KYLIN_SALES as a INNER JOIN > KYLIN_CATEGORY_GROUPINGS as b > on a.LEAF_CATEG_ID = b.LEAF_CATEG_ID > and a.LSTG_SITE_ID = b.SITE_ID > group by b.META_CATEG_NAME > " > it works!! > but if we query same sql like this > " > select count(*) as _count from kylin_sales as a > inner join > (select leaf_categ_id as _leaf_categ_id, site_id as _site_id, > meta_categ_name as _meta_categ_name from kylin_category_groupings) > as b > on (a.leaf_categ_id = b._leaf_categ_id and a.lstg_site_id = b._site_id) > group by b._meta_categ_name order by count(*) desc > " > it return 0 results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1743) Preview all the valid cuboids at web wizard
[ https://issues.apache.org/jira/browse/KYLIN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391077#comment-15391077 ] hongbin ma commented on KYLIN-1743: --- hi [~cal...@gmail.com] We both agree Solution Sizing is very helpful to modellers, however honestly it's so complex with so many influencing factors. We welcome all kinds of suggestions/ideas to overcome this. For simplicity we'll narrow the current JIRA's scope to "Preview cuboids" only. > Preview all the valid cuboids at web wizard > --- > > Key: KYLIN-1743 > URL: https://issues.apache.org/jira/browse/KYLIN-1743 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > > it will be nice if the modeller can preview all the valid cuboids before cube > building, rather than having to wait until built. Even though we don't have > size estimation before cubing, a cuboid tree will be suffice to give modeller > an impression -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KYLIN-1388) Different realization under one model could share some cubing steps
[ https://issues.apache.org/jira/browse/KYLIN-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391074#comment-15391074 ] hongbin ma edited comment on KYLIN-1388 at 7/24/16 2:37 PM: Dictionaries are already globally shared was (Author: mahongbin): Dictionaries are already globally > Different realization under one model could share some cubing steps > --- > > Key: KYLIN-1388 > URL: https://issues.apache.org/jira/browse/KYLIN-1388 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > > The data model behind each realizations(cubes) has shared resources, most > significantly being the flattened hive table and the dictionaries. The > realizations can check if other realization (with the same model) has already > created shared resources. If yes, it can directly skip these steps to save > time/resource -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1388) Different realization under one model could share some cubing steps
[ https://issues.apache.org/jira/browse/KYLIN-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391074#comment-15391074 ] hongbin ma commented on KYLIN-1388: --- Dictionaries are already globally > Different realization under one model could share some cubing steps > --- > > Key: KYLIN-1388 > URL: https://issues.apache.org/jira/browse/KYLIN-1388 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > > The data model behind each realizations(cubes) has shared resources, most > significantly being the flattened hive table and the dictionaries. The > realizations can check if other realization (with the same model) has already > created shared resources. If yes, it can directly skip these steps to save > time/resource -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1847) Cleanup of Intermediate tables not working well
[ https://issues.apache.org/jira/browse/KYLIN-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391073#comment-15391073 ] hongbin ma commented on KYLIN-1847: --- Kylin does not guarantee garbage cleaning for discarding jobs now. Instead admins are supposed to periodically use StorageCleanupJob to clean > Cleanup of Intermediate tables not working well > --- > > Key: KYLIN-1847 > URL: https://issues.apache.org/jira/browse/KYLIN-1847 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.2, v1.5.2.1 >Reporter: Richard Calaba > > I have realized that Hive tables > kylin_intermediate__ after cancelling all pending > build jobs and dropping the cube are not cleaned properly. > It could be that I didn't execute Purge before Dropping the cube ... just a > theory, not 100% sure. > I also suspect that on hdfs in the /kylin/kylin_metadata/ directory I have > too many uncleaned data ... considering that I have just now only 1 cube > having a pending build job I see too many subdirectories there ... > There might be some relation to he JIRA I already reported as well ... > https://issues.apache.org/jira/browse/KYLIN-1828 - but again not 100% sure > this is the sole reason. > My impression is that the cleanup logic after the Drop cube needs to be > re-checked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1828) java.lang.StringIndexOutOfBoundsException in org.apache.kylin.storage.hbase.util.StorageCleanupJob
[ https://issues.apache.org/jira/browse/KYLIN-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391071#comment-15391071 ] hongbin ma commented on KYLIN-1828: --- thank you [~cal...@gmail.com] for raising the issue. It seems to be a bug here, I'll follow > java.lang.StringIndexOutOfBoundsException in > org.apache.kylin.storage.hbase.util.StorageCleanupJob > -- > > Key: KYLIN-1828 > URL: https://issues.apache.org/jira/browse/KYLIN-1828 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v1.5.2.1 >Reporter: Richard Calaba > > While running storage cleanup job: > ./bin/kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete > true > I see Hive tables in form > kylin_intermediate__1970010100_20160701031500 > in the defaul schema. > While running the above storage cleaner (v.1.5.2.1 - all previously built > Cubes Disabled & Dropped) I am getting an error: > 2016-06-27 22:28:08,480 INFO [main StorageCleanupJob:262]: Remove > intermediate hive table with job id fc44da88-cffc-4710-8726-ff910cf83451 with > job status ERROR > usage: StorageCleanupJob > -deleteDelete the unused storage > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String > index out of range: -2 > at java.lang.String.substring(String.java:1904) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:269) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.run(StorageCleanupJob.java:91) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.main(StorageCleanupJob.java:308) > 2016-06-27 22:28:08,486 INFO [Thread-0 > HConnectionManager$HConnectionImplementation:1907]: Closing zookeeper > sessionid=0x154c97461586119 > 2016-06-27 22:28:08,491 INFO [Thread-0 ZooKeeper:684]: Session: > 0x154c97461586119 closed > 2016-06-27 22:28:08,491 INFO [main-EventThread ClientCnxn:509]: EventThread > shut down -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KYLIN-1828) java.lang.StringIndexOutOfBoundsException in org.apache.kylin.storage.hbase.util.StorageCleanupJob
[ https://issues.apache.org/jira/browse/KYLIN-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma reassigned KYLIN-1828: - Assignee: hongbin ma > java.lang.StringIndexOutOfBoundsException in > org.apache.kylin.storage.hbase.util.StorageCleanupJob > -- > > Key: KYLIN-1828 > URL: https://issues.apache.org/jira/browse/KYLIN-1828 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v1.5.2.1 >Reporter: Richard Calaba >Assignee: hongbin ma > > While running storage cleanup job: > ./bin/kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete > true > I see Hive tables in form > kylin_intermediate__1970010100_20160701031500 > in the defaul schema. > While running the above storage cleaner (v.1.5.2.1 - all previously built > Cubes Disabled & Dropped) I am getting an error: > 2016-06-27 22:28:08,480 INFO [main StorageCleanupJob:262]: Remove > intermediate hive table with job id fc44da88-cffc-4710-8726-ff910cf83451 with > job status ERROR > usage: StorageCleanupJob > -deleteDelete the unused storage > Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String > index out of range: -2 > at java.lang.String.substring(String.java:1904) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:269) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.run(StorageCleanupJob.java:91) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.kylin.storage.hbase.util.StorageCleanupJob.main(StorageCleanupJob.java:308) > 2016-06-27 22:28:08,486 INFO [Thread-0 > HConnectionManager$HConnectionImplementation:1907]: Closing zookeeper > sessionid=0x154c97461586119 > 2016-06-27 22:28:08,491 INFO [Thread-0 ZooKeeper:684]: Session: > 0x154c97461586119 closed > 2016-06-27 22:28:08,491 INFO [main-EventThread ClientCnxn:509]: EventThread > shut down -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KYLIN-1855) Should exclude those joins in whose related lookup tables no dimensions are used in cube
[ https://issues.apache.org/jira/browse/KYLIN-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15391063#comment-15391063 ] hongbin ma edited comment on KYLIN-1855 at 7/24/16 1:41 PM: seems you moved maven-checkstyle-plugin from build-> plugins to build -> pluginManagement -> plugins in https://github.com/apache/kylin/commit/780d002fbac3b20a5d8194957ed9008d84d71ad1. why? After I move back the checks come back. was (Author: mahongbin): seems you moved maven-checkstyle-plugin from build-> plugins to build -> pluginManagement -> plugins. why? After I move back the checks come back. > Should exclude those joins in whose related lookup tables no dimensions are > used in cube > > > Key: KYLIN-1855 > URL: https://issues.apache.org/jira/browse/KYLIN-1855 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Fix For: v1.5.4 > > Attachments: exclude_unused_joins.patch > > > A cube is based on a model in which a star schema is defined. In some cases, > the cube utilizes only a few lookup tables rather than all. In this case, > when creating the sql for the flat table, those lookup tables should not be > included. Otherwise, it will confuse users when query. If users do query > according to the definition of the flat table, error of no realization will > occur due to lack of the related join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1913) query log printed abnormally if the query contains "\r" (not "\r\n")
[ https://issues.apache.org/jira/browse/KYLIN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1913: -- Fix Version/s: (was: 1.5.3) v1.5.3 > query log printed abnormally if the query contains "\r" (not "\r\n") > - > > Key: KYLIN-1913 > URL: https://issues.apache.org/jira/browse/KYLIN-1913 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.6.0, v1.5.3 > > > when the sql request is: > {"sql":"select sum(lo_revenue) as lo_revenue, d_year, p_brand\rfrom > v_lineorder\rleft join dates on lo_orderdate = d_datekey\rleft join part on > lo_partkey = p_partkey\rleft join supplier on lo_suppkey = s_suppkey\rwhere > p_category = 'MFGR#0206' and s_region = 'AMERICA'\rgroup by d_year, > p_brand\rorder by d_year, > p_brand\r","offset":0,"limit":5,"acceptPartial":true,"project":"ssb"} > the log output will be: > ==[QUERY]=== > order by d_year, p_brand#0206' and s_region = 'AMERICA'and > User: ADMIN > Success: true > Duration: 28.288 > Project: ssb > Realization Names: [ssb] > Cuboid Ids: [1064] > Total scan count: 2086851 > Result row count: 6553 > Accept Partial: true > Is Partial Result: false > Hit Exception Cache: false > Storage cache used: false > Message: null > ==[QUERY]=== -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1913) query log printed abnormally if the query contains "\r" (not "\r\n")
[ https://issues.apache.org/jira/browse/KYLIN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1913: -- Fix Version/s: 1.5.3 > query log printed abnormally if the query contains "\r" (not "\r\n") > - > > Key: KYLIN-1913 > URL: https://issues.apache.org/jira/browse/KYLIN-1913 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.6.0, 1.5.3 > > > when the sql request is: > {"sql":"select sum(lo_revenue) as lo_revenue, d_year, p_brand\rfrom > v_lineorder\rleft join dates on lo_orderdate = d_datekey\rleft join part on > lo_partkey = p_partkey\rleft join supplier on lo_suppkey = s_suppkey\rwhere > p_category = 'MFGR#0206' and s_region = 'AMERICA'\rgroup by d_year, > p_brand\rorder by d_year, > p_brand\r","offset":0,"limit":5,"acceptPartial":true,"project":"ssb"} > the log output will be: > ==[QUERY]=== > order by d_year, p_brand#0206' and s_region = 'AMERICA'and > User: ADMIN > Success: true > Duration: 28.288 > Project: ssb > Realization Names: [ssb] > Cuboid Ids: [1064] > Total scan count: 2086851 > Result row count: 6553 > Accept Partial: true > Is Partial Result: false > Hit Exception Cache: false > Storage cache used: false > Message: null > ==[QUERY]=== -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1874) Make roaring bitmap version determined
[ https://issues.apache.org/jira/browse/KYLIN-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1874. --- Resolution: Fixed Fix Version/s: v1.6.0 > Make roaring bitmap version determined > --- > > Key: KYLIN-1874 > URL: https://issues.apache.org/jira/browse/KYLIN-1874 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.6.0 > > > currently roaring bitmap version is undetermined: > (0.5.4,] > although it will make sure we always get state of art roaring, however it can > break kylin version compatibility quitely. > I'll change to use the latest roaring verison 0.6.18 instread -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1913) query log printed abnormally if the query contains "\r" (not "\r\n")
[ https://issues.apache.org/jira/browse/KYLIN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma updated KYLIN-1913: -- Fix Version/s: (was: v1.5.3) v1.6.0 > query log printed abnormally if the query contains "\r" (not "\r\n") > - > > Key: KYLIN-1913 > URL: https://issues.apache.org/jira/browse/KYLIN-1913 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.6.0 > > > when the sql request is: > {"sql":"select sum(lo_revenue) as lo_revenue, d_year, p_brand\rfrom > v_lineorder\rleft join dates on lo_orderdate = d_datekey\rleft join part on > lo_partkey = p_partkey\rleft join supplier on lo_suppkey = s_suppkey\rwhere > p_category = 'MFGR#0206' and s_region = 'AMERICA'\rgroup by d_year, > p_brand\rorder by d_year, > p_brand\r","offset":0,"limit":5,"acceptPartial":true,"project":"ssb"} > the log output will be: > ==[QUERY]=== > order by d_year, p_brand#0206' and s_region = 'AMERICA'and > User: ADMIN > Success: true > Duration: 28.288 > Project: ssb > Realization Names: [ssb] > Cuboid Ids: [1064] > Total scan count: 2086851 > Result row count: 6553 > Accept Partial: true > Is Partial Result: false > Hit Exception Cache: false > Storage cache used: false > Message: null > ==[QUERY]=== -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1687) Error to select cuboid
[ https://issues.apache.org/jira/browse/KYLIN-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1687. --- Resolution: Cannot Reproduce > Error to select cuboid > -- > > Key: KYLIN-1687 > URL: https://issues.apache.org/jira/browse/KYLIN-1687 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Shaofeng SHI >Assignee: hongbin ma > > Reported by user lancelot chenfrom mailing list: > {code} > Caused by: java.util.NoSuchElementException > at java.util.ArrayList$Itr.next(ArrayList.java:794) > at java.util.Collections.min(Collections.java:665) > at > org.apache.kylin.cube.cuboid.Cuboid.translateToValidCuboid(Cuboid.java:201) > at > org.apache.kylin.cube.cuboid.Cuboid.translateToValidCuboid(Cuboid.java:125) > at org.apache.kylin.cube.cuboid.Cuboid.findById(Cuboid.java:67) > at > org.apache.kylin.storage.hbase.cube.v2.CubeStorageQuery.identifyCuboid(CubeStorageQuery.java:183) > at > org.apache.kylin.storage.hbase.cube.v2.CubeStorageQuery.search(CubeStorageQuery.java:96) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:125) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:71) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:819) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:754) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > at > org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:326) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:281) > at > org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:545) > at > org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:536) > at > org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:187) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:65) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) > at > org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:566) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:578) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:571) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:135) > ... 80 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1913) query log printed abnormally if the query contains "\r" (not "\r\n")
[ https://issues.apache.org/jira/browse/KYLIN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1913. --- Resolution: Fixed > query log printed abnormally if the query contains "\r" (not "\r\n") > - > > Key: KYLIN-1913 > URL: https://issues.apache.org/jira/browse/KYLIN-1913 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.3 >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.6.0 > > > when the sql request is: > {"sql":"select sum(lo_revenue) as lo_revenue, d_year, p_brand\rfrom > v_lineorder\rleft join dates on lo_orderdate = d_datekey\rleft join part on > lo_partkey = p_partkey\rleft join supplier on lo_suppkey = s_suppkey\rwhere > p_category = 'MFGR#0206' and s_region = 'AMERICA'\rgroup by d_year, > p_brand\rorder by d_year, > p_brand\r","offset":0,"limit":5,"acceptPartial":true,"project":"ssb"} > the log output will be: > ==[QUERY]=== > order by d_year, p_brand#0206' and s_region = 'AMERICA'and > User: ADMIN > Success: true > Duration: 28.288 > Project: ssb > Realization Names: [ssb] > Cuboid Ids: [1064] > Total scan count: 2086851 > Result row count: 6553 > Accept Partial: true > Is Partial Result: false > Hit Exception Cache: false > Storage cache used: false > Message: null > ==[QUERY]=== -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1911) NPE when extended column has NULL value
[ https://issues.apache.org/jira/browse/KYLIN-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1911. --- Resolution: Fixed Fix Version/s: v1.6.0 > NPE when extended column has NULL value > --- > > Key: KYLIN-1911 > URL: https://issues.apache.org/jira/browse/KYLIN-1911 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.2, v1.5.2.1 >Reporter: Shaofeng SHI >Assignee: hongbin ma > Fix For: v1.6.0, v1.5.3 > > > {code} > Caused by: java.lang.NullPointerException > at java.lang.String.(String.java:505) > at > org.apache.kylin.measure.extendedcolumn.ExtendedColumnMeasureType$2.reload(ExtendedColumnMeasureType.java:152) > at > org.apache.kylin.storage.hbase.cube.v2.CubeTupleConverter.translateResult(CubeTupleConverter.java:175) > at > org.apache.kylin.storage.hbase.cube.v2.SequentialCubeTupleIterator.hasNext(SequentialCubeTupleIterator.java:116) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:74) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:819) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:754) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1890) support hbase table prefix configurable
[ https://issues.apache.org/jira/browse/KYLIN-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390999#comment-15390999 ] hongbin ma commented on KYLIN-1890: --- I'm okay with it,too. However the patch has to properly address the concerns above. > support hbase table prefix configurable > --- > > Key: KYLIN-1890 > URL: https://issues.apache.org/jira/browse/KYLIN-1890 > Project: Kylin > Issue Type: Improvement > Components: General >Affects Versions: v1.5.2 >Reporter: fengYu >Assignee: fengYu > Attachments: > 0001-KYLIN-1890-support-hbase-table-prefix-configurable.patch > > > some times we need deploy two kylin env based on same hbase, I want to change > hbase table name prefix based two reasons: > 1、different kylin env will generate the same table name > 2、while clean invalid htable for one env will cause delete all tables belong > to another env. > different kylin env use different namespace is acceptable either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1914) Insight page result grid view will eat whitechars
hongbin ma created KYLIN-1914: - Summary: Insight page result grid view will eat whitechars Key: KYLIN-1914 URL: https://issues.apache.org/jira/browse/KYLIN-1914 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: Zhong,Jason Priority: Minor for query "select c_city, s_city, d_year, sum(lo_revenue) as lo_revenue from v_lineorder left join dates on lo_orderdate = d_datekey left join customer on lo_custkey = c_custkey left join supplier on lo_suppkey = s_suppkey where (c_city='CHINA065' or c_city='CHINA123') and (s_city='JAPAN198' or s_city='JAPAN123') and d_yearmonth like '%1997' group by c_city, s_city, d_year order by d_year asc, lo_revenue desc; " the c_city value has four white spaces in the middle, however the result grid will simply display "CHINA 065",trimming three whitespaces. As analylist will copy values from the grid view as the filter for the next query, it might be a issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1913) query log printed abnormally if the query contains "\r" (not "\r\n")
hongbin ma created KYLIN-1913: - Summary: query log printed abnormally if the query contains "\r" (not "\r\n") Key: KYLIN-1913 URL: https://issues.apache.org/jira/browse/KYLIN-1913 Project: Kylin Issue Type: Bug Affects Versions: v1.5.3 Reporter: hongbin ma Assignee: hongbin ma when the sql request is: {"sql":"select sum(lo_revenue) as lo_revenue, d_year, p_brand\rfrom v_lineorder\rleft join dates on lo_orderdate = d_datekey\rleft join part on lo_partkey = p_partkey\rleft join supplier on lo_suppkey = s_suppkey\rwhere p_category = 'MFGR#0206' and s_region = 'AMERICA'\rgroup by d_year, p_brand\rorder by d_year, p_brand\r","offset":0,"limit":5,"acceptPartial":true,"project":"ssb"} the log output will be: ==[QUERY]=== order by d_year, p_brand#0206' and s_region = 'AMERICA'and User: ADMIN Success: true Duration: 28.288 Project: ssb Realization Names: [ssb] Cuboid Ids: [1064] Total scan count: 2086851 Result row count: 6553 Accept Partial: true Is Partial Result: false Hit Exception Cache: false Storage cache used: false Message: null ==[QUERY]=== -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1826) kylin support more than one hive based on different hadoop claster
[ https://issues.apache.org/jira/browse/KYLIN-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387615#comment-15387615 ] hongbin ma commented on KYLIN-1826: --- what's the status now? is anyone reviewing the patch? > kylin support more than one hive based on different hadoop claster > -- > > Key: KYLIN-1826 > URL: https://issues.apache.org/jira/browse/KYLIN-1826 > Project: Kylin > Issue Type: Improvement > Components: Environment >Affects Versions: v1.5.2 >Reporter: fengYu >Assignee: fengYu > Attachments: > 0001-KYLIN-1826-support-more-hive-based-on-different-hado.patch > > > Currently, kylin only support one hive which should run by 'hive' command, > However, when source data located in more than one hive we should deploy more > kylin instance and more than one metastore. which is difficult to manager and > may cause some conflict. > I has been working on it Recently, In our cluster, there are some hive > client(different metastore) which based on different hadoop cluster, I add a > new hive source type which called 'external hive' in kylin 1.5.x > Thanks to kylin Plug-in architecture in 2.x, which make this work easiler. > the main modification are: > 1. add hive root directory in hive config file, external hive client exist in > this directory. hive named by directory name. > 2. add hive-site.xml file while loading hive tables. > 3. store hive name into project, one project can only take one hive as source. > 4. change and add some job to support job building. > I will upload my patch if I finish all my tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1857) Show available memory on UI - in System Tab (and other runtime statistics)
[ https://issues.apache.org/jira/browse/KYLIN-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387605#comment-15387605 ] hongbin ma commented on KYLIN-1857: --- can this issue get solved in the framework of https://issues.apache.org/jira/browse/KYLIN-1908? > Show available memory on UI - in System Tab (and other runtime statistics) > -- > > Key: KYLIN-1857 > URL: https://issues.apache.org/jira/browse/KYLIN-1857 > Project: Kylin > Issue Type: Improvement >Affects Versions: v1.5.2, v1.5.2.1 >Reporter: Richard Calaba >Priority: Minor > > I have run into situation that Kylin dies (exception in log says heap out of > memory) if I try to run 3 parallel cubes with high-cardinality dimensions. It > is reproduceable scenario. I have set max snapshot size to 2GB and -Xmx to > 16GB. > If I run the cube build one-by-one -> Kylin doesn't die. > As we have have no idea about memory requirements before we start building > the cube(s) then for now it would be beneficial at least to monitor basic > Kylin VM statistics, i.e.: > -- current memory occupied by snaphots > -- total memory allocation & total free memory > -- how many (and which) temporary (intermediate) objects (in > hive/hbase/filesystem) are created ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1858) Remove all InvertedIndex(Streaming purpose) related codes and tests
[ https://issues.apache.org/jira/browse/KYLIN-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1858. --- Resolution: Fixed Fix Version/s: v1.6.0 > Remove all InvertedIndex(Streaming purpose) related codes and tests > --- > > Key: KYLIN-1858 > URL: https://issues.apache.org/jira/browse/KYLIN-1858 > Project: Kylin > Issue Type: Improvement >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.6.0 > > > we used to try II for streaming purpose. However it's now not supported any > longer. Removing it from code base to reduce maintenance efforts -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1863) Discard the Jobs while Droping/Purging the Cube
[ https://issues.apache.org/jira/browse/KYLIN-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387589#comment-15387589 ] hongbin ma commented on KYLIN-1863: --- Will it help if we prohibit dropping when there's running/error/pending jobs? > Discard the Jobs while Droping/Purging the Cube > --- > > Key: KYLIN-1863 > URL: https://issues.apache.org/jira/browse/KYLIN-1863 > Project: Kylin > Issue Type: Bug >Reporter: Richard Calaba > Fix For: all, v1.5.2, v1.5.2.1 > > > I have observed that following scenario on UI leaves uncleaned meta-data in > Kylin: > 1) I have an error status job in Monitor for my Cube. I drop the cube from > UI. I still see the error status jobs in Monitor after Dropping the Cube. If > I try to Discard the job -> I am getting NPE. Didn't test the same if Purge > used instead of Drop - but this needs to be checked as well. > 2) Not 100% sure - but I have a feeling that if I Drop cube from UI before > Purging it 1st - some job execution metadata (finished build jobs) stay in > the system ... (intermediate tables/HDFS folders/...). It is hard to find a > prove now when my system is polluted with old job executions. This could be > checked while working on 1) above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1856) Kylin shows old error in job step output after resume - specifically in #4 Step Name: Build Dimension Dictionary
[ https://issues.apache.org/jira/browse/KYLIN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387582#comment-15387582 ] hongbin ma commented on KYLIN-1856: --- any patches for the change? > Kylin shows old error in job step output after resume - specifically in #4 > Step Name: Build Dimension Dictionary > > > Key: KYLIN-1856 > URL: https://issues.apache.org/jira/browse/KYLIN-1856 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.2, v1.5.2.1 >Reporter: Richard Calaba >Priority: Minor > > I have realized that if my job stops with error and I try to recover the > error and resume the job - then the latest step starts again from scratch. > This is fine but in my opinion the log of the Step should clear as well - now > it is showing the error from my previous attempt. > Specifically observed in #4 Step Name: Build Dimension Dictionary - but is > probbaly generic issue. > To correct this: clear the log of the Build Step after the job Step is > resumed. Already when the job step is restarted, not after it is completed. > (if Kylin fails i.e. for out of memory - it silently dies and analyzing the > step log shows wrong error (from previous run) - if it would be empty -> I > would know that most probable cause was that Kylin died) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX
[ https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387428#comment-15387428 ] hongbin ma commented on KYLIN-1908: --- I think it's doable, [~kangkaisen] do you have time to submit a patch to establish the framework? > Collect Metrics to JMX > -- > > Key: KYLIN-1908 > URL: https://issues.apache.org/jira/browse/KYLIN-1908 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test >Affects Versions: v1.5.2 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: QueryMetrics.java > > > As we all known, some performance metrics is important for enterprise > applications. so we should support to collect metrics to JMX in Kylin. > The method I have done is As shown below: > 1. use `org.apache.hadoop.metrics2` as the metrics collection framework. > 2. define MBean Class for the metrics that we need to collect. > 3. update metrics in right place. > The questions I have: > 1. can I depend on `org.apache.hadoop.metrics2` directly? > 2. how do you think about my method? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1907) Missing items in copyUnChangedProperties during upgrading
[ https://issues.apache.org/jira/browse/KYLIN-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387319#comment-15387319 ] hongbin ma commented on KYLIN-1907: --- the patch looks good, please go ahead to merge. the setter is private because the only valid entry for setting cube-level properties is through metadata changing (from GUI or REST) > Missing items in copyUnChangedProperties during upgrading > - > > Key: KYLIN-1907 > URL: https://issues.apache.org/jira/browse/KYLIN-1907 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > Attachments: > add_missing_items_in_copyUnChangedProperties_during_upgrading.patch > > > Missing items: statusNeedNotify, partitionDateStart, partitionDateEnd, > autoMergeTimeRanges, retentionRange -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1788) Allow arbitrary number of mandatory dimensions in one aggregation group
[ https://issues.apache.org/jira/browse/KYLIN-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma resolved KYLIN-1788. --- Resolution: Fixed Fix Version/s: v1.6.0 > Allow arbitrary number of mandatory dimensions in one aggregation group > > > Key: KYLIN-1788 > URL: https://issues.apache.org/jira/browse/KYLIN-1788 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.2 >Reporter: hongbin ma >Assignee: hongbin ma > Fix For: v1.6.0 > > Attachments: 0001-kylin-1788-enable-arbitrary-mandatory-size.patch > > > To prevent one aggregation group containing too many combinations we apply a > check > {code:java} > if (mandatoryDims.size() + normalDimSize + hierarchySize + jointSize > > maxSize) { > context.addResult(ResultLevel.ERROR, "Aggregation group " + > index + " has too many dimensions"); > continue; > } > {code} > however the formular fails to take into account the case where there're many > mandatory dimensions. For example, if we have 50 dimensions in a cube and we > only need the base cuboid, then what we want is a single aggregation group > containing all the dimensions, each of them being a mandatory. > since mandatory dimensions are more "encouraged", I suggest to remove > counting mandatory dimensions in the formula. the revised code will be: > {code:java} > if (normalDimSize + hierarchySize + jointSize > maxSize) { > context.addResult(ResultLevel.ERROR, "Aggregation group " + > index + " has too many dimensions"); > continue; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1788) Allow arbitrary number of mandatory dimensions in one aggregation group
[ https://issues.apache.org/jira/browse/KYLIN-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387310#comment-15387310 ] hongbin ma commented on KYLIN-1788: --- merged in master > Allow arbitrary number of mandatory dimensions in one aggregation group > > > Key: KYLIN-1788 > URL: https://issues.apache.org/jira/browse/KYLIN-1788 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.2 >Reporter: hongbin ma >Assignee: hongbin ma > Attachments: 0001-kylin-1788-enable-arbitrary-mandatory-size.patch > > > To prevent one aggregation group containing too many combinations we apply a > check > {code:java} > if (mandatoryDims.size() + normalDimSize + hierarchySize + jointSize > > maxSize) { > context.addResult(ResultLevel.ERROR, "Aggregation group " + > index + " has too many dimensions"); > continue; > } > {code} > however the formular fails to take into account the case where there're many > mandatory dimensions. For example, if we have 50 dimensions in a cube and we > only need the base cuboid, then what we want is a single aggregation group > containing all the dimensions, each of them being a mandatory. > since mandatory dimensions are more "encouraged", I suggest to remove > counting mandatory dimensions in the formula. the revised code will be: > {code:java} > if (normalDimSize + hierarchySize + jointSize > maxSize) { > context.addResult(ResultLevel.ERROR, "Aggregation group " + > index + " has too many dimensions"); > continue; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX
[ https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387283#comment-15387283 ] hongbin ma commented on KYLIN-1908: --- it depends what kind of metrics we're looking for. [~kangkaisen] can you give us a list of examples? > Collect Metrics to JMX > -- > > Key: KYLIN-1908 > URL: https://issues.apache.org/jira/browse/KYLIN-1908 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test >Affects Versions: v1.5.2 >Reporter: kangkaisen >Assignee: kangkaisen > > As we all known, some performance metrics is important for enterprise > applications. so we should support to collect metrics to JMX in Kylin. > The method I have done is As shown below: > 1. use `org.apache.hadoop.metrics2` as the metrics collection framework. > 2. define MBean Class for the metrics that we need to collect. > 3. update metrics in right place. > The questions I have: > 1. can I depend on `org.apache.hadoop.metrics2` directly? > 2. how do you think about my method? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1876) Star Schema also supports more than one fact table
[ https://issues.apache.org/jira/browse/KYLIN-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385624#comment-15385624 ] hongbin ma commented on KYLIN-1876: --- Hi, how many fact tables are supposed to be there in a star schema? can you provide more reading materials? > Star Schema also supports more than one fact table > -- > > Key: KYLIN-1876 > URL: https://issues.apache.org/jira/browse/KYLIN-1876 > Project: Kylin > Issue Type: Bug >Reporter: Rahul Choubey > > As per the document of Apache Kylin, it supports Star Schema but in the Kylin > we have the option to select only one Fact Table. Is there is any planning to > support more than one Fact Table.? -- This message was sent by Atlassian JIRA (v6.3.4#6332)