[jira] [Updated] (KYLIN-5404) Prefer to use common standard dependencies rather than self-maintained ones
[ https://issues.apache.org/jira/browse/KYLIN-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5404: -- Description: Better to use common standard dependencies rather than self-maintained ones: - Spark - Calcite - Google guava > Prefer to use common standard dependencies rather than self-maintained ones > --- > > Key: KYLIN-5404 > URL: https://issues.apache.org/jira/browse/KYLIN-5404 > Project: Kylin > Issue Type: Task >Reporter: Zhong Yanghong >Priority: Major > > Better to use common standard dependencies rather than self-maintained ones: > - Spark > - Calcite > - Google guava -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5404) Prefer to use common standard dependencies rather than self-maintained ones
[ https://issues.apache.org/jira/browse/KYLIN-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5404: -- Description: Better to use common standard dependencies rather than self-maintained ones: - Spark - Calcite - Google guava - Spring session was: Better to use common standard dependencies rather than self-maintained ones: - Spark - Calcite - Google guava > Prefer to use common standard dependencies rather than self-maintained ones > --- > > Key: KYLIN-5404 > URL: https://issues.apache.org/jira/browse/KYLIN-5404 > Project: Kylin > Issue Type: Task >Reporter: Zhong Yanghong >Priority: Major > > Better to use common standard dependencies rather than self-maintained ones: > - Spark > - Calcite > - Google guava > - Spring session -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5404) Prefer to use common standard dependencies rather than self-maintained ones
Zhong Yanghong created KYLIN-5404: - Summary: Prefer to use common standard dependencies rather than self-maintained ones Key: KYLIN-5404 URL: https://issues.apache.org/jira/browse/KYLIN-5404 Project: Kylin Issue Type: Task Reporter: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join
[ https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5321: -- Labels: 5.0.0-alpha (was: ) > Don't make the unprecomputed inner join defined in the model fail the match > of the query without that join > --- > > Key: KYLIN-5321 > URL: https://issues.apache.org/jira/browse/KYLIN-5321 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Reporter: Zhong Yanghong >Priority: Major > Labels: 5.0.0-alpha > > Given a model, *A inner join B && B inner join C*, in which *B inner join C* > is unprecomputed. > This model should be able to match the queries *A inner join B* no matter if > *inner-partial-match-join* is true or false. When *inner-partial-match-join* > is false, then this model should not match *from A* only. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join
[ https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5321: -- Description: Given a model, *A inner join B && B inner join C*, in which *B inner join C* is unprecomputed. This model should be able to match the queries *A inner join B* no matter if *inner-partial-match-join* is true or false. When *inner-partial-match-join* is false, then this model should not match *from A* only. was: Given a model, *A inner join B && B inner join C*, in which *B inner join C* is unprecomputed. This model should be able to match the queries *A inner join B* no matter if *inner-partial-match-join* is true or false. > Don't make the unprecomputed inner join defined in the model fail the match > of the query without that join > --- > > Key: KYLIN-5321 > URL: https://issues.apache.org/jira/browse/KYLIN-5321 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Reporter: Zhong Yanghong >Priority: Major > > Given a model, *A inner join B && B inner join C*, in which *B inner join C* > is unprecomputed. > This model should be able to match the queries *A inner join B* no matter if > *inner-partial-match-join* is true or false. When *inner-partial-match-join* > is false, then this model should not match *from A* only. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join
[ https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5321: -- Description: Given a model, *A inner join B && B inner join C*, in which *B inner join C* is unprecomputed. This model should be able to match the queries *A inner join B* no matter if *inner-partial-match-join* is true or false. was: Given a model, *A inner join B && B inner join C*, in which *B inner join C* is unprecomputed. This model should be able to match the queries *A inner join B*. > Don't make the unprecomputed inner join defined in the model fail the match > of the query without that join > --- > > Key: KYLIN-5321 > URL: https://issues.apache.org/jira/browse/KYLIN-5321 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Reporter: Zhong Yanghong >Priority: Major > > Given a model, *A inner join B && B inner join C*, in which *B inner join C* > is unprecomputed. > This model should be able to match the queries *A inner join B* no matter if > *inner-partial-match-join* is true or false. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join
[ https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5321: -- Description: Given a model, *A inner join B && B inner join C*, in which *B inner join C* is unprecomputed. This model should be able to match the queries *A inner join B*. > Don't make the unprecomputed inner join defined in the model fail the match > of the query without that join > --- > > Key: KYLIN-5321 > URL: https://issues.apache.org/jira/browse/KYLIN-5321 > Project: Kylin > Issue Type: Improvement > Components: Query Engine >Reporter: Zhong Yanghong >Priority: Major > > Given a model, *A inner join B && B inner join C*, in which *B inner join C* > is unprecomputed. > This model should be able to match the queries *A inner join B*. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join
Zhong Yanghong created KYLIN-5321: - Summary: Don't make the unprecomputed inner join defined in the model fail the match of the query without that join Key: KYLIN-5321 URL: https://issues.apache.org/jira/browse/KYLIN-5321 Project: Kylin Issue Type: Improvement Components: Query Engine Reporter: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5307) [kylin5] Distinguish sum(1) and count(*)
[ https://issues.apache.org/jira/browse/KYLIN-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5307: -- Labels: 5.0.0-alpha (was: ) > [kylin5] Distinguish sum(1) and count(*) > > > Key: KYLIN-5307 > URL: https://issues.apache.org/jira/browse/KYLIN-5307 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Priority: Major > Labels: 5.0.0-alpha > > In case there's no null values for column LO_ORDERDATE. > The following SQL should return null > {code} > select sum(1) > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > where LO_ORDERDATE is null > {code} > while the following SQL should return 0 > {code} > select count(*) > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > where LO_ORDERDATE is null > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5306) [kylin5] Allow more inner join keys in sql than the model
[ https://issues.apache.org/jira/browse/KYLIN-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5306: -- Labels: 5.0.0-alpha (was: ) > [kylin5] Allow more inner join keys in sql than the model > - > > Key: KYLIN-5306 > URL: https://issues.apache.org/jira/browse/KYLIN-5306 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Priority: Major > Labels: 5.0.0-alpha > > The join in model defined as follows: > {code} > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > {code} > The join in SQL is as follows: > {code} > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY and LO_SHIPMODE = C_NATION > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > {code} > Ideally, the SQL can be transferred as > {code} > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > where LO_SHIPMODE = C_NATION > {code} > so that the model will be able to match the SQL. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5307) [kylin5] Distinguish sum(1) and count(*)
[ https://issues.apache.org/jira/browse/KYLIN-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5307: -- Description: In case there's no null values for column LO_ORDERDATE. The following SQL should return null {code} select sum(1) from LINEORDER INNER JOIN CUSTOMER ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY INNER JOIN SUPPLIER ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY INNER JOIN PART ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY INNER JOIN DATES ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY where LO_ORDERDATE is null {code} while the following SQL should return 0 {code} select count(*) from LINEORDER INNER JOIN CUSTOMER ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY INNER JOIN SUPPLIER ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY INNER JOIN PART ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY INNER JOIN DATES ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY where LO_ORDERDATE is null {code} > [kylin5] Distinguish sum(1) and count(*) > > > Key: KYLIN-5307 > URL: https://issues.apache.org/jira/browse/KYLIN-5307 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Priority: Major > > In case there's no null values for column LO_ORDERDATE. > The following SQL should return null > {code} > select sum(1) > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > where LO_ORDERDATE is null > {code} > while the following SQL should return 0 > {code} > select count(*) > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > where LO_ORDERDATE is null > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5307) [kylin5] Distinguish sum(1) and count(*)
Zhong Yanghong created KYLIN-5307: - Summary: [kylin5] Distinguish sum(1) and count(*) Key: KYLIN-5307 URL: https://issues.apache.org/jira/browse/KYLIN-5307 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KYLIN-5306) [kylin5] Allow more inner join keys in sql than the model
[ https://issues.apache.org/jira/browse/KYLIN-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-5306: -- Description: The join in model defined as follows: {code} from LINEORDER INNER JOIN CUSTOMER ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY INNER JOIN SUPPLIER ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY INNER JOIN PART ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY INNER JOIN DATES ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY {code} The join in SQL is as follows: {code} from LINEORDER INNER JOIN CUSTOMER ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY and LO_SHIPMODE = C_NATION INNER JOIN SUPPLIER ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY INNER JOIN PART ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY INNER JOIN DATES ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY {code} Ideally, the SQL can be transferred as {code} from LINEORDER INNER JOIN CUSTOMER ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY INNER JOIN SUPPLIER ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY INNER JOIN PART ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY INNER JOIN DATES ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY where LO_SHIPMODE = C_NATION {code} so that the model will be able to match the SQL. > [kylin5] Allow more inner join keys in sql than the model > - > > Key: KYLIN-5306 > URL: https://issues.apache.org/jira/browse/KYLIN-5306 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Priority: Major > > The join in model defined as follows: > {code} > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > {code} > The join in SQL is as follows: > {code} > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY and LO_SHIPMODE = C_NATION > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > {code} > Ideally, the SQL can be transferred as > {code} > from LINEORDER > INNER JOIN CUSTOMER > ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY > INNER JOIN SUPPLIER > ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY > INNER JOIN PART > ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY > INNER JOIN DATES > ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY > where LO_SHIPMODE = C_NATION > {code} > so that the model will be able to match the SQL. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5306) [kylin5] Allow more inner join keys in sql than the model
Zhong Yanghong created KYLIN-5306: - Summary: [kylin5] Allow more inner join keys in sql than the model Key: KYLIN-5306 URL: https://issues.apache.org/jira/browse/KYLIN-5306 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KYLIN-5274) Improve performance of getSubstitutor
[ https://issues.apache.org/jira/browse/KYLIN-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17614454#comment-17614454 ] Zhong Yanghong commented on KYLIN-5274: --- Thanks [~xxyu] for providing the code of micro-benchmark (y) > Improve performance of getSubstitutor > - > > Key: KYLIN-5274 > URL: https://issues.apache.org/jira/browse/KYLIN-5274 > Project: Kylin > Issue Type: Improvement >Reporter: Xiaoxiang Yu >Assignee: Zhong Yanghong >Priority: Major > Fix For: 5.0-alpha > > Attachments: image-2022-10-08-17-35-34-500.png, > image-2022-10-08-17-50-07-929.png > > > h3. Background > The following code are called for each call of *_KylinConfig#getOptional_* , > which needs to be optimized. In some case, it will improve query performance. > > {code:java} > tected StrSubstitutor getSubstitutor() { > // env > properties > final Map all = Maps.newHashMap(); // create a new map > every time > all.putAll((Map) properties); > all.putAll(STATIC_SYSTEM_ENV); > return new StrSubstitutor(all); > } {code} > > > h3. How to fix > 1. Not to create a new map each time. > 2. Not to use Properties because it extends _Hashtable._ > h3. Micro Benchmark > Use JMH to show performance(avg time): > > {code:java} > import org.apache.kylin.common.util.NLocalFileMetadataTestCase; > import org.openjdk.jmh.annotations.Benchmark; > import org.openjdk.jmh.annotations.BenchmarkMode; > import org.openjdk.jmh.annotations.Fork; > import org.openjdk.jmh.annotations.Measurement; > import org.openjdk.jmh.annotations.Mode; > import org.openjdk.jmh.annotations.OutputTimeUnit; > import org.openjdk.jmh.annotations.Scope; > import org.openjdk.jmh.annotations.Setup; > import org.openjdk.jmh.annotations.State; > import org.openjdk.jmh.annotations.Threads; > import org.openjdk.jmh.annotations.Warmup; > import java.util.concurrent.TimeUnit; > @BenchmarkMode(Mode.AverageTime) > @OutputTimeUnit(TimeUnit.MILLISECONDS) > @Warmup(iterations = 1) > @Measurement(iterations = 10, time = 10, timeUnit = TimeUnit.MILLISECONDS) > @Threads(1) > @Fork(value = 1, jvmArgs = {"-Xms2G", "-Xmx2G"}) > @State(Scope.Benchmark) > public class KylinConfigBenchmark { > @Setup > public void setUp() throws Exception { > NLocalFileMetadataTestCase case1 = new NLocalFileMetadataTestCase(); > case1.createTestMetadata(); > } > @Benchmark > public void getProperty() { > KylinConfig config = KylinConfig.getInstanceFromEnv(); > for(int i = 0; i<= 1000_000; i++ ){ > config.getJdbcDriverClass(); > } > } > public static void main(String[] args) throws Exception { > org.openjdk.jmh.Main.main(args); > } > } {code} > > > h4. Before Applied > !image-2022-10-08-17-35-34-500.png! > > h4. After Applied > !image-2022-10-08-17-50-07-929.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup
[ https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533639#comment-17533639 ] Zhong Yanghong commented on KYLIN-3430: --- The following code: {code} Set activeResources = Sets.newHashSet(); for (CubeInstance cube : cubeManager.reloadAndListAllCubes()) { activeResources.addAll(cube.getSnapshots().values()); for (CubeSegment segment : cube.getSegments()) { activeResources.addAll(segment.getSnapshotPaths()); activeResources.addAll(segment.getDictionaryPaths()); activeResources.add(segment.getStatisticsResourcePath()); for (String dictPath : segment.getDictionaryPaths()) { DictionaryInfo dictInfo = store.getResource(dictPath, DictionaryInfoSerializer.FULL_SERIALIZER); if ("org.apache.kylin.dict.AppendTrieDictionary" .equals(dictInfo != null ? dictInfo.getDictionaryClass() : null)) { {code} will make it very slow to do the clean up, since we have to load every dictionaries. > Global Dictionary Cleanup > - > > Key: KYLIN-3430 > URL: https://issues.apache.org/jira/browse/KYLIN-3430 > Project: Kylin > Issue Type: Improvement > Components: Tools, Build and Test >Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0 >Reporter: Temple Zhou >Assignee: Temple Zhou >Priority: Major > Fix For: v2.6.0 > > Attachments: KYLIN-3430.master.001.patch > > > I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin > metadata, but, after that, the Global Dictionary still exists in my HDFS and > the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't > shrunk.}} > > {{BTW: I'm very sure that there are redundant Global Dictionaries.}} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (KYLIN-5119) kylin-native branch for next generation development
Zhong Yanghong created KYLIN-5119: - Summary: kylin-native branch for next generation development Key: KYLIN-5119 URL: https://issues.apache.org/jira/browse/KYLIN-5119 Project: Kylin Issue Type: Task Reporter: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (KYLIN-4985) optimize kylin planner by delete unnecessary cuboids
[ https://issues.apache.org/jira/browse/KYLIN-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370190#comment-17370190 ] Zhong Yanghong edited comment on KYLIN-4985 at 6/27/21, 10:37 AM: -- Hi [~tianhui5], the cuboids recommended by cube planner algorithms is redundant. The redundancy is controlled by *kylin.cube.cubeplanner.expansion-threshold*. One more thing for "get many cuboids in recommand result that never hitted by my history queries". If cuboid A is the parent of cuboid B, and their row account are similar, even when your history queries always hit cuboid B, Kylin should prefer to choose cuboid A to be built. For the weighting change, could you explain more about the mathematical theory? At first glance, it's not follow monotonicity of the probability. was (Author: yaho): Hi [~tianhui5], the cuboids recommended by cube planner algorithms is redundant. The redundancy is controlled by *kylin.cube.cubeplanner.expansion-threshold*. One more thing for "get many cuboids in recommand result that never hitted by my history queries". If cuboid A is the parent of cuboid B, and their row account are similar, even when your history queries always hit cuboid B, Kylin should prefer to choose cuboid A to be built. For the weighting change, could you explain more about the mathematical theory? At first glance, it's not follow monotonicity. > optimize kylin planner by delete unnecessary cuboids > > > Key: KYLIN-4985 > URL: https://issues.apache.org/jira/browse/KYLIN-4985 > Project: Kylin > Issue Type: New Feature >Reporter: tianhui >Priority: Major > > When I use Kylin Planner, I can get many cuboids in recommand result that > never hitted by my history queries. I think it maybe unnecessary, so I delete > the unhitted cuboids. > In addition, I change row count by weighting of 1/sqrt(hit probability) > before execute plan algorithm. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4985) optimize kylin planner by delete unnecessary cuboids
[ https://issues.apache.org/jira/browse/KYLIN-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370190#comment-17370190 ] Zhong Yanghong commented on KYLIN-4985: --- Hi [~tianhui5], the cuboids recommended by cube planner algorithms is redundant. The redundancy is controlled by *kylin.cube.cubeplanner.expansion-threshold*. One more thing for "get many cuboids in recommand result that never hitted by my history queries". If cuboid A is the parent of cuboid B, and their row account are similar, even when your history queries always hit cuboid B, Kylin should prefer to choose cuboid A to be built. For the weighting change, could you explain more about the mathematical theory? At first glance, it's not follow monotonicity. > optimize kylin planner by delete unnecessary cuboids > > > Key: KYLIN-4985 > URL: https://issues.apache.org/jira/browse/KYLIN-4985 > Project: Kylin > Issue Type: New Feature >Reporter: tianhui >Priority: Major > > When I use Kylin Planner, I can get many cuboids in recommand result that > never hitted by my history queries. I think it maybe unnecessary, so I delete > the unhitted cuboids. > In addition, I change row count by weighting of 1/sqrt(hit probability) > before execute plan algorithm. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error
[ https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368010#comment-17368010 ] Zhong Yanghong edited comment on KYLIN-4165 at 6/23/21, 10:17 AM: -- If the first step errors, why should the job still keeps the lock? was (Author: yaho): Why we need a distributed lock for two stages, which may introduce other issues? For example, when the first step errors due to that cube is disabled, the lock should be released. Currently only when job is discarded, the lock will be released. How about fixing it just in *SaveDictStep*? > RT OLAP building job on "Save Cube Dictionaries" step concurrency error > --- > > Key: KYLIN-4165 > URL: https://issues.apache.org/jira/browse/KYLIN-4165 > Project: Kylin > Issue Type: Bug > Components: Real-time Streaming >Affects Versions: v3.0.0-alpha >Reporter: wangxiaojing >Priority: Major > Fix For: v3.0.0 > > > There is a dictionary version conflict in "Save Cube Dictionaries" step when > build the realtime fsegment from remote persisted to reday,Which is very > serious,it will lead to unsuccessful updating of dictionaries by multiple > jobs concurrently.This may occurs when a cube has many concurrent building > jobs one the same step ——”Save Cube Dictionaries“ . > Perhaps a globally distributed lock is needed to avoid one cube concurrency > running of this step . > Save Cube Dictionaries log messages: > {code:java} > // code placeholder > org.apache.kylin.common.persistence.WriteConflictException: Overwriting > conflict > /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, > expect old TS 1568012509090, but it is 1568012509245at > org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) > at > org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) > at > org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) > at > org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) > at > org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) > at > org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error
[ https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4165: -- Comment: was deleted (was: Why we need a cube-level lock for this, since the dictionary is segment-level?) > RT OLAP building job on "Save Cube Dictionaries" step concurrency error > --- > > Key: KYLIN-4165 > URL: https://issues.apache.org/jira/browse/KYLIN-4165 > Project: Kylin > Issue Type: Bug > Components: Real-time Streaming >Affects Versions: v3.0.0-alpha >Reporter: wangxiaojing >Priority: Major > Fix For: v3.0.0 > > > There is a dictionary version conflict in "Save Cube Dictionaries" step when > build the realtime fsegment from remote persisted to reday,Which is very > serious,it will lead to unsuccessful updating of dictionaries by multiple > jobs concurrently.This may occurs when a cube has many concurrent building > jobs one the same step ——”Save Cube Dictionaries“ . > Perhaps a globally distributed lock is needed to avoid one cube concurrency > running of this step . > Save Cube Dictionaries log messages: > {code:java} > // code placeholder > org.apache.kylin.common.persistence.WriteConflictException: Overwriting > conflict > /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, > expect old TS 1568012509090, but it is 1568012509245at > org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) > at > org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) > at > org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) > at > org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) > at > org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) > at > org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error
[ https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368010#comment-17368010 ] Zhong Yanghong edited comment on KYLIN-4165 at 6/23/21, 10:06 AM: -- Why we need a distributed lock for two stages, which may introduce other issues? For example, when the first step errors due to that cube is disabled, the lock should be released. Currently only when job is discarded, the lock will be released. How about fixing it just in *SaveDictStep*? was (Author: yaho): Why we need a distributed lock for two stages, which may introduce other issues? How about fixing it just in *SaveDictStep*? > RT OLAP building job on "Save Cube Dictionaries" step concurrency error > --- > > Key: KYLIN-4165 > URL: https://issues.apache.org/jira/browse/KYLIN-4165 > Project: Kylin > Issue Type: Bug > Components: Real-time Streaming >Affects Versions: v3.0.0-alpha >Reporter: wangxiaojing >Priority: Major > Fix For: v3.0.0 > > > There is a dictionary version conflict in "Save Cube Dictionaries" step when > build the realtime fsegment from remote persisted to reday,Which is very > serious,it will lead to unsuccessful updating of dictionaries by multiple > jobs concurrently.This may occurs when a cube has many concurrent building > jobs one the same step ——”Save Cube Dictionaries“ . > Perhaps a globally distributed lock is needed to avoid one cube concurrency > running of this step . > Save Cube Dictionaries log messages: > {code:java} > // code placeholder > org.apache.kylin.common.persistence.WriteConflictException: Overwriting > conflict > /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, > expect old TS 1568012509090, but it is 1568012509245at > org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) > at > org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) > at > org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) > at > org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) > at > org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) > at > org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error
[ https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368010#comment-17368010 ] Zhong Yanghong commented on KYLIN-4165: --- Why we need a distributed lock for two stages, which may introduce other issues? How about fixing it just in *SaveDictStep*? > RT OLAP building job on "Save Cube Dictionaries" step concurrency error > --- > > Key: KYLIN-4165 > URL: https://issues.apache.org/jira/browse/KYLIN-4165 > Project: Kylin > Issue Type: Bug > Components: Real-time Streaming >Affects Versions: v3.0.0-alpha >Reporter: wangxiaojing >Priority: Major > Fix For: v3.0.0 > > > There is a dictionary version conflict in "Save Cube Dictionaries" step when > build the realtime fsegment from remote persisted to reday,Which is very > serious,it will lead to unsuccessful updating of dictionaries by multiple > jobs concurrently.This may occurs when a cube has many concurrent building > jobs one the same step ——”Save Cube Dictionaries“ . > Perhaps a globally distributed lock is needed to avoid one cube concurrency > running of this step . > Save Cube Dictionaries log messages: > {code:java} > // code placeholder > org.apache.kylin.common.persistence.WriteConflictException: Overwriting > conflict > /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, > expect old TS 1568012509090, but it is 1568012509245at > org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) > at > org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) > at > org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) > at > org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) > at > org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) > at > org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error
[ https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368007#comment-17368007 ] Zhong Yanghong edited comment on KYLIN-4165 at 6/23/21, 10:00 AM: -- Why we need a cube-level lock for this, since the dictionary is segment-level? was (Author: yaho): Why we need a cube-level lock for this, since the dictionary is segment-level. > RT OLAP building job on "Save Cube Dictionaries" step concurrency error > --- > > Key: KYLIN-4165 > URL: https://issues.apache.org/jira/browse/KYLIN-4165 > Project: Kylin > Issue Type: Bug > Components: Real-time Streaming >Affects Versions: v3.0.0-alpha >Reporter: wangxiaojing >Priority: Major > Fix For: v3.0.0 > > > There is a dictionary version conflict in "Save Cube Dictionaries" step when > build the realtime fsegment from remote persisted to reday,Which is very > serious,it will lead to unsuccessful updating of dictionaries by multiple > jobs concurrently.This may occurs when a cube has many concurrent building > jobs one the same step ——”Save Cube Dictionaries“ . > Perhaps a globally distributed lock is needed to avoid one cube concurrency > running of this step . > Save Cube Dictionaries log messages: > {code:java} > // code placeholder > org.apache.kylin.common.persistence.WriteConflictException: Overwriting > conflict > /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, > expect old TS 1568012509090, but it is 1568012509245at > org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) > at > org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) > at > org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) > at > org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) > at > org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) > at > org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error
[ https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368007#comment-17368007 ] Zhong Yanghong commented on KYLIN-4165: --- Why we need a cube-level lock for this, since the dictionary is segment-level. > RT OLAP building job on "Save Cube Dictionaries" step concurrency error > --- > > Key: KYLIN-4165 > URL: https://issues.apache.org/jira/browse/KYLIN-4165 > Project: Kylin > Issue Type: Bug > Components: Real-time Streaming >Affects Versions: v3.0.0-alpha >Reporter: wangxiaojing >Priority: Major > Fix For: v3.0.0 > > > There is a dictionary version conflict in "Save Cube Dictionaries" step when > build the realtime fsegment from remote persisted to reday,Which is very > serious,it will lead to unsuccessful updating of dictionaries by multiple > jobs concurrently.This may occurs when a cube has many concurrent building > jobs one the same step ——”Save Cube Dictionaries“ . > Perhaps a globally distributed lock is needed to avoid one cube concurrency > running of this step . > Save Cube Dictionaries log messages: > {code:java} > // code placeholder > org.apache.kylin.common.persistence.WriteConflictException: Overwriting > conflict > /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, > expect old TS 1568012509090, but it is 1568012509245at > org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) > at > org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) > at > org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) > at > org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) > at > org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) > at > org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error
[ https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reopened KYLIN-4165: --- > RT OLAP building job on "Save Cube Dictionaries" step concurrency error > --- > > Key: KYLIN-4165 > URL: https://issues.apache.org/jira/browse/KYLIN-4165 > Project: Kylin > Issue Type: Bug > Components: Real-time Streaming >Affects Versions: v3.0.0-alpha >Reporter: wangxiaojing >Priority: Major > Fix For: v3.0.0 > > > There is a dictionary version conflict in "Save Cube Dictionaries" step when > build the realtime fsegment from remote persisted to reday,Which is very > serious,it will lead to unsuccessful updating of dictionaries by multiple > jobs concurrently.This may occurs when a cube has many concurrent building > jobs one the same step ——”Save Cube Dictionaries“ . > Perhaps a globally distributed lock is needed to avoid one cube concurrency > running of this step . > Save Cube Dictionaries log messages: > {code:java} > // code placeholder > org.apache.kylin.common.persistence.WriteConflictException: Overwriting > conflict > /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict, > expect old TS 1568012509090, but it is 1568012509245at > org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372) > at > org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465) > at > org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457) > at > org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452) > at > org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197) > at > org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157) > at > org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) > at > org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper
[ https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4992: -- Affects Version/s: v3.0.0 v2.6.2 v2.6.3 v3.1.0 v3.0.1 v3.0.2 v3.1.1 v3.1.2 > Source row count statistics calculated in a wrong way in MergeDictionaryMapper > -- > > Key: KYLIN-4992 > URL: https://issues.apache.org/jira/browse/KYLIN-4992 > Project: Kylin > Issue Type: Bug >Affects Versions: v3.0.0, v2.6.2, v2.6.3, v3.1.0, v3.0.1, v3.0.2, v3.1.1, > v3.1.2 >Reporter: Zhong Yanghong >Priority: Critical > > With this bug, source row count will be smaller than the correct one and it > will result in smaller cuboid size estimation and smaller region number. > Finally it will impact job and query performance. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper
[ https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4992: -- Description: With this bug, source row count will be smaller than the correct one and it will result in smaller cuboid size estimation and smaller region number. Finally it will impact job and query performance. (was: With this bug, source row count will be smaller than the correct one and it will result in smaller cuboid size estimation and smaller region number.) > Source row count statistics calculated in a wrong way in MergeDictionaryMapper > -- > > Key: KYLIN-4992 > URL: https://issues.apache.org/jira/browse/KYLIN-4992 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Priority: Major > > With this bug, source row count will be smaller than the correct one and it > will result in smaller cuboid size estimation and smaller region number. > Finally it will impact job and query performance. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper
[ https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4992: -- Priority: Critical (was: Major) > Source row count statistics calculated in a wrong way in MergeDictionaryMapper > -- > > Key: KYLIN-4992 > URL: https://issues.apache.org/jira/browse/KYLIN-4992 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Priority: Critical > > With this bug, source row count will be smaller than the correct one and it > will result in smaller cuboid size estimation and smaller region number. > Finally it will impact job and query performance. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper
[ https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4992: -- Description: With this bug, source row count will be smaller than the correct one and it will result in smaller cuboid size estimation and smaller region number. > Source row count statistics calculated in a wrong way in MergeDictionaryMapper > -- > > Key: KYLIN-4992 > URL: https://issues.apache.org/jira/browse/KYLIN-4992 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Priority: Major > > With this bug, source row count will be smaller than the correct one and it > will result in smaller cuboid size estimation and smaller region number. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper
Zhong Yanghong created KYLIN-4992: - Summary: Source row count statistics calculated in a wrong way in MergeDictionaryMapper Key: KYLIN-4992 URL: https://issues.apache.org/jira/browse/KYLIN-4992 Project: Kylin Issue Type: Bug Reporter: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4861) Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()
[ https://issues.apache.org/jira/browse/KYLIN-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4861: - Assignee: (was: Zhong Yanghong) > Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite() > -- > > Key: KYLIN-4861 > URL: https://issues.apache.org/jira/browse/KYLIN-4861 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Priority: Major > > Each cube can have its own KylinConfig. Then for the following code: > {code} > public CubeInstance latestCopyForWrite() { > CubeManager mgr = CubeManager.getInstance(config); > CubeInstance latest = mgr.getCube(name); // in case this object is > out-of-date > return mgr.copyForWrite(latest); > } > {code} > Each cube can have a different CubeManager instance, which may easily cause > map consistency issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4861) Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()
[ https://issues.apache.org/jira/browse/KYLIN-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4861: -- Description: Each cube can have its own KylinConfig. Then for the following code: {code} public CubeInstance latestCopyForWrite() { CubeManager mgr = CubeManager.getInstance(config); CubeInstance latest = mgr.getCube(name); // in case this object is out-of-date return mgr.copyForWrite(latest); } {code} Each cube can have a different CubeManager instance, which may easily cause map consistency issue. > Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite() > -- > > Key: KYLIN-4861 > URL: https://issues.apache.org/jira/browse/KYLIN-4861 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Each cube can have its own KylinConfig. Then for the following code: > {code} > public CubeInstance latestCopyForWrite() { > CubeManager mgr = CubeManager.getInstance(config); > CubeInstance latest = mgr.getCube(name); // in case this object is > out-of-date > return mgr.copyForWrite(latest); > } > {code} > Each cube can have a different CubeManager instance, which may easily cause > map consistency issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4861) Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()
Zhong Yanghong created KYLIN-4861: - Summary: Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite() Key: KYLIN-4861 URL: https://issues.apache.org/jira/browse/KYLIN-4861 Project: Kylin Issue Type: Bug Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on
[ https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong resolved KYLIN-4658. --- Fix Version/s: v3.1.2 Resolution: Fixed > Union all issue with regarding to windows function & aggregation on > > > Key: KYLIN-4658 > URL: https://issues.apache.org/jira/browse/KYLIN-4658 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Fix For: v3.1.2 > > > Test SQL: > {code} > select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from > (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME > UNION ALL > select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) > order by TOTAL_GMV > {code} > > Exception: > {code} > Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, > sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, > LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by > SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, > sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT > group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" > {code} > Similar issue for the following sql: > {code} > select LSTG_FORMAT_NAME, >SLR_SEGMENT_CD, >CAL_DT, >sum(CNT) as CNT > from > (select LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT >from TEST_KYLIN_FACT >where LSTG_FORMAT_NAME = 'ABIN' >group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT >UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > case > when SLR_SEGMENT_CD > 1000 then CNT * 2 > else CNT * 3 > end as CNT >from > (select SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT > from TEST_KYLIN_FACT > where LSTG_FORMAT_NAME <> 'ABIN' > group by SLR_SEGMENT_CD,CAL_DT)) > group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT > order by CNT > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4851) Better to throw exception when lazy query waiting timeout
Zhong Yanghong created KYLIN-4851: - Summary: Better to throw exception when lazy query waiting timeout Key: KYLIN-4851 URL: https://issues.apache.org/jira/browse/KYLIN-4851 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
[ https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4682: -- Comment: was deleted (was: It seems rule *FilterAggregateTransposeRule* is not effective. It's better to set a lower number for computing the cost of *OLAPFilterRel* to make filter push down as much as possible.) > java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly > - > > Key: KYLIN-4682 > URL: https://issues.apache.org/jira/browse/KYLIN-4682 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > SQL: > {code} > select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv > from TEST_KYLIN_FACT > group by LSTG_FORMAT_NAME, LEAF_CATEG_ID > having LSTG_FORMAT_NAME = 'Auction' > {code} > Error stack trace: > {code} > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) > at java.util.ArrayList.get(ArrayList.java:433) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) > at > org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > at > org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301) > at > org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) > at > org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) > at > org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) > at > org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > ... 81 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on
[ https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4658: - Assignee: Zhong Yanghong (was: JiangYang) > Union all issue with regarding to windows function & aggregation on > > > Key: KYLIN-4658 > URL: https://issues.apache.org/jira/browse/KYLIN-4658 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Test SQL: > {code} > select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from > (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME > UNION ALL > select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) > order by TOTAL_GMV > {code} > > Exception: > {code} > Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, > sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, > LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by > SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, > sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT > group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" > {code} > Similar issue for the following sql: > {code} > select LSTG_FORMAT_NAME, >SLR_SEGMENT_CD, >CAL_DT, >sum(CNT) as CNT > from > (select LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT >from TEST_KYLIN_FACT >where LSTG_FORMAT_NAME = 'ABIN' >group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT >UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > case > when SLR_SEGMENT_CD > 1000 then CNT * 2 > else CNT * 3 > end as CNT >from > (select SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT > from TEST_KYLIN_FACT > where LSTG_FORMAT_NAME <> 'ABIN' > group by SLR_SEGMENT_CD,CAL_DT)) > group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT > order by CNT > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-3392) Support NULL value in Sum, Max, Min Aggregation
[ https://issues.apache.org/jira/browse/KYLIN-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1726#comment-1726 ] Zhong Yanghong commented on KYLIN-3392: --- Hi [~wangrupeng], I applied the patch to 3.1.0 and it works well. Could you try it? > Support NULL value in Sum, Max, Min Aggregation > --- > > Key: KYLIN-3392 > URL: https://issues.apache.org/jira/browse/KYLIN-3392 > Project: Kylin > Issue Type: Bug >Reporter: Yifei Wu >Assignee: Yifei Wu >Priority: Major > Fix For: Future > > Attachments: KYLIN-3392-2.png, KYLIN-3392.png, kylin-3.0.0-alpha2.png > > > It is treated as 0 when confronted with NULL value in KYLIN's basic aggregate > measure (like sum, max, min). However, to distinguish the NULL value with 0 > is very necessary. > It should be like this > *sum(null, null) = null* > *sum(null, 1) = 1* > *max(null, null) = null* > *max(null, -1) = -1* > *min(null, -1)= -1* > in accordance with Hive and SparkSQL -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-3482) Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer
[ https://issues.apache.org/jira/browse/KYLIN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235822#comment-17235822 ] Zhong Yanghong commented on KYLIN-3482: --- [~shaofengshi], It seems the patch for closing will have bad effect in *SparkCubingByLayer*. For example, after applying the patch {code} public void init() { KylinConfig kConfig = AbstractHadoopJob.loadKylinConfigFromHdfs(conf, metaUrl); try (KylinConfig.SetAndUnsetThreadLocalConfig autoUnset = KylinConfig .setAndUnsetThreadLocalConfig(kConfig)) { CubeInstance cubeInstance = CubeManager.getInstance(kConfig).getCube(cubeName); cubeDesc = cubeInstance.getDescriptor(); aggregators = new MeasureAggregators(cubeDesc.getMeasures()); measureNum = cubeDesc.getMeasures().size(); } } {code} After init(), the KylinConfig will be removed. Then it will fail to call KylinConfig.getInstanceFromEnv(). > Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer > --- > > Key: KYLIN-3482 > URL: https://issues.apache.org/jira/browse/KYLIN-3482 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: Jiatao Tao >Priority: Minor > Fix For: v2.5.0 > > > Here is related code: > {code} > KylinConfig kylinConfig = > AbstractHadoopJob.loadKylinConfigFromHdfs(sConf, metaUrl); > > KylinConfig.setAndUnsetThreadLocalConfig(kylinConfig); > {code} > The return value from setAndUnsetThreadLocalConfig should be closed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-3271) Optimize sub-path check of ResourceTool
[ https://issues.apache.org/jira/browse/KYLIN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17214421#comment-17214421 ] Zhong Yanghong commented on KYLIN-3271: --- This change blocks the following command: {code} ${KYLIN_HOME}/bin/metastore.sh fetch /execute {code} > Optimize sub-path check of ResourceTool > --- > > Key: KYLIN-3271 > URL: https://issues.apache.org/jira/browse/KYLIN-3271 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.2.0 >Reporter: nichunen >Assignee: nichunen >Priority: Minor > Fix For: v2.4.0 > > > kylin uses class org.apache.kylin.common.persistence.ResourceTool to do > metadata download, upload, remove, etc. The algorithm for resource > transversal is not very effective. For instance, for an "execute_output" with > key "/execute_output/\{uuid}", the algorithm will try to check whether it's a > folder with sub-resources, this makes un-necessary time cost, and in cases of > metadata with lots of jobs, it may last for a long time before the finish. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (KYLIN-3271) Optimize sub-path check of ResourceTool
[ https://issues.apache.org/jira/browse/KYLIN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reopened KYLIN-3271: --- > Optimize sub-path check of ResourceTool > --- > > Key: KYLIN-3271 > URL: https://issues.apache.org/jira/browse/KYLIN-3271 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.2.0 >Reporter: nichunen >Assignee: nichunen >Priority: Minor > Fix For: v2.4.0 > > > kylin uses class org.apache.kylin.common.persistence.ResourceTool to do > metadata download, upload, remove, etc. The algorithm for resource > transversal is not very effective. For instance, for an "execute_output" with > key "/execute_output/\{uuid}", the algorithm will try to check whether it's a > folder with sub-resources, this makes un-necessary time cost, and in cases of > metadata with lots of jobs, it may last for a long time before the finish. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4421) Allow to update table & database name
[ https://issues.apache.org/jira/browse/KYLIN-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17211583#comment-17211583 ] Zhong Yanghong commented on KYLIN-4421: --- The usage is as follows: {code:java} curl -kXPOST 'http://localhost:7070/kylin/api/tables/default/update' \ -H 'Authorization: Basic XX' \ -H 'Content-Type: application/json' \ -d '{ "mapping":{ "DEFAULT.KYLIN_SALES": { "database": "TEST", "tableName": "KYLIN_FACT" }, "DEFAULT.KYLIN_CAL_DT": { "tableName": "CAL_DT" }, "DEFAULT.KYLIN_CATEGORY_GROUPINGS": { "database": "TEST" } }, "isUseExisting":true }' {code} > Allow to update table & database name > -- > > Key: KYLIN-4421 > URL: https://issues.apache.org/jira/browse/KYLIN-4421 > Project: Kylin > Issue Type: Sub-task >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Minor > Fix For: v3.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4666) Improve TopNCounter's merge performance
[ https://issues.apache.org/jira/browse/KYLIN-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4666: -- Description: Currently, we need to do sort for very merge operation, which will cost much time for thousands of merges. It's better to leverage a bit more buffer to reduce the chance of sort (was: It's better to use PriorityQueue rather than Collections.sort() to sort elements and find minimum value.) > Improve TopNCounter's merge performance > --- > > Key: KYLIN-4666 > URL: https://issues.apache.org/jira/browse/KYLIN-4666 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Currently, we need to do sort for very merge operation, which will cost much > time for thousands of merges. It's better to leverage a bit more buffer to > reduce the chance of sort -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4080) Project schema update event causes error reload NEW DataModelDesc
[ https://issues.apache.org/jira/browse/KYLIN-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203848#comment-17203848 ] Zhong Yanghong edited comment on KYLIN-4080 at 9/29/20, 10:33 AM: -- This fix will break cube migration when the destination project is different from the source one because of the additional attribute *projectName* in DataModelDesc was (Author: yaho): This fix will break cube migration when the destination project is different from the source one. > Project schema update event causes error reload NEW DataModelDesc > - > > Key: KYLIN-4080 > URL: https://issues.apache.org/jira/browse/KYLIN-4080 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.5.2 >Reporter: Yuzhang QIU >Assignee: Yuzhang QIU >Priority: Blocker > Fix For: v2.6.5, v3.1.0, v3.0.1 > > > Hi, dear Kylin dev team: >When create new DataModelDesc, DataModelManager.createDataModelDese:246 > will temporarily add the new model name into selected project(project1) > cache, but won't persist it. The TEMPORARY ADD operation will make the model > reloading successful, rather than throw "No project found for model ..." > exception(at ProjectManager:391). >However, If there have another threads are processing "Broadcasting > update project_schema, project1", it will clean up cache of project1 and > reload it, which will reset the "TEMPORARY ADD" operation. Meanwhile, the > model saving thread has persisted the DataModelDesc and start to reload it, > but will find there have "No project for this model". > The new model can't be created again because the conflict timestamp and > can't be reloaded into cache because the abrove problem. >How do you think about this?? > >Best regards > >yuzhang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4080) Project schema update event causes error reload NEW DataModelDesc
[ https://issues.apache.org/jira/browse/KYLIN-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203848#comment-17203848 ] Zhong Yanghong commented on KYLIN-4080: --- This fix will break cube migration when the destination project is different from the source one. > Project schema update event causes error reload NEW DataModelDesc > - > > Key: KYLIN-4080 > URL: https://issues.apache.org/jira/browse/KYLIN-4080 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.5.2 >Reporter: Yuzhang QIU >Assignee: Yuzhang QIU >Priority: Blocker > Fix For: v2.6.5, v3.1.0, v3.0.1 > > > Hi, dear Kylin dev team: >When create new DataModelDesc, DataModelManager.createDataModelDese:246 > will temporarily add the new model name into selected project(project1) > cache, but won't persist it. The TEMPORARY ADD operation will make the model > reloading successful, rather than throw "No project found for model ..." > exception(at ProjectManager:391). >However, If there have another threads are processing "Broadcasting > update project_schema, project1", it will clean up cache of project1 and > reload it, which will reset the "TEMPORARY ADD" operation. Meanwhile, the > model saving thread has persisted the DataModelDesc and start to reload it, > but will find there have "No project for this model". > The new model can't be created again because the conflict timestamp and > can't be reloaded into cache because the abrove problem. >How do you think about this?? > >Best regards > >yuzhang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (KYLIN-4080) Project schema update event causes error reload NEW DataModelDesc
[ https://issues.apache.org/jira/browse/KYLIN-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reopened KYLIN-4080: --- > Project schema update event causes error reload NEW DataModelDesc > - > > Key: KYLIN-4080 > URL: https://issues.apache.org/jira/browse/KYLIN-4080 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.5.2 >Reporter: Yuzhang QIU >Assignee: Yuzhang QIU >Priority: Blocker > Fix For: v2.6.5, v3.1.0, v3.0.1 > > > Hi, dear Kylin dev team: >When create new DataModelDesc, DataModelManager.createDataModelDese:246 > will temporarily add the new model name into selected project(project1) > cache, but won't persist it. The TEMPORARY ADD operation will make the model > reloading successful, rather than throw "No project found for model ..." > exception(at ProjectManager:391). >However, If there have another threads are processing "Broadcasting > update project_schema, project1", it will clean up cache of project1 and > reload it, which will reset the "TEMPORARY ADD" operation. Meanwhile, the > model saving thread has persisted the DataModelDesc and start to reload it, > but will find there have "No project for this model". > The new model can't be created again because the conflict timestamp and > can't be reloaded into cache because the abrove problem. >How do you think about this?? > >Best regards > >yuzhang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4282) support case when in count (distinct)
[ https://issues.apache.org/jira/browse/KYLIN-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203784#comment-17203784 ] Zhong Yanghong commented on KYLIN-4282: --- This feature is mainly for dealing with the limitation of count (distinct ) with some filter. Without this feature, if user wants to query two count (distinct) measures with different filters, he has to use two subqueries first and then do join to combine the results, like follows: {code} select T1.col_a, T1.cm1, T2.cm2 from (select col_a, count(distinct m1) as cm1 from T where f1... group by 1) T1 inner join (select col_a, count(distinct m2) as cm2 from T where f2... group by 1) T2 on T1.col_a = T2.col_a {code} With this feature, we can only use single query as follows: {code} select col_a, count(distinct case when f1 then m1 end) as cm1, count(distinct case when f2 then m2 end) as cm2 from T group by 1 {code} > support case when in count (distinct) > - > > Key: KYLIN-4282 > URL: https://issues.apache.org/jira/browse/KYLIN-4282 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Fix For: v3.1.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query
[ https://issues.apache.org/jira/browse/KYLIN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4779: -- Description: Use InitialCuboidScheduler instead of stats-based TreeCuboidScheduler is not good for finding best parent cuboid for query performance. Here's the example for query with source cuboid 0X0200, it has two parent cuboids * 0X0302 with row count 560K * 0X0304 with row count 40 If we use InitialCuboidScheduler, it will choose 0X0302 as the target cuboid for this query. It's obviously better to choose 0X0304. ||Heading 1||Heading 2|| |!0X0302.png|width=400,height=400!|!0X0304.png|width=400,height=400!| was:Use InitialCuboidScheduler instead of stats-based > Use TreeCuboidScheduler even when cube planner is not enabled for query > --- > > Key: KYLIN-4779 > URL: https://issues.apache.org/jira/browse/KYLIN-4779 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: 0X0302.png, 0X0304.png > > > Use InitialCuboidScheduler instead of stats-based TreeCuboidScheduler is not > good for finding best parent cuboid for query performance. Here's the example > for query with source cuboid 0X0200, it has two parent cuboids > * 0X0302 with row count 560K > * 0X0304 with row count 40 > If we use InitialCuboidScheduler, it will choose 0X0302 as the target cuboid > for this query. It's obviously better to choose 0X0304. > ||Heading 1||Heading 2|| > |!0X0302.png|width=400,height=400!|!0X0304.png|width=400,height=400!| -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query
[ https://issues.apache.org/jira/browse/KYLIN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4779: -- Attachment: 0X0302.png 0X0304.png > Use TreeCuboidScheduler even when cube planner is not enabled for query > --- > > Key: KYLIN-4779 > URL: https://issues.apache.org/jira/browse/KYLIN-4779 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: 0X0302.png, 0X0304.png > > > Use InitialCuboidScheduler instead of stats-based -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query
[ https://issues.apache.org/jira/browse/KYLIN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4779: -- Description: Use InitialCuboidScheduler instead of stats-based > Use TreeCuboidScheduler even when cube planner is not enabled for query > --- > > Key: KYLIN-4779 > URL: https://issues.apache.org/jira/browse/KYLIN-4779 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Use InitialCuboidScheduler instead of stats-based -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query
Zhong Yanghong created KYLIN-4779: - Summary: Use TreeCuboidScheduler even when cube planner is not enabled for query Key: KYLIN-4779 URL: https://issues.apache.org/jira/browse/KYLIN-4779 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4769) Make supportAppend to be true for hdfs federation in HiveProducer
[ https://issues.apache.org/jira/browse/KYLIN-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4769: -- Summary: Make supportAppend to be true for hdfs federation in HiveProducer (was: Make supportAppend to be true for hdfs federation) > Make supportAppend to be true for hdfs federation in HiveProducer > - > > Key: KYLIN-4769 > URL: https://issues.apache.org/jira/browse/KYLIN-4769 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4769) Make supportAppend to be true for hdfs federation
Zhong Yanghong created KYLIN-4769: - Summary: Make supportAppend to be true for hdfs federation Key: KYLIN-4769 URL: https://issues.apache.org/jira/browse/KYLIN-4769 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4758) Introduce a new measure to allow input to be negative for topn
Zhong Yanghong created KYLIN-4758: - Summary: Introduce a new measure to allow input to be negative for topn Key: KYLIN-4758 URL: https://issues.apache.org/jira/browse/KYLIN-4758 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4666) Improve TopNCounter's merge performance
[ https://issues.apache.org/jira/browse/KYLIN-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4666: -- Description: It's better to use PriorityQueue rather than Collections.sort() to sort elements and find minimum value. > Improve TopNCounter's merge performance > --- > > Key: KYLIN-4666 > URL: https://issues.apache.org/jira/browse/KYLIN-4666 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > It's better to use PriorityQueue rather than Collections.sort() to sort > elements and find minimum value. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on
[ https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4658: - Assignee: JiangYang (was: Zhong Yanghong) > Union all issue with regarding to windows function & aggregation on > > > Key: KYLIN-4658 > URL: https://issues.apache.org/jira/browse/KYLIN-4658 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > > Test SQL: > {code} > select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from > (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME > UNION ALL > select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) > order by TOTAL_GMV > {code} > > Exception: > {code} > Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, > sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, > LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by > SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, > sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT > group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" > {code} > Similar issue for the following sql: > {code} > select LSTG_FORMAT_NAME, >SLR_SEGMENT_CD, >CAL_DT, >sum(CNT) as CNT > from > (select LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT >from TEST_KYLIN_FACT >where LSTG_FORMAT_NAME = 'ABIN' >group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT >UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > case > when SLR_SEGMENT_CD > 1000 then CNT * 2 > else CNT * 3 > end as CNT >from > (select SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT > from TEST_KYLIN_FACT > where LSTG_FORMAT_NAME <> 'ABIN' > group by SLR_SEGMENT_CD,CAL_DT)) > group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT > order by CNT > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4651) TopN does not work when force hit cube enabled
[ https://issues.apache.org/jira/browse/KYLIN-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4651: - Assignee: JiangYang (was: Zhong Yanghong) > TopN does not work when force hit cube enabled > -- > > Key: KYLIN-4651 > URL: https://issues.apache.org/jira/browse/KYLIN-4651 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4637) Fix sum(null) issue for decimal
[ https://issues.apache.org/jira/browse/KYLIN-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4637: - Assignee: JiangYang (was: Zhong Yanghong) > Fix sum(null) issue for decimal > --- > > Key: KYLIN-4637 > URL: https://issues.apache.org/jira/browse/KYLIN-4637 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4667) Automatically set kylin.query.cache-signature-enabled to be true when memcached is enabled
[ https://issues.apache.org/jira/browse/KYLIN-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4667: - Assignee: JiangYang (was: Zhong Yanghong) > Automatically set kylin.query.cache-signature-enabled to be true when > memcached is enabled > -- > > Key: KYLIN-4667 > URL: https://issues.apache.org/jira/browse/KYLIN-4667 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4702) Missing cube-level lookup table snapshot when doing cube migration
[ https://issues.apache.org/jira/browse/KYLIN-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4702: - Assignee: JiangYang (was: Zhong Yanghong) > Missing cube-level lookup table snapshot when doing cube migration > -- > > Key: KYLIN-4702 > URL: https://issues.apache.org/jira/browse/KYLIN-4702 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4639) Make batch account without any authorities to be able to see web pages
[ https://issues.apache.org/jira/browse/KYLIN-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4639: - Assignee: JiangYang (was: Zhong Yanghong) > Make batch account without any authorities to be able to see web pages > -- > > Key: KYLIN-4639 > URL: https://issues.apache.org/jira/browse/KYLIN-4639 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4697) User info update logic is not correct
[ https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4697: - Assignee: JiangYang (was: Zhong Yanghong) > User info update logic is not correct > - > > Key: KYLIN-4697 > URL: https://issues.apache.org/jira/browse/KYLIN-4697 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > > There are mainly two issues: > * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct > due to not considering ALL_USERS > * The logic of updateUser in some places is not correct due to not following > copy on write. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4636) Make /api/admin/public_config callable for profile saml
[ https://issues.apache.org/jira/browse/KYLIN-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4636: - Assignee: JiangYang (was: Zhong Yanghong) > Make /api/admin/public_config callable for profile saml > --- > > Key: KYLIN-4636 > URL: https://issues.apache.org/jira/browse/KYLIN-4636 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-4752) Refine server mode checking
[ https://issues.apache.org/jira/browse/KYLIN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-4752: - Assignee: JiangYang (was: Zhong Yanghong) > Refine server mode checking > --- > > Key: KYLIN-4752 > URL: https://issues.apache.org/jira/browse/KYLIN-4752 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: JiangYang >Priority: Major > > It's better to use *org.apache.kylin.common.util.ServerMode* for server mode > checking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4752) Refine server mode checking
[ https://issues.apache.org/jira/browse/KYLIN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4752: -- Description: It's better to use *org.apache.kylin.common.util.ServerMode* for server mode checking. > Refine server mode checking > --- > > Key: KYLIN-4752 > URL: https://issues.apache.org/jira/browse/KYLIN-4752 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > It's better to use *org.apache.kylin.common.util.ServerMode* for server mode > checking. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4752) Refine server mode checking
Zhong Yanghong created KYLIN-4752: - Summary: Refine server mode checking Key: KYLIN-4752 URL: https://issues.apache.org/jira/browse/KYLIN-4752 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-3359) Support sum(expression) if possible
[ https://issues.apache.org/jira/browse/KYLIN-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192525#comment-17192525 ] Zhong Yanghong commented on KYLIN-3359: --- Hi [~mzz_q], did you set *kylin.query.enable-dynamic-column* to be true at project level. > Support sum(expression) if possible > --- > > Key: KYLIN-3359 > URL: https://issues.apache.org/jira/browse/KYLIN-3359 > Project: Kylin > Issue Type: Sub-task > Components: Query Engine >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Fix For: v2.4.0 > > Attachments: KYLIN-3359-Hive-query.png, KYLIN-3359-Kylin-query.png > > > The expression can be as follows: > # a ~1~*col ~1~ + a ~2~*col ~2~ + ... + a ~n~*col ~n~ + b, if sum(col > ~1~),sum(col ~2~),...sum(col ~n~) are defined > # case when {{filter}} ~1~ then expr ~1~ > when {{filter}} ~2~ then expr ~2~ > ... > else expr ~N~ > end, if {{filter}} ~1~,{{filter}} ~2~, ... {{filter}} ~N-1~, and expr > ~1~,expr ~2~,...expr ~N~ are supported > There's a constraint for the filter. That is it's able to push down the > related filters in case when. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4620) sum(expression) support should be limited since it's not conform the associative law of addition in standard sql
[ https://issues.apache.org/jira/browse/KYLIN-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4620: -- Description: In standard sql, there's an edge case for the calculation of expression for a single row. For example, {code:java} ${col1} + ${col2} {code} if ${col1} or ${col2} is null, the result of this expression should be null. Therefore, the sum aggregation function does not conform the associative law of addition. That is {code:java} sum(col1) + sum(col2) != sum(col1 + col2) {code} To support sum(col1 + col2), we have to predefine the it if null values may exist for the related columns. If you want to enable sum(expression) with regarding null as 0, you need to set *kylin.query.is-null-as-zero-in-expression* to be true at project level was: In standard sql, there's an edge case for the calculation of expression for a single row. For example, {code:java} ${col1} + ${col2} {code} if ${col1} or ${col2} is null, the result of this expression should be null. Therefore, the sum aggregation function does not conform the associative law of addition. That is {code:java} sum(col1) + sum(col2) != sum(col1 + col2) {code} To support sum(col1 + col2), we have to predefine the it if null values may exist for the related columns. If you want to enable sum(expression) with regarding null as 0, you need to set kylin.query.is-null-as-zero-in-expression to be true at project level > sum(expression) support should be limited since it's not conform the > associative law of addition in standard sql > > > Key: KYLIN-4620 > URL: https://issues.apache.org/jira/browse/KYLIN-4620 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > In standard sql, there's an edge case for the calculation of expression for a > single row. For example, > {code:java} > ${col1} + ${col2} > {code} > if ${col1} or ${col2} is null, the result of this expression should be null. > Therefore, the sum aggregation function does not conform the associative law > of addition. That is > {code:java} > sum(col1) + sum(col2) != sum(col1 + col2) > {code} > > To support sum(col1 + col2), we have to predefine the it if null values may > exist for the related columns. > If you want to enable sum(expression) with regarding null as 0, you need to > set *kylin.query.is-null-as-zero-in-expression* to be true at project level -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4620) sum(expression) support should be limited since it's not conform the associative law of addition in standard sql
[ https://issues.apache.org/jira/browse/KYLIN-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4620: -- Description: In standard sql, there's an edge case for the calculation of expression for a single row. For example, {code:java} ${col1} + ${col2} {code} if ${col1} or ${col2} is null, the result of this expression should be null. Therefore, the sum aggregation function does not conform the associative law of addition. That is {code:java} sum(col1) + sum(col2) != sum(col1 + col2) {code} To support sum(col1 + col2), we have to predefine the it if null values may exist for the related columns. If you want to enable sum(expression) with regarding null as 0, you need to set kylin.query.is-null-as-zero-in-expression to be true at project level was: In standard sql, there's an edge case for the calculation of expression for a single row. For example, {code:java} ${col1} + ${col2} {code} if ${col1} or ${col2} is null, the result of this expression should be null. Therefore, the sum aggregation function does not conform the associative law of addition. That is {code:java} sum(col1) + sum(col2) != sum(col1 + col2) {code} To support sum(col1 + col2), we have to predefine the it if null values may exist for the related columns. > sum(expression) support should be limited since it's not conform the > associative law of addition in standard sql > > > Key: KYLIN-4620 > URL: https://issues.apache.org/jira/browse/KYLIN-4620 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > In standard sql, there's an edge case for the calculation of expression for a > single row. For example, > {code:java} > ${col1} + ${col2} > {code} > if ${col1} or ${col2} is null, the result of this expression should be null. > Therefore, the sum aggregation function does not conform the associative law > of addition. That is > {code:java} > sum(col1) + sum(col2) != sum(col1 + col2) > {code} > > To support sum(col1 + col2), we have to predefine the it if null values may > exist for the related columns. > If you want to enable sum(expression) with regarding null as 0, you need to > set kylin.query.is-null-as-zero-in-expression to be true at project level -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-3359) Support sum(expression) if possible
[ https://issues.apache.org/jira/browse/KYLIN-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192167#comment-17192167 ] Zhong Yanghong commented on KYLIN-3359: --- Hi [~mzz_q], please see [KYLIN-4620]. If you want to enable sum(expression), you need to set *kylin.query.is-null-as-zero-in-expression* to be true at project level > Support sum(expression) if possible > --- > > Key: KYLIN-3359 > URL: https://issues.apache.org/jira/browse/KYLIN-3359 > Project: Kylin > Issue Type: Sub-task > Components: Query Engine >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Fix For: v2.4.0 > > Attachments: KYLIN-3359-Hive-query.png, KYLIN-3359-Kylin-query.png > > > The expression can be as follows: > # a ~1~*col ~1~ + a ~2~*col ~2~ + ... + a ~n~*col ~n~ + b, if sum(col > ~1~),sum(col ~2~),...sum(col ~n~) are defined > # case when {{filter}} ~1~ then expr ~1~ > when {{filter}} ~2~ then expr ~2~ > ... > else expr ~N~ > end, if {{filter}} ~1~,{{filter}} ~2~, ... {{filter}} ~N-1~, and expr > ~1~,expr ~2~,...expr ~N~ are supported > There's a constraint for the filter. That is it's able to push down the > related filters in case when. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4633) IT failed for can't detect the default value of config kylin.source.hive.databasedir
[ https://issues.apache.org/jira/browse/KYLIN-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191717#comment-17191717 ] Zhong Yanghong commented on KYLIN-4633: --- Thanks for fixing this. > IT failed for can't detect the default value of config > kylin.source.hive.databasedir > > > Key: KYLIN-4633 > URL: https://issues.apache.org/jira/browse/KYLIN-4633 > Project: Kylin > Issue Type: Improvement > Components: Tools, Build and Test >Reporter: Yaqian Zhang >Assignee: Yaqian Zhang >Priority: Minor > > KYLIN-4616 introduce the method of auto detect the default value of config > kylin.source.hive.databasedir in find-hive-dependency, but IT doesn't execute > this script so `kylin.source.hive.databasedir` is null, this will lead to > BuildCubeWithEngine failed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4749) Add isNeedMaterialize() for TableDesc
Zhong Yanghong created KYLIN-4749: - Summary: Add isNeedMaterialize() for TableDesc Key: KYLIN-4749 URL: https://issues.apache.org/jira/browse/KYLIN-4749 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4745) Fix BeelineHiveClient parseResultEntry
Zhong Yanghong created KYLIN-4745: - Summary: Fix BeelineHiveClient parseResultEntry Key: KYLIN-4745 URL: https://issues.apache.org/jira/browse/KYLIN-4745 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4718) Trim memory hungry measures in region server if no need to do post aggregation at server side
Zhong Yanghong created KYLIN-4718: - Summary: Trim memory hungry measures in region server if no need to do post aggregation at server side Key: KYLIN-4718 URL: https://issues.apache.org/jira/browse/KYLIN-4718 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4707) Add one config to disable mapper side combiner especially when there's topn measure
[ https://issues.apache.org/jira/browse/KYLIN-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4707: -- Summary: Add one config to disable mapper side combiner especially when there's topn measure (was: Add one config to disable mapper side combiner especially there's topn measure) > Add one config to disable mapper side combiner especially when there's topn > measure > --- > > Key: KYLIN-4707 > URL: https://issues.apache.org/jira/browse/KYLIN-4707 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Since mapper side combiner is only using one single thread to do measure > merge. If there's some topn measure defined in cube, it will become very slow > to finish a mapper task. It's better to provide an option to disable it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4707) Add one config to disable mapper side combiner especially there's topn measure
Zhong Yanghong created KYLIN-4707: - Summary: Add one config to disable mapper side combiner especially there's topn measure Key: KYLIN-4707 URL: https://issues.apache.org/jira/browse/KYLIN-4707 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong Since mapper side combiner is only using one single thread to do measure merge. If there's some topn measure defined in cube, it will become very slow to finish a mapper task. It's better to provide an option to disable it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4702) Missing cube-level lookup table snapshot when doing cube migration
Zhong Yanghong created KYLIN-4702: - Summary: Missing cube-level lookup table snapshot when doing cube migration Key: KYLIN-4702 URL: https://issues.apache.org/jira/browse/KYLIN-4702 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4697) User info update logic is not correct
[ https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17178763#comment-17178763 ] Zhong Yanghong commented on KYLIN-4697: --- There's one more thing we need to take care of. In HBase, the rowkey is case-sensitive. While in Kylin, the user name is case-insensitive. We need to avoid the case that there's multiple HBase records existing for the same user. Otherwise, WriteConflictException may occur when updating user info. > User info update logic is not correct > - > > Key: KYLIN-4697 > URL: https://issues.apache.org/jira/browse/KYLIN-4697 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > There are mainly two issues: > * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct > due to not considering ALL_USERS > * The logic of updateUser in some places is not correct due to not following > copy on write. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4697) User info update logic is not correct
[ https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4697: -- Description: There are mainly two issues: * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due to not considering ALL_USERS * The logic of updateUser in some places is not correct due to not following copy on write. was: There are mainly two issues: * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due to not considering ALL_USERS * The logic of updateUser in KylinUserService & KylinUserManager is not correct due to not following copy on write. > User info update logic is not correct > - > > Key: KYLIN-4697 > URL: https://issues.apache.org/jira/browse/KYLIN-4697 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > There are mainly two issues: > * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct > due to not considering ALL_USERS > * The logic of updateUser in some places is not correct due to not following > copy on write. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4697) User info update logic is not correct
[ https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4697: -- Description: There are mainly two issues: * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due to not considering ALL_USERS * The logic of updateUser in KylinUserService & KylinUserManager is not correct due to not following copy on write. was: There are currently 3 main issues: * In KylinUserGroupService.init(), the following code is not correct: {code} store.checkAndPutResource(PATH, userGroup, USER_GROUP_SERIALIZER); {code} * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due to not considering ALL_USERS * The logic of updateUser in KylinUserService & KylinUserManager is not correct due to not following copy on write. > User info update logic is not correct > - > > Key: KYLIN-4697 > URL: https://issues.apache.org/jira/browse/KYLIN-4697 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > There are mainly two issues: > * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct > due to not considering ALL_USERS > * The logic of updateUser in KylinUserService & KylinUserManager is not > correct due to not following copy on write. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4697) User info update logic is not correct
Zhong Yanghong created KYLIN-4697: - Summary: User info update logic is not correct Key: KYLIN-4697 URL: https://issues.apache.org/jira/browse/KYLIN-4697 Project: Kylin Issue Type: Bug Reporter: Zhong Yanghong Assignee: Zhong Yanghong There are currently 3 main issues: * In KylinUserGroupService.init(), the following code is not correct: {code} store.checkAndPutResource(PATH, userGroup, USER_GROUP_SERIALIZER); {code} * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due to not considering ALL_USERS * The logic of updateUser in KylinUserService & KylinUserManager is not correct due to not following copy on write. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
[ https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172037#comment-17172037 ] Zhong Yanghong commented on KYLIN-4682: --- It seems rule *FilterAggregateTransposeRule* is not effective. It's better to set a lower number for computing the cost of *OLAPFilterRel* > java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly > - > > Key: KYLIN-4682 > URL: https://issues.apache.org/jira/browse/KYLIN-4682 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > SQL: > {code} > select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv > from TEST_KYLIN_FACT > group by LSTG_FORMAT_NAME, LEAF_CATEG_ID > having LSTG_FORMAT_NAME = 'Auction' > {code} > Error stack trace: > {code} > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) > at java.util.ArrayList.get(ArrayList.java:433) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) > at > org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > at > org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301) > at > org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) > at > org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) > at > org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) > at > org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > ... 81 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
[ https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172037#comment-17172037 ] Zhong Yanghong edited comment on KYLIN-4682 at 8/6/20, 5:11 AM: It seems rule *FilterAggregateTransposeRule* is not effective. It's better to set a lower number for computing the cost of *OLAPFilterRel* to make filter push down as much as possible. was (Author: yaho): It seems rule *FilterAggregateTransposeRule* is not effective. It's better to set a lower number for computing the cost of *OLAPFilterRel* > java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly > - > > Key: KYLIN-4682 > URL: https://issues.apache.org/jira/browse/KYLIN-4682 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > SQL: > {code} > select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv > from TEST_KYLIN_FACT > group by LSTG_FORMAT_NAME, LEAF_CATEG_ID > having LSTG_FORMAT_NAME = 'Auction' > {code} > Error stack trace: > {code} > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) > at java.util.ArrayList.get(ArrayList.java:433) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) > at > org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > at > org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301) > at > org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) > at > org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) > at > org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) > at > org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > ... 81 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
[ https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171306#comment-17171306 ] Zhong Yanghong commented on KYLIN-4682: --- For this kind of sql, we can extract filters on group by related columns and then push down these filters like {code} select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv from TEST_KYLIN_FACT where LSTG_FORMAT_NAME = 'Auction' group by LSTG_FORMAT_NAME, LEAF_CATEG_ID {code} > java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly > - > > Key: KYLIN-4682 > URL: https://issues.apache.org/jira/browse/KYLIN-4682 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > SQL: > {code} > select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv > from TEST_KYLIN_FACT > group by LSTG_FORMAT_NAME, LEAF_CATEG_ID > having LSTG_FORMAT_NAME = 'Auction' > {code} > Error stack trace: > {code} > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) > at java.util.ArrayList.get(ArrayList.java:433) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) > at > org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > at > org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301) > at > org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) > at > org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) > at > org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) > at > org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) > at > org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) > ... 81 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
[ https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4682: -- Description: SQL: {code} select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv from TEST_KYLIN_FACT group by LSTG_FORMAT_NAME, LEAF_CATEG_ID having LSTG_FORMAT_NAME = 'Auction' {code} Error stack trace: {code} Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:657) at java.util.ArrayList.get(ArrayList.java:433) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) at org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) at Baz$1$1.moveNext(Unknown Source) at org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) at org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) at org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) at Baz.bind(Unknown Source) at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365) at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301) at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182) at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619) at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) ... 81 more {code} was: SQL: {code} select LSTG_FORMAT_NAME, sum(gmv) from ( select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv from TEST_KYLIN_FACT group by LSTG_FORMAT_NAME, LEAF_CATEG_ID ) where LSTG_FORMAT_NAME = 'Auction' group by LSTG_FORMAT_NAME {code} Error stack trace: {code} Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:657) at java.util.ArrayList.get(ArrayList.java:433) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) at org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) at Baz$1$1.moveNext(Unknown Source) at org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) at org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) at org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) at Baz.bind(Unknown Source) at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365) at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301) at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182) at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) at org.apache.calcite.jd
[jira] [Updated] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
[ https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4682: -- Description: SQL: {code} select LSTG_FORMAT_NAME, sum(gmv) from ( select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv from TEST_KYLIN_FACT group by LSTG_FORMAT_NAME, LEAF_CATEG_ID ) where LSTG_FORMAT_NAME = 'Auction' group by LSTG_FORMAT_NAME {code} Error stack trace: {code} Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:657) at java.util.ArrayList.get(ArrayList.java:433) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) at org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) at org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) at org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) at org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) at Baz$1$1.moveNext(Unknown Source) at org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) at org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) at org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) at Baz.bind(Unknown Source) at org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365) at org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301) at org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559) at org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550) at org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182) at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67) at org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44) at org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667) at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619) at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156) ... 81 more {code} > java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly > - > > Key: KYLIN-4682 > URL: https://issues.apache.org/jira/browse/KYLIN-4682 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > SQL: > {code} > select LSTG_FORMAT_NAME, sum(gmv) > from > ( > select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv > from TEST_KYLIN_FACT > group by LSTG_FORMAT_NAME, LEAF_CATEG_ID > ) > where LSTG_FORMAT_NAME = 'Auction' > group by LSTG_FORMAT_NAME > {code} > Error stack trace: > {code} > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) > at java.util.ArrayList.get(ArrayList.java:433) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98) > at > org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90) > at > org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825) > at > org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761) > at > org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302) > at Baz.bind(Unknown Source) > at > org.apache.calcite.jdbc.CalcitePrepare$C
[jira] [Created] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
Zhong Yanghong created KYLIN-4682: - Summary: java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly Key: KYLIN-4682 URL: https://issues.apache.org/jira/browse/KYLIN-4682 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4658) Union all issue with regarding to windows function & aggregation
[ https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4658: -- Summary: Union all issue with regarding to windows function & aggregation (was: Windows function does not work for union all) > Union all issue with regarding to windows function & aggregation > - > > Key: KYLIN-4658 > URL: https://issues.apache.org/jira/browse/KYLIN-4658 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Test SQL: > {code} > select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from > (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME > UNION ALL > select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) > order by TOTAL_GMV > {code} > > Exception: > {code} > Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, > sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, > LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by > SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, > sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT > group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" > {code} > Similar issue for the following sql: > {code} > select LSTG_FORMAT_NAME, >SLR_SEGMENT_CD, >CAL_DT, >sum(CNT) as CNT > from > (select LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT >from TEST_KYLIN_FACT >where LSTG_FORMAT_NAME = 'ABIN' >group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT >UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > case > when SLR_SEGMENT_CD > 1000 then CNT * 2 > else CNT * 3 > end as CNT >from > (select SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT > from TEST_KYLIN_FACT > where LSTG_FORMAT_NAME <> 'ABIN' > group by SLR_SEGMENT_CD,CAL_DT)) > group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT > order by CNT > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on
[ https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4658: -- Summary: Union all issue with regarding to windows function & aggregation on (was: Union all issue with regarding to windows function & aggregation) > Union all issue with regarding to windows function & aggregation on > > > Key: KYLIN-4658 > URL: https://issues.apache.org/jira/browse/KYLIN-4658 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Test SQL: > {code} > select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from > (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME > UNION ALL > select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) > order by TOTAL_GMV > {code} > > Exception: > {code} > Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, > sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, > LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by > SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, > sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT > group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" > {code} > Similar issue for the following sql: > {code} > select LSTG_FORMAT_NAME, >SLR_SEGMENT_CD, >CAL_DT, >sum(CNT) as CNT > from > (select LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT >from TEST_KYLIN_FACT >where LSTG_FORMAT_NAME = 'ABIN' >group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT >UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > case > when SLR_SEGMENT_CD > 1000 then CNT * 2 > else CNT * 3 > end as CNT >from > (select SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT > from TEST_KYLIN_FACT > where LSTG_FORMAT_NAME <> 'ABIN' > group by SLR_SEGMENT_CD,CAL_DT)) > group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT > order by CNT > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4658) Windows function does not work for union all
[ https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4658: -- Description: Test SQL: {code} select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV {code} Exception: {code} Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" {code} Similar issue for the following sql: {code} select LSTG_FORMAT_NAME, SLR_SEGMENT_CD, CAL_DT, sum(CNT) as CNT from (select LSTG_FORMAT_NAME, SLR_SEGMENT_CD, CAL_DT, sum(ITEM_COUNT) CNT from TEST_KYLIN_FACT where LSTG_FORMAT_NAME = 'ABIN' group by LSTG_FORMAT_NAME, SLR_SEGMENT_CD, CAL_DT UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME, SLR_SEGMENT_CD, CAL_DT, case when SLR_SEGMENT_CD > 1000 then CNT * 2 else CNT * 3 end as CNT from (select SLR_SEGMENT_CD, CAL_DT, sum(ITEM_COUNT) CNT from TEST_KYLIN_FACT where LSTG_FORMAT_NAME <> 'ABIN' group by SLR_SEGMENT_CD,CAL_DT)) group by LSTG_FORMAT_NAME, SLR_SEGMENT_CD, CAL_DT order by CNT {code} was: Test SQL: {code} select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV {code} Exception: {code} Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" {code} > Windows function does not work for union all > > > Key: KYLIN-4658 > URL: https://issues.apache.org/jira/browse/KYLIN-4658 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Test SQL: > {code} > select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from > (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME > UNION ALL > select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME > from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) > order by TOTAL_GMV > {code} > > Exception: > {code} > Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, > sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, > LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, > SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by > SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, > sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT > group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5" > {code} > Similar issue for the following sql: > {code} > select LSTG_FORMAT_NAME, >SLR_SEGMENT_CD, >CAL_DT, >sum(CNT) as CNT > from > (select LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT, > sum(ITEM_COUNT) CNT >from TEST_KYLIN_FACT >where LSTG_FORMAT_NAME = 'ABIN' >group by LSTG_FORMAT_NAME, > SLR_SEGMENT_CD, > CAL_DT >UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME, > SLR_SEGMENT_C
[jira] [Updated] (KYLIN-4674) support cast in sum() expression
[ https://issues.apache.org/jira/browse/KYLIN-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4674: -- Description: Make it possible to support the following sql: {code} select LSTG_FORMAT_NAME, sum(cast(ITEM_COUNT as decimal(18, 6))), sum(case when LSTG_FORMAT_NAME = 'ABIN' then cast(ITEM_COUNT as decimal(18, 6)) when LSTG_FORMAT_NAME = 'Auction' then 2 end), sum(cast(case when LSTG_FORMAT_NAME = 'ABIN' then ITEM_COUNT when LSTG_FORMAT_NAME = 'Auction' then 2 end as decimal(18, 6))) from TEST_KYLIN_FACT group by LSTG_FORMAT_NAME {code} > support cast in sum() expression > > > Key: KYLIN-4674 > URL: https://issues.apache.org/jira/browse/KYLIN-4674 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > > Make it possible to support the following sql: > {code} > select LSTG_FORMAT_NAME, > sum(cast(ITEM_COUNT as decimal(18, 6))), > sum(case > when LSTG_FORMAT_NAME = 'ABIN' then cast(ITEM_COUNT as decimal(18, 6)) > when LSTG_FORMAT_NAME = 'Auction' then 2 > end), > sum(cast(case > when LSTG_FORMAT_NAME = 'ABIN' then ITEM_COUNT > when LSTG_FORMAT_NAME = 'Auction' then 2 > end as decimal(18, 6))) > from TEST_KYLIN_FACT > group by LSTG_FORMAT_NAME > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4674) support cast in sum() expression
Zhong Yanghong created KYLIN-4674: - Summary: support cast in sum() expression Key: KYLIN-4674 URL: https://issues.apache.org/jira/browse/KYLIN-4674 Project: Kylin Issue Type: Improvement Reporter: Zhong Yanghong Assignee: Zhong Yanghong -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads
[ https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4670: -- Attachment: (was: Single-Thread-First-Time.png) > Improve query performance by reusing LookupStringTable and using multi-threads > -- > > Key: KYLIN-4670 > URL: https://issues.apache.org/jira/browse/KYLIN-4670 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: Five-Threads-First-Time.png, > Five-Threads-Second-Time.png, Single-Thread-First-Time.png, > Single-Thread-Second-Time.png > > > For a cube with 37 segments and related snapshots are all different from each > other. Do a query with lookup table derived column. Here's the performance > comparison between single thread & five threads. > * For the first time query without snapshot table cache: > ** Single Thread > !Single-Thread-First-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-First-Time.png|width=600,height=200! > * For the second time query with snapshot table cache: > ** Single Thread > !Single-Thread-Second-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-Second-Time.png|width=600,height=200! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads
[ https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4670: -- Attachment: Single-Thread-First-Time.png Five-Threads-First-Time.png Five-Threads-Second-Time.png Single-Thread-Second-Time.png > Improve query performance by reusing LookupStringTable and using multi-threads > -- > > Key: KYLIN-4670 > URL: https://issues.apache.org/jira/browse/KYLIN-4670 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: Five-Threads-First-Time.png, > Five-Threads-Second-Time.png, Single-Thread-First-Time.png, > Single-Thread-Second-Time.png > > > For a cube with 37 segments and related snapshots are all different from each > other. Do a query with lookup table derived column. Here's the performance > comparison between single thread & five threads. > * For the first time query without snapshot table cache: > ** Single Thread > !Single-Thread-First-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-First-Time.png|width=600,height=200! > * For the second time query with snapshot table cache: > ** Single Thread > !Single-Thread-Second-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-Second-Time.png|width=600,height=200! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads
[ https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4670: -- Attachment: (was: Single-Thread-Second-Time.png) > Improve query performance by reusing LookupStringTable and using multi-threads > -- > > Key: KYLIN-4670 > URL: https://issues.apache.org/jira/browse/KYLIN-4670 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: Five-Threads-First-Time.png, > Five-Threads-Second-Time.png, Single-Thread-First-Time.png, > Single-Thread-Second-Time.png > > > For a cube with 37 segments and related snapshots are all different from each > other. Do a query with lookup table derived column. Here's the performance > comparison between single thread & five threads. > * For the first time query without snapshot table cache: > ** Single Thread > !Single-Thread-First-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-First-Time.png|width=600,height=200! > * For the second time query with snapshot table cache: > ** Single Thread > !Single-Thread-Second-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-Second-Time.png|width=600,height=200! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads
[ https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4670: -- Attachment: (was: Five-Threads-Second-Time.png) > Improve query performance by reusing LookupStringTable and using multi-threads > -- > > Key: KYLIN-4670 > URL: https://issues.apache.org/jira/browse/KYLIN-4670 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: Single-Thread-First-Time.png, > Single-Thread-Second-Time.png > > > For a cube with 37 segments and related snapshots are all different from each > other. Do a query with lookup table derived column. Here's the performance > comparison between single thread & five threads. > * For the first time query without snapshot table cache: > ** Single Thread > !Single-Thread-First-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-First-Time.png|width=600,height=200! > * For the second time query with snapshot table cache: > ** Single Thread > !Single-Thread-Second-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-Second-Time.png|width=600,height=200! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads
[ https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4670: -- Attachment: (was: Five-Threads-First-Time.png) > Improve query performance by reusing LookupStringTable and using multi-threads > -- > > Key: KYLIN-4670 > URL: https://issues.apache.org/jira/browse/KYLIN-4670 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: Single-Thread-First-Time.png, > Single-Thread-Second-Time.png > > > For a cube with 37 segments and related snapshots are all different from each > other. Do a query with lookup table derived column. Here's the performance > comparison between single thread & five threads. > * For the first time query without snapshot table cache: > ** Single Thread > !Single-Thread-First-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-First-Time.png|width=600,height=200! > * For the second time query with snapshot table cache: > ** Single Thread > !Single-Thread-Second-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-Second-Time.png|width=600,height=200! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads
[ https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4670: -- Description: For a cube with 37 segments and related snapshots are all different from each other. Do a query with lookup table derived column. Here's the performance comparison between single thread & five threads. * For the first time query without snapshot table cache: ** Single Thread !Single-Thread-First-Time.png|width=600,height=200! ** Five Threads !Five-Threads-First-Time.png|width=600,height=200! * For the second time query with snapshot table cache: ** Single Thread !Single-Thread-Second-Time.png|width=600,height=200! ** Five Threads !Five-Threads-Second-Time.png|width=600,height=200! was:For a cube with 37 segments and related snapshots are all different from each other. > Improve query performance by reusing LookupStringTable and using multi-threads > -- > > Key: KYLIN-4670 > URL: https://issues.apache.org/jira/browse/KYLIN-4670 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: Five-Threads-First-Time.png, > Five-Threads-Second-Time.png, Single-Thread-First-Time.png, > Single-Thread-Second-Time.png > > > For a cube with 37 segments and related snapshots are all different from each > other. Do a query with lookup table derived column. Here's the performance > comparison between single thread & five threads. > * For the first time query without snapshot table cache: > ** Single Thread > !Single-Thread-First-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-First-Time.png|width=600,height=200! > * For the second time query with snapshot table cache: > ** Single Thread > !Single-Thread-Second-Time.png|width=600,height=200! > ** Five Threads > !Five-Threads-Second-Time.png|width=600,height=200! > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads
[ https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-4670: -- Attachment: Single-Thread-First-Time.png Single-Thread-Second-Time.png Five-Threads-Second-Time.png Five-Threads-First-Time.png > Improve query performance by reusing LookupStringTable and using multi-threads > -- > > Key: KYLIN-4670 > URL: https://issues.apache.org/jira/browse/KYLIN-4670 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Major > Attachments: Five-Threads-First-Time.png, > Five-Threads-Second-Time.png, Single-Thread-First-Time.png, > Single-Thread-Second-Time.png > > > For a cube with 37 segments and related snapshots are all different from each > other. -- This message was sent by Atlassian Jira (v8.3.4#803005)