[jira] [Updated] (KYLIN-5404) Prefer to use common standard dependencies rather than self-maintained ones

2023-01-29 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5404:
--
Description: 
Better to use common standard dependencies rather than self-maintained ones:
- Spark
- Calcite
- Google guava

> Prefer to use common standard dependencies rather than self-maintained ones
> ---
>
> Key: KYLIN-5404
> URL: https://issues.apache.org/jira/browse/KYLIN-5404
> Project: Kylin
>  Issue Type: Task
>Reporter: Zhong Yanghong
>Priority: Major
>
> Better to use common standard dependencies rather than self-maintained ones:
> - Spark
> - Calcite
> - Google guava



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5404) Prefer to use common standard dependencies rather than self-maintained ones

2023-01-29 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5404:
--
Description: 
Better to use common standard dependencies rather than self-maintained ones:
- Spark
- Calcite
- Google guava
- Spring session

  was:
Better to use common standard dependencies rather than self-maintained ones:
- Spark
- Calcite
- Google guava


> Prefer to use common standard dependencies rather than self-maintained ones
> ---
>
> Key: KYLIN-5404
> URL: https://issues.apache.org/jira/browse/KYLIN-5404
> Project: Kylin
>  Issue Type: Task
>Reporter: Zhong Yanghong
>Priority: Major
>
> Better to use common standard dependencies rather than self-maintained ones:
> - Spark
> - Calcite
> - Google guava
> - Spring session



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5404) Prefer to use common standard dependencies rather than self-maintained ones

2023-01-29 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-5404:
-

 Summary: Prefer to use common standard dependencies rather than 
self-maintained ones
 Key: KYLIN-5404
 URL: https://issues.apache.org/jira/browse/KYLIN-5404
 Project: Kylin
  Issue Type: Task
Reporter: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join

2022-12-01 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5321:
--
Labels: 5.0.0-alpha  (was: )

> Don't make the unprecomputed inner join defined in the model fail the match 
> of the query without that join 
> ---
>
> Key: KYLIN-5321
> URL: https://issues.apache.org/jira/browse/KYLIN-5321
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Priority: Major
>  Labels: 5.0.0-alpha
>
> Given a model, *A inner join B && B inner join C*, in which *B inner join C* 
> is unprecomputed.
> This model should be able to match the queries *A inner join B* no matter if 
> *inner-partial-match-join* is true or false. When *inner-partial-match-join* 
> is false, then this model should not match *from A* only.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join

2022-12-01 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5321:
--
Description: 
Given a model, *A inner join B && B inner join C*, in which *B inner join C* is 
unprecomputed.

This model should be able to match the queries *A inner join B* no matter if 
*inner-partial-match-join* is true or false. When *inner-partial-match-join* is 
false, then this model should not match *from A* only.

  was:
Given a model, *A inner join B && B inner join C*, in which *B inner join C* is 
unprecomputed.

This model should be able to match the queries *A inner join B* no matter if 
*inner-partial-match-join* is true or false.


> Don't make the unprecomputed inner join defined in the model fail the match 
> of the query without that join 
> ---
>
> Key: KYLIN-5321
> URL: https://issues.apache.org/jira/browse/KYLIN-5321
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Priority: Major
>
> Given a model, *A inner join B && B inner join C*, in which *B inner join C* 
> is unprecomputed.
> This model should be able to match the queries *A inner join B* no matter if 
> *inner-partial-match-join* is true or false. When *inner-partial-match-join* 
> is false, then this model should not match *from A* only.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join

2022-12-01 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5321:
--
Description: 
Given a model, *A inner join B && B inner join C*, in which *B inner join C* is 
unprecomputed.

This model should be able to match the queries *A inner join B* no matter if 
*inner-partial-match-join* is true or false.

  was:
Given a model, *A inner join B && B inner join C*, in which *B inner join C* is 
unprecomputed.

This model should be able to match the queries *A inner join B*.


> Don't make the unprecomputed inner join defined in the model fail the match 
> of the query without that join 
> ---
>
> Key: KYLIN-5321
> URL: https://issues.apache.org/jira/browse/KYLIN-5321
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Priority: Major
>
> Given a model, *A inner join B && B inner join C*, in which *B inner join C* 
> is unprecomputed.
> This model should be able to match the queries *A inner join B* no matter if 
> *inner-partial-match-join* is true or false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join

2022-12-01 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5321:
--
Description: 
Given a model, *A inner join B && B inner join C*, in which *B inner join C* is 
unprecomputed.

This model should be able to match the queries *A inner join B*.

> Don't make the unprecomputed inner join defined in the model fail the match 
> of the query without that join 
> ---
>
> Key: KYLIN-5321
> URL: https://issues.apache.org/jira/browse/KYLIN-5321
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Priority: Major
>
> Given a model, *A inner join B && B inner join C*, in which *B inner join C* 
> is unprecomputed.
> This model should be able to match the queries *A inner join B*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5321) Don't make the unprecomputed inner join defined in the model fail the match of the query without that join

2022-12-01 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-5321:
-

 Summary: Don't make the unprecomputed inner join defined in the 
model fail the match of the query without that join 
 Key: KYLIN-5321
 URL: https://issues.apache.org/jira/browse/KYLIN-5321
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Reporter: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5307) [kylin5] Distinguish sum(1) and count(*)

2022-11-28 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5307:
--
Labels: 5.0.0-alpha  (was: )

> [kylin5] Distinguish sum(1) and count(*)
> 
>
> Key: KYLIN-5307
> URL: https://issues.apache.org/jira/browse/KYLIN-5307
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Priority: Major
>  Labels: 5.0.0-alpha
>
> In case there's no null values for column LO_ORDERDATE. 
> The following SQL should return null
> {code}
> select sum(1)
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> where LO_ORDERDATE is null
> {code}
> while the following SQL should return 0
> {code}
> select count(*)
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> where LO_ORDERDATE is null
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5306) [kylin5] Allow more inner join keys in sql than the model

2022-11-28 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5306:
--
Labels: 5.0.0-alpha  (was: )

> [kylin5] Allow more inner join keys in sql than the model
> -
>
> Key: KYLIN-5306
> URL: https://issues.apache.org/jira/browse/KYLIN-5306
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Priority: Major
>  Labels: 5.0.0-alpha
>
> The join in model defined as follows:
> {code}
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> {code}
> The join in SQL is as follows:
> {code}
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY and LO_SHIPMODE = C_NATION 
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> {code}
> Ideally, the SQL can be transferred as 
> {code}
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY 
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> where LO_SHIPMODE = C_NATION 
> {code}
> so that the model will be able to match the SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5307) [kylin5] Distinguish sum(1) and count(*)

2022-11-27 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5307:
--
Description: 
In case there's no null values for column LO_ORDERDATE. 

The following SQL should return null
{code}
select sum(1)
from LINEORDER
INNER JOIN CUSTOMER
ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
INNER JOIN SUPPLIER
ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
INNER JOIN PART
ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
INNER JOIN DATES
ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
where LO_ORDERDATE is null
{code}

while the following SQL should return 0
{code}
select count(*)
from LINEORDER
INNER JOIN CUSTOMER
ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
INNER JOIN SUPPLIER
ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
INNER JOIN PART
ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
INNER JOIN DATES
ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
where LO_ORDERDATE is null
{code}

> [kylin5] Distinguish sum(1) and count(*)
> 
>
> Key: KYLIN-5307
> URL: https://issues.apache.org/jira/browse/KYLIN-5307
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Priority: Major
>
> In case there's no null values for column LO_ORDERDATE. 
> The following SQL should return null
> {code}
> select sum(1)
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> where LO_ORDERDATE is null
> {code}
> while the following SQL should return 0
> {code}
> select count(*)
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> where LO_ORDERDATE is null
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5307) [kylin5] Distinguish sum(1) and count(*)

2022-11-27 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-5307:
-

 Summary: [kylin5] Distinguish sum(1) and count(*)
 Key: KYLIN-5307
 URL: https://issues.apache.org/jira/browse/KYLIN-5307
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KYLIN-5306) [kylin5] Allow more inner join keys in sql than the model

2022-11-27 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-5306:
--
Description: 
The join in model defined as follows:
{code}
from LINEORDER
INNER JOIN CUSTOMER
ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
INNER JOIN SUPPLIER
ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
INNER JOIN PART
ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
INNER JOIN DATES
ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
{code}

The join in SQL is as follows:
{code}
from LINEORDER
INNER JOIN CUSTOMER
ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY and LO_SHIPMODE = C_NATION 
INNER JOIN SUPPLIER
ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
INNER JOIN PART
ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
INNER JOIN DATES
ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
{code}

Ideally, the SQL can be transferred as 
{code}
from LINEORDER
INNER JOIN CUSTOMER
ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY 
INNER JOIN SUPPLIER
ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
INNER JOIN PART
ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
INNER JOIN DATES
ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
where LO_SHIPMODE = C_NATION 
{code}
so that the model will be able to match the SQL.

> [kylin5] Allow more inner join keys in sql than the model
> -
>
> Key: KYLIN-5306
> URL: https://issues.apache.org/jira/browse/KYLIN-5306
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Priority: Major
>
> The join in model defined as follows:
> {code}
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> {code}
> The join in SQL is as follows:
> {code}
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY and LO_SHIPMODE = C_NATION 
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> {code}
> Ideally, the SQL can be transferred as 
> {code}
> from LINEORDER
> INNER JOIN CUSTOMER
> ON LINEORDER.LO_CUSTKEY=CUSTOMER.C_CUSTKEY 
> INNER JOIN SUPPLIER
> ON LINEORDER.LO_SUPPKEY=SUPPLIER.S_SUPPKEY
> INNER JOIN PART
> ON LINEORDER.LO_PARTKEY=PART.P_PARTKEY
> INNER JOIN DATES
> ON LINEORDER.LO_ORDERDATE=DATES.D_DATEKEY
> where LO_SHIPMODE = C_NATION 
> {code}
> so that the model will be able to match the SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5306) [kylin5] Allow more inner join keys in sql than the model

2022-11-27 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-5306:
-

 Summary: [kylin5] Allow more inner join keys in sql than the model
 Key: KYLIN-5306
 URL: https://issues.apache.org/jira/browse/KYLIN-5306
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KYLIN-5274) Improve performance of getSubstitutor

2022-10-08 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614454#comment-17614454
 ] 

Zhong Yanghong commented on KYLIN-5274:
---

Thanks [~xxyu] for providing the code of micro-benchmark (y)

> Improve performance of getSubstitutor
> -
>
> Key: KYLIN-5274
> URL: https://issues.apache.org/jira/browse/KYLIN-5274
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Xiaoxiang Yu
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: 5.0-alpha
>
> Attachments: image-2022-10-08-17-35-34-500.png, 
> image-2022-10-08-17-50-07-929.png
>
>
> h3. Background
> The following code are called for each call of *_KylinConfig#getOptional_* , 
> which needs to be optimized. In some case, it will improve query performance.
>  
> {code:java}
> tected StrSubstitutor getSubstitutor() {
> // env > properties
> final Map all = Maps.newHashMap(); // create a new map 
> every time
> all.putAll((Map) properties);
> all.putAll(STATIC_SYSTEM_ENV);
> return new StrSubstitutor(all);
> } {code}
>  
>  
> h3. How to fix
> 1. Not to create a new map each time.
> 2. Not to use Properties because it extends _Hashtable._
> h3. Micro Benchmark
> Use JMH to show performance(avg time):
>  
> {code:java}
> import org.apache.kylin.common.util.NLocalFileMetadataTestCase;
> import org.openjdk.jmh.annotations.Benchmark;
> import org.openjdk.jmh.annotations.BenchmarkMode;
> import org.openjdk.jmh.annotations.Fork;
> import org.openjdk.jmh.annotations.Measurement;
> import org.openjdk.jmh.annotations.Mode;
> import org.openjdk.jmh.annotations.OutputTimeUnit;
> import org.openjdk.jmh.annotations.Scope;
> import org.openjdk.jmh.annotations.Setup;
> import org.openjdk.jmh.annotations.State;
> import org.openjdk.jmh.annotations.Threads;
> import org.openjdk.jmh.annotations.Warmup;
> import java.util.concurrent.TimeUnit;
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.MILLISECONDS)
> @Warmup(iterations = 1)
> @Measurement(iterations = 10, time = 10, timeUnit = TimeUnit.MILLISECONDS)
> @Threads(1)
> @Fork(value = 1, jvmArgs = {"-Xms2G", "-Xmx2G"})
> @State(Scope.Benchmark)
> public class KylinConfigBenchmark {
> @Setup
> public void setUp() throws Exception {
> NLocalFileMetadataTestCase case1 = new NLocalFileMetadataTestCase();
> case1.createTestMetadata();
> }
> @Benchmark
> public void getProperty() {
> KylinConfig config = KylinConfig.getInstanceFromEnv();
> for(int i = 0; i<= 1000_000; i++ ){
> config.getJdbcDriverClass();
> }
> }
> public static void main(String[] args) throws Exception {
> org.openjdk.jmh.Main.main(args);
> }
> } {code}
>  
>  
> h4. Before Applied
> !image-2022-10-08-17-35-34-500.png!
>  
> h4. After Applied
> !image-2022-10-08-17-50-07-929.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KYLIN-3430) Global Dictionary Cleanup

2022-05-09 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533639#comment-17533639
 ] 

Zhong Yanghong commented on KYLIN-3430:
---

The following code:
{code}
Set activeResources = Sets.newHashSet();
for (CubeInstance cube : cubeManager.reloadAndListAllCubes()) {
activeResources.addAll(cube.getSnapshots().values());
for (CubeSegment segment : cube.getSegments()) {
activeResources.addAll(segment.getSnapshotPaths());
activeResources.addAll(segment.getDictionaryPaths());
activeResources.add(segment.getStatisticsResourcePath());
for (String dictPath : segment.getDictionaryPaths()) {
DictionaryInfo dictInfo = store.getResource(dictPath, 
DictionaryInfoSerializer.FULL_SERIALIZER);
if ("org.apache.kylin.dict.AppendTrieDictionary"
.equals(dictInfo != null ? 
dictInfo.getDictionaryClass() : null)) {
{code}
will make it very slow to do the clean up, since we have to load every 
dictionaries.

> Global Dictionary Cleanup
> -
>
> Key: KYLIN-3430
> URL: https://issues.apache.org/jira/browse/KYLIN-3430
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v2.1.0, v2.2.0, v2.3.0, v2.3.1, v2.4.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Major
> Fix For: v2.6.0
>
> Attachments: KYLIN-3430.master.001.patch
>
>
> I had run "{{./bin/metastore.sh clean --delete true" to cleanup my Kylin 
> metadata, but, after that, the Global Dictionary still exists in my HDFS and 
> the size of directory "/kylin_metadata/resources/GlobalDict/dict" hasn't 
> shrunk.}}
>  
> {{BTW: I'm very sure that there are redundant Global Dictionaries.}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KYLIN-5119) kylin-native branch for next generation development

2021-11-08 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-5119:
-

 Summary: kylin-native branch for next generation development
 Key: KYLIN-5119
 URL: https://issues.apache.org/jira/browse/KYLIN-5119
 Project: Kylin
  Issue Type: Task
Reporter: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (KYLIN-4985) optimize kylin planner by delete unnecessary cuboids

2021-06-27 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370190#comment-17370190
 ] 

Zhong Yanghong edited comment on KYLIN-4985 at 6/27/21, 10:37 AM:
--

Hi [~tianhui5], the cuboids recommended by cube planner algorithms is 
redundant. The redundancy is controlled by 
*kylin.cube.cubeplanner.expansion-threshold*.

One more thing for "get many cuboids in recommand result that never hitted by 
my history queries". If cuboid A is the parent of cuboid B, and their row 
account are similar, even when your history queries always hit cuboid B, Kylin 
should prefer to choose cuboid A to be built.

For the weighting change, could you explain more about the mathematical theory? 
At first glance, it's not follow monotonicity of the probability.


was (Author: yaho):
Hi [~tianhui5], the cuboids recommended by cube planner algorithms is 
redundant. The redundancy is controlled by 
*kylin.cube.cubeplanner.expansion-threshold*.

One more thing for "get many cuboids in recommand result that never hitted by 
my history queries". If cuboid A is the parent of cuboid B, and their row 
account are similar, even when your history queries always hit cuboid B, Kylin 
should prefer to choose cuboid A to be built.

For the weighting change, could you explain more about the mathematical theory? 
At first glance, it's not follow monotonicity.

> optimize kylin planner by delete unnecessary cuboids
> 
>
> Key: KYLIN-4985
> URL: https://issues.apache.org/jira/browse/KYLIN-4985
> Project: Kylin
>  Issue Type: New Feature
>Reporter: tianhui
>Priority: Major
>
> When I use Kylin Planner, I can get many cuboids in recommand result that 
> never hitted by my history queries. I think it maybe unnecessary, so I delete 
> the unhitted cuboids.
> In addition, I change row count by weighting of 1/sqrt(hit probability) 
> before execute plan algorithm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4985) optimize kylin planner by delete unnecessary cuboids

2021-06-27 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370190#comment-17370190
 ] 

Zhong Yanghong commented on KYLIN-4985:
---

Hi [~tianhui5], the cuboids recommended by cube planner algorithms is 
redundant. The redundancy is controlled by 
*kylin.cube.cubeplanner.expansion-threshold*.

One more thing for "get many cuboids in recommand result that never hitted by 
my history queries". If cuboid A is the parent of cuboid B, and their row 
account are similar, even when your history queries always hit cuboid B, Kylin 
should prefer to choose cuboid A to be built.

For the weighting change, could you explain more about the mathematical theory? 
At first glance, it's not follow monotonicity.

> optimize kylin planner by delete unnecessary cuboids
> 
>
> Key: KYLIN-4985
> URL: https://issues.apache.org/jira/browse/KYLIN-4985
> Project: Kylin
>  Issue Type: New Feature
>Reporter: tianhui
>Priority: Major
>
> When I use Kylin Planner, I can get many cuboids in recommand result that 
> never hitted by my history queries. I think it maybe unnecessary, so I delete 
> the unhitted cuboids.
> In addition, I change row count by weighting of 1/sqrt(hit probability) 
> before execute plan algorithm.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error

2021-06-23 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368010#comment-17368010
 ] 

Zhong Yanghong edited comment on KYLIN-4165 at 6/23/21, 10:17 AM:
--

If the first step errors, why should the job still keeps the lock?


was (Author: yaho):
Why we need a distributed lock for two stages, which may introduce other issues?

For example, when the first step errors due to that cube is disabled, the lock 
should be released. Currently only when job is discarded, the lock will be 
released.

How about fixing it just in *SaveDictStep*?

> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> ---
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when 
> build the realtime fsegment from remote persisted to reday,Which is very 
> serious,it will lead to unsuccessful updating of dictionaries by multiple 
> jobs concurrently.This may occurs when a cube has many concurrent building 
> jobs one the same step ——”Save Cube Dictionaries“ . 
> Perhaps a globally distributed lock is needed to avoid one cube concurrency 
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict 
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
>  expect old TS 1568012509090, but it is 1568012509245at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at 
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at 
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at 
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at 
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at 
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error

2021-06-23 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4165:
--
Comment: was deleted

(was: Why we need a cube-level lock for this, since the dictionary is 
segment-level?)

> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> ---
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when 
> build the realtime fsegment from remote persisted to reday,Which is very 
> serious,it will lead to unsuccessful updating of dictionaries by multiple 
> jobs concurrently.This may occurs when a cube has many concurrent building 
> jobs one the same step ——”Save Cube Dictionaries“ . 
> Perhaps a globally distributed lock is needed to avoid one cube concurrency 
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict 
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
>  expect old TS 1568012509090, but it is 1568012509245at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at 
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at 
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at 
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at 
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at 
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error

2021-06-23 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368010#comment-17368010
 ] 

Zhong Yanghong edited comment on KYLIN-4165 at 6/23/21, 10:06 AM:
--

Why we need a distributed lock for two stages, which may introduce other issues?

For example, when the first step errors due to that cube is disabled, the lock 
should be released. Currently only when job is discarded, the lock will be 
released.

How about fixing it just in *SaveDictStep*?


was (Author: yaho):
Why we need a distributed lock for two stages, which may introduce other issues?

How about fixing it just in *SaveDictStep*?

> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> ---
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when 
> build the realtime fsegment from remote persisted to reday,Which is very 
> serious,it will lead to unsuccessful updating of dictionaries by multiple 
> jobs concurrently.This may occurs when a cube has many concurrent building 
> jobs one the same step ——”Save Cube Dictionaries“ . 
> Perhaps a globally distributed lock is needed to avoid one cube concurrency 
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict 
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
>  expect old TS 1568012509090, but it is 1568012509245at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at 
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at 
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at 
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at 
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at 
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error

2021-06-23 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368010#comment-17368010
 ] 

Zhong Yanghong commented on KYLIN-4165:
---

Why we need a distributed lock for two stages, which may introduce other issues?

How about fixing it just in *SaveDictStep*?

> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> ---
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when 
> build the realtime fsegment from remote persisted to reday,Which is very 
> serious,it will lead to unsuccessful updating of dictionaries by multiple 
> jobs concurrently.This may occurs when a cube has many concurrent building 
> jobs one the same step ——”Save Cube Dictionaries“ . 
> Perhaps a globally distributed lock is needed to avoid one cube concurrency 
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict 
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
>  expect old TS 1568012509090, but it is 1568012509245at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at 
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at 
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at 
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at 
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at 
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error

2021-06-23 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368007#comment-17368007
 ] 

Zhong Yanghong edited comment on KYLIN-4165 at 6/23/21, 10:00 AM:
--

Why we need a cube-level lock for this, since the dictionary is segment-level?


was (Author: yaho):
Why we need a cube-level lock for this, since the dictionary is segment-level.

> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> ---
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when 
> build the realtime fsegment from remote persisted to reday,Which is very 
> serious,it will lead to unsuccessful updating of dictionaries by multiple 
> jobs concurrently.This may occurs when a cube has many concurrent building 
> jobs one the same step ——”Save Cube Dictionaries“ . 
> Perhaps a globally distributed lock is needed to avoid one cube concurrency 
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict 
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
>  expect old TS 1568012509090, but it is 1568012509245at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at 
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at 
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at 
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at 
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at 
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error

2021-06-23 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368007#comment-17368007
 ] 

Zhong Yanghong commented on KYLIN-4165:
---

Why we need a cube-level lock for this, since the dictionary is segment-level.

> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> ---
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when 
> build the realtime fsegment from remote persisted to reday,Which is very 
> serious,it will lead to unsuccessful updating of dictionaries by multiple 
> jobs concurrently.This may occurs when a cube has many concurrent building 
> jobs one the same step ——”Save Cube Dictionaries“ . 
> Perhaps a globally distributed lock is needed to avoid one cube concurrency 
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict 
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
>  expect old TS 1568012509090, but it is 1568012509245at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at 
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at 
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at 
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at 
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at 
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (KYLIN-4165) RT OLAP building job on "Save Cube Dictionaries" step concurrency error

2021-06-23 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reopened KYLIN-4165:
---

> RT OLAP building job on "Save Cube Dictionaries" step concurrency error
> ---
>
> Key: KYLIN-4165
> URL: https://issues.apache.org/jira/browse/KYLIN-4165
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> There is a dictionary version conflict in "Save Cube Dictionaries" step when 
> build the realtime fsegment from remote persisted to reday,Which is very 
> serious,it will lead to unsuccessful updating of dictionaries by multiple 
> jobs concurrently.This may occurs when a cube has many concurrent building 
> jobs one the same step ——”Save Cube Dictionaries“ . 
> Perhaps a globally distributed lock is needed to avoid one cube concurrency 
> running of this step .
> Save Cube Dictionaries log messages:
> {code:java}
> // code placeholder
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict 
> /dict/DEFAULT.TASK_SNAPSHOT/GROUPVALUE/5387e747-9649-0b17-5a72-ee17f5baea0a.dict,
>  expect old TS 1568012509090, but it is 1568012509245at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.updateTimestampImpl(HBaseResourceStore.java:372)
> at 
> org.apache.kylin.common.persistence.ResourceStore$7.call(ResourceStore.java:465)
> at 
> org.apache.kylin.common.persistence.ExponentialBackoffRetry.doWithRetry(ExponentialBackoffRetry.java:52)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampWithRetry(ResourceStore.java:462)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestampCheckPoint(ResourceStore.java:457)
> at 
> org.apache.kylin.common.persistence.ResourceStore.updateTimestamp(ResourceStore.java:452)
> at 
> org.apache.kylin.dict.DictionaryManager.updateExistingDictLastModifiedTime(DictionaryManager.java:197)
> at 
> org.apache.kylin.dict.DictionaryManager.trySaveNewDict(DictionaryManager.java:157)
> at 
> org.apache.kylin.engine.mr.streaming.SaveDictStep.doWork(SaveDictStep.java:122)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
> at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper

2021-05-07 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4992:
--
Affects Version/s: v3.0.0
   v2.6.2
   v2.6.3
   v3.1.0
   v3.0.1
   v3.0.2
   v3.1.1
   v3.1.2

> Source row count statistics calculated in a wrong way in MergeDictionaryMapper
> --
>
> Key: KYLIN-4992
> URL: https://issues.apache.org/jira/browse/KYLIN-4992
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0, v2.6.2, v2.6.3, v3.1.0, v3.0.1, v3.0.2, v3.1.1, 
> v3.1.2
>Reporter: Zhong Yanghong
>Priority: Critical
>
> With this bug, source row count will be smaller than the correct one and it 
> will result in smaller cuboid size estimation and smaller region number. 
> Finally it will impact job and query performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper

2021-05-07 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4992:
--
Description: With this bug, source row count will be smaller than the 
correct one and it will result in smaller cuboid size estimation and smaller 
region number. Finally it will impact job and query performance.  (was: With 
this bug, source row count will be smaller than the correct one and it will 
result in smaller cuboid size estimation and smaller region number.)

> Source row count statistics calculated in a wrong way in MergeDictionaryMapper
> --
>
> Key: KYLIN-4992
> URL: https://issues.apache.org/jira/browse/KYLIN-4992
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Priority: Major
>
> With this bug, source row count will be smaller than the correct one and it 
> will result in smaller cuboid size estimation and smaller region number. 
> Finally it will impact job and query performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper

2021-05-07 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4992:
--
Priority: Critical  (was: Major)

> Source row count statistics calculated in a wrong way in MergeDictionaryMapper
> --
>
> Key: KYLIN-4992
> URL: https://issues.apache.org/jira/browse/KYLIN-4992
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Priority: Critical
>
> With this bug, source row count will be smaller than the correct one and it 
> will result in smaller cuboid size estimation and smaller region number. 
> Finally it will impact job and query performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper

2021-05-07 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4992:
--
Description: With this bug, source row count will be smaller than the 
correct one and it will result in smaller cuboid size estimation and smaller 
region number.

> Source row count statistics calculated in a wrong way in MergeDictionaryMapper
> --
>
> Key: KYLIN-4992
> URL: https://issues.apache.org/jira/browse/KYLIN-4992
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Priority: Major
>
> With this bug, source row count will be smaller than the correct one and it 
> will result in smaller cuboid size estimation and smaller region number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4992) Source row count statistics calculated in a wrong way in MergeDictionaryMapper

2021-05-07 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4992:
-

 Summary: Source row count statistics calculated in a wrong way in 
MergeDictionaryMapper
 Key: KYLIN-4992
 URL: https://issues.apache.org/jira/browse/KYLIN-4992
 Project: Kylin
  Issue Type: Bug
Reporter: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4861) Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()

2021-01-04 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4861:
-

Assignee: (was: Zhong Yanghong)

> Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()
> --
>
> Key: KYLIN-4861
> URL: https://issues.apache.org/jira/browse/KYLIN-4861
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Priority: Major
>
> Each cube can have its own KylinConfig. Then for the following code:
> {code}
> public CubeInstance latestCopyForWrite() {
> CubeManager mgr = CubeManager.getInstance(config);
> CubeInstance latest = mgr.getCube(name); // in case this object is 
> out-of-date
> return mgr.copyForWrite(latest);
> }
> {code}
> Each cube can have a different CubeManager instance, which may easily cause 
> map consistency issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4861) Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()

2021-01-04 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4861:
--
Description: 
Each cube can have its own KylinConfig. Then for the following code:
{code}
public CubeInstance latestCopyForWrite() {
CubeManager mgr = CubeManager.getInstance(config);
CubeInstance latest = mgr.getCube(name); // in case this object is 
out-of-date
return mgr.copyForWrite(latest);
}
{code}
Each cube can have a different CubeManager instance, which may easily cause map 
consistency issue.

> Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()
> --
>
> Key: KYLIN-4861
> URL: https://issues.apache.org/jira/browse/KYLIN-4861
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Each cube can have its own KylinConfig. Then for the following code:
> {code}
> public CubeInstance latestCopyForWrite() {
> CubeManager mgr = CubeManager.getInstance(config);
> CubeInstance latest = mgr.getCube(name); // in case this object is 
> out-of-date
> return mgr.copyForWrite(latest);
> }
> {code}
> Each cube can have a different CubeManager instance, which may easily cause 
> map consistency issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4861) Wrong way to get CubeManager instance in CubeInstance.latestCopyForWrite()

2021-01-04 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4861:
-

 Summary: Wrong way to get CubeManager instance in 
CubeInstance.latestCopyForWrite()
 Key: KYLIN-4861
 URL: https://issues.apache.org/jira/browse/KYLIN-4861
 Project: Kylin
  Issue Type: Bug
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on

2021-01-04 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong resolved KYLIN-4658.
---
Fix Version/s: v3.1.2
   Resolution: Fixed

>  Union all issue with regarding to windows function & aggregation on
> 
>
> Key: KYLIN-4658
> URL: https://issues.apache.org/jira/browse/KYLIN-4658
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v3.1.2
>
>
> Test SQL:
> {code}
> select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME
> from 
> (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> UNION ALL
> select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
> order by TOTAL_GMV
> {code}
>  
> Exception:
> {code}
> Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
> sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
> LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, 
> sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT 
> group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
> {code}
> Similar issue for the following sql:
> {code}
> select LSTG_FORMAT_NAME,
>SLR_SEGMENT_CD,
>CAL_DT,
>sum(CNT) as CNT
> from
>   (select LSTG_FORMAT_NAME,
>   SLR_SEGMENT_CD,
>   CAL_DT,
>   sum(ITEM_COUNT) CNT
>from TEST_KYLIN_FACT
>where LSTG_FORMAT_NAME = 'ABIN'
>group by LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT
>UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT,
> case
> when SLR_SEGMENT_CD > 1000 then CNT * 2
> else CNT * 3
> end as CNT
>from
>  (select SLR_SEGMENT_CD,
>  CAL_DT,
>  sum(ITEM_COUNT) CNT
>   from TEST_KYLIN_FACT
>   where LSTG_FORMAT_NAME <> 'ABIN'
>   group by SLR_SEGMENT_CD,CAL_DT))
> group by LSTG_FORMAT_NAME,
>  SLR_SEGMENT_CD,
>  CAL_DT
> order by CNT
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4851) Better to throw exception when lazy query waiting timeout

2020-12-30 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4851:
-

 Summary: Better to throw exception when lazy query waiting timeout
 Key: KYLIN-4851
 URL: https://issues.apache.org/jira/browse/KYLIN-4851
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly

2020-12-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4682:
--
Comment: was deleted

(was: It seems rule *FilterAggregateTransposeRule* is not effective. It's 
better to set a lower number for computing the cost of *OLAPFilterRel* to make 
filter push down as much as possible.)

> java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
> -
>
> Key: KYLIN-4682
> URL: https://issues.apache.org/jira/browse/KYLIN-4682
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> SQL:
> {code}
> select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv
> from TEST_KYLIN_FACT 
> group by LSTG_FORMAT_NAME, LEAF_CATEG_ID
> having LSTG_FORMAT_NAME = 'Auction'
> {code}
> Error stack trace:
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:657)
>   at java.util.ArrayList.get(ArrayList.java:433)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90)
>   at 
> org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)
>   at Baz$1$1.moveNext(Unknown Source)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>   at 
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>   at Baz.bind(Unknown Source)
>   at 
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365)
>   at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
>   at 
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
>   at 
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>   at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>   ... 81 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on

2020-12-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4658:
-

Assignee: Zhong Yanghong  (was: JiangYang)

>  Union all issue with regarding to windows function & aggregation on
> 
>
> Key: KYLIN-4658
> URL: https://issues.apache.org/jira/browse/KYLIN-4658
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Test SQL:
> {code}
> select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME
> from 
> (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> UNION ALL
> select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
> order by TOTAL_GMV
> {code}
>  
> Exception:
> {code}
> Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
> sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
> LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, 
> sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT 
> group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
> {code}
> Similar issue for the following sql:
> {code}
> select LSTG_FORMAT_NAME,
>SLR_SEGMENT_CD,
>CAL_DT,
>sum(CNT) as CNT
> from
>   (select LSTG_FORMAT_NAME,
>   SLR_SEGMENT_CD,
>   CAL_DT,
>   sum(ITEM_COUNT) CNT
>from TEST_KYLIN_FACT
>where LSTG_FORMAT_NAME = 'ABIN'
>group by LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT
>UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT,
> case
> when SLR_SEGMENT_CD > 1000 then CNT * 2
> else CNT * 3
> end as CNT
>from
>  (select SLR_SEGMENT_CD,
>  CAL_DT,
>  sum(ITEM_COUNT) CNT
>   from TEST_KYLIN_FACT
>   where LSTG_FORMAT_NAME <> 'ABIN'
>   group by SLR_SEGMENT_CD,CAL_DT))
> group by LSTG_FORMAT_NAME,
>  SLR_SEGMENT_CD,
>  CAL_DT
> order by CNT
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-3392) Support NULL value in Sum, Max, Min Aggregation

2020-12-28 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1726#comment-1726
 ] 

Zhong Yanghong commented on KYLIN-3392:
---

Hi [~wangrupeng], I applied the patch to 3.1.0 and it works well. Could you try 
it?

> Support NULL value in Sum, Max, Min Aggregation
> ---
>
> Key: KYLIN-3392
> URL: https://issues.apache.org/jira/browse/KYLIN-3392
> Project: Kylin
>  Issue Type: Bug
>Reporter: Yifei Wu
>Assignee: Yifei Wu
>Priority: Major
> Fix For: Future
>
> Attachments: KYLIN-3392-2.png, KYLIN-3392.png, kylin-3.0.0-alpha2.png
>
>
> It is treated as 0 when confronted with NULL value in KYLIN's basic aggregate 
> measure (like sum, max, min). However, to distinguish the NULL value with 0 
> is very necessary.
> It should be like this
> *sum(null, null) = null*
> *sum(null, 1) = 1*
> *max(null, null) = null*
> *max(null, -1) = -1*
> *min(null,  -1)= -1*
>  in accordance with Hive and SparkSQL



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-3482) Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer

2020-11-19 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17235822#comment-17235822
 ] 

Zhong Yanghong commented on KYLIN-3482:
---

[~shaofengshi], It seems the patch for closing will have bad effect in 
*SparkCubingByLayer*. 

For example, after applying the patch
{code}
public void init() {
 KylinConfig kConfig = 
AbstractHadoopJob.loadKylinConfigFromHdfs(conf, metaUrl);
 try (KylinConfig.SetAndUnsetThreadLocalConfig autoUnset = 
KylinConfig
 .setAndUnsetThreadLocalConfig(kConfig)) {
 CubeInstance cubeInstance = 
CubeManager.getInstance(kConfig).getCube(cubeName);
 cubeDesc = cubeInstance.getDescriptor();
 aggregators = new MeasureAggregators(cubeDesc.getMeasures());
 measureNum = cubeDesc.getMeasures().size();
 }
 }
{code}

After init(), the KylinConfig will be removed. Then it will fail to call 
KylinConfig.getInstanceFromEnv(). 

> Unclosed SetAndUnsetThreadLocalConfig in SparkCubingByLayer
> ---
>
> Key: KYLIN-3482
> URL: https://issues.apache.org/jira/browse/KYLIN-3482
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Jiatao Tao
>Priority: Minor
> Fix For: v2.5.0
>
>
> Here is related code:
> {code}
> KylinConfig kylinConfig = 
> AbstractHadoopJob.loadKylinConfigFromHdfs(sConf, metaUrl);
> 
> KylinConfig.setAndUnsetThreadLocalConfig(kylinConfig);
> {code}
> The return value from setAndUnsetThreadLocalConfig should be closed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-3271) Optimize sub-path check of ResourceTool

2020-10-14 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214421#comment-17214421
 ] 

Zhong Yanghong commented on KYLIN-3271:
---

This change blocks the following command:
{code}
${KYLIN_HOME}/bin/metastore.sh fetch /execute
{code}

> Optimize sub-path check of ResourceTool
> ---
>
> Key: KYLIN-3271
> URL: https://issues.apache.org/jira/browse/KYLIN-3271
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v2.2.0
>Reporter: nichunen
>Assignee: nichunen
>Priority: Minor
> Fix For: v2.4.0
>
>
> kylin uses class org.apache.kylin.common.persistence.ResourceTool to do 
> metadata download, upload, remove, etc. The algorithm for resource 
> transversal is not very effective. For instance, for an "execute_output" with 
> key "/execute_output/\{uuid}", the algorithm will try to check whether it's a 
> folder with sub-resources, this makes un-necessary time cost, and in cases of 
> metadata with lots of jobs, it may last for a long time before the finish.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (KYLIN-3271) Optimize sub-path check of ResourceTool

2020-10-14 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reopened KYLIN-3271:
---

> Optimize sub-path check of ResourceTool
> ---
>
> Key: KYLIN-3271
> URL: https://issues.apache.org/jira/browse/KYLIN-3271
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v2.2.0
>Reporter: nichunen
>Assignee: nichunen
>Priority: Minor
> Fix For: v2.4.0
>
>
> kylin uses class org.apache.kylin.common.persistence.ResourceTool to do 
> metadata download, upload, remove, etc. The algorithm for resource 
> transversal is not very effective. For instance, for an "execute_output" with 
> key "/execute_output/\{uuid}", the algorithm will try to check whether it's a 
> folder with sub-resources, this makes un-necessary time cost, and in cases of 
> metadata with lots of jobs, it may last for a long time before the finish.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4421) Allow to update table & database name

2020-10-10 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211583#comment-17211583
 ] 

Zhong Yanghong commented on KYLIN-4421:
---

The usage is as follows:
{code:java}
curl -kXPOST 'http://localhost:7070/kylin/api/tables/default/update' \
-H 'Authorization: Basic XX' \
-H 'Content-Type: application/json' \
-d '{  
   "mapping":{
  "DEFAULT.KYLIN_SALES": {
 "database": "TEST",
 "tableName": "KYLIN_FACT"
  },
  "DEFAULT.KYLIN_CAL_DT": {
 "tableName": "CAL_DT"
  },
  "DEFAULT.KYLIN_CATEGORY_GROUPINGS": {
 "database": "TEST"
  }
   },
   "isUseExisting":true
}'
{code}

> Allow to update table & database name 
> --
>
> Key: KYLIN-4421
> URL: https://issues.apache.org/jira/browse/KYLIN-4421
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Minor
> Fix For: v3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4666) Improve TopNCounter's merge performance

2020-09-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4666:
--
Description: Currently, we need to do sort for very merge operation, which 
will cost much time for thousands of merges. It's better to leverage a bit more 
buffer to reduce the chance of sort  (was: It's better to use PriorityQueue 
rather than Collections.sort() to sort elements and find minimum value.)

> Improve TopNCounter's merge performance
> ---
>
> Key: KYLIN-4666
> URL: https://issues.apache.org/jira/browse/KYLIN-4666
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Currently, we need to do sort for very merge operation, which will cost much 
> time for thousands of merges. It's better to leverage a bit more buffer to 
> reduce the chance of sort



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4080) Project schema update event causes error reload NEW DataModelDesc

2020-09-29 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203848#comment-17203848
 ] 

Zhong Yanghong edited comment on KYLIN-4080 at 9/29/20, 10:33 AM:
--

This fix will break cube migration when the destination project is different 
from the source one because of the additional attribute *projectName* in 
DataModelDesc


was (Author: yaho):
This fix will break cube migration when the destination project is different 
from the source one.

> Project schema update event causes error reload NEW DataModelDesc
> -
>
> Key: KYLIN-4080
> URL: https://issues.apache.org/jira/browse/KYLIN-4080
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Assignee: Yuzhang QIU
>Priority: Blocker
> Fix For: v2.6.5, v3.1.0, v3.0.1
>
>
> Hi, dear Kylin dev team:
>When create new DataModelDesc, DataModelManager.createDataModelDese:246 
> will temporarily add the new model name into selected project(project1) 
> cache, but won't persist it. The TEMPORARY ADD operation will make the model 
> reloading successful, rather than throw "No project found for model ..." 
> exception(at ProjectManager:391).
>However, If there have another threads are processing  "Broadcasting 
> update project_schema, project1", it will clean up cache of project1 and 
> reload it, which will reset the "TEMPORARY ADD" operation. Meanwhile, the 
> model saving thread has persisted the DataModelDesc and start to reload it, 
> but will find there have "No project for this model".
>   The new model can't be created again because the conflict timestamp and 
> can't be reloaded into cache because the abrove problem. 
>How do you think about this??
>   
>Best regards
>   
>yuzhang



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4080) Project schema update event causes error reload NEW DataModelDesc

2020-09-29 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203848#comment-17203848
 ] 

Zhong Yanghong commented on KYLIN-4080:
---

This fix will break cube migration when the destination project is different 
from the source one.

> Project schema update event causes error reload NEW DataModelDesc
> -
>
> Key: KYLIN-4080
> URL: https://issues.apache.org/jira/browse/KYLIN-4080
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Assignee: Yuzhang QIU
>Priority: Blocker
> Fix For: v2.6.5, v3.1.0, v3.0.1
>
>
> Hi, dear Kylin dev team:
>When create new DataModelDesc, DataModelManager.createDataModelDese:246 
> will temporarily add the new model name into selected project(project1) 
> cache, but won't persist it. The TEMPORARY ADD operation will make the model 
> reloading successful, rather than throw "No project found for model ..." 
> exception(at ProjectManager:391).
>However, If there have another threads are processing  "Broadcasting 
> update project_schema, project1", it will clean up cache of project1 and 
> reload it, which will reset the "TEMPORARY ADD" operation. Meanwhile, the 
> model saving thread has persisted the DataModelDesc and start to reload it, 
> but will find there have "No project for this model".
>   The new model can't be created again because the conflict timestamp and 
> can't be reloaded into cache because the abrove problem. 
>How do you think about this??
>   
>Best regards
>   
>yuzhang



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (KYLIN-4080) Project schema update event causes error reload NEW DataModelDesc

2020-09-29 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reopened KYLIN-4080:
---

> Project schema update event causes error reload NEW DataModelDesc
> -
>
> Key: KYLIN-4080
> URL: https://issues.apache.org/jira/browse/KYLIN-4080
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Assignee: Yuzhang QIU
>Priority: Blocker
> Fix For: v2.6.5, v3.1.0, v3.0.1
>
>
> Hi, dear Kylin dev team:
>When create new DataModelDesc, DataModelManager.createDataModelDese:246 
> will temporarily add the new model name into selected project(project1) 
> cache, but won't persist it. The TEMPORARY ADD operation will make the model 
> reloading successful, rather than throw "No project found for model ..." 
> exception(at ProjectManager:391).
>However, If there have another threads are processing  "Broadcasting 
> update project_schema, project1", it will clean up cache of project1 and 
> reload it, which will reset the "TEMPORARY ADD" operation. Meanwhile, the 
> model saving thread has persisted the DataModelDesc and start to reload it, 
> but will find there have "No project for this model".
>   The new model can't be created again because the conflict timestamp and 
> can't be reloaded into cache because the abrove problem. 
>How do you think about this??
>   
>Best regards
>   
>yuzhang



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4282) support case when in count (distinct)

2020-09-29 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203784#comment-17203784
 ] 

Zhong Yanghong commented on KYLIN-4282:
---

This feature is mainly for dealing with the limitation of count (distinct ) 
with some filter. Without this feature, if user wants to query two count 
(distinct) measures with different filters, he has to use two subqueries first 
and then do join to combine the results, like follows:
{code}
select T1.col_a, T1.cm1, T2.cm2 
from
(select col_a, count(distinct m1) as cm1 from T where f1... group by 1) T1
inner join
(select col_a, count(distinct m2) as cm2 from T where f2... group by 1) T2
on T1.col_a = T2.col_a
{code}

With this feature, we can only use single query as follows:
{code}
select col_a, count(distinct case when f1 then m1 end) as cm1, count(distinct 
case when f2 then m2 end) as cm2 from T group by 1
{code}

> support case when in count (distinct)
> -
>
> Key: KYLIN-4282
> URL: https://issues.apache.org/jira/browse/KYLIN-4282
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v3.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query

2020-09-28 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4779:
--
Description: 
Use InitialCuboidScheduler instead of stats-based TreeCuboidScheduler is not 
good for finding best parent cuboid for query performance. Here's the example 
for query with source cuboid 0X0200, it has two parent cuboids
 * 0X0302 with row count 560K
 * 0X0304 with row count 40

If we use InitialCuboidScheduler, it will choose 0X0302 as the target cuboid 
for this query. It's obviously better to choose 0X0304.
||Heading 1||Heading 2||
|!0X0302.png|width=400,height=400!|!0X0304.png|width=400,height=400!|

  was:Use InitialCuboidScheduler instead of stats-based 


> Use TreeCuboidScheduler even when cube planner is not enabled for query
> ---
>
> Key: KYLIN-4779
> URL: https://issues.apache.org/jira/browse/KYLIN-4779
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: 0X0302.png, 0X0304.png
>
>
> Use InitialCuboidScheduler instead of stats-based TreeCuboidScheduler is not 
> good for finding best parent cuboid for query performance. Here's the example 
> for query with source cuboid 0X0200, it has two parent cuboids
>  * 0X0302 with row count 560K
>  * 0X0304 with row count 40
> If we use InitialCuboidScheduler, it will choose 0X0302 as the target cuboid 
> for this query. It's obviously better to choose 0X0304.
> ||Heading 1||Heading 2||
> |!0X0302.png|width=400,height=400!|!0X0304.png|width=400,height=400!|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query

2020-09-28 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4779:
--
Attachment: 0X0302.png
0X0304.png

> Use TreeCuboidScheduler even when cube planner is not enabled for query
> ---
>
> Key: KYLIN-4779
> URL: https://issues.apache.org/jira/browse/KYLIN-4779
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: 0X0302.png, 0X0304.png
>
>
> Use InitialCuboidScheduler instead of stats-based 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query

2020-09-28 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4779:
--
Description: Use InitialCuboidScheduler instead of stats-based 

> Use TreeCuboidScheduler even when cube planner is not enabled for query
> ---
>
> Key: KYLIN-4779
> URL: https://issues.apache.org/jira/browse/KYLIN-4779
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Use InitialCuboidScheduler instead of stats-based 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4779) Use TreeCuboidScheduler even when cube planner is not enabled for query

2020-09-28 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4779:
-

 Summary: Use TreeCuboidScheduler even when cube planner is not 
enabled for query
 Key: KYLIN-4779
 URL: https://issues.apache.org/jira/browse/KYLIN-4779
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4769) Make supportAppend to be true for hdfs federation in HiveProducer

2020-09-21 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4769:
--
Summary: Make supportAppend to be true for hdfs federation in HiveProducer  
(was: Make supportAppend to be true for hdfs federation)

> Make supportAppend to be true for hdfs federation in HiveProducer
> -
>
> Key: KYLIN-4769
> URL: https://issues.apache.org/jira/browse/KYLIN-4769
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4769) Make supportAppend to be true for hdfs federation

2020-09-21 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4769:
-

 Summary: Make supportAppend to be true for hdfs federation
 Key: KYLIN-4769
 URL: https://issues.apache.org/jira/browse/KYLIN-4769
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4758) Introduce a new measure to allow input to be negative for topn

2020-09-14 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4758:
-

 Summary: Introduce a new measure to allow input to be negative for 
topn 
 Key: KYLIN-4758
 URL: https://issues.apache.org/jira/browse/KYLIN-4758
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4666) Improve TopNCounter's merge performance

2020-09-14 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4666:
--
Description: It's better to use PriorityQueue rather than 
Collections.sort() to sort elements and find minimum value.

> Improve TopNCounter's merge performance
> ---
>
> Key: KYLIN-4666
> URL: https://issues.apache.org/jira/browse/KYLIN-4666
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> It's better to use PriorityQueue rather than Collections.sort() to sort 
> elements and find minimum value.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4658:
-

Assignee: JiangYang  (was: Zhong Yanghong)

>  Union all issue with regarding to windows function & aggregation on
> 
>
> Key: KYLIN-4658
> URL: https://issues.apache.org/jira/browse/KYLIN-4658
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>
> Test SQL:
> {code}
> select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME
> from 
> (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> UNION ALL
> select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
> order by TOTAL_GMV
> {code}
>  
> Exception:
> {code}
> Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
> sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
> LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, 
> sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT 
> group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
> {code}
> Similar issue for the following sql:
> {code}
> select LSTG_FORMAT_NAME,
>SLR_SEGMENT_CD,
>CAL_DT,
>sum(CNT) as CNT
> from
>   (select LSTG_FORMAT_NAME,
>   SLR_SEGMENT_CD,
>   CAL_DT,
>   sum(ITEM_COUNT) CNT
>from TEST_KYLIN_FACT
>where LSTG_FORMAT_NAME = 'ABIN'
>group by LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT
>UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT,
> case
> when SLR_SEGMENT_CD > 1000 then CNT * 2
> else CNT * 3
> end as CNT
>from
>  (select SLR_SEGMENT_CD,
>  CAL_DT,
>  sum(ITEM_COUNT) CNT
>   from TEST_KYLIN_FACT
>   where LSTG_FORMAT_NAME <> 'ABIN'
>   group by SLR_SEGMENT_CD,CAL_DT))
> group by LSTG_FORMAT_NAME,
>  SLR_SEGMENT_CD,
>  CAL_DT
> order by CNT
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4651) TopN does not work when force hit cube enabled

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4651:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> TopN does not work when force hit cube enabled
> --
>
> Key: KYLIN-4651
> URL: https://issues.apache.org/jira/browse/KYLIN-4651
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4637) Fix sum(null) issue for decimal

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4637:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> Fix sum(null) issue for decimal
> ---
>
> Key: KYLIN-4637
> URL: https://issues.apache.org/jira/browse/KYLIN-4637
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4667) Automatically set kylin.query.cache-signature-enabled to be true when memcached is enabled

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4667:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> Automatically set kylin.query.cache-signature-enabled to be true when 
> memcached is enabled
> --
>
> Key: KYLIN-4667
> URL: https://issues.apache.org/jira/browse/KYLIN-4667
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4702) Missing cube-level lookup table snapshot when doing cube migration

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4702:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> Missing cube-level lookup table snapshot when doing cube migration
> --
>
> Key: KYLIN-4702
> URL: https://issues.apache.org/jira/browse/KYLIN-4702
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4639) Make batch account without any authorities to be able to see web pages

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4639:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> Make batch account without any authorities to be able to see web pages
> --
>
> Key: KYLIN-4639
> URL: https://issues.apache.org/jira/browse/KYLIN-4639
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4697) User info update logic is not correct

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4697:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> User info update logic is not correct
> -
>
> Key: KYLIN-4697
> URL: https://issues.apache.org/jira/browse/KYLIN-4697
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>
> There are mainly two issues:
> * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct 
> due to not considering ALL_USERS
> * The logic of updateUser in some places is not correct due to not following 
> copy on write. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4636) Make /api/admin/public_config callable for profile saml

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4636:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> Make /api/admin/public_config callable for profile saml
> ---
>
> Key: KYLIN-4636
> URL: https://issues.apache.org/jira/browse/KYLIN-4636
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4752) Refine server mode checking

2020-09-10 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong reassigned KYLIN-4752:
-

Assignee: JiangYang  (was: Zhong Yanghong)

> Refine server mode checking
> ---
>
> Key: KYLIN-4752
> URL: https://issues.apache.org/jira/browse/KYLIN-4752
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: JiangYang
>Priority: Major
>
> It's better to use *org.apache.kylin.common.util.ServerMode* for server mode 
> checking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4752) Refine server mode checking

2020-09-09 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4752:
--
Description: It's better to use *org.apache.kylin.common.util.ServerMode* 
for server mode checking.

> Refine server mode checking
> ---
>
> Key: KYLIN-4752
> URL: https://issues.apache.org/jira/browse/KYLIN-4752
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> It's better to use *org.apache.kylin.common.util.ServerMode* for server mode 
> checking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4752) Refine server mode checking

2020-09-09 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4752:
-

 Summary: Refine server mode checking
 Key: KYLIN-4752
 URL: https://issues.apache.org/jira/browse/KYLIN-4752
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-3359) Support sum(expression) if possible

2020-09-08 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192525#comment-17192525
 ] 

Zhong Yanghong commented on KYLIN-3359:
---

Hi [~mzz_q], did you set *kylin.query.enable-dynamic-column* to be true at 
project level.

> Support sum(expression) if possible
> ---
>
> Key: KYLIN-3359
> URL: https://issues.apache.org/jira/browse/KYLIN-3359
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.4.0
>
> Attachments: KYLIN-3359-Hive-query.png, KYLIN-3359-Kylin-query.png
>
>
> The expression can be as follows:
>  # a ~1~*col ~1~ + a ~2~*col ~2~ + ... + a ~n~*col ~n~ + b, if sum(col 
> ~1~),sum(col ~2~),...sum(col ~n~) are defined
>  # case when {{filter}} ~1~ then expr ~1~
>  when {{filter}} ~2~ then expr ~2~
>  ...
>  else expr ~N~
>  end, if {{filter}} ~1~,{{filter}} ~2~, ... {{filter}} ~N-1~, and expr 
> ~1~,expr ~2~,...expr ~N~ are supported 
> There's a constraint for the filter. That is it's able to push down the 
> related filters in case when.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4620) sum(expression) support should be limited since it's not conform the associative law of addition in standard sql

2020-09-08 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4620:
--
Description: 
In standard sql, there's an edge case for the calculation of expression for a 
single row. For example, 
{code:java}
${col1} + ${col2}

{code}
if ${col1} or ${col2} is null, the result of this expression should be null. 
Therefore, the sum aggregation function does not conform the associative law of 
addition. That is
{code:java}
sum(col1) + sum(col2) != sum(col1 + col2) 

{code}
 

To support sum(col1 + col2), we have to predefine the it if null values may 
exist for the related columns.

If you want to enable sum(expression) with regarding null as 0, you need to set 
*kylin.query.is-null-as-zero-in-expression* to be true at project level

  was:
In standard sql, there's an edge case for the calculation of expression for a 
single row. For example, 
{code:java}
${col1} + ${col2}

{code}
if ${col1} or ${col2} is null, the result of this expression should be null. 
Therefore, the sum aggregation function does not conform the associative law of 
addition. That is
{code:java}
sum(col1) + sum(col2) != sum(col1 + col2) 

{code}
 

To support sum(col1 + col2), we have to predefine the it if null values may 
exist for the related columns.

If you want to enable sum(expression) with regarding null as 0, you need to set 
kylin.query.is-null-as-zero-in-expression to be true at project level


> sum(expression) support should be limited since it's not conform the 
> associative law of addition in standard sql
> 
>
> Key: KYLIN-4620
> URL: https://issues.apache.org/jira/browse/KYLIN-4620
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> In standard sql, there's an edge case for the calculation of expression for a 
> single row. For example, 
> {code:java}
> ${col1} + ${col2}
> {code}
> if ${col1} or ${col2} is null, the result of this expression should be null. 
> Therefore, the sum aggregation function does not conform the associative law 
> of addition. That is
> {code:java}
> sum(col1) + sum(col2) != sum(col1 + col2) 
> {code}
>  
> To support sum(col1 + col2), we have to predefine the it if null values may 
> exist for the related columns.
> If you want to enable sum(expression) with regarding null as 0, you need to 
> set *kylin.query.is-null-as-zero-in-expression* to be true at project level



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4620) sum(expression) support should be limited since it's not conform the associative law of addition in standard sql

2020-09-08 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4620:
--
Description: 
In standard sql, there's an edge case for the calculation of expression for a 
single row. For example, 
{code:java}
${col1} + ${col2}

{code}
if ${col1} or ${col2} is null, the result of this expression should be null. 
Therefore, the sum aggregation function does not conform the associative law of 
addition. That is
{code:java}
sum(col1) + sum(col2) != sum(col1 + col2) 

{code}
 

To support sum(col1 + col2), we have to predefine the it if null values may 
exist for the related columns.

If you want to enable sum(expression) with regarding null as 0, you need to set 
kylin.query.is-null-as-zero-in-expression to be true at project level

  was:
In standard sql, there's an edge case for the calculation of expression for a 
single row. For example, 
{code:java}
${col1} + ${col2}

{code}
if ${col1} or ${col2} is null, the result of this expression should be null. 
Therefore, the sum aggregation function does not conform the associative law of 
addition. That is
{code:java}
sum(col1) + sum(col2) != sum(col1 + col2) 

{code}
 

To support sum(col1 + col2), we have to predefine the it if null values may 
exist for the related columns.


> sum(expression) support should be limited since it's not conform the 
> associative law of addition in standard sql
> 
>
> Key: KYLIN-4620
> URL: https://issues.apache.org/jira/browse/KYLIN-4620
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> In standard sql, there's an edge case for the calculation of expression for a 
> single row. For example, 
> {code:java}
> ${col1} + ${col2}
> {code}
> if ${col1} or ${col2} is null, the result of this expression should be null. 
> Therefore, the sum aggregation function does not conform the associative law 
> of addition. That is
> {code:java}
> sum(col1) + sum(col2) != sum(col1 + col2) 
> {code}
>  
> To support sum(col1 + col2), we have to predefine the it if null values may 
> exist for the related columns.
> If you want to enable sum(expression) with regarding null as 0, you need to 
> set kylin.query.is-null-as-zero-in-expression to be true at project level



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-3359) Support sum(expression) if possible

2020-09-08 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192167#comment-17192167
 ] 

Zhong Yanghong commented on KYLIN-3359:
---

Hi [~mzz_q], please see [KYLIN-4620]. If you want to enable sum(expression), 
you need to set *kylin.query.is-null-as-zero-in-expression* to be true at 
project level

> Support sum(expression) if possible
> ---
>
> Key: KYLIN-3359
> URL: https://issues.apache.org/jira/browse/KYLIN-3359
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.4.0
>
> Attachments: KYLIN-3359-Hive-query.png, KYLIN-3359-Kylin-query.png
>
>
> The expression can be as follows:
>  # a ~1~*col ~1~ + a ~2~*col ~2~ + ... + a ~n~*col ~n~ + b, if sum(col 
> ~1~),sum(col ~2~),...sum(col ~n~) are defined
>  # case when {{filter}} ~1~ then expr ~1~
>  when {{filter}} ~2~ then expr ~2~
>  ...
>  else expr ~N~
>  end, if {{filter}} ~1~,{{filter}} ~2~, ... {{filter}} ~N-1~, and expr 
> ~1~,expr ~2~,...expr ~N~ are supported 
> There's a constraint for the filter. That is it's able to push down the 
> related filters in case when.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4633) IT failed for can't detect the default value of config kylin.source.hive.databasedir

2020-09-07 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17191717#comment-17191717
 ] 

Zhong Yanghong commented on KYLIN-4633:
---

Thanks for fixing this.

> IT failed for can't detect the default value of config 
> kylin.source.hive.databasedir
> 
>
> Key: KYLIN-4633
> URL: https://issues.apache.org/jira/browse/KYLIN-4633
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Reporter: Yaqian Zhang
>Assignee: Yaqian Zhang
>Priority: Minor
>
> KYLIN-4616 introduce the method of auto detect the default value of config 
> kylin.source.hive.databasedir in find-hive-dependency, but IT doesn't execute 
> this script so `kylin.source.hive.databasedir` is null, this will lead to 
> BuildCubeWithEngine failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4749) Add isNeedMaterialize() for TableDesc

2020-09-04 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4749:
-

 Summary: Add isNeedMaterialize() for TableDesc
 Key: KYLIN-4749
 URL: https://issues.apache.org/jira/browse/KYLIN-4749
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4745) Fix BeelineHiveClient parseResultEntry

2020-09-03 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4745:
-

 Summary: Fix BeelineHiveClient parseResultEntry
 Key: KYLIN-4745
 URL: https://issues.apache.org/jira/browse/KYLIN-4745
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4718) Trim memory hungry measures in region server if no need to do post aggregation at server side

2020-08-25 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4718:
-

 Summary: Trim memory hungry measures in region server if no need 
to do post aggregation at server side
 Key: KYLIN-4718
 URL: https://issues.apache.org/jira/browse/KYLIN-4718
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4707) Add one config to disable mapper side combiner especially when there's topn measure

2020-08-18 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4707:
--
Summary: Add one config to disable mapper side combiner especially when 
there's topn measure  (was: Add one config to disable mapper side combiner 
especially there's topn measure)

> Add one config to disable mapper side combiner especially when there's topn 
> measure
> ---
>
> Key: KYLIN-4707
> URL: https://issues.apache.org/jira/browse/KYLIN-4707
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Since mapper side combiner is only using one single thread to do measure 
> merge. If there's some topn measure defined in cube, it will become very slow 
> to finish a mapper task. It's better to provide an option to disable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4707) Add one config to disable mapper side combiner especially there's topn measure

2020-08-18 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4707:
-

 Summary: Add one config to disable mapper side combiner especially 
there's topn measure
 Key: KYLIN-4707
 URL: https://issues.apache.org/jira/browse/KYLIN-4707
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong


Since mapper side combiner is only using one single thread to do measure merge. 
If there's some topn measure defined in cube, it will become very slow to 
finish a mapper task. It's better to provide an option to disable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4702) Missing cube-level lookup table snapshot when doing cube migration

2020-08-17 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4702:
-

 Summary: Missing cube-level lookup table snapshot when doing cube 
migration
 Key: KYLIN-4702
 URL: https://issues.apache.org/jira/browse/KYLIN-4702
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4697) User info update logic is not correct

2020-08-17 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178763#comment-17178763
 ] 

Zhong Yanghong commented on KYLIN-4697:
---

There's one more thing we need to take care of. In HBase, the rowkey is 
case-sensitive. While in Kylin, the user name is case-insensitive. We need to 
avoid the case that there's multiple HBase records existing for the same user. 
Otherwise, WriteConflictException may occur when updating user info.

> User info update logic is not correct
> -
>
> Key: KYLIN-4697
> URL: https://issues.apache.org/jira/browse/KYLIN-4697
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> There are mainly two issues:
> * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct 
> due to not considering ALL_USERS
> * The logic of updateUser in some places is not correct due to not following 
> copy on write. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4697) User info update logic is not correct

2020-08-16 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4697:
--
Description: 
There are mainly two issues:
* The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due 
to not considering ALL_USERS
* The logic of updateUser in some places is not correct due to not following 
copy on write. 

  was:
There are mainly two issues:
* The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due 
to not considering ALL_USERS
* The logic of updateUser in KylinUserService & KylinUserManager is not correct 
due to not following copy on write. 


> User info update logic is not correct
> -
>
> Key: KYLIN-4697
> URL: https://issues.apache.org/jira/browse/KYLIN-4697
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> There are mainly two issues:
> * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct 
> due to not considering ALL_USERS
> * The logic of updateUser in some places is not correct due to not following 
> copy on write. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4697) User info update logic is not correct

2020-08-14 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4697:
--
Description: 
There are mainly two issues:
* The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due 
to not considering ALL_USERS
* The logic of updateUser in KylinUserService & KylinUserManager is not correct 
due to not following copy on write. 

  was:
There are currently 3 main issues:
* In KylinUserGroupService.init(), the following code is not correct:
{code}
store.checkAndPutResource(PATH, userGroup, USER_GROUP_SERIALIZER);
{code}
* The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due 
to not considering ALL_USERS
* The logic of updateUser in KylinUserService & KylinUserManager is not correct 
due to not following copy on write. 


> User info update logic is not correct
> -
>
> Key: KYLIN-4697
> URL: https://issues.apache.org/jira/browse/KYLIN-4697
> Project: Kylin
>  Issue Type: Bug
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> There are mainly two issues:
> * The logic for KylinAuthenticationProvider.needUpdateUser() is not correct 
> due to not considering ALL_USERS
> * The logic of updateUser in KylinUserService & KylinUserManager is not 
> correct due to not following copy on write. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4697) User info update logic is not correct

2020-08-14 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4697:
-

 Summary: User info update logic is not correct
 Key: KYLIN-4697
 URL: https://issues.apache.org/jira/browse/KYLIN-4697
 Project: Kylin
  Issue Type: Bug
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong


There are currently 3 main issues:
* In KylinUserGroupService.init(), the following code is not correct:
{code}
store.checkAndPutResource(PATH, userGroup, USER_GROUP_SERIALIZER);
{code}
* The logic for KylinAuthenticationProvider.needUpdateUser() is not correct due 
to not considering ALL_USERS
* The logic of updateUser in KylinUserService & KylinUserManager is not correct 
due to not following copy on write. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly

2020-08-05 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172037#comment-17172037
 ] 

Zhong Yanghong commented on KYLIN-4682:
---

It seems rule *FilterAggregateTransposeRule* is not effective. It's better to 
set a lower number for computing the cost of *OLAPFilterRel*

> java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
> -
>
> Key: KYLIN-4682
> URL: https://issues.apache.org/jira/browse/KYLIN-4682
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> SQL:
> {code}
> select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv
> from TEST_KYLIN_FACT 
> group by LSTG_FORMAT_NAME, LEAF_CATEG_ID
> having LSTG_FORMAT_NAME = 'Auction'
> {code}
> Error stack trace:
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:657)
>   at java.util.ArrayList.get(ArrayList.java:433)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90)
>   at 
> org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)
>   at Baz$1$1.moveNext(Unknown Source)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>   at 
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>   at Baz.bind(Unknown Source)
>   at 
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365)
>   at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
>   at 
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
>   at 
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>   at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>   ... 81 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly

2020-08-05 Thread Zhong Yanghong (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171306#comment-17171306
 ] 

Zhong Yanghong commented on KYLIN-4682:
---

For this kind of sql, we can extract filters on group by related columns and 
then push down these filters like
{code}
select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv
from TEST_KYLIN_FACT 
where LSTG_FORMAT_NAME = 'Auction'
group by LSTG_FORMAT_NAME, LEAF_CATEG_ID
{code}

> java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly
> -
>
> Key: KYLIN-4682
> URL: https://issues.apache.org/jira/browse/KYLIN-4682
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> SQL:
> {code}
> select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv
> from TEST_KYLIN_FACT 
> group by LSTG_FORMAT_NAME, LEAF_CATEG_ID
> having LSTG_FORMAT_NAME = 'Auction'
> {code}
> Error stack trace:
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1
>   at java.util.ArrayList.rangeCheck(ArrayList.java:657)
>   at java.util.ArrayList.get(ArrayList.java:433)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98)
>   at 
> org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90)
>   at 
> org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)
>   at Baz$1$1.moveNext(Unknown Source)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>   at 
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>   at Baz.bind(Unknown Source)
>   at 
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365)
>   at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
>   at 
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
>   at 
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>   at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>   ... 81 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly

2020-08-05 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4682:
--
Description: 
SQL:
{code}
select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv
from TEST_KYLIN_FACT 
group by LSTG_FORMAT_NAME, LEAF_CATEG_ID
having LSTG_FORMAT_NAME = 'Auction'
{code}

Error stack trace:
{code}
Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90)
at 
org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53)
at 
org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
at 
org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)
at Baz$1$1.moveNext(Unknown Source)
at 
org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825)
at 
org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
at 
org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
at Baz.bind(Unknown Source)
at 
org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301)
at 
org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
at 
org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182)
at 
org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
at 
org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
at 
org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
... 81 more
{code}

  was:
SQL:
{code}
select LSTG_FORMAT_NAME, sum(gmv)
from
(
select LSTG_FORMAT_NAME, LEAF_CATEG_ID, sum(price) as gmv
from TEST_KYLIN_FACT 
group by LSTG_FORMAT_NAME, LEAF_CATEG_ID
)
where LSTG_FORMAT_NAME = 'Auction'
group by LSTG_FORMAT_NAME
{code}

Error stack trace:
{code}
Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.checkHavingCanPushDown(GTCubeStorageQueryBase.java:553)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.getStorageQueryRequest(GTCubeStorageQueryBase.java:196)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.searchInner(GTCubeStorageQueryBase.java:98)
at 
org.apache.kylin.storage.gtrecord.GTCubeStorageQueryBase.search(GTCubeStorageQueryBase.java:90)
at 
org.apache.kylin.storage.hybrid.HybridStorageQuery.search(HybridStorageQuery.java:53)
at 
org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:117)
at 
org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:60)
at Baz$1$1.moveNext(Unknown Source)
at 
org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825)
at 
org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
at 
org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
at Baz.bind(Unknown Source)
at 
org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301)
at 
org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
at 
org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182)
at 
org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
at 

[jira] [Created] (KYLIN-4682) java.lang.IndexOutOfBoundsException due to not setting havingFilter correctly

2020-08-04 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4682:
-

 Summary: java.lang.IndexOutOfBoundsException due to not setting 
havingFilter correctly
 Key: KYLIN-4682
 URL: https://issues.apache.org/jira/browse/KYLIN-4682
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4658) Union all issue with regarding to windows function & aggregation

2020-08-03 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4658:
--
Summary:  Union all issue with regarding to windows function & aggregation  
(was: Windows function does not work for union all)

>  Union all issue with regarding to windows function & aggregation
> -
>
> Key: KYLIN-4658
> URL: https://issues.apache.org/jira/browse/KYLIN-4658
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Test SQL:
> {code}
> select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME
> from 
> (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> UNION ALL
> select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
> order by TOTAL_GMV
> {code}
>  
> Exception:
> {code}
> Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
> sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
> LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, 
> sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT 
> group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
> {code}
> Similar issue for the following sql:
> {code}
> select LSTG_FORMAT_NAME,
>SLR_SEGMENT_CD,
>CAL_DT,
>sum(CNT) as CNT
> from
>   (select LSTG_FORMAT_NAME,
>   SLR_SEGMENT_CD,
>   CAL_DT,
>   sum(ITEM_COUNT) CNT
>from TEST_KYLIN_FACT
>where LSTG_FORMAT_NAME = 'ABIN'
>group by LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT
>UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT,
> case
> when SLR_SEGMENT_CD > 1000 then CNT * 2
> else CNT * 3
> end as CNT
>from
>  (select SLR_SEGMENT_CD,
>  CAL_DT,
>  sum(ITEM_COUNT) CNT
>   from TEST_KYLIN_FACT
>   where LSTG_FORMAT_NAME <> 'ABIN'
>   group by SLR_SEGMENT_CD,CAL_DT))
> group by LSTG_FORMAT_NAME,
>  SLR_SEGMENT_CD,
>  CAL_DT
> order by CNT
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4658) Union all issue with regarding to windows function & aggregation on

2020-08-03 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4658:
--
Summary:  Union all issue with regarding to windows function & aggregation 
on  (was:  Union all issue with regarding to windows function & aggregation)

>  Union all issue with regarding to windows function & aggregation on
> 
>
> Key: KYLIN-4658
> URL: https://issues.apache.org/jira/browse/KYLIN-4658
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Test SQL:
> {code}
> select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME
> from 
> (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> UNION ALL
> select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
> order by TOTAL_GMV
> {code}
>  
> Exception:
> {code}
> Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
> sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
> LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, 
> sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT 
> group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
> {code}
> Similar issue for the following sql:
> {code}
> select LSTG_FORMAT_NAME,
>SLR_SEGMENT_CD,
>CAL_DT,
>sum(CNT) as CNT
> from
>   (select LSTG_FORMAT_NAME,
>   SLR_SEGMENT_CD,
>   CAL_DT,
>   sum(ITEM_COUNT) CNT
>from TEST_KYLIN_FACT
>where LSTG_FORMAT_NAME = 'ABIN'
>group by LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT
>UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT,
> case
> when SLR_SEGMENT_CD > 1000 then CNT * 2
> else CNT * 3
> end as CNT
>from
>  (select SLR_SEGMENT_CD,
>  CAL_DT,
>  sum(ITEM_COUNT) CNT
>   from TEST_KYLIN_FACT
>   where LSTG_FORMAT_NAME <> 'ABIN'
>   group by SLR_SEGMENT_CD,CAL_DT))
> group by LSTG_FORMAT_NAME,
>  SLR_SEGMENT_CD,
>  CAL_DT
> order by CNT
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4658) Windows function does not work for union all

2020-08-03 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4658:
--
Description: 
Test SQL:

{code}

select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
SLR_SEGMENT_CD, LSTG_FORMAT_NAME
from 
(select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
UNION ALL
select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
order by TOTAL_GMV

{code}

 

Exception:

{code}
Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, 
LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, 
LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
{code}

Similar issue for the following sql:
{code}
select LSTG_FORMAT_NAME,
   SLR_SEGMENT_CD,
   CAL_DT,
   sum(CNT) as CNT
from
  (select LSTG_FORMAT_NAME,
  SLR_SEGMENT_CD,
  CAL_DT,
  sum(ITEM_COUNT) CNT
   from TEST_KYLIN_FACT
   where LSTG_FORMAT_NAME = 'ABIN'
   group by LSTG_FORMAT_NAME,
SLR_SEGMENT_CD,
CAL_DT
   UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME,
SLR_SEGMENT_CD,
CAL_DT,
case
when SLR_SEGMENT_CD > 1000 then CNT * 2
else CNT * 3
end as CNT
   from
 (select SLR_SEGMENT_CD,
 CAL_DT,
 sum(ITEM_COUNT) CNT
  from TEST_KYLIN_FACT
  where LSTG_FORMAT_NAME <> 'ABIN'
  group by SLR_SEGMENT_CD,CAL_DT))
group by LSTG_FORMAT_NAME,
 SLR_SEGMENT_CD,
 CAL_DT
order by CNT
{code}

  was:
Test SQL:

{code}

select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
SLR_SEGMENT_CD, LSTG_FORMAT_NAME
from 
(select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
UNION ALL
select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
order by TOTAL_GMV

{code}

 

Exception:

{code}
Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, 
LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, 
LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
{code}


> Windows function does not work for union all
> 
>
> Key: KYLIN-4658
> URL: https://issues.apache.org/jira/browse/KYLIN-4658
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Test SQL:
> {code}
> select CNT, GMV, sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME
> from 
> (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> UNION ALL
> select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME 
> from TEST_KYLIN_FACT group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) 
> order by TOTAL_GMV
> {code}
>  
> Exception:
> {code}
> Index: 2, Size: 2 while executing SQL: "select * from (select CNT, GMV, 
> sum(GMV) over(partition by SLR_SEGMENT_CD) TOTAL_GMV, SLR_SEGMENT_CD, 
> LSTG_FORMAT_NAME from (select sum(PRICE) GMV, sum(ITEM_COUNT) CNT, 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT group by 
> SLR_SEGMENT_CD, LSTG_FORMAT_NAME UNION ALL select sum(PRICE) GMV, 
> sum(ITEM_COUNT) CNT, SLR_SEGMENT_CD, LSTG_FORMAT_NAME from TEST_KYLIN_FACT 
> group by SLR_SEGMENT_CD, LSTG_FORMAT_NAME) order by TOTAL_GMV) limit 5"
> {code}
> Similar issue for the following sql:
> {code}
> select LSTG_FORMAT_NAME,
>SLR_SEGMENT_CD,
>CAL_DT,
>sum(CNT) as CNT
> from
>   (select LSTG_FORMAT_NAME,
>   SLR_SEGMENT_CD,
>   CAL_DT,
>   sum(ITEM_COUNT) CNT
>from TEST_KYLIN_FACT
>where LSTG_FORMAT_NAME = 'ABIN'
>group by LSTG_FORMAT_NAME,
> SLR_SEGMENT_CD,
> CAL_DT
>UNION ALL select 'NON-ABIN' as LSTG_FORMAT_NAME,
> 

[jira] [Updated] (KYLIN-4674) support cast in sum() expression

2020-08-02 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4674:
--
Description: 
Make it possible to support the following sql:

{code}

select LSTG_FORMAT_NAME,
 sum(cast(ITEM_COUNT as decimal(18, 6))),
 sum(case
 when LSTG_FORMAT_NAME = 'ABIN' then cast(ITEM_COUNT as decimal(18, 6))
 when LSTG_FORMAT_NAME = 'Auction' then 2
 end),
 sum(cast(case
 when LSTG_FORMAT_NAME = 'ABIN' then ITEM_COUNT
 when LSTG_FORMAT_NAME = 'Auction' then 2
 end as decimal(18, 6)))
from TEST_KYLIN_FACT
group by LSTG_FORMAT_NAME

{code}

> support cast in sum() expression
> 
>
> Key: KYLIN-4674
> URL: https://issues.apache.org/jira/browse/KYLIN-4674
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> Make it possible to support the following sql:
> {code}
> select LSTG_FORMAT_NAME,
>  sum(cast(ITEM_COUNT as decimal(18, 6))),
>  sum(case
>  when LSTG_FORMAT_NAME = 'ABIN' then cast(ITEM_COUNT as decimal(18, 6))
>  when LSTG_FORMAT_NAME = 'Auction' then 2
>  end),
>  sum(cast(case
>  when LSTG_FORMAT_NAME = 'ABIN' then ITEM_COUNT
>  when LSTG_FORMAT_NAME = 'Auction' then 2
>  end as decimal(18, 6)))
> from TEST_KYLIN_FACT
> group by LSTG_FORMAT_NAME
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4674) support cast in sum() expression

2020-08-02 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4674:
-

 Summary: support cast in sum() expression
 Key: KYLIN-4674
 URL: https://issues.apache.org/jira/browse/KYLIN-4674
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Attachment: (was: Single-Thread-First-Time.png)

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: Five-Threads-First-Time.png, 
> Five-Threads-Second-Time.png, Single-Thread-First-Time.png, 
> Single-Thread-Second-Time.png
>
>
> For a cube with 37 segments and related snapshots are all different from each 
> other. Do a query with lookup table derived column. Here's the performance 
> comparison between single thread & five threads.
>  * For the first time query without snapshot table cache:
>  ** Single Thread
> !Single-Thread-First-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-First-Time.png|width=600,height=200!
>  * For the second time query with snapshot table cache:
>  ** Single Thread
> !Single-Thread-Second-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-Second-Time.png|width=600,height=200!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Attachment: Single-Thread-First-Time.png
Five-Threads-First-Time.png
Five-Threads-Second-Time.png
Single-Thread-Second-Time.png

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: Five-Threads-First-Time.png, 
> Five-Threads-Second-Time.png, Single-Thread-First-Time.png, 
> Single-Thread-Second-Time.png
>
>
> For a cube with 37 segments and related snapshots are all different from each 
> other. Do a query with lookup table derived column. Here's the performance 
> comparison between single thread & five threads.
>  * For the first time query without snapshot table cache:
>  ** Single Thread
> !Single-Thread-First-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-First-Time.png|width=600,height=200!
>  * For the second time query with snapshot table cache:
>  ** Single Thread
> !Single-Thread-Second-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-Second-Time.png|width=600,height=200!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Attachment: (was: Single-Thread-Second-Time.png)

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: Five-Threads-First-Time.png, 
> Five-Threads-Second-Time.png, Single-Thread-First-Time.png, 
> Single-Thread-Second-Time.png
>
>
> For a cube with 37 segments and related snapshots are all different from each 
> other. Do a query with lookup table derived column. Here's the performance 
> comparison between single thread & five threads.
>  * For the first time query without snapshot table cache:
>  ** Single Thread
> !Single-Thread-First-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-First-Time.png|width=600,height=200!
>  * For the second time query with snapshot table cache:
>  ** Single Thread
> !Single-Thread-Second-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-Second-Time.png|width=600,height=200!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Attachment: (was: Five-Threads-Second-Time.png)

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: Single-Thread-First-Time.png, 
> Single-Thread-Second-Time.png
>
>
> For a cube with 37 segments and related snapshots are all different from each 
> other. Do a query with lookup table derived column. Here's the performance 
> comparison between single thread & five threads.
>  * For the first time query without snapshot table cache:
>  ** Single Thread
> !Single-Thread-First-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-First-Time.png|width=600,height=200!
>  * For the second time query with snapshot table cache:
>  ** Single Thread
> !Single-Thread-Second-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-Second-Time.png|width=600,height=200!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Attachment: (was: Five-Threads-First-Time.png)

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: Single-Thread-First-Time.png, 
> Single-Thread-Second-Time.png
>
>
> For a cube with 37 segments and related snapshots are all different from each 
> other. Do a query with lookup table derived column. Here's the performance 
> comparison between single thread & five threads.
>  * For the first time query without snapshot table cache:
>  ** Single Thread
> !Single-Thread-First-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-First-Time.png|width=600,height=200!
>  * For the second time query with snapshot table cache:
>  ** Single Thread
> !Single-Thread-Second-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-Second-Time.png|width=600,height=200!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Description: 
For a cube with 37 segments and related snapshots are all different from each 
other. Do a query with lookup table derived column. Here's the performance 
comparison between single thread & five threads.
 * For the first time query without snapshot table cache:
 ** Single Thread
!Single-Thread-First-Time.png|width=600,height=200!
 ** Five Threads
!Five-Threads-First-Time.png|width=600,height=200!

 * For the second time query with snapshot table cache:
 ** Single Thread
!Single-Thread-Second-Time.png|width=600,height=200!
 ** Five Threads
!Five-Threads-Second-Time.png|width=600,height=200!
 

  was:For a cube with 37 segments and related snapshots are all different from 
each other.


> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: Five-Threads-First-Time.png, 
> Five-Threads-Second-Time.png, Single-Thread-First-Time.png, 
> Single-Thread-Second-Time.png
>
>
> For a cube with 37 segments and related snapshots are all different from each 
> other. Do a query with lookup table derived column. Here's the performance 
> comparison between single thread & five threads.
>  * For the first time query without snapshot table cache:
>  ** Single Thread
> !Single-Thread-First-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-First-Time.png|width=600,height=200!
>  * For the second time query with snapshot table cache:
>  ** Single Thread
> !Single-Thread-Second-Time.png|width=600,height=200!
>  ** Five Threads
> !Five-Threads-Second-Time.png|width=600,height=200!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Attachment: Single-Thread-First-Time.png
Single-Thread-Second-Time.png
Five-Threads-Second-Time.png
Five-Threads-First-Time.png

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Attachments: Five-Threads-First-Time.png, 
> Five-Threads-Second-Time.png, Single-Thread-First-Time.png, 
> Single-Thread-Second-Time.png
>
>
> For a cube with 37 segments and related snapshots are all different from each 
> other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Description: For a cube with 37 segments and related snapshots are all 
different from each other.

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>
> For a cube with 37 segments and related snapshots are all different from each 
> other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4670) Improve query performance by reusing LookupStringTable and using multi-threads

2020-07-30 Thread Zhong Yanghong (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-4670:
--
Summary: Improve query performance by reusing LookupStringTable and using 
multi-threads  (was: Improve query performance by reusing LookupStringTable)

> Improve query performance by reusing LookupStringTable and using multi-threads
> --
>
> Key: KYLIN-4670
> URL: https://issues.apache.org/jira/browse/KYLIN-4670
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   10   >