[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1785#comment-1785 ] ASF subversion and git services commented on KYLIN-3635: Commit 26e71a762b08445ad173fe42fc84c204ff33fe49 in kylin's branch refs/heads/master from tttMelody [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=26e71a7 ] KYLIN-3635, make a stronger constraint about implementing method reset(). > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1784#comment-1784 ] ASF GitHub Bot commented on KYLIN-3635: --- shaofengshi closed pull request #316: KYLIN-3635, make a stronger constraint about implementing method reset(). URL: https://github.com/apache/kylin/pull/316 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/core-metadata/src/main/java/org/apache/kylin/measure/MeasureIngester.java b/core-metadata/src/main/java/org/apache/kylin/measure/MeasureIngester.java index ed2cb02e59..b48acf0862 100644 --- a/core-metadata/src/main/java/org/apache/kylin/measure/MeasureIngester.java +++ b/core-metadata/src/main/java/org/apache/kylin/measure/MeasureIngester.java @@ -43,9 +43,9 @@ abstract public V valueOf(String[] values, MeasureDesc measureDesc, Map> dictionaryMap); -public void reset() { - -} +// Be attention with this, do remember resetting objects if you init in your implementation. +// See more details in KYLIN-3635. +abstract public void reset(); public V reEncodeDictionary(V value, MeasureDesc measureDesc, Map> oldDicts, Map> newDicts) { throw new UnsupportedOperationException(); diff --git a/core-metadata/src/main/java/org/apache/kylin/measure/basic/BigDecimalIngester.java b/core-metadata/src/main/java/org/apache/kylin/measure/basic/BigDecimalIngester.java index c7541abb04..5194606b11 100644 --- a/core-metadata/src/main/java/org/apache/kylin/measure/basic/BigDecimalIngester.java +++ b/core-metadata/src/main/java/org/apache/kylin/measure/basic/BigDecimalIngester.java @@ -38,4 +38,9 @@ public BigDecimal valueOf(String[] values, MeasureDesc measureDesc, Map reEncodeDictionary(List value, MeasureDesc measureDesc, Map> oldDicts, Map> newDicts) { diff --git a/core-metadata/src/main/java/org/apache/kylin/measure/topn/TopNMeasureType.java b/core-metadata/src/main/java/org/apache/kylin/measure/topn/TopNMeasureType.java index 9b6ff0ac20..d7b1bd7382 100644 --- a/core-metadata/src/main/java/org/apache/kylin/measure/topn/TopNMeasureType.java +++ b/core-metadata/src/main/java/org/apache/kylin/measure/topn/TopNMeasureType.java @@ -162,6 +162,11 @@ public boolean isMemoryHungry() { return topNCounter; } +@Override +public void reset() { + +} + @Override public TopNCounter reEncodeDictionary(TopNCounter value, MeasureDesc measureDesc, Map> oldDicts, Map> newDicts) { This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1699#comment-1699 ] ASF GitHub Bot commented on KYLIN-3635: --- Aaron commented on a change in pull request #316: KYLIN-3635, make a stronger constraint about implementing method reset(). URL: https://github.com/apache/kylin/pull/316#discussion_r228788582 ## File path: core-metadata/src/main/java/org/apache/kylin/measure/MeasureIngester.java ## @@ -43,9 +43,7 @@ abstract public V valueOf(String[] values, MeasureDesc measureDesc, Map> dictionaryMap); -public void reset() { - -} +abstract public void reset(); Review comment: Thanks, Shaofeng, Pretty good advice! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1696#comment-1696 ] ASF GitHub Bot commented on KYLIN-3635: --- shaofengshi commented on a change in pull request #316: KYLIN-3635, make a stronger constraint about implementing method reset(). URL: https://github.com/apache/kylin/pull/316#discussion_r228788317 ## File path: core-metadata/src/main/java/org/apache/kylin/measure/MeasureIngester.java ## @@ -43,9 +43,7 @@ abstract public V valueOf(String[] values, MeasureDesc measureDesc, Map> dictionaryMap); -public void reset() { - -} +abstract public void reset(); Review comment: Jiatao, it is good to make this method as abstract; If you can add some comment on this method to tell the developer why this method is so important, that would be great. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665152#comment-16665152 ] ASF GitHub Bot commented on KYLIN-3635: --- coveralls commented on issue #316: KYLIN-3635, make a stronger constraint about implementing method reset(). URL: https://github.com/apache/kylin/pull/316#issuecomment-433402739 ## Pull Request Test Coverage Report for [Build 3814](https://coveralls.io/builds/19741079) * **0** of **4** **(0.0%)** changed or added relevant lines in **4** files are covered. * **5** unchanged lines in **1** file lost coverage. * Overall coverage decreased (**-0.004%**) to **23.299%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [core-metadata/src/main/java/org/apache/kylin/measure/basic/BigDecimalIngester.java](https://coveralls.io/builds/19741079/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fmeasure%2Fbasic%2FBigDecimalIngester.java#L45) | 0 | 1 | 0.0% | [core-metadata/src/main/java/org/apache/kylin/measure/extendedcolumn/ExtendedColumnMeasureType.java](https://coveralls.io/builds/19741079/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fmeasure%2Fextendedcolumn%2FExtendedColumnMeasureType.java#L245) | 0 | 1 | 0.0% | [core-metadata/src/main/java/org/apache/kylin/measure/raw/RawMeasureType.java](https://coveralls.io/builds/19741079/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fmeasure%2Fraw%2FRawMeasureType.java#L129) | 0 | 1 | 0.0% | [core-metadata/src/main/java/org/apache/kylin/measure/topn/TopNMeasureType.java](https://coveralls.io/builds/19741079/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fmeasure%2Ftopn%2FTopNMeasureType.java#L168) | 0 | 1 | 0.0% | Files with Coverage Reduction | New Missed Lines | % | | :-|--|--: | | [core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/19741079/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L439) | 5 | 77.81% | | Totals | [![Coverage Status](https://coveralls.io/builds/19741079/badge)](https://coveralls.io/builds/19741079) | | :-- | --: | | Change from base [Build 3810](https://coveralls.io/builds/19711495): | -0.004% | | Covered Lines: | 16314 | | Relevant Lines: | 70019 | --- # - [Coveralls](https://coveralls.io) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665133#comment-16665133 ] ASF GitHub Bot commented on KYLIN-3635: --- tttMelody opened a new pull request #316: KYLIN-3635, make a stronger constraint about implementing method rese… URL: https://github.com/apache/kylin/pull/316 …t(). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665134#comment-16665134 ] ASF GitHub Bot commented on KYLIN-3635: --- asfgit commented on issue #316: KYLIN-3635, make a stronger constraint about implementing method rese… URL: https://github.com/apache/kylin/pull/316#issuecomment-433396821 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665130#comment-16665130 ] jiatao.tao commented on KYLIN-3635: --- Hi Shaofeng, do you think we should provide a stronger constraint about this? I've tried to make method reset() abstract so that the subclasses have to implement this, but I found most of the implementation is still empty. What's your opinion about this? > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug > Components: Spark Engine >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI >Priority: Major > Fix For: v2.4.2, v2.5.1 > > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655300#comment-16655300 ] ASF subversion and git services commented on KYLIN-3635: Commit 0b172436d90012c6573720d94134001e728e5e03 in kylin's branch refs/heads/2.2.x from shaofengshi [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=0b17243 ] KYLIN-3635 Percentile calculation on Spark engine is wrong > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655298#comment-16655298 ] ASF subversion and git services commented on KYLIN-3635: Commit ce4c9518926584e8c588fdd2271c39eeb80aa172 in kylin's branch refs/heads/2.4.x from shaofengshi [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=ce4c951 ] KYLIN-3635 Percentile calculation on Spark engine is wrong > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655299#comment-16655299 ] ASF subversion and git services commented on KYLIN-3635: Commit 126ceac301e8acb135a0d2e4f7d8693d78a7ec5a in kylin's branch refs/heads/2.3.x from shaofengshi [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=126ceac ] KYLIN-3635 Percentile calculation on Spark engine is wrong > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655296#comment-16655296 ] ASF subversion and git services commented on KYLIN-3635: Commit 0cd2a4c5be6d007177265533ee3523d048cd3d55 in kylin's branch refs/heads/2.5.x from shaofengshi [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=0cd2a4c ] KYLIN-3635 Percentile calculation on Spark engine is wrong > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655292#comment-16655292 ] ASF subversion and git services commented on KYLIN-3635: Commit 5a6ff1d54f004e2e8c1bcaa09e64bebc036df8ef in kylin's branch refs/heads/master from shaofengshi [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=5a6ff1d ] KYLIN-3635 Percentile calculation on Spark engine is wrong > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655291#comment-16655291 ] ASF GitHub Bot commented on KYLIN-3635: --- shaofengshi closed pull request #294: KYLIN-3635 Percentile calculation on Spark engine is wrong URL: https://github.com/apache/kylin/pull/294 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/core-metadata/src/main/java/org/apache/kylin/measure/percentile/PercentileMeasureType.java b/core-metadata/src/main/java/org/apache/kylin/measure/percentile/PercentileMeasureType.java index 44bd2133b2..60b3282c71 100644 --- a/core-metadata/src/main/java/org/apache/kylin/measure/percentile/PercentileMeasureType.java +++ b/core-metadata/src/main/java/org/apache/kylin/measure/percentile/PercentileMeasureType.java @@ -82,6 +82,11 @@ public PercentileCounter valueOf(String[] values, MeasureDesc measureDesc, } return counter; } + +@Override +public void reset() { +current = new PercentileCounter(dataType.getPrecision()); +} }; } This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655087#comment-16655087 ] ASF GitHub Bot commented on KYLIN-3635: --- codecov-io commented on issue #294: KYLIN-3635 Percentile calculation on Spark engine is wrong URL: https://github.com/apache/kylin/pull/294#issuecomment-430976873 # [Codecov](https://codecov.io/gh/apache/kylin/pull/294?src=pr=h1) Report > :exclamation: No coverage uploaded for pull request base (`master@2ab720d`). [Click here to learn what that means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit). > The diff coverage is `0%`. [![Impacted file tree graph](https://codecov.io/gh/apache/kylin/pull/294/graphs/tree.svg?width=650=JawVgbgsVo=150=pr)](https://codecov.io/gh/apache/kylin/pull/294?src=pr=tree) ```diff @@ Coverage Diff@@ ## master#294 +/- ## Coverage ? 21.3% Complexity?4442 Files ?1087 Lines ? 69969 Branches ? 10108 Hits ? 14909 Misses? 53658 Partials ?1402 ``` | [Impacted Files](https://codecov.io/gh/apache/kylin/pull/294?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ylin/measure/percentile/PercentileMeasureType.java](https://codecov.io/gh/apache/kylin/pull/294/diff?src=pr=tree#diff-Y29yZS1tZXRhZGF0YS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vbWVhc3VyZS9wZXJjZW50aWxlL1BlcmNlbnRpbGVNZWFzdXJlVHlwZS5qYXZh) | `50% <0%> (ø)` | `4 <0> (?)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/kylin/pull/294?src=pr=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/kylin/pull/294?src=pr=footer). Last update [2ab720d...6b2ae48](https://codecov.io/gh/apache/kylin/pull/294?src=pr=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655070#comment-16655070 ] ASF GitHub Bot commented on KYLIN-3635: --- asfgit commented on issue #294: KYLIN-3635 Percentile calculation on Spark engine is wrong URL: https://github.com/apache/kylin/pull/294#issuecomment-430971630 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3635) Percentile calculation on Spark engine is wrong
[ https://issues.apache.org/jira/browse/KYLIN-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655069#comment-16655069 ] ASF GitHub Bot commented on KYLIN-3635: --- shaofengshi opened a new pull request #294: KYLIN-3635 Percentile calculation on Spark engine is wrong URL: https://github.com/apache/kylin/pull/294 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Percentile calculation on Spark engine is wrong > --- > > Key: KYLIN-3635 > URL: https://issues.apache.org/jira/browse/KYLIN-3635 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.3.0, v2.3.1, v2.4.0, v2.3.2, v2.4.1, v2.5.0 >Reporter: Shaofeng SHI >Priority: Major > > As titled; Received reporting that percentile result is wrong when using > Spark engine. Checked the code and found the object was reused and not reset, > the problem won't happen on normal MR as it would be serialized before next > call, but would be a problem on Spark, as the object was cached in-memory and > be serialized later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)