[jira] [Commented] (KYLIN-3316) Reported NPE after cube build
[ https://issues.apache.org/jira/browse/KYLIN-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415064#comment-16415064 ] TianZhiwei commented on KYLIN-3316: --- Thanks > Reported NPE after cube build > - > > Key: KYLIN-3316 > URL: https://issues.apache.org/jira/browse/KYLIN-3316 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.3.0 >Reporter: TianZhiwei >Assignee: TianZhiwei >Priority: Major > Labels: build > Fix For: v2.4.0 > > Attachments: 0001-KYLIN-3316-modify-CubingJob.updateMetrics.patch > > > Does not affect the completion of the build task and any build task can be > reproduced -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3320) CubeStatsReader cannot print stats properly for some cube
[ https://issues.apache.org/jira/browse/KYLIN-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415063#comment-16415063 ] Ma Gang commented on KYLIN-3320: attach the patch, this is caused by the code: {code:java} // CubeStatsReader.printCuboidInfoTree() method List children = scheduler.getSpanningCuboid(cuboidID); Collections.sort(children);{code} the returned cuboid list is immutable for TreeCuboidScheduler, the Collections.sort() will cause exception. the fix is return a copy cuboid list from the TreeCuboidScheduler > CubeStatsReader cannot print stats properly for some cube > -- > > Key: KYLIN-3320 > URL: https://issues.apache.org/jira/browse/KYLIN-3320 > Project: Kylin > Issue Type: Improvement > Components: Tools, Build and Test >Reporter: Ma Gang >Assignee: Ma Gang >Priority: Minor > Attachments: fix_KYLIN-3320.patch > > > For the cubes that have cuboid_bytes set in the CubeInstance, the cuboid > stats cannot print properly using tool CubeStatsReader -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3320) CubeStatsReader cannot print stats properly for some cube
[ https://issues.apache.org/jira/browse/KYLIN-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ma Gang updated KYLIN-3320: --- Attachment: fix_KYLIN-3320.patch > CubeStatsReader cannot print stats properly for some cube > -- > > Key: KYLIN-3320 > URL: https://issues.apache.org/jira/browse/KYLIN-3320 > Project: Kylin > Issue Type: Improvement > Components: Tools, Build and Test >Reporter: Ma Gang >Assignee: Ma Gang >Priority: Minor > Attachments: fix_KYLIN-3320.patch > > > For the cubes that have cuboid_bytes set in the CubeInstance, the cuboid > stats cannot print properly using tool CubeStatsReader -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3320) CubeStatsReader cannot print stats properly for some cube
Ma Gang created KYLIN-3320: -- Summary: CubeStatsReader cannot print stats properly for some cube Key: KYLIN-3320 URL: https://issues.apache.org/jira/browse/KYLIN-3320 Project: Kylin Issue Type: Improvement Components: Tools, Build and Test Reporter: Ma Gang Assignee: Ma Gang For the cubes that have cuboid_bytes set in the CubeInstance, the cuboid stats cannot print properly using tool CubeStatsReader -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3319) exceeds threshold 5000000 while executing SQL
[ https://issues.apache.org/jira/browse/KYLIN-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cc updated KYLIN-3319: -- Attachment: image.png > exceeds threshold 500 while executing SQL > - > > Key: KYLIN-3319 > URL: https://issues.apache.org/jira/browse/KYLIN-3319 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.3.0 >Reporter: cc >Priority: Blocker > Attachments: image.png > > > {color:#d04437}hello,what is the reason of the errors?{color} > {color:#d04437}Query returned 5008662 rows exceeds threshold 500 while > executing SQL: "select * from USER_FREQ_DAY_TEST LIMIT 5"{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3316) Reported NPE after cube build
[ https://issues.apache.org/jira/browse/KYLIN-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414956#comment-16414956 ] peng.jianhua commented on KYLIN-3316: - Hi [~TinChiWay], this issue has been fixed in https://issues.apache.org/jira/browse/KYLIN-3219. So I will close this jira. > Reported NPE after cube build > - > > Key: KYLIN-3316 > URL: https://issues.apache.org/jira/browse/KYLIN-3316 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.3.0 >Reporter: TianZhiwei >Assignee: TianZhiwei >Priority: Major > Labels: build > Fix For: v2.4.0 > > Attachments: 0001-KYLIN-3316-modify-CubingJob.updateMetrics.patch > > > Does not affect the completion of the build task and any build task can be > reproduced -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3311) Segments overlap error (refactor write conflict exception)
[ https://issues.apache.org/jira/browse/KYLIN-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414928#comment-16414928 ] xujing commented on KYLIN-3311: --- Thanks for your answer .And I will continue to pay attention on this issue,because if the "segments overlap" occured will lead to some following questions: 1.new segment's status will be "NEW",but could not find corresponding Htable in Hbase 2.new segment could not be refresh or delete. When this error happened I can only clone a new cube and build all history data or restore the metadata.I think these two ways are not good.Do you have some good suggestions? Thank you! > Segments overlap error (refactor write conflict exception) > -- > > Key: KYLIN-3311 > URL: https://issues.apache.org/jira/browse/KYLIN-3311 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.3.0 >Reporter: xujing >Priority: Major > Labels: build > Attachments: Segments_Overlap_ErrorLog.txt > > > when "updateCubeWithRetry" method be called at first time , > line newSegs.validate();was passed . > then > cube = crud.save(cube);seem with error throw exception > write conflict to update cube at try 0 ,will retry... > while retry "updateCubeWithRetry" start > line newSegs.validate();was not passed . > throw exception > Segments overlap: [2018031800_2018031900] and > sales_order_channel[2018031800_201803 > 1900] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3318) Kylin 2.3 UI top n group by only show dimension columns
Le Anh Vu created KYLIN-3318: Summary: Kylin 2.3 UI top n group by only show dimension columns Key: KYLIN-3318 URL: https://issues.apache.org/jira/browse/KYLIN-3318 Project: Kylin Issue Type: Bug Reporter: Le Anh Vu In Kylin 2.3.0 Web UI, when I use TopN measure, the group by column drop down only show me dimension columns. Is it the expected behavior or a bug? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3305) Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' for documents
[ https://issues.apache.org/jira/browse/KYLIN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414842#comment-16414842 ] ASF GitHub Bot commented on KYLIN-3305: --- yiming187 closed pull request #121: KYLIN-3305 Fix typos `kylin.job.run.as.remote.cmd` URL: https://github.com/apache/kylin/pull/121 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/website/_dev/dev_env.md b/website/_dev/dev_env.md index c16330a364..388582f058 100644 --- a/website/_dev/dev_env.md +++ b/website/_dev/dev_env.md @@ -87,7 +87,7 @@ Local configuration must be modified to point to your hadoop sandbox (or CLI) ma * In **examples/test_case_data/sandbox/kylin.properties** * Find `sandbox` and replace with your hadoop hosts (if you're using HDP sandbox, this can be skipped) - * Find `kylin.job.run.as.remote.cmd` and change it to "true" (in code repository the default is false, which assume running it on hadoop CLI) + * Find `kylin.job.use-remote-cli` and change it to "true" (in code repository the default is false, which assume running it on hadoop CLI) * Find `kylin.job.remote.cli.username` and `kylin.job.remote.cli.password`, fill in the user name and password used to login hadoop cluster for hadoop command execution; If you're using HDP sandbox, the default username is `root` and password is `hadoop`. * In **examples/test_case_data/sandbox** diff --git a/website/_dev/howto_test.md b/website/_dev/howto_test.md index 88b1649b3d..1d52ef2cf6 100644 --- a/website/_dev/howto_test.md +++ b/website/_dev/howto_test.md @@ -52,7 +52,7 @@ If your sandbox is already provisioned and your code change will not affect the ### Cube Provision -Environment cube provision is indeed running kylin cubing jobs to prepare example cubes in the sandbox. These prepared cubes will be used by the ITs. Currently provision step is bound with the maven pre-integration-test phase, and it contains running BuildCubeWithEngine (HBase required), BuildCubeWithStream(Kafka required) and BuildIIWithStream(Kafka Required). You can run the mvn commands on you sandbox or your develop computer. For the latter case you need to set kylin.job.run.as.remote.cmd=true in __$KYLIN_HOME/examples/test_case_data/sandbox/kylin.properties__. +Environment cube provision is indeed running kylin cubing jobs to prepare example cubes in the sandbox. These prepared cubes will be used by the ITs. Currently provision step is bound with the maven pre-integration-test phase, and it contains running BuildCubeWithEngine (HBase required), BuildCubeWithStream(Kafka required) and BuildIIWithStream(Kafka Required). You can run the mvn commands on you sandbox or your develop computer. For the latter case you need to set kylin.job.use-remote-cli=true in __$KYLIN_HOME/examples/test_case_data/sandbox/kylin.properties__. Try appending `-DfastBuildMode=true` to mvn verify command to speed up provision by skipping incremental cubing. ## More on v1.3 Mini Cluster This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' for > documents > --- > > Key: KYLIN-3305 > URL: https://issues.apache.org/jira/browse/KYLIN-3305 > Project: Kylin > Issue Type: Improvement >Reporter: yongjie zhao >Priority: Minor > > Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3305) Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' for documents
[ https://issues.apache.org/jira/browse/KYLIN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414843#comment-16414843 ] ASF subversion and git services commented on KYLIN-3305: Commit 337f2d07599b90063ce0ecb5e45b2107dd78e63c in kylin's branch refs/heads/document from [~zog] [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=337f2d0 ] KYLIN-3305 Fix typos `kylin.job.run.as.remote.cmd` to `kylin.job.use-remote-cli` > Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' for > documents > --- > > Key: KYLIN-3305 > URL: https://issues.apache.org/jira/browse/KYLIN-3305 > Project: Kylin > Issue Type: Improvement >Reporter: yongjie zhao >Priority: Minor > > Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3305) Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' for documents
[ https://issues.apache.org/jira/browse/KYLIN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414840#comment-16414840 ] ASF GitHub Bot commented on KYLIN-3305: --- yiming187 commented on issue #121: KYLIN-3305 Fix typos `kylin.job.run.as.remote.cmd` URL: https://github.com/apache/kylin/pull/121#issuecomment-376361033 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' for > documents > --- > > Key: KYLIN-3305 > URL: https://issues.apache.org/jira/browse/KYLIN-3305 > Project: Kylin > Issue Type: Improvement >Reporter: yongjie zhao >Priority: Minor > > Fix typos 'kylin.job.run.as.remote.cmd' to 'kylin.job.use-remote-cli' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3317) Replace UUID.randomUUID with deterministic PRNG
Ted Yu created KYLIN-3317: - Summary: Replace UUID.randomUUID with deterministic PRNG Key: KYLIN-3317 URL: https://issues.apache.org/jira/browse/KYLIN-3317 Project: Kylin Issue Type: Task Reporter: Ted Yu Currently UUID.randomUUID is called in various places in the code base. * It is non-deterministic. * It uses a single secure random for UUID generation. This uses a single JVM wide lock, and this can lead to lock contention and other performance problems. We should move to something that is deterministic by using seeded PRNGs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-3316) Reported NPE after cube build
[ https://issues.apache.org/jira/browse/KYLIN-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billy Liu reassigned KYLIN-3316: Assignee: TianZhiwei > Reported NPE after cube build > - > > Key: KYLIN-3316 > URL: https://issues.apache.org/jira/browse/KYLIN-3316 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.3.0 >Reporter: TianZhiwei >Assignee: TianZhiwei >Priority: Major > Labels: build > Fix For: v2.4.0 > > Attachments: 0001-KYLIN-3316-modify-CubingJob.updateMetrics.patch > > > Does not affect the completion of the build task and any build task can be > reproduced -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-3277) Kylin should override hiveconf settings when connecting to hive using jdbc
[ https://issues.apache.org/jira/browse/KYLIN-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billy Liu reassigned KYLIN-3277: Assignee: Chuqian Yu > Kylin should override hiveconf settings when connecting to hive using jdbc > -- > > Key: KYLIN-3277 > URL: https://issues.apache.org/jira/browse/KYLIN-3277 > Project: Kylin > Issue Type: Bug >Reporter: Chuqian Yu >Assignee: Chuqian Yu >Priority: Major > Labels: patch > Fix For: v2.4.0 > > Attachments: 0001-KYLIN-3277.patch > > > Hi, kylin developers. My cube building procedure failing at Step 2 > "Redistribute Flat Hive Table" because Kylin always trying to submit a mr job > to the default yarn queue. > > I have overrided the mapred.job.queue.name property in both > kylin_hive_conf.xml and kylin.properties but it doesn't work. > > kylin.properties > ``` > kylin.source.hive.beeline-params=-n hive -p hive --hiveconf > mapred.job.queue.name=myQueue -u > "jdbc:hive2://myZk:2181/;serviceDiscoveryMode=zooKeeper;" > ``` > > kylin_hive_conf.xml > ``` > > mapred.job.queue.name > myQueue > > ``` > > After digging into the source code ,I found that kylin try to get the row > count of hive table before redistributing it. But it dose not override the > hive configuration when using jdbc to connect to hive server. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3277) Kylin should override hiveconf settings when connecting to hive using jdbc
[ https://issues.apache.org/jira/browse/KYLIN-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billy Liu updated KYLIN-3277: - Fix Version/s: v2.4.0 > Kylin should override hiveconf settings when connecting to hive using jdbc > -- > > Key: KYLIN-3277 > URL: https://issues.apache.org/jira/browse/KYLIN-3277 > Project: Kylin > Issue Type: Bug >Reporter: Chuqian Yu >Assignee: Chuqian Yu >Priority: Major > Labels: patch > Fix For: v2.4.0 > > Attachments: 0001-KYLIN-3277.patch > > > Hi, kylin developers. My cube building procedure failing at Step 2 > "Redistribute Flat Hive Table" because Kylin always trying to submit a mr job > to the default yarn queue. > > I have overrided the mapred.job.queue.name property in both > kylin_hive_conf.xml and kylin.properties but it doesn't work. > > kylin.properties > ``` > kylin.source.hive.beeline-params=-n hive -p hive --hiveconf > mapred.job.queue.name=myQueue -u > "jdbc:hive2://myZk:2181/;serviceDiscoveryMode=zooKeeper;" > ``` > > kylin_hive_conf.xml > ``` > > mapred.job.queue.name > myQueue > > ``` > > After digging into the source code ,I found that kylin try to get the row > count of hive table before redistributing it. But it dose not override the > hive configuration when using jdbc to connect to hive server. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413998#comment-16413998 ] ASF GitHub Bot commented on KYLIN-3314: --- lidongsjtu commented on issue #124: KYLIN-3314 refactor code for cube planner algorithm URL: https://github.com/apache/kylin/pull/124#issuecomment-376206373 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > refactor code for cube planner algorithm > > > Key: KYLIN-3314 > URL: https://issues.apache.org/jira/browse/KYLIN-3314 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: Zhong Yanghong >Assignee: Wang Ken >Priority: Major > Attachments: APACHE-KYLIN-3314.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-3154) Create a document for cube planner
[ https://issues.apache.org/jira/browse/KYLIN-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-3154: - Assignee: qianqiaoneng (was: Zhong Yanghong) > Create a document for cube planner > -- > > Key: KYLIN-3154 > URL: https://issues.apache.org/jira/browse/KYLIN-3154 > Project: Kylin > Issue Type: Sub-task >Reporter: Zhong Yanghong >Assignee: qianqiaoneng >Priority: Major > Fix For: v2.3.0 > > Attachments: APACHE-KYLIN-3154.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-3153) Create a document for system cube creation
[ https://issues.apache.org/jira/browse/KYLIN-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-3153: - Assignee: qianqiaoneng (was: Zhong Yanghong) > Create a document for system cube creation > -- > > Key: KYLIN-3153 > URL: https://issues.apache.org/jira/browse/KYLIN-3153 > Project: Kylin > Issue Type: Sub-task >Reporter: Zhong Yanghong >Assignee: qianqiaoneng >Priority: Major > Fix For: v2.3.0 > > Attachments: APACHE-KYLIN-3153.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413930#comment-16413930 ] ASF GitHub Bot commented on KYLIN-3314: --- kyotoYaho closed pull request #124: KYLIN-3314 refactor code for cube planner algorithm URL: https://github.com/apache/kylin/pull/124 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/AbstractRecommendAlgorithm.java b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/AbstractRecommendAlgorithm.java index b35c738645..094b960005 100755 --- a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/AbstractRecommendAlgorithm.java +++ b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/AbstractRecommendAlgorithm.java @@ -18,16 +18,17 @@ package org.apache.kylin.cube.cuboid.algorithm; -import java.util.concurrent.atomic.AtomicBoolean; - import org.slf4j.Logger; import org.slf4j.LoggerFactory; +import java.util.List; +import java.util.concurrent.atomic.AtomicBoolean; + public abstract class AbstractRecommendAlgorithm implements CuboidRecommendAlgorithm { private static final Logger logger = LoggerFactory.getLogger(AbstractRecommendAlgorithm.class); -private CuboidStats cuboidStats; -private BenefitPolicy benefitPolicy; +protected final CuboidStats cuboidStats; +protected final BenefitPolicy benefitPolicy; private AtomicBoolean cancelRequested = new AtomicBoolean(false); private AtomicBoolean canceled = new AtomicBoolean(false); @@ -44,6 +45,12 @@ public AbstractRecommendAlgorithm(final long timeout, BenefitPolicy benefitPolic this.benefitPolicy = benefitPolicy; } +@Override +public List recommend(double expansionRate) { +double spaceLimit = cuboidStats.getBaseCuboidSize() * expansionRate; +return start(spaceLimit); +} + @Override public void cancel() { cancelRequested.set(true); @@ -51,7 +58,6 @@ public void cancel() { /** * Checks whether the algorithm has been canceled or timed out. - * */ protected boolean shouldCancel() { if (canceled.get()) { @@ -71,12 +77,4 @@ protected boolean shouldCancel() { } return false; } - -public CuboidStats getCuboidStats() { -return cuboidStats; -} - -public BenefitPolicy getBenefitPolicy() { -return benefitPolicy; -} } diff --git a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java index 6d0b654f75..e29332585f 100755 --- a/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java +++ b/core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java @@ -18,15 +18,14 @@ package org.apache.kylin.cube.cuboid.algorithm; -import java.util.List; -import java.util.Map; -import java.util.Set; - +import com.google.common.collect.ImmutableMap; +import com.google.common.collect.Maps; +import com.google.common.collect.Sets; import org.slf4j.Logger; import org.slf4j.LoggerFactory; -import com.google.common.collect.Maps; -import com.google.common.collect.Sets; +import java.util.Map; +import java.util.Set; /** * Calculate the benefit based on Benefit Per Unit Space. @@ -35,17 +34,24 @@ private static Logger logger = LoggerFactory.getLogger(BPUSCalculator.class); -protected CuboidStats cuboidStats; -protected MapcuboidAggCostMap; +protected final CuboidStats cuboidStats; +protected final ImmutableMap initCuboidAggCostMap; +protected final Map processCuboidAggCostMap; public BPUSCalculator(CuboidStats cuboidStats) { this.cuboidStats = cuboidStats; -this.cuboidAggCostMap = Maps.newHashMap(); +this.initCuboidAggCostMap = ImmutableMap.copyOf(initCuboidAggCostMap()); +this.processCuboidAggCostMap = Maps.newHashMap(initCuboidAggCostMap); } -@Override -public void initBeforeStart() { -cuboidAggCostMap.clear(); +protected BPUSCalculator(CuboidStats cuboidStats, ImmutableMap initCuboidAggCostMap) { +this.cuboidStats = cuboidStats; +this.initCuboidAggCostMap = initCuboidAggCostMap; +this.processCuboidAggCostMap = Maps.newHashMap(initCuboidAggCostMap); +} + +private Map initCuboidAggCostMap() { +Map cuboidAggCostMap = Maps.newHashMap(); //Initialize stats for mandatory cuboids for (Long cuboid :
[jira] [Resolved] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong resolved KYLIN-3314. --- Resolution: Resolved > refactor code for cube planner algorithm > > > Key: KYLIN-3314 > URL: https://issues.apache.org/jira/browse/KYLIN-3314 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: Zhong Yanghong >Assignee: Wang Ken >Priority: Major > Attachments: APACHE-KYLIN-3314.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413931#comment-16413931 ] ASF subversion and git services commented on KYLIN-3314: Commit 03316e2ebf5efed203ef16e2cd27b7541d3ff09d in kylin's branch refs/heads/master from Wang Ken [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=03316e2 ] KYLIN-3314 refactor code for cube planner algorithm Signed-off-by: Zhong> refactor code for cube planner algorithm > > > Key: KYLIN-3314 > URL: https://issues.apache.org/jira/browse/KYLIN-3314 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: Zhong Yanghong >Assignee: Wang Ken >Priority: Major > Attachments: APACHE-KYLIN-3314.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3316) Reported NPE after build in KAP
[ https://issues.apache.org/jira/browse/KYLIN-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] TianZhiwei updated KYLIN-3316: -- Attachment: 0001-KYLIN-3316-modify-CubingJob.updateMetrics.patch > Reported NPE after build in KAP > --- > > Key: KYLIN-3316 > URL: https://issues.apache.org/jira/browse/KYLIN-3316 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v2.3.0 >Reporter: TianZhiwei >Priority: Major > Labels: build > Fix For: v2.4.0 > > Attachments: 0001-KYLIN-3316-modify-CubingJob.updateMetrics.patch > > > Does not affect the completion of the build task and any build task can be > reproduced -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3316) Reported NPE after build in KAP
TianZhiwei created KYLIN-3316: - Summary: Reported NPE after build in KAP Key: KYLIN-3316 URL: https://issues.apache.org/jira/browse/KYLIN-3316 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: v2.3.0 Reporter: TianZhiwei Fix For: v2.4.0 Does not affect the completion of the build task and any build task can be reproduced -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3277) Kylin should override hiveconf settings when connecting to hive using jdbc
[ https://issues.apache.org/jira/browse/KYLIN-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413585#comment-16413585 ] nichunen commented on KYLIN-3277: - +1 > Kylin should override hiveconf settings when connecting to hive using jdbc > -- > > Key: KYLIN-3277 > URL: https://issues.apache.org/jira/browse/KYLIN-3277 > Project: Kylin > Issue Type: Bug >Reporter: Chuqian Yu >Priority: Major > Labels: patch > Attachments: 0001-KYLIN-3277.patch > > > Hi, kylin developers. My cube building procedure failing at Step 2 > "Redistribute Flat Hive Table" because Kylin always trying to submit a mr job > to the default yarn queue. > > I have overrided the mapred.job.queue.name property in both > kylin_hive_conf.xml and kylin.properties but it doesn't work. > > kylin.properties > ``` > kylin.source.hive.beeline-params=-n hive -p hive --hiveconf > mapred.job.queue.name=myQueue -u > "jdbc:hive2://myZk:2181/;serviceDiscoveryMode=zooKeeper;" > ``` > > kylin_hive_conf.xml > ``` > > mapred.job.queue.name > myQueue > > ``` > > After digging into the source code ,I found that kylin try to get the row > count of hive table before redistributing it. But it dose not override the > hive configuration when using jdbc to connect to hive server. > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3315) allow each project to set its own source in project level override configuration
Dong Li created KYLIN-3315: -- Summary: allow each project to set its own source in project level override configuration Key: KYLIN-3315 URL: https://issues.apache.org/jira/browse/KYLIN-3315 Project: Kylin Issue Type: Improvement Components: Metadata Reporter: Dong Li Currently, all projects connect to the same source which is set in kylin.properties with kylin.source.default property. It's better to allow each project to set its own source in project level override configuration. As the result, we can have project A connects to JDBC, and project B connects to Hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413551#comment-16413551 ] ASF GitHub Bot commented on KYLIN-3314: --- coveralls commented on issue #124: KYLIN-3314 refactor code for cube planner algorithm URL: https://github.com/apache/kylin/pull/124#issuecomment-376083546 ## Pull Request Test Coverage Report for [Build 3087](https://coveralls.io/builds/16173500) * **0** of **144** **(0.0%)** changed or added relevant lines in **13** files are covered. * **18** unchanged lines in **6** files lost coverage. * Overall coverage increased (+**0.09%**) to **23.965%** --- | Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | | :-|--||---: | | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/generic/RouletteWheelSelection.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2Fgeneric%2FRouletteWheelSelection.java#L51) | 0 | 1 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/AbstractRecommendAlgorithm.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2FAbstractRecommendAlgorithm.java#L50) | 0 | 2 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidBenefitModel.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2FCuboidBenefitModel.java#L36) | 0 | 2 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/SPBPUSCalculator.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2FSPBPUSCalculator.java#L33) | 0 | 3 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/generic/BitsMutation.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2Fgeneric%2FBitsMutation.java#L45) | 0 | 4 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/PBPUSCalculator.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2FPBPUSCalculator.java#L34) | 0 | 6 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/generic/BitsOnePointCrossover.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2Fgeneric%2FBitsOnePointCrossover.java#L53) | 0 | 6 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/CuboidStats.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2FCuboidStats.java#L127) | 0 | 8 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/greedy/GreedyAlgorithm.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2Fgreedy%2FGreedyAlgorithm.java#L65) | 0 | 13 | 0.0% | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/generic/GeneticAlgorithm.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2Fgeneric%2FGeneticAlgorithm.java#L65) | 0 | 17 | 0.0% | Files with Coverage Reduction | New Missed Lines | % | | :-|--|--: | | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2FTreeCuboidScheduler.java#L129) | 1 | 68.5% | | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/greedy/GreedyAlgorithm.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2Fgreedy%2FGreedyAlgorithm.java#L63) | 1 | 0.0% | | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/generic/BitsChromosome.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2Fgeneric%2FBitsChromosome.java#L55) | 2 | 0.0% | | [core-cube/src/main/java/org/apache/kylin/cube/cuboid/algorithm/BPUSCalculator.java](https://coveralls.io/builds/16173500/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2Falgorithm%2FBPUSCalculator.java#L45) | 3 | 0.0% | |
[jira] [Commented] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413534#comment-16413534 ] ASF GitHub Bot commented on KYLIN-3314: --- kyotoYaho opened a new pull request #124: KYLIN-3314 refactor code for cube planner algorithm URL: https://github.com/apache/kylin/pull/124 Signed-off-by: ZhongThis is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > refactor code for cube planner algorithm > > > Key: KYLIN-3314 > URL: https://issues.apache.org/jira/browse/KYLIN-3314 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: Zhong Yanghong >Assignee: Wang Ken >Priority: Major > Attachments: APACHE-KYLIN-3314.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413487#comment-16413487 ] Zhong Yanghong commented on KYLIN-3314: --- Previous Generic algorithm implementation has lots of duplication with https://github.com/apache/commons-math. This refactor removes the duplication and focuses on the Bits-related component implementation, like {{BitsChromosome}}, {{BitsOnePointCrossover}}, {{BitsMutation}}. > refactor code for cube planner algorithm > > > Key: KYLIN-3314 > URL: https://issues.apache.org/jira/browse/KYLIN-3314 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: Zhong Yanghong >Assignee: Wang Ken >Priority: Major > Attachments: APACHE-KYLIN-3314.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3314) refactor code for cube planner algorithm
[ https://issues.apache.org/jira/browse/KYLIN-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-3314: -- Attachment: APACHE-KYLIN-3314.patch > refactor code for cube planner algorithm > > > Key: KYLIN-3314 > URL: https://issues.apache.org/jira/browse/KYLIN-3314 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: Zhong Yanghong >Assignee: Wang Ken >Priority: Major > Attachments: APACHE-KYLIN-3314.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3314) refactor code for cube planner algorithm
Zhong Yanghong created KYLIN-3314: - Summary: refactor code for cube planner algorithm Key: KYLIN-3314 URL: https://issues.apache.org/jira/browse/KYLIN-3314 Project: Kylin Issue Type: Improvement Components: Metadata Reporter: Zhong Yanghong Assignee: Wang Ken -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-2866) Enlarge the reducer number for hyperloglog statistics calculation at step FactDistinctColumnsJob
[ https://issues.apache.org/jira/browse/KYLIN-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong reassigned KYLIN-2866: - Assignee: Wang Ken (was: Zhong Yanghong) > Enlarge the reducer number for hyperloglog statistics calculation at step > FactDistinctColumnsJob > > > Key: KYLIN-2866 > URL: https://issues.apache.org/jira/browse/KYLIN-2866 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Reporter: Zhong Yanghong >Assignee: Wang Ken >Priority: Major > Fix For: v2.3.0 > > Attachments: APACHE-KYLIN-2866-refined.patch, APACHE-KYLIN-2866.patch > > > Currently only one reducer is assigned for hll stats calculation, which may > become the bottleneck for slow down this step. Since the stats for different > cuboids will not influence each other, it's better to divide the cuboid set > into several and assign a reduce for each subset. > The strategy of this patch is to assign 100 cuboids into a subset. And > there's a upper limit of reducers for hll stats calculation. Currently it's > 50. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-2987) Add 'auto.purge=true' when creating intermediate hive table or redistribute a hive table
[ https://issues.apache.org/jira/browse/KYLIN-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413405#comment-16413405 ] Zhong Yanghong commented on KYLIN-2987: --- Hi [~liukaige], I think this issue relates to HIVE. Why auto.purge=true cannot work for EXTERNAL tables. Currently HIVE may not support. However, for the long term, I don't know why not support it. > Add 'auto.purge=true' when creating intermediate hive table or redistribute a > hive table > > > Key: KYLIN-2987 > URL: https://issues.apache.org/jira/browse/KYLIN-2987 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Trivial > Attachments: APACHE-KYLIN-2987.patch > > > At kylin side, we can add auto.purge=true when creating intermediate table. > However, to make ‘auto.purge’ effective for “insert overwrite table”, we > still need one patch for hive. > https://issues.apache.org/jira/browse/HIVE-15880 -- This message was sent by Atlassian JIRA (v7.6.3#76005)