[
https://issues.apache.org/jira/browse/KYLIN-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166985#comment-17166985
]
ASF GitHub Bot commented on KYLIN-4660:
---------------------------------------
hit-lacus commented on a change in pull request #1332:
URL: https://github.com/apache/kylin/pull/1332#discussion_r462028446
##########
File path:
kylin-spark-project/kylin-spark-engine/src/main/scala/org/apache/kylin/engine/spark/builder/CubeDictionaryBuilder.scala
##########
@@ -70,7 +70,7 @@ class DFDictionaryBuilder(val dataset: Dataset[Row],
val columnName = ref.identity
logInfo(s"Start building global dictionaries V2 for column $columnName.")
Review comment:
V2 need to be removed?
##########
File path:
kylin-spark-project/kylin-spark-engine/src/main/scala/org/apache/kylin/engine/spark/job/ParentSourceChooser.scala
##########
@@ -163,7 +162,6 @@ class ParentSourceChooser(
val flatTable = new CreateFlatTable(seg, toBuildTree, ss, sourceInfo)
val afterJoin: Dataset[Row] = flatTable.generateDataset(needEncoding, true)
sourceInfo.setFlattableDS(afterJoin)
- sourceInfo.setCount(afterJoin.count())
Review comment:
Good catch!
##########
File path:
kylin-spark-project/kylin-spark-engine/src/main/scala/org/apache/kylin/engine/spark/job/CubeBuildJob.java
##########
@@ -120,7 +120,7 @@ protected void doExecute() throws Exception {
infos.recordSpanningTree(segId, spanningTree);
logger.info("Updating segment info");
- updateSegmentInfo(getParam(MetadataConstants.P_CUBE_ID), seg,
buildFromFlatTable.getCount());
+ updateSegmentInfo(getParam(MetadataConstants.P_CUBE_ID), seg,
buildFromFlatTable.getParentDS().count());
Review comment:
What's the reason of `getParentDS`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Remove duplicated/misleading code or comment
> --------------------------------------------
>
> Key: KYLIN-4660
> URL: https://issues.apache.org/jira/browse/KYLIN-4660
> Project: Kylin
> Issue Type: Sub-task
> Components: Storage - Parquet
> Reporter: Xiaoxiang Yu
> Assignee: wangrupeng
> Priority: Major
> Fix For: v4.0.0-beta
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> Source code is located at
> [https://github.com/apache/kylin/tree/kylin-on-parquet-v2] , we are still
> test and verify it.
>
>
> ||Feature to be removed||commit id||Jira ID for the feature||
> |Advanced Snapshot|todo|todo|
> |JDBC Source|todo| |
> |Kafka Source (NRT)|todo| |
> |Kafka Source (Real-time OLAP)|todo| |
> | Build Engine - MR|todo| |
> | Build Engine - Spark|todo| |
> | Build Engine - Flink|todo| |
> | Storgae Engine - HBase|todo| |
--
This message was sent by Atlassian Jira
(v8.3.4#803005)