[GitHub] [carbondata] akashrn5 commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

GitBox Wed, 18 Mar 2020 06:17:42 -0700

akashrn5 commented on a change in pull request #3584: [CARBONDATA-3718] Support 
SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#discussion_r394334509


 ##########
 File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
 ##########
 @@ -248,21 +258,31 @@ class CarbonTableCompactor(carbonLoadModel: 
CarbonLoadModel,
       } else {
         // Get the segment files each updated segment in case of IUD compaction
         if (compactionType == CompactionType.IUD_UPDDEL_DELTA) {
-          val segmentFilesList = loadsToMerge.asScala.map{seg =>
+          val segmentFilesList = loadsToMerge.asScala.map { seg =>
+            val segmentMetaDataInfo = new 
SegmentFileStore(carbonLoadModel.getTablePath,
+              seg.getSegmentFile).getSegmentFile.getSegmentMetaDataInfo
             val file = SegmentFileStore.writeSegmentFile(
               carbonTable,
               seg.getLoadName,
-              carbonLoadModel.getFactTimeStamp.toString)
+              carbonLoadModel.getFactTimeStamp.toString,
+              segmentMetaDataInfo)
             new Segment(seg.getLoadName, file)
           }.filter(_.getSegmentFileName != null).asJava
           segmentFilesForIUDCompact = new 
util.ArrayList[Segment](segmentFilesList)
         } else {
+          // get segmentMetadata info from accumulator
+          val segmentMetaDataInfo = 
CarbonDataRDDFactory.getSegmentMetaDataInfoFromAccumulator(
 
 Review comment:
   `getSegmentMetaDataInfoFromAccumulator`  move this method to 
carbonLoaderUtil as its used in all load flows

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3584: [CARBONDATA-3718] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

Reply via email to