[GitHub] carbondata pull request #2936: [CARBONDATA-3118] Parallelize block pruning o...

ravipesala Wed, 21 Nov 2018 22:18:02 -0800

Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2936#discussion_r235611496
  
    --- Diff: 
core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java ---
    @@ -120,37 +132,166 @@ public BlockletDetailsFetcher 
getBlockletDetailsFetcher() {
        * @param filterExp
        * @return
        */
    -  public List<ExtendedBlocklet> prune(List<Segment> segments, 
FilterResolverIntf filterExp,
    -      List<PartitionSpec> partitions) throws IOException {
    -    List<ExtendedBlocklet> blocklets = new ArrayList<>();
    -    SegmentProperties segmentProperties;
    -    Map<Segment, List<DataMap>> dataMaps = 
dataMapFactory.getDataMaps(segments);
    +  public List<ExtendedBlocklet> prune(List<Segment> segments, final 
FilterResolverIntf filterExp,
    +      final List<PartitionSpec> partitions) throws IOException {
    +    final List<ExtendedBlocklet> blocklets = new ArrayList<>();
    +    final Map<Segment, List<DataMap>> dataMaps = 
dataMapFactory.getDataMaps(segments);
    +    // for non-filter queries
    +    if (filterExp == null) {
    +      // if filter is not passed, then return all the blocklets.
    +      return pruneWithoutFilter(segments, partitions, blocklets);
    --- End diff --
    
    Please check what is the time taken to get all blocks in case of millions 
of files. If it takes more time then we may need to parallelize this also.

---

[GitHub] carbondata pull request #2936: [CARBONDATA-3118] Parallelize block pruning o...

Reply via email to