Github user kevinjmh commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2713#discussion_r242454233
  
    --- Diff: 
datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
 ---
    @@ -218,56 +218,46 @@ public DataMapBuilder createBuilder(Segment segment, 
String shardName,
             this.bloomFilterSize, this.bloomFilterFpp, bloomCompress);
       }
     
    -  /**
    -   * returns all shard directories of bloom index files for query
    -   * if bloom index files are merged we should get only one shard path
    -   */
    -  private Set<String> getAllShardPaths(String tablePath, String segmentId) 
{
    -    String dataMapStorePath = CarbonTablePath.getDataMapStorePath(
    -        tablePath, segmentId, dataMapName);
    -    CarbonFile[] carbonFiles = 
FileFactory.getCarbonFile(dataMapStorePath).listFiles();
    -    Set<String> shardPaths = new HashSet<>();
    +
    +  private boolean isAllShardsMerged(String dmSegmentPath) {
    +    boolean mergeShardExist = false;
         boolean mergeShardInprogress = false;
    -    CarbonFile mergeShardFile = null;
    +    CarbonFile[] carbonFiles = 
FileFactory.getCarbonFile(dmSegmentPath).listFiles();
         for (CarbonFile carbonFile : carbonFiles) {
    -      if 
(carbonFile.getName().equals(BloomIndexFileStore.MERGE_BLOOM_INDEX_SHARD_NAME)) 
{
    -        mergeShardFile = carbonFile;
    -      } else if 
(carbonFile.getName().equals(BloomIndexFileStore.MERGE_INPROGRESS_FILE)) {
    +      String fileName = carbonFile.getName();
    +      if 
(fileName.equals(BloomIndexFileStore.MERGE_BLOOM_INDEX_SHARD_NAME)) {
    +        mergeShardExist = true;
    +      } else if 
(fileName.equals(BloomIndexFileStore.MERGE_INPROGRESS_FILE)) {
             mergeShardInprogress = true;
    --- End diff --
    
    Yes, you are right. We need to fix this. If we allow to use bloom filter 
when the index files are merging, maybe any IO Exception will occur in 
following steps when the merging is done. 
    
    Some simple ideas for this:
    1. datamap do not choose bloom when merging is under action
    2. change the pruning logic to segment independent, any datamap excepts 
default datamap can reject or fail the segment pruning ( by return null or ?), 
and no more result blocklet intersection for this datamap, such that this does 
not affect final result


---

Reply via email to