ShreelekhyaG commented on a change in pull request #3988: URL: https://github.com/apache/carbondata/pull/3988#discussion_r534763831
########## File path: core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java ########## @@ -439,6 +430,73 @@ public boolean accept(CarbonFile file) { return null; } + /** + * Get old and invalid files which have already been merged to a mergeindex file.In segment folder + * we may have both .index files and .mergeindex files, as we are not deleting index files + * immediately for old tables, this method reads mergeindex file and adds mapped index files to a + * list and returns.If more than one mergeindex file is present, considers the latest one as valid + * Ex: We have 3 files in segment. Segment0/ 1.index , 1.mergeindex file, 1.carbondata. + * 1.index is merged to 1.mergeindex. Here it returns merged index file - 1.index. + */ + public static Set<String> getInvalidAndMergedIndexFiles(List<String> indexFiles) + throws IOException { + SegmentIndexFileStore indexFileStore = new SegmentIndexFileStore(); Review comment: This method is called during read, when segment/table status file is not present or gets deleted, and when writing segment data size in tablestatus file. So for all old/new tables this method is called and taken care of. For new tables also, when stale data is present, it is used to filter invalid files. Ex: SI load when MT has stale index files, as during SI load MT segment file/ status file name is not updated, we directly get from segment directory. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org