akashrn5 commented on a change in pull request #3677: [wip]Fix segment cache issue with parallel spark applications on same store URL: https://github.com/apache/carbondata/pull/3677#discussion_r397305548
########## File path: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java ########## @@ -369,6 +379,33 @@ private void modifyColumnSchemaForSortColumn(ColumnSchema columnSchema, boolean return tableBlockIndexUniqueIdentifiers; } + /** + * This case is added for a case where, there are two applications running, and in one application + * operations happened like SI rebuild, update or delete case, then the cache should be updated as + * well. The cache updation happens for same application, but other application may fail to query + * or may give wrong result. Since we overwrite the segment file in these scenarios, check the + * timestamp, and if modified, clear from the cache. + */ + private void clearSegmentMapIfSegmentUpdated(String latestSegmentFilePath, Segment segment) { + SegmentBlockIndexInfo segmentBlockIndexInfo = segmentMap.get(segment.getSegmentNo()); Review comment: But that case, we will be doing list files, get all the files inside segment, timestamp comparison, i dont think this will be feasible, here, just i can go for segment file, one file timestamp checking would be feasible and in future also, any segment file related updation will be taken care. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services