vinothchandar commented on pull request #1848:
URL: https://github.com/apache/hudi/pull/1848#issuecomment-669787801


   @garyli1019 I am afraid this has something to do with the changes we for 
`InMemoryFileIndex` or sth made in the pr .
   
   ```
   >>>> TestBootstrap : 
         
files:[file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F03/part-00000-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet,
 
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F03/part-00001-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet,
 
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F01/part-00000-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet,
 
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F01/part-00001-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet,
 
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F02/part-00000-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet,
 
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F02/part-00001-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet]
         numVersions:2
         numFiles:6
         bootstrapBasePath:file:/tmp/junit3878890598586882351/data/_SUCCESS
                
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F03/part-00000-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet
                
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F03/part-00001-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet
                
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F01/part-00000-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet
                
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F01/part-00001-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet
                
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F02/part-00000-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet
                
file:/tmp/junit3878890598586882351/data/datestr=2020%252F04%252F02/part-00001-5b566025-a845-494b-830e-e203c2ab142f.c000.snappy.parquet
         
basePath:file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F03/.44f3a4f0-ec41-4a43-889e-b133bccaaf40_00000000000001.log.1_0-937-6957
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F03/830616aa-e406-4155-aa63-c9c80d15212d_0-895-6505_00000000000001.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F03/.hoodie_partition_metadata
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F03/44f3a4f0-ec41-4a43-889e-b133bccaaf40_0-943-6973_20200806081528.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F03/44f3a4f0-ec41-4a43-889e-b133bccaaf40_0-895-6505_00000000000001.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F03/.830616aa-e406-4155-aa63-c9c80d15212d_00000000000001.log.1_1-937-6958
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F03/830616aa-e406-4155-aa63-c9c80d15212d_5-943-6978_20200806081528.parquet
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/00000000000001.deltacommit.requested
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081528.compaction.inflight
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081528.compaction.requested
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081524.deltacommit.inflight
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/00000000000001.deltacommit.inflight
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081524.deltacommit
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.aux/20200806081528.compaction.requested
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.aux/.bootstrap/.partitions/00000000-0000-0000-0000-000000000000-0_1-0-1_00000000000001.hfile
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.aux/.bootstrap/.fileids/00000000-0000-0000-0000-000000000000-0_1-0-1_00000000000001.hfile
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081517.restore.inflight
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081524.deltacommit.requested
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/20200806081528/datestr=2020%252F04%252F03/830616aa-e406-4155-aa63-c9c80d15212d_5-943-6978_20200806081528.parquet.marker.MERGE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/20200806081528/datestr=2020%252F04%252F03/44f3a4f0-ec41-4a43-889e-b133bccaaf40_0-943-6973_20200806081528.parquet.marker.MERGE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/20200806081528/datestr=2020%252F04%252F01/cc465d37-07a6-419b-86a0-7756d1dd8ca4_3-943-6976_20200806081528.parquet.marker.MERGE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/20200806081528/datestr=2020%252F04%252F01/562f9e3c-29a4-4f3b-9405-0796eaea0717_1-943-6974_20200806081528.parquet.marker.MERGE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/20200806081528/datestr=2020%252F04%252F02/6b045efb-f008-44e9-9e44-25f7ed7c7e36_4-943-6977_20200806081528.parquet.marker.MERGE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/20200806081528/datestr=2020%252F04%252F02/ae7740f7-b986-48a5-9d3c-2a784f0ee262_2-943-6975_20200806081528.parquet.marker.MERGE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/00000000000001/datestr=2020%252F04%252F03/830616aa-e406-4155-aa63-c9c80d15212d_0-895-6505_00000000000001.parquet.marker.CREATE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/00000000000001/datestr=2020%252F04%252F03/44f3a4f0-ec41-4a43-889e-b133bccaaf40_0-895-6505_00000000000001.parquet.marker.CREATE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/00000000000001/datestr=2020%252F04%252F01/cc465d37-07a6-419b-86a0-7756d1dd8ca4_1-895-6506_00000000000001.parquet.marker.CREATE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/00000000000001/datestr=2020%252F04%252F01/562f9e3c-29a4-4f3b-9405-0796eaea0717_1-895-6506_00000000000001.parquet.marker.CREATE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/00000000000001/datestr=2020%252F04%252F02/ae7740f7-b986-48a5-9d3c-2a784f0ee262_2-895-6507_00000000000001.parquet.marker.CREATE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/.temp/00000000000001/datestr=2020%252F04%252F02/6b045efb-f008-44e9-9e44-25f7ed7c7e36_2-895-6507_00000000000001.parquet.marker.CREATE
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/hoodie.properties
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081528.commit
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/00000000000001.deltacommit
                
file:/tmp/junit3878890598586882351/dataset/.hoodie/20200806081517.restore
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F01/cc465d37-07a6-419b-86a0-7756d1dd8ca4_1-895-6506_00000000000001.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F01/.562f9e3c-29a4-4f3b-9405-0796eaea0717_00000000000001.log.1_4-937-6961
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F01/562f9e3c-29a4-4f3b-9405-0796eaea0717_1-895-6506_00000000000001.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F01/562f9e3c-29a4-4f3b-9405-0796eaea0717_1-943-6974_20200806081528.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F01/.hoodie_partition_metadata
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F01/.cc465d37-07a6-419b-86a0-7756d1dd8ca4_00000000000001.log.1_5-937-6962
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F01/cc465d37-07a6-419b-86a0-7756d1dd8ca4_3-943-6976_20200806081528.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F02/ae7740f7-b986-48a5-9d3c-2a784f0ee262_2-895-6507_00000000000001.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F02/.6b045efb-f008-44e9-9e44-25f7ed7c7e36_00000000000001.log.1_3-937-6960
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F02/.ae7740f7-b986-48a5-9d3c-2a784f0ee262_00000000000001.log.1_2-937-6959
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F02/.hoodie_partition_metadata
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F02/6b045efb-f008-44e9-9e44-25f7ed7c7e36_2-895-6507_00000000000001.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F02/6b045efb-f008-44e9-9e44-25f7ed7c7e36_4-943-6977_20200806081528.parquet
                
file:/tmp/junit3878890598586882351/dataset/datestr=2020%252F04%252F02/ae7740f7-b986-48a5-9d3c-2a784f0ee262_2-943-6975_20200806081528.parquet
   ```
   
   May be the path filter getting set, but not being removed or something? I 
can now see all 12 files present, as expected. but spark.read.parquet() picks 
up just 6, the latests parquet files
   
   ```
   +----------------------------------------------------------------------+
   |_hoodie_file_name                                                     |
   +----------------------------------------------------------------------+
   |44f3a4f0-ec41-4a43-889e-b133bccaaf40_0-943-6973_20200806081528.parquet|
   |ae7740f7-b986-48a5-9d3c-2a784f0ee262_2-943-6975_20200806081528.parquet|
   |6b045efb-f008-44e9-9e44-25f7ed7c7e36_4-943-6977_20200806081528.parquet|
   |cc465d37-07a6-419b-86a0-7756d1dd8ca4_3-943-6976_20200806081528.parquet|
   |830616aa-e406-4155-aa63-c9c80d15212d_5-943-6978_20200806081528.parquet|
   |562f9e3c-29a4-4f3b-9405-0796eaea0717_1-943-6974_20200806081528.parquet|
   +----------------------------------------------------------------------+
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to