Prashant Wason created HUDI-2923:
------------------------------------

             Summary: Unable to read from metadata table when a compaction is 
in progress or has failed
                 Key: HUDI-2923
                 URL: https://issues.apache.org/jira/browse/HUDI-2923
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Prashant Wason
            Assignee: Prashant Wason
             Fix For: 0.10.0


When reading from metadata table, the [readers are opened with the latest file 
slices|[https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java#L248].]
 

When a compaction is in progress, the latest file slice does not have any 
base-file or log file (yet). Hence we are unable to read data from the metadata 
table. 

 

There are two cases here:
 # Compaction eventually completes: We will be able to read data from the 
metadata table.
 # Compaction fails: We will not be able to read data unless the next time 
compaction runs. This can be a fatal issue if the next writer tries to perform 
an update which requires listing partition from the metadata table.

 

Relevant logs from a unit test:

 

13084 [main] INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView  - *Pending 
Compaction instant for* (FileSlice 
\{fileGroupId=HoodieFileGroupId{partitionPath='files', 
fileId='2733a6ef-4bfd-444d-91bd-b42c3b66a84e-0'}, baseCommitTime=002001, 
baseFile='null', logFiles='[]'}) is 
:Option\{val=(002001,CompactionOperation{baseInstantTime='001', 
dataFileCommitTime=Option{val=001}, 
deltaFileNames=[.2733a6ef-4bfd-444d-91bd-b42c3b66a84e-0_001.log.1_0-34-36], 
dataFileName=Option\{val=2733a6ef-4bfd-444d-91bd-b42c3b66a84e-0_0-16-20_001.hfile},
 id='HoodieFileGroupId\{partitionPath='files', 
fileId='2733a6ef-4bfd-444d-91bd-b42c3b66a84e-0'}', metrics={}, 
bootstrapFilePath=Optional.empty})}

13084 [main] INFO  
org.apache.hudi.common.table.view.AbstractTableFileSystemView  - File Slice 
(FileSlice \{fileGroupId=HoodieFileGroupId{partitionPath='files', 
fileId='2733a6ef-4bfd-444d-91bd-b42c3b66a84e-0'}, *baseCommitTime=002001, 
baseFile='null', logFiles='[]'}) is in pending compaction*

 

13089 [main] INFO  
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner  - *Number of log 
files scanned => 0*
13089 [main] INFO  
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner  - 
MaxMemoryInBytes allowed for compaction => 0
13089 [main] INFO  
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner  - Number of 
entries in MemoryBasedMap in ExternalSpillableMap => 0
13089 [main] INFO  
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner  - Total size in 
bytes of MemoryBasedMap in ExternalSpillableMap => 0
13089 [main] INFO  
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner  - Number of 
entries in DiskBasedMap in ExternalSpillableMap => 0
13089 [main] INFO  
org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner  - Size of file 
spilled to disk => 0
13089 [main] INFO  org.apache.hudi.metadata.HoodieBackedTableMetadata  - 
*Opened metadata log files from []* at instant (dataset instant=002, metadata 
instant=002) in 2 ms
13089 [main] INFO  org.apache.hudi.metadata.HoodieBackedTableMetadata  - 
Metadata read for key __all_partitions__ took [baseFileRead, logMerge] [0, 0] ms
13090 [main] INFO  org.apache.hudi.metadata.BaseTableMetadata  - *Listed 
partitions from metadata: #partitions=0*



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to