parisni commented on issue #6373:
URL: https://github.com/apache/hudi/issues/6373#issuecomment-1222605685
@nsivabalan when cleaning + metadata the spark executor gets thousand of
such logs (likely it is reading the meta table again and again)
```
27324870 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.timeline.HoodieActiveTimeline -
Loaded instants upto : Option{val=[20220822121450048__deltacommit__COMPLETED]}
27324906 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-0-0',
fileLen=-1}
27324906 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hadoop.fs.s3a.S3AInputStream - Switching to Random IO
seek policy
27324924 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a delete block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-0-0
27324939 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieLogFormatReader - Moving
to the next reader for logfile
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27324961 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27324981 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hadoop.fs.s3a.S3AInputStream - Switching to Random IO
seek policy
27325059 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325059 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27325083 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325083 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27325128 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325128 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27325194 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325194 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27325265 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325265 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27325336 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325336 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27325409 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325409 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754',
fileLen=-1}
27325479 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.1_0-111-75754
at instant 20220822121450048
27325505 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieLogFormatReader - Moving
to the next reader for logfile
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.2_0-162-81065',
fileLen=-1}
27325523 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Scanning log file
HoodieLogFile{pathStr='s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.2_0-162-81065',
fileLen=-1}
27325523 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hadoop.fs.s3a.S3AInputStream - Switching to Random IO
seek policy
27325542 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Reading a data block from file
s3a://hudi_path/.hoodie/metadata/files/.files-0000_20220822121450048.log.2_0-162-81065
at instant 20220822085816830
27325542 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Merging the final data blocks
27325542 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 10
27325559 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 9
27325626 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.util.collection.ExternalSpillableMap -
Estimated Payload size => 3409672
27325626 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 8
27325683 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.util.collection.ExternalSpillableMap -
New Estimated Payload size => 35441
27325683 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 7
27325728 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 6
27325778 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 5
27325835 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 4
27325899 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 3
27325987 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 2
27326086 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader -
Number of remaining logblocks to merge 1
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner -
Number of log files scanned => 3
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner -
MaxMemoryInBytes allowed for compaction => 1073741824
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner -
Number of entries in MemoryBasedMap in ExternalSpillableMap => 17075
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner -
Total size in bytes of MemoryBasedMap in ExternalSpillableMap => 605155167
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner -
Number of entries in BitCaskDiskMap in ExternalSpillableMap => 0
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner -
Size of file spilled to disk => 0
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.metadata.HoodieBackedTableMetadata - Opened 3
metadata log files (dataset instant=20220822121450048, metadata
instant=20220822121450048) in 1379 ms
27326134 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.metadata.BaseTableMetadata - Listed file in
partition from metadata:
partition=version=1/event_date=2022-04-21/event_hour=09, #files=3
27326135 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView -
addFilesToView: NumFiles=3, NumFileGroups=3, FileGroupsCreationTime=1,
StoreTimeTaken=0
27326135 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.table.action.clean.CleanPlanner - 0 patterns
used to delete in partition path:version=1/event_date=2022-04-21/event_hour=09
27326135 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.table.action.clean.CleanPlanner - Cleaning
version=1/event_date=2022-04-21/event_hour=10, retaining latest 24 commits.
27326135 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView -
Building file system view for partition
(version=1/event_date=2022-04-21/event_hour=10)
27326135 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.metadata.HoodieTableMetadataUtil - Loading
latest merged file slices for metadata table partition files
27326135 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView -
Took 0 ms to read 0 instants, 0 replaced file groups
27326170 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.util.ClusteringUtils - Found 0 files in
pending clustering operations
27326170 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView -
Building file system view for partition (files)
27326195 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.view.AbstractTableFileSystemView -
addFilesToView: NumFiles=4, NumFileGroups=1, FileGroupsCreationTime=1,
StoreTimeTaken=0
27326195 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Loading
HoodieTableMetaClient from s3a://hudi_path//.hoodie/metadata
27326232 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.HoodieTableConfig - Loading table
properties from s3a://hudi_path/.hoodie/metadata/.hoodie/hoodie.properties
27326266 [Executor task launch worker for task 165.0 in stage 173.0 (TID
83451)] INFO org.apache.hudi.common.table.HoodieTableMetaClient - Finished
Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from
s3a://hudi_path//.hoodie/metadata
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]