[GitHub] [iceberg] kingeasternsun opened a new pull request #3876: ⚡ Speed up TableMigration By collect the DafaFile In parallel

GitBox Mon, 10 Jan 2022 23:18:55 -0800


kingeasternsun opened a new pull request #3876:
URL: https://github.com/apache/iceberg/pull/3876



   ⚡   In TableMigration,  the  performance bottleneck mostly  root at  collect 
data file  by read the hadoop input file one by one , so I think  we can do it 
in Parallel  .   In this PR I use the `parquet.metadata.read.parallelism` 
configraion to control the parallelism .
   If the `parquet.metadata.read.parallelism`  not set or equals 1 , just use 
the orginal  nonparalle api.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] kingeasternsun opened a new pull request #3876: ⚡ Speed up TableMigration By collect the DafaFile In parallel

Reply via email to