geserdugarov opened a new pull request, #12516:
URL: https://github.com/apache/hudi/pull/12516

   ### Change Logs
   
   Currently, `hoodie.populate.meta.fields` is not supported in Flink. This 
config should be used for append mode, so this MR adds support for this case.
   
   **Before**, master, 3cb874fd53390b4465a8a8b9acd615b1b7f362cf:
   
![1-append-master-3cb874fd](https://github.com/user-attachments/assets/b609dcc4-8c4c-474d-a16a-3026f4703ecf)
   
   **After**:
   
![2-append-populate-false](https://github.com/user-attachments/assets/e5b5d037-c13b-461e-b72e-fc1066ef0cf2)
   
   
   |  | Before | After |
   | -- | -- | -- |
   | CPU samples | 79.8K | 67.9K |
   | Total time of 60M rows writing | 108 s | 93 s |
   
   TPC-H lineitem table is used for profiling.
   
   14% faster write in append mode for Flink when `hoodie.populate.meta.fields 
= false`.
   
   The next possible optimization for this scenario is to think about do we 
really need Bloom filters in this use case, because it costs 16% of CPU after 
optimization of this MR:
   
![3-append-bloom-filters](https://github.com/user-attachments/assets/5a59cd92-1f87-41fd-8747-fa47b4f191d3)
   
   
   ### Impact
   
   No
   
   ### Risk level (write none, low medium or high below)
   
   Low
   
   ### Documentation Update
   
   No need. There is no mention in "All Configurations" page that 
`hoodie.populate.meta.fields` is not supported for Flink.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to