sreejasahithi opened a new pull request, #8254: URL: https://github.com/apache/ozone/pull/8254
## What changes were proposed in this pull request? This update introduces concurrent processing of container log files to improve performance. Logs are processed in parallel using a fixed thread pool, with each thread handling a separate log file. As log entries are read, they are parsed and organized based on unique identifiers (container ID and datanode ID). key level locks are used to ensure thread safety when multiple threads try to insert data into the map concurrently. Once a batch accumulates a predefined number of log entries, it is processed and inserted into the database. A synchronization mechanism ensures that only one thread processes the batch and inserts into DB at a time, preventing race conditions. Also created a ozone debug CLI command for the same. `./ozone debug container container_log_parse --parse=<path to logs folder> --thread-count=<num of threads>` ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-12581 ## How was this patch tested? Below is the workflow link that passed successfully: https://github.com/sreejasahithi/ozone/actions/runs/14351650273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
