sreejasahithi opened a new pull request, #8254:
URL: https://github.com/apache/ozone/pull/8254

   ## What changes were proposed in this pull request?
   
   This update introduces concurrent processing of container log files to 
improve performance. Logs are processed in parallel using a fixed thread pool, 
with each thread handling a separate log file. As log entries are read, they 
are parsed and organized based on unique identifiers (container ID and datanode 
ID).
   key level locks are used to ensure thread safety when multiple threads try 
to insert data into the map concurrently.
   Once a batch accumulates a predefined number of log entries, it is processed 
and inserted into the database.
   A synchronization mechanism ensures that only one thread processes the batch 
and inserts into DB at a time, preventing race conditions. 
   
   Also created a ozone debug CLI command for the same.
   `./ozone debug container container_log_parse --parse=<path to logs folder>  
--thread-count=<num of threads>`
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-12581
   
   ## How was this patch tested?
   Below is the workflow link that passed successfully:
   https://github.com/sreejasahithi/ozone/actions/runs/14351650273
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to