[ https://issues.apache.org/jira/browse/KAFKA-7283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773683#comment-16773683 ]
ASF GitHub Bot commented on KAFKA-7283: --------------------------------------- junrao commented on pull request #5498: KAFKA-7283: Enable lazy mmap on index files and skip sanity check for segments below recovery point URL: https://github.com/apache/kafka/pull/5498 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > mmap indexes lazily and skip sanity check for segments below recovery point > --------------------------------------------------------------------------- > > Key: KAFKA-7283 > URL: https://issues.apache.org/jira/browse/KAFKA-7283 > Project: Kafka > Issue Type: New Feature > Reporter: Zhanxiang (Patrick) Huang > Assignee: Zhanxiang (Patrick) Huang > Priority: Major > > This is a follow-up ticket for KIP-263. > Currently broker will mmap the index files, read the length as well as the > last entry of the file, and sanity check index files of all log segments in > the log directory after the broker is started. These operations can be slow > because broker needs to open index file and read data into page cache. In > this case, the time to restart a broker will increase proportional to the > number of segments in the log directory. > Per the KIP discussion, we think we can skip sanity check for segments below > the recovery point since Kafka does not provide guarantee for segments > already flushed to disk and sanity checking only index file benefits little > when the segment is also corrupted because of disk failure. Therefore, we can > make the following changes to improve broker startup time: > # Mmap the index file and populate fields of the index file on-demand rather > than performing costly disk operations when creating the index object on > broker startup. > # Skip sanity checks on indexes of segments below the recovery point. > With these changes, the broker startup time will increase only proportional > to the number of partitions in the log directly after cleaned shutdown > because only active segments are mmaped and sanity checked. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)