kowshik commented on pull request #11345:
URL: https://github.com/apache/kafka/pull/11345#issuecomment-934133766


   @ccding During a call to `log.flush()`, we remember 
[here](https://github.com/apache/kafka/blob/1d3b96389b325520648d29b6363941f50e5b6d35/core/src/main/scala/kafka/log/UnifiedLog.scala#L1522)
 the offset upto which the log was flushed. So, a subsequent `flush()` will be 
a no-op [due to this 
check](https://github.com/apache/kafka/blob/1d3b96389b325520648d29b6363941f50e5b6d35/core/src/main/scala/kafka/log/UnifiedLog.scala#L1517)
 unless the `logEndOffset` has advanced. This means that empty active segments 
shouldn't add additional burden to the `log.flush()` operation during each 
call, unless new empty active segments are generated in between 2 calls but 
that's quite uncommon (see [relevant 
comment](https://github.com/apache/kafka/pull/11345#discussion_r719132048)).
   
   Furthermore, we typically don't configure to flush the log periodically or 
during appends. Even then, when we call `log.flush()` without passing an offset 
parameter, the intent was always to flush all data written to the log. It is 
just a corner case that "all data written to the log" needs to also include an 
empty active segment for correctness reasons since the `logEndOffset` is 
derived from it [during 
recovery](https://github.com/apache/kafka/blob/db1f581da7f3440cfd5be93800b4a9a2d7327a35/core/src/main/scala/kafka/log/LogLoader.scala#L401).
 Some related details on documentation below:
   
   (1) Periodic flush: 
https://github.com/apache/kafka/blob/0fe4e24c095f57606a9c33a3135f2f81a1fb0285/core/src/main/scala/kafka/log/LogManager.scala#L1244-L1246
   
   Here, `log.flush()` executes only when the `timeSinceLastFlush >= 
log.config.flushMs`. The `flush.ms` configuration is documented 
[here](https://github.com/apache/kafka/blob/a46b82bea9abbd08e550d985f87e79a6d912a9ab/clients/src/main/java/org/apache/kafka/common/config/TopicConfig.java#L58-L63)
 and it has a default value of `Long.MaxValue` (very high!) defined 
[here](https://github.com/apache/kafka/blob/85548acafbc799ee371b531802580fcde170ddc1/core/src/main/scala/kafka/server/KafkaConfig.scala#L135).
 The doc recommends us not to override the config unless needed.
   
   (2) Flush during appends:
   
   
https://github.com/apache/kafka/blob/db42afd6e24ef4291390b4d1c1f10758beedefed/core/src/main/scala/kafka/log/UnifiedLog.scala#L933
   
   Here, `flush()` executes only when `unflushedMessages >= 
config.flushInterval`. Similar explanation to the above. The `flush.messages` 
configuration is documented 
[here](https://github.com/apache/kafka/blob/a46b82bea9abbd08e550d985f87e79a6d912a9ab/clients/src/main/java/org/apache/kafka/common/config/TopicConfig.java#L50-L56)
 and it has a default value of `Long.MaxValue` (very high!) defined 
[here](https://github.com/apache/kafka/blob/85548acafbc799ee371b531802580fcde170ddc1/core/src/main/scala/kafka/server/KafkaConfig.scala#L133).
 The doc recommends us not to override the config unless needed.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to