cmccabe commented on pull request #9008: URL: https://github.com/apache/kafka/pull/9008#issuecomment-662745982
Thanks for this, @mumrah. I took a look at the overall approach with the `RecordsWriter` and it looks reasonable. Do we need `RecordsReader`? Seems like we could just add the `readRecords` method to `ByteBufferAccessor`. I do think `RecordsWriter` needs to be a separate class from `ByteBufferAccessor` -- it is doing something quite different, after all. But I'm not sure that the generated code needs to know about `RecordsWriter`. If we add a `writeRecords` method to `Writable` and a simple implementation to `ByteBufferAccessor`, we can avoid the downcast in the generated code. That also suggests that maybe `RecordsWriter` could be in `org.apache.kafka.common.record`? Maybe. It seems like `Writable` should have a `Writable#close` method, in case there's something the writable needs to do when we're done adding things. Actually it should just extend AutoCloseable so that the compiler will complain if we don't close it appropriately. Then that can be a no-op for `ByteBufferAccessor` but call flush when using `RecordsWriter`. Using `ByteBufferOutputStream` is wasteful when you have to do a lot of doublings. When you do a doubling, you end up copying a lot of data that has already been flushed (and hence sent to the Sender). You're making a new buffer to contain it, but why? Nobody will ever read that part of the new buffer. What the Sender will read is the part of the old (pre-doubling) buffer that contained that data. What you really want is to get rid of the ByteBufferOutputStream and just manage the buffer yourself here. Then, when you need to enlarge, you can just copy the data that's live and not the old, already flushed data. The above could be done in a follow-on if you want. I don't think it should block the merge ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org