[jira] [Commented] (KAFKA-405) Improve the high water mark maintenance to store high watermarks for all partitions in a single file on disk

Jay Kreps (JIRA) Wed, 18 Jul 2012 17:50:37 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417913#comment-13417913
 ]


Jay Kreps commented on KAFKA-405:
---------------------------------

Some of these comments are not directly related to this change but this is just 
the first time I have looked at this code in detail

- I think Log.recoverUptoLastCheckpointedHighWatermark(lastKnownHW: Long) is 
misnamed. I don't think the log should know anything about hw marks, and it 
also isn't doing recovery (e.g. checking the validity of the log), I think it 
is just blindly truncating the log. Can we change it to Log.truncateTo
- Is it possible that we need to truncate more than one segment? I.e. couldn't 
the segment to be truncated not be the last segment (unlikely with 1gb 
segments, but still a problem)
- Can we change the api in MessageSet.truncateUpto should be truncateUpTo or 
truncateTo
- Can you make a wrapper for the RandomAccessFile called 
HighwaterMarkCheckpoint and put the logic related to that there. Intuitively 
the logic for serializing a checkpoint shouldn't be mixed into ReplicaManager. 
Can you also document the file format? Is there a reason this can't be plain 
text? Also I think a better approach to the updates would be to create a new 
file, write to it, and then move it over the old one; this will make the update 
atomic. Not sure if that is needed...
- It would be good to have a test that covered the HighwaterMarkCheckpoint file 
read and write. Just basic reading/writing, nothing fancy.
- Do we need to version the hw mark checkpoint file format? I.e. maybe the 
first line of the file is version=x or something... Not sure if that is needed 
but I am paranoid about persistent formats after blowing that and being stuck. 
This would let format changes be handled automatically.
- Can we fix the setters/getters in Replica.scala and make changes to the leo 
update the leoUpdateTime. Currently I think it is encumbant on the caller to do 
these two things together which is odd...
- I think the use of KafkaScheduler is not quite right. This is a thread pool 
meant to execute many tasks. There should really only be one for all of kafka, 
not one per background thread. You should probably pass in a central instance 
as an argument rather than making two new ones.
- Also I notice that ReplicaManager.getReplica returns an Option. But then 
everyone who calls it needs to check the option and return an exception if it 
is not found. Can we just have getReplica either return the Replica or throw 
the exception?
- I think abbreviations should be in the form updateLeo not updateLEO and 
updateIsr not updateISR. Let's standardize that.
                
> Improve the high water mark maintenance to store high watermarks for all 
> partitions in a single file on disk
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-405
>                 URL: https://issues.apache.org/jira/browse/KAFKA-405
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>         Attachments: kafka-405-v1.patch
>
>
> KAFKA-46 introduced per partition leader high watermarks. But it stores those 
> in one file per partition. A more performant solution would be to store all 
> high watermarks in a single file on disk

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-405) Improve the high water mark maintenance to store high watermarks for all partitions in a single file on disk

Reply via email to