[ 
https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729010#comment-13729010
 ] 

Jun Rao commented on KAFKA-615:
-------------------------------

Thanks for patch v5. Some comments:

50. Log:
50.1 recoveryLog(): It seems that recoveryPoint can be > lastOffset due to 
truncation on unclean shutdown. See the comment in 52.2.

50.2 The comment in the following code is no longer correct since it's not just 
recovering the active segment. Also, it seems that if we hit the exception, we 
should delete the rest of the segments after resetting current segment to 
startOffset.
        } catch {
          case e: InvalidOffsetException => 
            val startOffset = curr.baseOffset
            warn("Found invalid offset during recovery of the active segment 
for topic partition " + dir.getName +". Deleting the segment and " +
                 "creating an empty one with starting offset " + startOffset)
            // truncate the active segment to its starting offset
            curr.truncateTo(startOffset)
        }

50.3 the log flusher scheduler is multi-threaded. I am wondering if that 
guarantees that the flushes on the same log will complete in recovery point 
order, which is important?

51. LogSegment.recover(): the comment for the return value is incorrect. We 
return truncated bytes, not messages.

52. ReplicaManager:
52.1 The checkpointing of recovery point can be done once per LeaderAndIsr 
request, not per partition.
52.2 There is this corner case that I am not sure how to handle. Suppose that 
we truncate a log and immediately crash before flushing the recovery points. 
During recovery, we can happen is that a recovery point may be larger than 
logEndOffset. However, the log may need recovery since we don't know whether 
the flushing on truncated data succeeded or not. So, perhaps what we can do is 
that in recoveryLog(), if (lastOffset <= this.recoveryPoint), we force recover 
the last segment?

53. Could you verify that the basic system test works?

                
> Avoid fsync on log segment roll
> -------------------------------
>
>                 Key: KAFKA-615
>                 URL: https://issues.apache.org/jira/browse/KAFKA-615
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jay Kreps
>            Assignee: Neha Narkhede
>         Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch, 
> KAFKA-615-v3.patch, KAFKA-615-v4.patch, KAFKA-615-v5.patch, KAFKA-615-v6.patch
>
>
> It still isn't feasible to run without an application level fsync policy. 
> This is a problem as fsync locks the file and tuning such a policy so that 
> the flushes aren't so frequent that seeks reduce throughput, yet not so 
> infrequent that the fsync is writing so much data that there is a noticable 
> jump in latency is very challenging.
> The remaining problem is the way that log recovery works. Our current policy 
> is that if a clean shutdown occurs we do no recovery. If an unclean shutdown 
> occurs we recovery the last segment of all logs. To make this correct we need 
> to ensure that each segment is fsync'd before we create a new segment. Hence 
> the fsync during roll.
> Obviously if the fsync during roll is the only time fsync occurs then it will 
> potentially write out the entire segment which for a 1GB segment at 50mb/sec 
> might take many seconds. The goal of this JIRA is to eliminate this and make 
> it possible to run with no application-level fsyncs at all, depending 
> entirely on replication and background writeback for durability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to