Fix the issue Sriram mentioned. Code review and jira/KIP updated.

Below are detail description for the scenarios:
1.If do clear shutdown,  the last log file will be truncated to its real size 
since the close() function of FileMessageSet will call trim(), 
2.If crash, then when restart,  will go through the process of recover() and 
the last log file will be truncate to its real size, (and the position will be 
moved to end of the file)
3.When service start and open existing file
a.Will run the LogSegment constructor which has NO parameter "preallocate", 
b.Then in FileMessageSet,  the "end" in FileMessageSet will be Int.MaxValue,   
and then "channel.position(math.min(channel.size().toInt, end))"  will make the 
position be end of the file,
c.If recover needed, the recover function will truncate file to end of valid 
data, and also move the position to it,

4.When service running and need create new log segment and new FileMessageSet

a.If preallocate = truei.the "end" in FileMessageSet will be 0,  the file size 
will be "initFileSize", and then 
"channel.position(math.min(channel.size().toInt, end))"  will make the position 
be 0,

b.Else if preallocate = falsei.backward compatible, the "end" in FileMessageSet 
will be Int.MaxValue, the file size will be "0",  and then 
"channel.position(math.min(channel.size().toInt, end))"  will make the position 
be 0,

https://cwiki.apache.org/confluence/display/KAFKA/KIP-20+-+Enable+log+preallocate+to+improve+consume+performance+under+windows+and+some+old+Linux+file+system
https://issues.apache.org/jira/browse/KAFKA-1646 
https://reviews.apache.org/r/33204/diff/2/ 

Thanks, Honghai Chen
http://aka.ms/kafka 
http://aka.ms/manifold 

-----Original Message-----
From: Honghai Chen 
Sent: Wednesday, April 22, 2015 11:12 AM
To: dev@kafka.apache.org
Subject: RE: [DISCUSS] KIP 20 Enable log preallocate to improve consume 
performance under windows and some old Linux file system

Hi Sriram,
        One sentence of code missed, will update code review board and KIP soon.
        For LogSegment and FileMessageSet, must use different constructor 
function for existing file and new file, then the code " 
channel.position(math.min(channel.size().toInt, end)) " will make sure the 
position at end of existing file.    

Thanks, Honghai Chen 

-----Original Message-----
From: Jay Kreps [mailto:jay.kr...@gmail.com]
Sent: Wednesday, April 22, 2015 5:22 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP 20 Enable log preallocate to improve consume 
performance under windows and some old Linux file system

My understanding of the patch is that clean shutdown truncates the file back to 
it's true size (and reallocates it on startup). Hard crash is handled by the 
normal recovery which should truncate off the empty portion of the file.

On Tue, Apr 21, 2015 at 10:52 AM, Sriram Subramanian < 
srsubraman...@linkedin.com.invalid> wrote:

> Could you describe how recovery works in this mode? Say, we had a 250 
> MB preallocated segment and we wrote till 50MB and crashed. Till what 
> point do we recover? Also, on startup, how is the append end pointer 
> set even on a clean shutdown? How does the FileChannel end position 
> get set to 50 MB instead of 250 MB? The existing code might just work 
> for it but explaining that would be useful.
>
> On 4/21/15 9:40 AM, "Neha Narkhede" <n...@confluent.io> wrote:
>
> >+1. I've tried this on Linux and it helps reduce the spikes in append 
> >+(and
> >hence producer) latency for high throughput writes. I am not entirely 
> >sure why but my suspicion is that in the absence of preallocation, 
> >you see spikes writes need to happen faster than the time it takes 
> >Linux to allocate the next block to the file.
> >
> >It will be great to see some performance test results too.
> >
> >On Tue, Apr 21, 2015 at 9:23 AM, Jay Kreps <jay.kr...@gmail.com> wrote:
> >
> >> I'm also +1 on this. The change is quite small and may actually 
> >>help perf  on Linux as well (we've never tried this).
> >>
> >> I have a lot of concerns on testing the various failure conditions 
> >> but I think since it will be off by default the risk is not too high.
> >>
> >> -Jay
> >>
> >> On Mon, Apr 20, 2015 at 6:58 PM, Honghai Chen 
> >><honghai.c...@microsoft.com>
> >> wrote:
> >>
> >> > I wrote a KIP for this after some discussion on KAFKA-1646.
> >> > https://issues.apache.org/jira/browse/KAFKA-1646
> >> >
> >> >
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-20+-+Enable+log+
> pre
> >>allocate+to+improve+consume+performance+under+windows+and+some+old+Linux+
> >>file+system
> >> > The RB is here: https://reviews.apache.org/r/33204/diff/
> >> >
> >> > Thanks, Honghai
> >> >
> >> >
> >>
> >
> >
> >
> >--
> >Thanks,
> >Neha
>
>

Reply via email to