Yeah if we understand the optimal policy for a config we always want to set it automatically. In this case I don't think we do yet, but down the road that could be nice. I think for now we should consider this option experimental to give people a chance to try it out.
-Jay On Wed, Apr 22, 2015 at 7:32 PM, Honghai Chen <honghai.c...@microsoft.com> wrote: > Hi Roshan, > Use the 'auto' value maybe will break the rule and mess up the > configuration. @Jay, any thoughts? > > Thanks, Honghai Chen > > -----Original Message----- > From: Sriharsha Chintalapani [mailto:harsh...@fastmail.fm] > Sent: Thursday, April 23, 2015 6:27 AM > To: dev@kafka.apache.org; Roshan Naik > Subject: Re: [DISCUSS] KIP 20 Enable log preallocate to improve consume > performance under windows and some old Linux file system > > +1 (non-binding). > > -- > Harsha > > > On April 22, 2015 at 2:52:12 PM, Roshan Naik (ros...@hortonworks.com) > wrote: > > I see that it is safe to keep it this off by default due to some concerns. > Eventually, for settings such as this whose 'preferred' value is platform > specific (or based on other criteria), it might be worth considering > having a default value that is not a constant but an 'auto' value .. When > kafka boots up it can automatically use the preferred value. Ofcourse it > would have to documented as to what auto means for a given platform. > > -roshan > > > On 4/22/15 1:21 PM, "Jakob Homan" <jgho...@gmail.com> wrote: > > >+1. This is an important performance fix for Windows-based clusters. > > > >-Jakob > > > >On 22 April 2015 at 03:25, Honghai Chen <honghai.c...@microsoft.com> > >wrote: > >> Fix the issue Sriram mentioned. Code review and jira/KIP updated. > >> > >> Below are detail description for the scenarios: > >> 1.If do clear shutdown, the last log file will be truncated to its > >>real size since the close() function of FileMessageSet will call trim(), > >> 2.If crash, then when restart, will go through the process of > >>recover() and the last log file will be truncate to its real size, (and > >>the position will be moved to end of the file) > >> 3.When service start and open existing file > >> a.Will run the LogSegment constructor which has NO parameter > >>"preallocate", > >> b.Then in FileMessageSet, the "end" in FileMessageSet will be > >>Int.MaxValue, and then > >>"channel.position(math.min(channel.size().toInt, end))" will make the > >>position be end of the file, > >> c.If recover needed, the recover function will truncate file to end of > >>valid data, and also move the position to it, > >> > >> 4.When service running and need create new log segment and new > >>FileMessageSet > >> > >> a.If preallocate = truei.the "end" in FileMessageSet will be 0, the > >>file size will be "initFileSize", and then > >>"channel.position(math.min(channel.size().toInt, end))" will make the > >>position be 0, > >> > >> b.Else if preallocate = falsei.backward compatible, the "end" in > >>FileMessageSet will be Int.MaxValue, the file size will be "0", and > >>then "channel.position(math.min(channel.size().toInt, end))" will make > >>the position be 0, > >> > >> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-20+-+Enable+log+pre > >>allocate+to+improve+consume+performance+under+windows+and+some+old+Linux+ > >>file+system > >> https://issues.apache.org/jira/browse/KAFKA-1646 > >> https://reviews.apache.org/r/33204/diff/2/ > >> > >> Thanks, Honghai Chen > >> http://aka.ms/kafka > >> http://aka.ms/manifold > >> > >> -----Original Message----- > >> From: Honghai Chen > >> Sent: Wednesday, April 22, 2015 11:12 AM > >> To: dev@kafka.apache.org > >> Subject: RE: [DISCUSS] KIP 20 Enable log preallocate to improve consume > >>performance under windows and some old Linux file system > >> > >> Hi Sriram, > >> One sentence of code missed, will update code review board and > >>KIP soon. > >> For LogSegment and FileMessageSet, must use different > >>constructor function for existing file and new file, then the code " > >>channel.position(math.min(channel.size().toInt, end)) " will make sure > >>the position at end of existing file. > >> > >> Thanks, Honghai Chen > >> > >> -----Original Message----- > >> From: Jay Kreps [mailto:jay.kr...@gmail.com] > >> Sent: Wednesday, April 22, 2015 5:22 AM > >> To: dev@kafka.apache.org > >> Subject: Re: [DISCUSS] KIP 20 Enable log preallocate to improve consume > >>performance under windows and some old Linux file system > >> > >> My understanding of the patch is that clean shutdown truncates the file > >>back to it's true size (and reallocates it on startup). Hard crash is > >>handled by the normal recovery which should truncate off the empty > >>portion of the file. > >> > >> On Tue, Apr 21, 2015 at 10:52 AM, Sriram Subramanian < > >>srsubraman...@linkedin.com.invalid> wrote: > >> > >>> Could you describe how recovery works in this mode? Say, we had a 250 > >>> MB preallocated segment and we wrote till 50MB and crashed. Till what > >>> point do we recover? Also, on startup, how is the append end pointer > >>> set even on a clean shutdown? How does the FileChannel end position > >>> get set to 50 MB instead of 250 MB? The existing code might just work > >>> for it but explaining that would be useful. > >>> > >>> On 4/21/15 9:40 AM, "Neha Narkhede" <n...@confluent.io> wrote: > >>> > >>> >+1. I've tried this on Linux and it helps reduce the spikes in append > >>> >+(and > >>> >hence producer) latency for high throughput writes. I am not entirely > >>> >sure why but my suspicion is that in the absence of preallocation, > >>> >you see spikes writes need to happen faster than the time it takes > >>> >Linux to allocate the next block to the file. > >>> > > >>> >It will be great to see some performance test results too. > >>> > > >>> >On Tue, Apr 21, 2015 at 9:23 AM, Jay Kreps <jay.kr...@gmail.com> > >>>wrote: > >>> > > >>> >> I'm also +1 on this. The change is quite small and may actually > >>> >>help perf on Linux as well (we've never tried this). > >>> >> > >>> >> I have a lot of concerns on testing the various failure conditions > >>> >> but I think since it will be off by default the risk is not too > >>>high. > >>> >> > >>> >> -Jay > >>> >> > >>> >> On Mon, Apr 20, 2015 at 6:58 PM, Honghai Chen > >>> >><honghai.c...@microsoft.com> > >>> >> wrote: > >>> >> > >>> >> > I wrote a KIP for this after some discussion on KAFKA-1646. > >>> >> > https://issues.apache.org/jira/browse/KAFKA-1646 > >>> >> > > >>> >> > > >>> >> > >>> >> > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-20+-+Enable+log+ > >>> pre > >>> > >>>>>allocate+to+improve+consume+performance+under+windows+and+some+old+Lin > >>>>>ux+ > >>> >>file+system > >>> >> > The RB is here: https://reviews.apache.org/r/33204/diff/ > >>> >> > > >>> >> > Thanks, Honghai > >>> >> > > >>> >> > > >>> >> > >>> > > >>> > > >>> > > >>> >-- > >>> >Thanks, > >>> >Neha > >>> > >>> > >