On Mon, Mar 19, 2012 at 11:32 AM, Emmanuel Lécharny <[email protected]> wrote: > Le 3/19/12 6:59 PM, Selcuk AYA a écrit : > >> On Mon, Mar 19, 2012 at 10:41 AM, Emmanuel Lécharny<[email protected]> >> wrote: >>> >>> Le 3/19/12 6:26 PM, Selcuk AYA a écrit : >>> >>>> On Mon, Mar 19, 2012 at 9:24 AM, Emmanuel Lécharny<[email protected]> >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I have a few questions about the handling of the log buffer. >>>>> >>>>> When we can't write anymore data in the buffer, because it's full, we >>>>> try >>>>> to >>>>> flush the buffer on disk. What happens then is : >>>>> - if there is enough room remaining in the buffer, we write a skip >>>>> record >>>>> (with a -1 length) : is it necessary ? (we then rewind the buffer) >>>>> - otherwise, we rewind the buffer >>>>> >>>>> In any case, we increment the writeAheadRewindCount : what for ? >>>>> >>>>> then we call the flush() method, which will be executed only if there >>>>> is >>>>> no >>>>> other thread flushing the buffer already (just in case the sync() >>>>> method >>>>> is >>>>> called by another thread). I guess this is intended to allow a thread >>>>> to >>>>> add >>>>> new data in the buffer while another thread writes the buffer on disk? >>>>> >>>>> So AFAIU, only one thread will be allowed to write data into the >>>>> buffer, >>>>> up >>>>> to the point it reaches a record being hold by the flush thread, and >>>>> only >>>>> one thread can flush the data, up to the point it reaches the last >>>>> record >>>>> it >>>>> can write (which is computed before the flush() method is called). >>>>> >>>>> I'm wondering if we couldn't use a simpler algorithm, where we have a >>>>> flush >>>>> thread used to flush the data in any case. If the buffer is full, we >>>>> stop >>>>> writing until we are signaled that there is some room left (and this is >>>>> the >>>>> flush thread role to signal the writer that it can start again). That >>>>> means >>>>> we write as much as we can, signaling each record to the flush thread, >>>>> and >>>>> the flush thread will consume the record when they arrive. If both are >>>>> colliding (ie, no more room remains in the buffer, the reader will have >>>>> to >>>>> wait for the writer to wake it up). We won't need to use a buffer at >>>>> all, >>>>> we >>>>> just pass the records (plus their headers and trailers) in queue, >>>>> avoiding >>>>> a copy in a temporary memory. >>>>> >>>>> This is basically doing the same thing, but we don't wait until the >>>>> buffer >>>>> is full to wake up the writer. This is the way the network layer works >>>>> in >>>>> NIO, with a selector signaling the writer thread when it's ready to >>>>> accept >>>>> some more data to be written. >>>> >>>> I am confused about the buffering (or no buffering) you suggest. Are >>>> you suggesting a flush thread will use directly write off the user's >>>> buffer without any in mem copy? >>> >>> Yes. In fact, I suggest we buffer the records, without copying them. When >>> the flush thread is waken up (or kicked), it will write the header, the >>> buffer, the footer. We can use ByteBuffer gathering for that (see >>> http://tutorials.jenkov.com/java-nio/scatter-gather.html) >> >> I see.But this is effectively what we are doing right? Instead of >> putting the buffers in a queue and doing scatter/gather through byte >> buffer(which will eventually do a memcpy to do a single batched write >> I think), we copy into an in mem buffer and let the flushing thread to >> do the single batched write. > > Yes, but you copy the user records into a temporary ByteBuffer, which will > be read and flushed. If you put the user records in a queue, you don't need > this extra copy, plus you don't need to allocate a 4Mb buffer at all. That > does not mean you won't suck those 4 Mb, if the queue is not emptied fast > enough by the flush thread, but in the general case, you just end using less > memory if the flush thread is awakened when some data is present in the > queue. > So we want to write to the end of log a batched write using a "single" IO. What I am saying this wont the java byte buffer implementation have to internally copy the buffers into a single buffer and do a single batched write from that buffer?
> > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com >
