On Mon, Mar 19, 2012 at 10:26 AM, Selcuk AYA <[email protected]> wrote: > On Mon, Mar 19, 2012 at 9:24 AM, Emmanuel Lécharny <[email protected]> > wrote: >> Hi, >> >> I have a few questions about the handling of the log buffer. >> >> When we can't write anymore data in the buffer, because it's full, we try to >> flush the buffer on disk. What happens then is : >> - if there is enough room remaining in the buffer, we write a skip record >> (with a -1 length) : is it necessary ? (we then rewind the buffer) >> - otherwise, we rewind the buffer >> >> In any case, we increment the writeAheadRewindCount : what for ?
as far as I can remember, writeAheadRewindCount was to avoid overwriting non flushed log records when in memory circular buffer wraps. IF this answer is not good enough, I can take a look more closely later. >> >> then we call the flush() method, which will be executed only if there is no >> other thread flushing the buffer already (just in case the sync() method is >> called by another thread). I guess this is intended to allow a thread to add >> new data in the buffer while another thread writes the buffer on disk? >> >> So AFAIU, only one thread will be allowed to write data into the buffer, up >> to the point it reaches a record being hold by the flush thread, and only >> one thread can flush the data, up to the point it reaches the last record it >> can write (which is computed before the flush() method is called). >> >> I'm wondering if we couldn't use a simpler algorithm, where we have a flush >> thread used to flush the data in any case. If the buffer is full, we stop >> writing until we are signaled that there is some room left (and this is the >> flush thread role to signal the writer that it can start again). That means >> we write as much as we can, signaling each record to the flush thread, and >> the flush thread will consume the record when they arrive. If both are >> colliding (ie, no more room remains in the buffer, the reader will have to >> wait for the writer to wake it up). We won't need to use a buffer at all, we >> just pass the records (plus their headers and trailers) in queue, avoiding >> a copy in a temporary memory. >> >> This is basically doing the same thing, but we don't wait until the buffer >> is full to wake up the writer. This is the way the network layer works in >> NIO, with a selector signaling the writer thread when it's ready to accept >> some more data to be written. > > I am confused about the buffering (or no buffering) you suggest. Are > you suggesting a flush thread will use directly write off the user's > buffer without any in mem copy? > > Currently the things work like this on the common code path: > > * for user threads: > prepare record > get log latch > copy in memory buffer and get LSN(logicla sequence number). > release log latch > return LSN > > > *for background flushing thread: > wake up periodically , reap the in memory log and write > > so background does not necessarily wait for buffer to be full to > wakeup and write.In the hopefully less common case, if the buffer is > full, a user thread will take it for the team and write the buffer(we > could signal the flush thread as an alternative here). > > In the common case, this allows user threads not wait for write and > getting an LSN quickly(LSN is important to order log records) and > batching of writes. Similar algorithms are used for all database WAL > code I looked at(including Apache Derby) > >> >> thougths ? >> >> -- >> Regards, >> Cordialement, >> Emmanuel Lécharny >> www.iktek.com >> > > thanks > Selcuk
