Great, say hi to Masood!

-Flavio

On 21 Jul 2014, at 21:24, Jaln <[email protected]> wrote:

> Hi Flavio,
> I'm doing some research on scalable/durable transactional messaging system, 
> with Masood at Huawei Innovation Center. I'm currently using bookkeeper as a 
> case study.
> Thanks for the help.
> 
> Best,
> Jialin
> 
> 
> On Mon, Jul 21, 2014 at 5:43 AM, Flavio Junqueira 
> <[email protected]> wrote:
> Jialin,
> 
> I'm curious to know why you're asking all these questions. Are you working on 
> some research project that involves BookKeeper? Otherwise, what's your use 
> case if you don't mind sharing?
> 
> 
> -Flavio
> 
> 
> 
> On Monday, July 21, 2014 1:34 PM, Ivan Kelly <[email protected]> wrote:
> 
> 
> >
> >
> >We have considered something like this in the past. However, it would
> >mean that reads will affect the latency or writes, as they will move
> >the disk head.
> >
> >It's also the case that the interleaved entrylog performs really badly
> >on reads. Work has been done recently to improve this, by buffering
> >entries and sorting them by ledger id before flushing to the
> >entrylog. This means that reads for a specific ledger will be
> >sequential as opposed to jumping all over the place as it has to do
> >now. If we used the journal for this, then we wouldn't be able to do
> >this processing, as the point of the journal is to ensure that the
> >entry is on persistent storage before replying to the client. If we
> >buffered enough to get benefit from sorting, write latency would be
> >enormous.
> >
> >-Ivan
> >
> >
> >On Sat, Jul 19, 2014 at 01:55:16PM -0700, Jaln wrote:
> >> Thank you so much, Rakesh,
> >> Without consideration of performance, can we just maintain one file. For
> >> example journal file, and the index for each entry.
> >>
> >> Best,
> >> Jaln
> >>
> >>
> >> On Fri, Jul 18, 2014 at 11:23 PM, Rakesh Radhakrishnan <
> >> [email protected]> wrote:
> >>
> >> > Hi Jaln,
> >> >
> >> > >>>>>>for the data in the journal file(*.txn) and the entry log
> >> > file(*.log), are
> >> > >>>>>>they similar?
> >> > >>>>>>for example, when I add an entry, this opeartion and the entry data
> >> > will be
> >> > >>>>>>logged in the journal file,
> >> > >>>>>>and the entry data will be logged in the entry log file (*.log),
> >> > right?
> >> >
> >> > As I mentioned earlier, when an entry is added Bookie server will add 
> >> > only
> >> > this entry to the journal file and will send a response back to the
> >> > client after the successful flush to the disk. Later during checkpointing
> >> > time, server will read the journal entries and add it to the entry logger
> >> > files. Also, it will generate index files corresponding to each ledgers 
> >> > for
> >> > the faster access. This old journal file will be garbage collected, 
> >> > because
> >> > all these entries are mapped it to the entry logger.
> >> >
> >> > >>>>>what's the purpose of the two files?
> >> > AFAIK, adding to entry log and generating index is a costly I/O operation
> >> > and will affect the performance. Thats the reason, first will only add
> >> > transactions to journal file and send a response quickly. Later will add 
> >> > it
> >> > to the entrylog file & index files offline.
> >> >
> >> > Total bookie stored data = entry logger data + journal data(most recent
> >> > data)
> >> >
> >> > *For example:* I'm calling write operation as transaction. Assume client
> >> > has performed 20 transactions. All these exists only in the journal file.
> >> > Say, now checkpointing triggered. It will add these 20 transactions to 
> >> > the
> >> > entry logger file and generate indexes. Again assume user performed 10 
> >> > more
> >> > transactions. Now we have total 30 transactions.
> >> >
> >> > Bookie data(30 transactions) = 20 + 10.
> >> >
> >> > Regards,
> >> > Rakesh
> >> >
> >> >
> >> >
> >> > On Sat, Jul 19, 2014 at 9:52 AM, Jaln <[email protected]> wrote:
> >> >
> >> > > Thanks Rakesh,
> >> > > for the data in the journal file(*.txn) and the entry log file(*.log),
> >> > are
> >> > > they similar?
> >> > > for example, when I add an entry, this opeartion and the entry data 
> >> > > will
> >> > be
> >> > > logged in the journal file,
> >> > > and the entry data will be logged in the entry log file (*.log), right?
> >> > > what's the purpose of the two files?
> >> > >
> >> > > Thanks,
> >> > > Jaln
> >> > >
> >> > > On Fri, Jul 18, 2014 at 8:16 PM, Rakesh Radhakrishnan <
> >> > > [email protected]> wrote:
> >> > >
> >> > > > Hi Jaln,
> >> > > >
> >> > > > No, both are different. I hope you are asking about 'entry log' files
> >> > and
> >> > > > 'journal' files
> >> > > >
> >> > > > *Journal : *When client performs a write operation (such as adding an
> >> > > entry
> >> > > > etc), it is first recorded in the journal file. Journal will be 
> >> > > > flushed
> >> > > and
> >> > > > synced after every write operation before a success code is returned 
> >> > > > to
> >> > > the
> >> > > > client. This ensures that no operation is lost due to machine 
> >> > > > failure.
> >> > > >
> >> > > > *Entry Log : *It is not updated for every write operation, bookie
> >> > server
> >> > > > will do it lazily. Because writing out the ledger involves - update
> >> > > ledger
> >> > > > index files to faster look up and add entry to the logger file. This
> >> > will
> >> > > > be a costly operation and will affect the performance.
> >> > > >
> >> > > > In Bookie, there is a dedicated thread to play journal transactions 
> >> > > > and
> >> > > add
> >> > > > it to the logger lazily, this is called as checkpointing operation.
> >> > This
> >> > > > will be performed periodically, now the data will be persisted to
> >> > ledger
> >> > > > index files and entry logger. By default the 'flushInterval' is 100
> >> > > > milliseconds. Probably you can configure a bigger value to see the
> >> > > > difference.
> >> > > >
> >> > > > *"SyncThread"* is a background thread which help checkpointing. 
> >> > > > After a
> >> > > > ledger storage is checkpointed, the journal files added before
> >> > checkpoint
> >> > > > will be garbage collected.
> >> > > >
> >> > > > Cheers,
> >> > > > Rakesh
> >> > > >
> >> > > >
> >> > > > On Sat, Jul 19, 2014 at 1:41 AM, Jaln <[email protected]> wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > > is the ledger file and journal file same?
> >> > > > > I run the bookkeeper and generate the bookie,
> >> > > > > inside the bookie, I found the journal file and ledger file are
> >> > almost
> >> > > > > same.
> >> > > > >
> >> > > > > Best,
> >> > > > > Jialin
> >> > > > >
> >> > > >
> >> > >
> >> >
> >
> >
> >
> 
> 
> 
> -- 
> Genius only means hard-working all one's life

Reply via email to