All, chatted with Hiram about how syncs on replicated leveldb works... didn't mean for it to be private :) I'm forwarding the email thread...
See the discussion below and add any comments/thoughts as desired.. Thanks, Christian ---------- Forwarded message ---------- From: Hiram Chirino <[email protected]> Date: Thu, May 9, 2013 at 6:31 AM Subject: Re: does master sync to disk on successful replication? To: Christian Posta <[email protected]> Yeah think your right.. might be better of with something like syncTo="<type>": where <type> can be space separated list of: * disk - Sync to the local disk * replica - Sync to remote replica's memory * replicaDisk - Sync to remote replicas disk. And we just default that to replica. On Thu, May 9, 2013 at 9:16 AM, Christian Posta <[email protected]> wrote: > But i think we need sync to be true for the replication as it stand right > now? If sync option is true then we hit this line in the client's store > method which is the hook into the replication: > > if( syncNeeded && sync ) { > appender.force > } > > If we change to false, then replication won't be kicked off. We could remove > the && sync, but then persistent messages would be sync'd even if > sync==false... prob don't want. > > *might* need another setting "forceReplicationSyncToDisk" or something... > or.. move the replication out of the appender.force method... in activemq > 5.x you have the following in DataFileAppender which delegates to a > replicator: > > ReplicationTarget replicationTarget = > journal.getReplicationTarget(); > if( replicationTarget!=null ) { > > replicationTarget.replicate(wb.writes.getHead().location, sequence, > forceToDisk); > } > > > On Thu, May 9, 2013 at 6:02 AM, Hiram Chirino <[email protected]> wrote: >> >> Yeah... perhaps we keep using the sync config option, just change the >> default to false in the replicated scenario. >> >> Very hard to verify proper operation of fsync. >> >> Best way I've found is by comparing performance of writes followed by >> fsync and and writes not followed by fsync. Then looking at the >> numbers and comparing it to the hardware being used and seeing if it >> makes sense. On a spinning disk /w out battery backed write cache, >> you should not get more than 100-300 writes per second /w fsync. But >> once you start looking at SDDs or battery backed write cache hardware, >> then that assumption goes out the window. >> >> >> On Thu, May 9, 2013 at 8:48 AM, Christian Posta >> <[email protected]> wrote: >> > Your thoughts above make sense. Maybe we can add the option and leave it >> > disabled for now? >> > I can write a test for it and do it. As fsync vs fflush are quite OS >> > dependent, do you know of a good way to write tests to verify fsync? >> > Just >> > read the contents from the file? >> > >> > >> > On Wed, May 8, 2013 at 7:02 PM, Hiram Chirino <[email protected]> wrote: >> >> >> >> Nope. your not missing anything. Instead of disk syncing, we are >> >> doing replica syncing. If the master dies and he looses some of his >> >> recent log entries, it's not a big deal since we can recover from the >> >> log file of the slave. >> >> >> >> The only time you could possibly loose data is in the small likelihood >> >> that the master and the salve machines die at the same time. But if >> >> that is likely to happen your really don't have a very HA deployment. >> >> >> >> But if folks do think that's a possibility, then perhaps we should add >> >> an option to really disk sync. >> >> >> >> On Wed, May 8, 2013 at 6:06 PM, Christian Posta >> >> <[email protected]> wrote: >> >> > Hey, >> >> > >> >> > Might be some trickery that I'm missing... but in the replication >> >> > sequence, >> >> > when the master writes to its log, it also tries to tell its slaves >> >> > about >> >> > the write (in the overridden log appender in MasterLevelDBClient, the >> >> > overridden methods force and flush... looks like we tell the slaves >> >> > about >> >> > our updates in flush by calling store.replicate_wal, and then we wait >> >> > for >> >> > acks in force by calling store.wal_sync_to(position).... what i'm >> >> > missing is >> >> > when file sync is required, the master doesn't do it. The force >> >> > method >> >> > in >> >> > the original LogAppender does the call to channel.force()... but it >> >> > might be >> >> > missing in the overridden log appender. Do you see the same? Maybe >> >> > i'm >> >> > missing something... >> >> > >> >> > >> >> > >> >> > -- >> >> > Christian Posta >> >> > http://www.christianposta.com/blog >> >> > twitter: @christianposta >> >> >> >> >> >> >> >> -- >> >> Regards, >> >> Hiram >> >> >> >> Blog: http://hiramchirino.com >> >> >> >> Open Source SOA >> >> http://fusesource.com/ >> > >> > >> > >> > >> > -- >> > Christian Posta >> > http://www.christianposta.com/blog >> > twitter: @christianposta >> >> >> >> -- >> Regards, >> Hiram >> >> Blog: http://hiramchirino.com >> >> Open Source SOA >> http://fusesource.com/ > > > > > -- > Christian Posta > http://www.christianposta.com/blog > twitter: @christianposta -- Regards, Hiram Blog: http://hiramchirino.com Open Source SOA http://fusesource.com/ -- *Christian Posta* http://www.christianposta.com/blog twitter: @christianposta
