> On Oct. 13, 2018, 2:49 a.m., James DeFelice wrote: > > src/slave/state.hpp > > Line 192 (original), 196 (patched) > > <https://reviews.apache.org/r/69010/diff/1/?file=2096858#file2096858line196> > > > > Why is the default `false` here? If someone is calling the `checkpoint` > > func because they want atomic semantics ... wouldn't they *normally* want > > `sync==true` (and shouldn't the *exception* to the rule be `sync==false`)? > > Chun-Hung Hsiao wrote: > This is to maintain the same semantics for original uses of this > function. IIRC most of the components are able to handle stale or empty > checkpoints. Also defaulting the flag to true will bring in some performance > impact.
Some numbers for fsync found on the Internet: https://gist.github.com/prashanthpai/e246be62656f25d7e31b For now I'll just leave a todo for enabling syncing by default. Does this sound okay to you? - Chun-Hung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69010/#review209512 ----------------------------------------------------------- On Oct. 16, 2018, 2:48 a.m., Chun-Hung Hsiao wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69010/ > ----------------------------------------------------------- > > (Updated Oct. 16, 2018, 2:48 a.m.) > > > Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht. > > > Bugs: MESOS-9281 > https://issues.apache.org/jira/browse/MESOS-9281 > > > Repository: mesos > > > Description > ------- > > Currently if a system crashes, SLRP checkpoints might not be synced to > the filesystem, so it is possible that an old or empty checkpoint will > be read upon recovery. Moreover, if a CSI call has been issued right > before the crash, the recovered state may be inconsistent with the > actual state reported by the plugin. For example, the plugin might have > created a volume but the checkpointed state does not know about it. > > To avoid this inconsistency, we always call fsync() when checkpointing > SLRP states. > > > Diffs > ----- > > src/resource_provider/storage/provider.cpp > db783b53558811081fb2671e005e8bbbd9edbede > src/slave/state.hpp 003211e4670c1092acb1634220d76bafd39e3a20 > > > Diff: https://reviews.apache.org/r/69010/diff/3/ > > > Testing > ------- > > make check > > > Thanks, > > Chun-Hung Hsiao > >