I'm supportive of this plan in general, but I agree with Bill that the proposal is somewhat lacking in detail. Specifically I think it would be good to document in a less broad sense what the proposed changes to Mesos and Aurora actually entail. Right now it's a little bit hand-wavy ;).
On Wed, Jul 26, 2017 at 2:03 PM, Bill Farner <wfar...@apache.org> wrote: > Some (hopefully) constructive criticism: > > - the doc is very high-level on the problem statement and the proposal, > making it difficult to agree with prioritization over cheaper snapshots or > the oft-discussed support of an external DBMS. > > - the supporting data is a single data point of the > scheduler_log_recover_nanos_total metric. More data points and more > detail > on this data (how many entries/bytes did this represent?) would help > normalize the metric, and possibly indicate whether recover time is linear > or non-linear. Finer-grained information would also help (where was time > spent within the replay - GC? reading log entries? inflating snapshots?). > > - the doc calls out parts (1) mesos log support and (2) scheduler support. > Is the planned approach to gain value from (1) before (2), or are both > needed? > > - for (2) scheduler support, can you add detail on the implementation? > Much of the scheduler code assumes it is the leader > (CallOrderEnforcingStorage is currently a gatekeeper to avoid mistakes of > this type), so i would caution against replaying directly into the main > Storage. > > > On Wed, Jul 26, 2017 at 1:56 PM, Santhosh Kumar Shanmugham < > sshanmug...@twitter.com.invalid> wrote: > > > +1 > > > > This sets up the stage for more potential benefits by offloading work > from > > the leading scheduler that consumes stable data (that is not affected by > > minor inconsistencies). > > > > On Wed, Jul 26, 2017 at 10:31 AM, David McLaughlin < > dmclaugh...@apache.org > > > > > wrote: > > > > > I'm +1 to this approach over my proposal. With the enforced daily > > failover, > > > it's a much bigger win to make failovers "cheap" than making snapshots > > > cheap, and this is going to be backwards compatible too. > > > > > > On Wed, Jul 26, 2017 at 9:51 AM, Jordan Ly <jordan....@gmail.com> > wrote: > > > > > > > Hello everyone! > > > > > > > > I've created a document with an initial proposal to reduce leader > > > > failover time by eagerly reading and replaying the replicated log in > > > > followers: > > > > > > > > https://docs.google.com/document/d/10SYOq0ehLMFKQ9rX2TGC_xpM-- > > > > GBnstzMFP-tXGQaVI/edit?usp=sharing > > > > > > > > We wanted to open up this topic for discussion with the community and > > > > see if anyone had any alternate opinions or recommendations before > > > > starting the work. > > > > > > > > If this solution seems reasonable, we will write and release a design > > > > document for a more formal discussion and review. > > > > > > > > Please feel free to comment on the doc, or let me know if you have > any > > > > concerns. > > > > > > > > -Jordan > > > > > > > > > >