Just to set expectations straight on my side, I won't be able to spend time on this until next week. I am planning to do a live cluster restore to better understand and document all findings.
To reiterate what I mentioned above, I don't think this should be a blocker for the release. The restore-from-backup is a very environment sensitive procedure and our instructions should be treated as general guidance rather than a precise set of steps to follow. On Fri, Feb 5, 2016 at 8:10 AM, John Sirois <jsir...@apache.org> wrote: > On Wed, Feb 3, 2016 at 10:58 AM, John Sirois <jsir...@apache.org> wrote: > > > > > > > On Tue, Feb 2, 2016 at 10:22 AM, Maxim Khutornenko <ma...@apache.org> > > wrote: > > > >> +1 to having 1603 and 1601 as blockers. I am planning to work on 1603 > >> today. > >> > >> As for 1605, I don't believe it's a blocker given that all findings are > >> already documented in the ticket. > >> > > > > I went through a recovery using the guide and hit issues that don't > square > > with the description of corrections described in AURORA-1605 nor the new > > `--bypass-leader-redirect` capability introduced to aurora_admin in > > AURORA-1601. > > I suspect this can be explained by me not knowing what I'm doing! That > > said, unless I'm being especially dumb here, neither will the the 1st > time > > restorer. > > > > I'll wait for you to close out AURORA-1603 to signal an OK on the > > technical issue that necessitated the restore in the 1st place and I'd > like > > to block on some feedback on my experience restoring documented in > > AURORA-1605 before making up my mind on AURORA-1605 being a release > > blocker. It does seem to me we should have useable restore docs as a > high > > priority, but if they've been broken in large ways for some time, I might > > be convinced that AURORA-1605 is a valid 0.13.0 release blocker but not > > 0.12.0. > > > > Alright - Maxim has closed out AURORA-1603 and only AURORA-1605 remains. > I'd still like to block on that if someone can devote some time in the next > 2 business days to running through the docs and correcting / reviewing the > issues I had with the docs as noted in the issue. > If I have no feedback on the status of AURORA-1605 by the morning (MST) of > Monday February 8th, I'll take that a silent disapproval of the block and > proceed to cut 0.12.0-rc3. > > > > > >> On Tue, Feb 2, 2016 at 7:03 AM, Joshua Cohen <jco...@apache.org> wrote: > >> > >> > I'd only consider item 1 to be a blocker to 0.12.0, but 2 and 3 should > >> be > >> > relatively quick so in general this sounds like a reasonable plan of > >> action > >> > to me. > >> > > >> > On Tue, Feb 2, 2016 at 8:52 AM, John Sirois <jsir...@apache.org> > wrote: > >> > > >> > > Although the last blocker raised for the 0.12.0 RC series has been > >> > resolved > >> > > [1], it looks like resolution of several issues related to rolling > >> back > >> > to > >> > > 0.11.0 are required to cut the next RC: > >> > > 1. "Scheduler fails to start after rollback": > >> > > https://issues.apache.org/jira/browse/AURORA-1603 > >> > > 2. "Add a flag to disable the HTTP redirect to the leader": > >> > > https://issues.apache.org/jira/browse/AURORA-1601 > >> > > 3. "Update recovery docs to reflect changes": > >> > > https://issues.apache.org/jira/browse/AURORA-1605 > >> > > > >> > > These issues fall into 2 classes: > >> > > Item 1 above needs to fix the immediate problem of rolling back to > >> > 0.11.0; > >> > > although there may be more changes to process, tooling and code to > >> > support > >> > > the problem better going forward. > >> > > Items 2 & 3 address tooling & procedure that support rollback. > >> > > > >> > > It looks like Maxim has claimed item 1/AURORA-1603 and Joshua is > >> working > >> > > item 2/AURORA-1601. I assume one of Maxim, Joshua or Zameer will > >> tackle > >> > > item 3/AURORA-1605 to update rollback docs with what they learned > >> rolling > >> > > back. > >> > > > >> > > If I have any of this wrong, please speak up; otherwise I'll be > >> cutting > >> > the > >> > > next 0.12.0 RC3 when the above 3 issues are resolved. > >> > > > >> > > [1] "Identity.role is still used in the UI leading to duplicate > >> instances > >> > > on job page": https://issues.apache.org/jira/browse/AURORA-1604 > >> > > > >> > > >> > > > > >