----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54774/#review159305 -----------------------------------------------------------
I think there's some relevant context in the original review for this feature: https://reviews.apache.org/r/44471/. There was concern in that review with the removal of the native fields from snapshots for the reasons you mentioned in the associated ticket. I think that's still a concern, and there's the possibility that tooling exists outside of Aurora that depends on these fields existing in the snapshot. I'm generally in favor of the changes you've outlined, as they should dramatically improve snapshot performance, however perhaps it makes sense to put this behind a command line flag? Maybe a single flag that allows for control over which snapshot fields are emitted in addition to the dbScript (this would require adding something like `getName` to `SnapshotField`). I'd also be fine with providing some tooling that converts the dbScript in a snapshot to a fully hydrated thrift snapshot. Either way, I don't think we should ship this without some accommodation for that use case. Maybe start with the flag (as it's much easier to add), and follow up with the additional tooling at a later date (at which point we could kill the flag and all of the `SnapshotField`s that are already encompassed by the `dbScript`.) src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java (lines 315 - 317) <https://reviews.apache.org/r/54774/#comment230279> Calling `hasDbSnapshot` has nothing to do with whether or not there's a memory store. It has to do with whether or not there's a `dbScript` in the snapshot. That's a relatively recent addition to snapshots. Prior to that we fetched everything from H2 and dumped it to the snapshot as individual thrift entries. - Joshua Cohen On Dec. 15, 2016, 7:57 a.m., David McLaughlin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/54774/ > ----------------------------------------------------------- > > (Updated Dec. 15, 2016, 7:57 a.m.) > > > Review request for Aurora, Joshua Cohen, Santhosh Kumar Shanmugham, and > Zameer Manji. > > > Bugs: AURORA-1861 > https://issues.apache.org/jira/browse/AURORA-1861 > > > Repository: aurora > > > Description > ------- > > Motivation: Thanks to the mybatis query metrics we added, we found that the > redundant query I'm removing here currently adds 20s to the snapshot creation > time. This is an extra 20s unavailability every time we snapshot on our prod > clusters. > > === > > Avoid double writing job updates to the Scheduler Snapshot. This should be > low-risk because: > > 1) Job Updates have *never* had an in-memory store implementation. > 2) They are also relatively new, so the chances of some old code processing > job updates from backups is fairly low? > > > Diffs > ----- > > > src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java > d2c859055f905ac76ee8eb387dca103b9857ddbe > src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java > 7a11850e217dcb0148e4a4d33542c95b2e53a726 > > src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java > cf0a8f3ea11e9c48d1f16441af54dc781b33bdfc > > Diff: https://reviews.apache.org/r/54774/diff/ > > > Testing > ------- > > ./gradlew test > > I'll apply this to one of our test clusters before merging to master too. > > > Thanks, > > David McLaughlin > >
