> On Dec. 27, 2016, 10:50 p.m., Joshua Cohen wrote: > > Overall this looks good to me. I think some work would be required to > > productionize it (e.g. make it optional to start). Obviously want to vet > > this in a test cluster and we would need some doc changes to go with this > > and RELEASE-NOTES.md, etc. but that's easy enough to take care of. > > > > I also want to be sure I understand the behavior in the event of a problem > > with one, or both replicated logs. So... scenarios: > > > > 1) One log or the other is missing/corrupt: our state becomes the sum of > > whatever's found in the log that's present/valid. > > 2) There's no chance of a snapshot being persisted that references a log > > position that hasn't been persisted to the operation log, right? E.g. > > there's no realistic way that appending the noop transaction could take > > longer than creating/persisting the snapshot (in which case the snapshot > > could be successfully persisted referencing that position, yet that > > position would not exist in the operation log). > > David McLaughlin wrote: > 1) It's exactly the same as today if the log is corrupt - you need to > restore from a backup. > 2) No, that is not possible. Log appends are synchronous. > > I'm -1 to making this optional, purely because that doesn't reduce the > risk in any way - you'd just be putting off adding flags to your start-up > script. But yeah, it means I'll have to vet this change in a production-like > environment before this gets committed.
I'd like to hear what others running Aurora in production think about putting this behind a flag. This feels like a big change to force on people and I think if this came from someone outside of our org we'd like the flexibility to enable this on our own timeframe rather than being blocked from picking up all updates until we were comfortable/ready. - Joshua ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54883/#review160188 ----------------------------------------------------------- On Dec. 22, 2016, 9:26 a.m., David McLaughlin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/54883/ > ----------------------------------------------------------- > > (Updated Dec. 22, 2016, 9:26 a.m.) > > > Review request for Aurora, Joshua Cohen, John Sirois, Santhosh Kumar > Shanmugham, and Zameer Manji. > > > Repository: aurora > > > Description > ------- > > See here for more details: > https://docs.google.com/document/d/1QVSEfeoCyt2D6cCmTCxy8-epufcuzIfnqRUkyT1betY/edit?usp=sharing > > The motivation for this patch is the realisation that if we store Snapshots > outside of the main transaction log, we only ever to need to hold the storage > lock during snapshot creation. This would reduce the time writes are blocked > during snapshots by 80% for a large production cluster. I'm also doing this > with a view to future work, which is described in the document above. > > As long as everything works fine, this change is as simple as adding two new > args to the scheduler JVM. But it's worth noting that once this change is > deployed and the first snapshot is written, you will no longer be able to > rollback without first restoring from a backup. > > > Diffs > ----- > > api/src/main/thrift/org/apache/aurora/gen/storage.thrift > 9e4213f13255a182df938bea44ca87fa03a25318 > examples/vagrant/aurorabuild.sh dbec54d3e677db8cb0f02849e5373e619629fc7c > examples/vagrant/upstart/aurora-scheduler.conf > e68a801017ae02e0ed581129e12a645ccc1e35d4 > src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java > 43cc5b4645bc26b0fc6b23726ad3292699048ded > src/main/java/org/apache/aurora/scheduler/log/Log.java > dc77eb435e5f8fdce56727a2f679e8e1907e977c > src/main/java/org/apache/aurora/scheduler/log/mesos/DualLogModule.java > PRE-CREATION > src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLog.java > 21855e184fe20dc339713978b32344b6950701ec > > src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java > 6704a328a4023a178ed8f86ae4772cb04eb2fa8e > src/main/java/org/apache/aurora/scheduler/storage/log/LogManager.java > cfa9c56c909bac2e885e33a9fe772cb41cbd9fa9 > src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java > 387350c7667a5fb8ee674ad0d3dd17529232b25b > > src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java > 9733ffe74b107f336858657550156ddb1f1dd215 > src/main/java/org/apache/aurora/scheduler/storage/log/StreamManager.java > ea147c0ba6aaa6d113144be0a8330bd2f73d2453 > > src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java > baf2647c54f1f9918139584e5151873a3853e83c > src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java > 29a3b4a57925f31e59a49d4bfa630e724cadcb04 > src/test/java/org/apache/aurora/scheduler/log/mesos/MesosLogTest.java > f142f545799d64f9352b0ac6c51942eedf5e9ced > src/test/java/org/apache/aurora/scheduler/storage/log/LogManagerTest.java > 3f445595a81a5655c6c486791a9b55d8dc7f2f76 > src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java > 34c24aac9339e59c435925bd2357ce86998f5f02 > > Diff: https://reviews.apache.org/r/54883/diff/ > > > Testing > ------- > > I currently have only tested this on one node (wrote this code on a flight). > Will have to verify this in our test clusters before shipping. > > > Example log output on my vagrant image: > > I1222 07:18:36.471 [qtp1117266705-37, LoggingInterceptor] snapshot() > I1222 07:18:36.479 [qtp1117266705-37, LogStorage] Creating snapshot at > position: Position 6 > I1222 07:18:36.485 [qtp1117266705-37, SnapshotStoreImpl] Saving dbsnapshot > I1222 07:18:36.591 [qtp1117266705-37, LogStorage] Write lock released. > I1222 07:18:36.591 [pool-12-thread-1, LogStorage] Persisting snapshot. > I1222 07:18:36.592 [pool-12-thread-1, > SnapshotDeduplicator$SnapshotDeduplicatorImpl] Starting deduplication of a > snapshot with 0 tasks. > I1222 07:18:36.595 [pool-12-thread-1, > SnapshotDeduplicator$SnapshotDeduplicatorImpl] Finished deduplicating > snapshot. Deduplication ratio: 0/0 = NaN%. > I1222 07:18:36.599218 4155 log.cpp:577] Attempting to append 6261 bytes to > the log > I1222 07:18:36.599321 4155 coordinator.cpp:348] Coordinator attempting to > write APPEND action at position 5 > I1222 07:18:36.599573 4160 replica.cpp:537] Replica received write request > for position 5 from (11)@192.168.33.7:8083 > I1222 07:18:36.600693 4160 leveldb.cpp:341] Persisting action (6280 bytes) > to leveldb took 1.061504ms > I1222 07:18:36.600733 4160 replica.cpp:712] Persisted action at 5 > I1222 07:18:36.600844 4160 replica.cpp:691] Replica received learned notice > for position 5 from @0.0.0.0:0 > I1222 07:18:36.601541 4160 leveldb.cpp:341] Persisting action (6282 bytes) > to leveldb took 619657ns > I1222 07:18:36.601569 4160 replica.cpp:712] Persisted action at 5 > I1222 07:18:36.601582 4160 replica.cpp:697] Replica learned APPEND action at > position 5 > I1222 07:18:36.603044 4160 log.cpp:596] Attempting to truncate the log to 5 > I1222 07:18:36.603173 4160 coordinator.cpp:348] Coordinator attempting to > write TRUNCATE action at position 6 > I1222 07:18:36.603330 4160 replica.cpp:537] Replica received write request > for position 6 from (12)@192.168.33.7:8083 > I1222 07:18:36.603890 4160 leveldb.cpp:341] Persisting action (16 bytes) to > leveldb took 502063ns > I1222 07:18:36.603919 4160 replica.cpp:712] Persisted action at 6 > I1222 07:18:36.603991 4160 replica.cpp:691] Replica received learned notice > for position 6 from @0.0.0.0:0 > I1222 07:18:36.604279 4160 leveldb.cpp:341] Persisting action (18 bytes) to > leveldb took 217834ns > I1222 07:18:36.604357 4160 leveldb.cpp:399] Deleting ~3 keys from leveldb > took 13629ns > I1222 07:18:36.604372 4160 replica.cpp:712] Persisted action at 6 > I1222 07:18:36.604383 4160 replica.cpp:697] Replica learned TRUNCATE action > at position 6 > I1222 07:18:36.604581 4160 log.cpp:596] Attempting to truncate the log to 6 > I1222 07:18:36.604682 4160 coordinator.cpp:348] Coordinator attempting to > write TRUNCATE action at position 7 > I1222 07:18:36.604778 4160 replica.cpp:537] Replica received write request > for position 7 from (13)@192.168.33.7:8083 > I1222 07:18:36.605317 4160 leveldb.cpp:341] Persisting action (16 bytes) to > leveldb took 485544ns > I1222 07:18:36.605356 4160 replica.cpp:712] Persisted action at 7 > I1222 07:18:36.605439 4160 replica.cpp:691] Replica received learned notice > for position 7 from @0.0.0.0:0 > I1222 07:18:36.605736 4160 leveldb.cpp:341] Persisting action (18 bytes) to > leveldb took 215503ns > I1222 07:18:36.605773 4160 leveldb.cpp:399] Deleting ~3 keys from leveldb > took 9620ns > I1222 07:18:36.605787 4160 replica.cpp:712] Persisted action at 7 > I1222 07:18:36.605798 4160 replica.cpp:697] Replica learned TRUNCATE action > at position 7 > I1222 07:18:36.605 [pool-12-thread-1, LogStorage] Snapshot complete. host > attrs: 1, cron jobs: 0, locks: 0, quota confs: 0, tasks: 0, updates: 0 > > > Thanks, > > David McLaughlin > >
