> On Oct. 12, 2017, 2:05 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStore.java
> > Lines 43-52 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852507#file1852507line43>
> >
> >     With multiple accesses of `hostAttributes` this class is subject to 
> > race conditions. 
> >     
> >     Should we add `synchronized` to all public methods?

Historically, all stores have assumed the presence of a centralized storage 
write lock.  This is true (`LogStorage#writeLock`), but distant and apparently 
no longer documented.  Since this type of concurrency concern is sure to exist 
in all stores, i propose restoring the documentation in `Storage` for now.


> On Oct. 12, 2017, 2:05 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 146-161 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line146>
> >
> >     As indicated in the other comment, this will miss the case when 
> > something is in the LockStore but not reflected in the corresponding update.
> >     
> >     I think we should handle that case for additonal robustness.

I agree with you; the only thing holding me back is practical longevity of 
`LockStore`.  I'd like to remove it ASAP (its only purpose is to indirectly 
answer "Is this job updating?"); but tight coupling to `DbJobUpdateStore` 
severly complicated that.  Since i'm pulling that whole thread now, i'd like to 
avoid any kind of cross-store synchronization unless correctness is otherwise 
at risk.

With that said, `refreshLocks()` is used locally in what appear to be read 
paths, but ultimately part of write paths.  How would you feel about moving 
`fetchAllJobUpdateDetails()` and `getLockToken()` to the 
`JobUpdateStore.Mutable` interface, making them subject to the now-documented 
write lock guarantee?


> On Oct. 12, 2017, 2:05 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 235-240 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line235>
> >
> >     What happens for the case when `lockToken` is not present but have have 
> > a token in the LockStore? 
> >     
> >     It seems like `refreshLocks` does not handle that case either.

This is precisely where the coupling between `LockStore` and `JobUpdateStore` 
gets ugly - they both assume calling code is coordinating use of the stores in 
a specific way.  The implementation attempts to keep the stores in sync to 
prevent this case, but it's technically possible.  The same is true of the DB 
implementations.

However, this is tempting me to go back and take another stab at removing 
`LockStore` since carrying over the store coupling feels error-prone.


> On Oct. 12, 2017, 2:05 p.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java
> > Lines 345-360 (patched)
> > <https://reviews.apache.org/r/62869/diff/4/?file=1852508#file1852508line345>
> >
> >     This relies on the assumption that all `UpdateEvent`s and 
> > `InstanceUpdateEvent`s are sorted chronologically. 
> >     
> >     I think we should enforce this invariant in `saveJobUpdateEvent` and 
> > `saveJobInstanceUpdateEvent` to prevent surprises.

Good call, done.


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62869/#review187861
-----------------------------------------------------------


On Oct. 10, 2017, 12:35 p.m., Bill Farner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62869/
> -----------------------------------------------------------
> 
> (Updated Oct. 10, 2017, 12:35 p.m.)
> 
> 
> Review request for Aurora, Jordan Ly and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This patch introduces map-based volatile stores, most of which were revived 
> from git history with minimal changes.  The DB storage system is now only 
> used in a temporary storage when replaying a snapshot containing the 
> `dbScript` field.
> 
> Please take special care to double-check my work in `SnapshotStoreImpl`, as 
> it must function for backwards compatibility with `Snapshot`s in the wild.  
> Another area of interest is the implementation of `MemJobUpdateStore`, which 
> must have nuanced behavior for compatibility with `DbJobUpdateStore`.  Most 
> of this stems from behind-the-scenes interaction with `LockStore` and use of 
> cascading deletes.
> 
> Note that this change removes the transactional nature of in-memory storage 
> operations as well as the `READ COMMITTED` transaction isolation previously 
> available to some stores (proven in necessary changes to 
> `StorageTransactionTest`).  This means some stores will permit dirty reads 
> when they previously did not.  `TaskStore` has always had this 
> non-transactional behavior by default, as the DB task store was never deemed 
> suitable for production.  Nonetheless, this non-transactional behavior should 
> be considered safe as the scheduler fails over on a storage operation 
> failure, and relies on the persistent log storage for transaction atomicity.
> 
> See the [mailing list 
> discussion](https://lists.apache.org/thread.html/0bb74c78da7fa12954e2438e8011b3bff73fe3bfdafeaaa2ff00a98e@%3Cdev.aurora.apache.org%3E)
>  for context.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md c58e68080beb1740d92639601ef7cc29c63be37e 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> 9ee9eff6c3d657decd93458dc3e6792a30614a60 
>   docs/reference/scheduler-configuration.md 
> 4e3f90713c307e3b9e9f84c29343af7f014f0165 
>   examples/vagrant/upstart/aurora-scheduler-kerberos.conf 
> 04e3d311c6845689cd3bd813aa8dcf9df55f1199 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 63fcc87be653835cb3c3f25dae4164f1d7c8d4da 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 7d3766829161457b1b3ba50bce128047d10b2c58 
>   src/jmh/java/org/apache/aurora/benchmark/SnapshotBenchmarks.java 
> ae59f3def75d0e1d3866c8d2f85063456643e9a6 
>   src/jmh/java/org/apache/aurora/benchmark/StateManagerBenchmarks.java 
> c293a9fd7c435e17dfc09f9ccc180154be8c7b2d 
>   src/jmh/java/org/apache/aurora/benchmark/StatusUpdateBenchmark.java 
> 45c2ab96d8fa65ab12ba710954128da987f0544b 
>   src/jmh/java/org/apache/aurora/benchmark/TaskStoreBenchmarks.java 
> f5e44dfd6d2c81e90859eaf24cf904a9662c1c72 
>   src/jmh/java/org/apache/aurora/benchmark/ThriftApiBenchmarks.java 
> 7b4050637433ab656fc09958bac1a050fec02bed 
>   src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java 
> cac02a55d26b2934099a2b03457c703600296292 
>   src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 
> dd0e480c628f6bb44de0ce758c8af2bb54db1bc6 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> 64733e5b38f238e64424127a53591e6546a3e9a8 
>   src/main/java/org/apache/aurora/scheduler/http/JettyServerModule.java 
> 93cd20adc28ed700719e472bb2331137a93d1d9d 
>   src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 
> 52c3c6618a3cf1009435ca8a9cece36365913e55 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> 6c676693e531541ef9aec7f7a130c119ebf35df3 
>   src/main/java/org/apache/aurora/scheduler/storage/Util.java PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/backup/TemporaryStorage.java
>  bad05f5bf1976bb590e8dd7af9b43d5d30e892a9 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> cbe5a0deff1ebc38a9618e7d89ab073dfaf78d36 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbModule.java 
> fb3dadb4b4c7036112e423b2bc1b2540d793b4d6 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbStorage.java 
> 7904e3880d214aac1013ce18265fed924ef7897e 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbUtil.java 
> 942d180fe1a8b7a583e812f9106ee0e8db1bea55 
>   src/main/java/org/apache/aurora/scheduler/storage/log/LogStorageModule.java 
> 6fdf31d6ed430f3dc0a0117442fc8b532aa40230 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 
> fdf9c33f80be3eff03f50732b8c4735cb9ff3b91 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java 
> d145259653dd4df90e3877fc3e744e24c7a15d13 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/InMemStoresModule.java 
> 24296b6c61e62a09c30f01dfb84448f73a5f1e44 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStore.java 
> PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemLockStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemQuotaStore.java 
> PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemSchedulerStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/Util.java 
> c28fb65010af5e3db925487929d4e0e12b4101a4 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  54202359382bfe39e7cbaec0bf4c7d65d10ca13b 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> a363e70176881518653d0774e9d0c4be0f7f6d78 
>   src/test/java/org/apache/aurora/scheduler/app/local/LocalSchedulerMain.java 
> 3dee55a38752faea52c0c6ace90db8352ef9b2ea 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> 9b4f2ad15ab5b61d4cccfad38ba48f17e7853425 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
> 2b817076d6beb09586d4711bc10bceb31f3b74e1 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 459d6bebcd7d6341dac2aead7e3dd8ce87bc9ed6 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImplTest.java
>  81440f5689f9538a4c7a9e6700bf03ca89c4ba85 
>   src/test/java/org/apache/aurora/scheduler/http/MaintenanceTest.java 
> f94b58b77b7c6ce824914af7e1147e73ad5a7eed 
>   
> src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java
>  09f443bca90f154b547d28ca5fd5be08177fdf99 
>   src/test/java/org/apache/aurora/scheduler/state/LockManagerImplTest.java 
> 19f9de312596395eff81bbfd073a5f617d2ef84c 
>   src/test/java/org/apache/aurora/scheduler/state/StateManagerImplTest.java 
> 24176fe089703386e8246f6add1965c111c136ed 
>   src/test/java/org/apache/aurora/scheduler/stats/ResourceCounterTest.java 
> d73656dc8fb8764c7a66eed3ea789554ce0ecc52 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> 04ec7715ba9a9d47c99d4ccb92939f3e561e1a60 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/AttributeStoreTest.java 
> PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbAttributeStoreTest.java
>  f47f4a8a492fb43bacd909dc520256ed028531dd 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbCronJobStoreTest.java 
> b68298ab50a3621da8596947619f2d1c34d2d194 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStoreTest.java
>  8fca54becde34a0d10d60e05d3809fc1c41233bf 
>   src/test/java/org/apache/aurora/scheduler/storage/db/DbLockStoreTest.java 
> 8ed58e01eea09ab9f1aa4d269a5e59ce9c9c2191 
>   src/test/java/org/apache/aurora/scheduler/storage/db/DbQuotaStoreTest.java 
> 275d0fd93819169550ce39c8ddfa6c4fa0e4d920 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbSchedulerStoreTest.java
>  a6320bf687aa41c698bdc6a2c0b5208624cff4a3 
>   src/test/java/org/apache/aurora/scheduler/storage/db/DbTaskStoreTest.java 
> bcf5be43ab69ef565bb68f9235b4c43d385898aa 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/JobUpdateStoreTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/storage/db/LockStoreTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/storage/db/QuotaStoreTest.java 
> PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/SchedulerStoreTest.java 
> PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplIT.java
>  b2c333eabbd3e3ae8f119170266ce6054b53fcae 
>   
> src/test/java/org/apache/aurora/scheduler/storage/mem/InMemTaskStoreTest.java 
> 84fa3ff48a0957760f2002ed026e72b882171109 
>   
> src/test/java/org/apache/aurora/scheduler/storage/mem/MemAttributeStoreTest.java
>  PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/storage/mem/MemCronJobStoreTest.java
>  91e591f3c84489386a7c20d33278ba691b3ae7ba 
>   
> src/test/java/org/apache/aurora/scheduler/storage/mem/MemJobUpdateStoreTest.java
>  PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemLockStoreTest.java 
> PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/storage/mem/MemQuotaStoreTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/storage/mem/MemStorageTest.java 
> PRE-CREATION 
>   
> src/test/java/org/apache/aurora/scheduler/storage/mem/StorageTransactionTest.java
>  25f34e2bc26c6d4754c1591fad7f2165dd465d32 
>   src/test/java/org/apache/aurora/scheduler/thrift/ThriftIT.java 
> 2cd19d572d2945e28630b9bce5ba58eb9753630a 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 453366ef4e4b6db45643b1927887771dbd795bf9 
> 
> 
> Diff: https://reviews.apache.org/r/62869/diff/4/
> 
> 
> Testing
> -------
> 
> Still need to complete (but should not block review):
> 
> * Run `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` from scratch 
> at this patch.
> * Run `./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh` on master, 
> then this patch, then on master.  This verifies some level of replay 
> compatibility between versions.
> 
> *Most relevant benchmark*
> 
> On master:
> ```
> Benchmark                                             (instanceOverrides)  
> (instances)  (metadata)   Mode  Cnt    Score    Error  Units
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A     
>     1000         N/A  thrpt    5  696.186 ± 67.296  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A     
>     5000         N/A  thrpt    5  160.567 ± 22.365  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A     
>    10000         N/A  thrpt    5   82.897 ±  5.384  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                      1     
>      N/A         N/A  thrpt    5  166.567 ± 20.929  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                     10     
>      N/A         N/A  thrpt    5  161.149 ±  6.016  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                    100     
>      N/A         N/A  thrpt    5  141.993 ±  7.119  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                   1000     
>      N/A         N/A  thrpt    5   65.481 ±  0.861  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A          10  thrpt    5  167.387 ± 10.009  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A         100  thrpt    5  153.032 ± 19.067  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A        1000  thrpt    5  140.656 ± 16.893  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A       10000  thrpt    5   45.039 ±  4.840  ops/s
> ```
> 
> With this patch:
> ```
> Benchmark                                             (instanceOverrides)  
> (instances)  (metadata)   Mode  Cnt         Score          Error  Units
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A     
>     1000         N/A  thrpt    5  43700548.771 ±  4491654.631  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A     
>     5000         N/A  thrpt    5  44234835.787 ±  6078260.236  ops/s
> UpdateStoreBenchmarks.JobDetailsBenchmark.run                         N/A     
>    10000         N/A  thrpt    5  43362266.895 ±  7546660.236  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                      1     
>      N/A         N/A  thrpt    5  45796640.959 ±  3698698.216  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                     10     
>      N/A         N/A  thrpt    5  46402021.197 ±  4800377.641  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                    100     
>      N/A         N/A  thrpt    5  44416822.563 ±  9591398.027  ops/s
> UpdateStoreBenchmarks.JobInstructionsBenchmark.run                   1000     
>      N/A         N/A  thrpt    5  44004609.998 ± 20332063.977  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A          10  thrpt    5  47403264.761 ±  4787575.678  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A         100  thrpt    5  45629630.921 ±  7654875.770  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A        1000  thrpt    5  44150253.103 ±  9376206.202  ops/s
> UpdateStoreBenchmarks.JobUpdateMetadataBenchmark.run                  N/A     
>      N/A       10000  thrpt    5  44215602.085 ±  8326237.269  ops/s
> ```
> 
> 
> Thanks,
> 
> Bill Farner
> 
>

Reply via email to