I wrote a response inline. On Mon, Apr 24, 2017 at 3:12 PM, Jeff Ortel <jor...@redhat.com> wrote:
> > > On 04/21/2017 01:16 PM, Austin Macdonald wrote: > > I think we have arrived at a consensus around these points: > > +1 > > > > > * Django's ModelForm validations that are used during `full_clean` is an > anti-pattern in DjangoRestFramework > > and should not be used > > * Validation, by our definition, will be done by the serializers in the > API layer > > * The Task layer should run business logic checks at the time of use > > > > We have not arrived at a consensus on whether Updates/Deletes for > Importers/Publishers will be synchronous or > > asynchronous. I've tried to think through each way, and what it would > need to work. > > > > *Synchronous Update and Delete: > > * > > _Validation Heuristic_ > > The serializers validate in the API layer only, and it is desirable to > have that validation occur at the time > > of the write. > > * > > * > > _Concurrent Writes:_ > > Let's assume synchronous Updates and Deletes. There is a possibility of > concurrent writes because sync/publish > > tasks write to the importer/publisher tables respectively and these > tables can also be written by the Update > > API at any time. If there is a collision the last write wins. > > > > Workaround: > > Concurrent writes may not be a big deal because they should be not > affect each other. In platform, sync and > > publish only write to the last_sync/last_publish fields, which should > never be written by the update API. If > > the writes are restricted to only the affected fields, then there would > be no clobbering. As long as plugin > > writers' implementations of sync and publish do not write to the same > fields allowed in an API update, this > > concurrency is safe. We would obviously need to be clearly documented > for plugin writers. > > > > _Dealing With Broken Configurations_ > > For example, while a sync task is WAITING, the user updates an importer > in a way that breaks sync. The > > importer configuration should be read only once at the start of a sync. > If this configuration has been deleted > > or is not syncable, business logic checks catch this and fail > immediately. Even if the configuration is > > changed during the sync task, the sync task will continue with the > original configuration. The sync task will > > update last_sync, but will not write to other fields. I don't see a > problem here, as long as this is > > documented for the plugin writer. > > > > _Summary_: > > I don't see any concurrency problems here. The downside of this design > is that sync and publish, (implemented > > by plugin writers), should not write to the same fields as API updates. > > > > * > > * > > *Asynchronous Update and Delete* > > > > _Validation Heuristic_ > > Assuming async updates, the data comes into the API layer where it is > validated by the serializers. The update > > reserves a task (reservation is based on repository) and waits to be > performed. It is usually undesirable to > > separate validation from use. There could be concern that data validated > in the API layer could become > > unacceptable by the time the Task layer actually performs the write. > > > > Practically, this separation may not be an issue. Valid data becoming > invalid involves validations that are > > dependent on state. We can split these validations into two categories, > uniqueness validations and multi-field > > validations. Fortunately, the Django ORM will enforce uniqueness > validation, so the database is protected, but > > we might have to handle write failures. Multi-field validations are > probably not validations by our > > definition, but are business logic, which could be checked in the Task > layer. > > _ > > _ > > _Concurrent write prevention_ > > Leveraging the reserved resource system, concurrent writes could not be > a problem because writes could never > > be concurrent at all. Syncs, Updates, Deletes, Publishes all make > reservations based on the repository (this > > is an assumption based on what we do in Pulp 2), so each would run in > the order that they were called. > > > > _User Controlled Order_ > > A user can issue a [sync][update][sync] and be confident that the first > sync will use the original importer > > configuration, and the second sync will use the updated configuration. > > This is not deterministic (from user point of view) even with task > reservation. You still have a race > condition queuing tasks. In the following example, the update made by > user B will win and the update made by > user A will not be used for the 2nd sync. By making this async, seems we > have done is moved the race > condition from the DB to the tasking queue. > > queue > ------- > 1. user-A [sync] > 2. user-A [update] > 3. user-B [update] (winner) > 4. user-A [sync] > I'm not concerned about ^ because eventually Pulp3 will get authZ features which will allow administrators to prevent multi-user write-write races. Two users competing to write shared data is a normal, multi-user system problem. Also with the User A and User B update example, both synchronous and asynchronous update approaches would produce the same outcomes, so I don't think this point motivates either one as being better. > > > > > _Uncertain State_ > > The configuration of an importer is unpredictable if there is a waiting > update task from another user. > > > > > > > > > > > > > > > > _______________________________________________ > > Pulp-dev mailing list > > Pulp-dev@redhat.com > > https://www.redhat.com/mailman/listinfo/pulp-dev > > > > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev > >
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev