On Sun, Jul 21, 2019 at 6:23 AM Tatiana Tereshchenko <ttere...@redhat.com> wrote:
> +1 to the idea of a repo_key. > > Should we also add the ability to apply custom validation of the content > being added? > Similar to a repo_key, Content model can optionally provide an additional > validator. > Use cases: > - for pulp_file to avoid relative path overlap - e.g. 'a/b' and 'a' > In thinking this over more, I'm unsure that pulp_file has the use case. Two different Artifacts having relative paths 'a' and 'a/b' in one repo version doesn't seem problematic. This problem statement is similar to the Distribution.base_path overlap problem statement where it's unavoidably ambiguous which Distribution should be matched when base_paths are allowed to overlap. In this case for pulp_file, it's not ambiguous in the same way, the relative_path I expect to match to exactly 1 content unit either 'a', or 'a/b', but not both. What do you think about this? - for pulp_rpm to filter by signature/signing key > Can we expand on this use case a bit? Is it that the repo version should only contains units signed or unsigned rpms? Or is it that we are ok with a mixture as long as each NEVRA is unique? I suspect the former, but I want to be sure. > Plugins can solve it by defining their own stage but it seems like almost > any plugin needs to ensure absence of collisions specific to it, even the > simple pulp_file. > It means that our default pipeline becomes less useful and will be hardly > ever used by any [currently known] plugins. > > Any thoughts? > > Tanya > > > On Mon, Jul 8, 2019 at 9:09 PM Brian Bouterse <bbout...@redhat.com> wrote: > >> I want to retell Simon's proposal to have "Content defines a 'repo_key' >> similar to a unit_key. This key must be unique within a repo version (and >> not globally like the unit_key." >> >> We could adopt his proposal to have the repo_key tuple defined on Content >> in pulpcore. If we left the add/remove APIs in core and adopt for both sync >> and add/remove a "keep newest to associate" functionality described earlier >> in the thread. This "keep newest to associate" code would be used by sync >> in the form of a core stage that is a generalized version of the >> RemoveDuplicates stage. This would become part of the default pipeline for >> all users of Stages API. I think this would be better than plugin writers >> implementing it over and over and also less effort for plugin wrtiers. This >> design would meet the current needs of pulp_cookbook, pulp_file, and >> pulp_rpm which are the only 3 places I know we have this problem so far, >> but I believe more content types are susceptible to this. >> >> What do you think we should do? >> >> Thanks! >> Brian >> >> >> >> >> >> On Thu, Jun 27, 2019 at 4:03 AM Tatiana Tereshchenko <ttere...@redhat.com> >> wrote: >> >>> Sure, the code can be de-duplicated. >>> My main worry is that it's a responsibility of a plugin writer not to >>> forget to ensure uniqueness constraints within a repo version for every >>> workflow (sync, copy, anything else) where a repo version is created. >>> Every time before RepositoryVersion.create() is called, there should be >>> a check that there is no colliding content in a repo version. >>> It would be much more reliable and friendly, in my opinion, if plugin >>> writer could define rules/callbacks/whatever for each content and it would >>> be applied to any repository creation. >>> At the same time this eliminates the flexibility to define different >>> logic for content collision for different workflows, however I'm not sure >>> if such a use case exists or is desired. >>> >>> Tanya >>> >>> >>> On Wed, Jun 26, 2019 at 6:49 PM Austin Macdonald <amacd...@redhat.com> >>> wrote: >>> >>>> @Tanya Tereshchenko <ttere...@redhat.com> >>>> >>>>> Do I understand correctly that it doesn't cover the sync case and it's >>>>> only about explicit repo version creation? >>>>> >>>> >>>> I don't mean that add/remove could not share code with remove duplicate >>>> stage. I wanted to point out that we have a problem here (how to remove >>>> duplicates) that has similar patterns to other problems with add remove >>>> (recursive, copy, deciding which content to keep with a collision, etc.) I >>>> don't doubt that pulpcore could help solve these problems, but I think that >>>> as we approach our GA, we should consider solving this problem (for now) by >>>> getting out of the way of plugin writers rather than by implementing code >>>> that is supposed to work for all plugins. I suspect that plenty of the >>>> plugins will be implementing their own add/remove anyway. >>>> >>>> On Tue, Jun 25, 2019 at 12:56 PM David Davis <davidda...@redhat.com> >>>> wrote: >>>> >>>>> I don't think this solution would work in the case of creating a new >>>>> repository version. Suppose for example you had two content units that >>>>> collide, one in a repo version and one older unit that a user explicitly >>>>> wants to add to the repo version. If the latter one is older, then what >>>>> would happen? >>>>> >>>>> David >>>>> >>>>> >>>>> On Tue, Jun 25, 2019 at 12:48 PM Brian Bouterse <bbout...@redhat.com> >>>>> wrote: >>>>> >>>>>> Having a way for units to express their uniqueness per repo sounds >>>>>> good because then more areas of Pulp's code could answer the question: >>>>>> "will I have a duplicate if I add content X to repo_version Y". >>>>>> >>>>>> Let's assume we know that situation is about to occur during sync for >>>>>> example, what do we do about it? In the errata case we know the "new" one >>>>>> should replace the existing one. Maybe we start to 'order' the units with >>>>>> colliding repo keys and keep the newest one always? Would this work for >>>>>> pulp_cookbook and pulp_rpm? Would it generalize? Is this what you >>>>>> imagined? >>>>>> >>>>>> On Tue, Jun 25, 2019 at 5:30 AM Tatiana Tereshchenko < >>>>>> ttere...@redhat.com> wrote: >>>>>> >>>>>>> Do I understand correctly that it doesn't cover the sync case and >>>>>>> it's only about explicit repo version creation? >>>>>>> So the suggestion is to implement the same logic twice: for sync >>>>>>> case - RemoveDuplicates stage and/or maybe some custom stage (e.g. to >>>>>>> disallow overlapping paths), and for direct repo version creation - your >>>>>>> proposal. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Jun 24, 2019 at 3:13 PM Austin Macdonald < >>>>>>> amacd...@redhat.com> wrote: >>>>>>> >>>>>>>> I have a design in mind for solving this problem: >>>>>>>> >>>>>>>> 1. Remove POST to RepositoryVersion (no general add/remove >>>>>>>> endpoint). >>>>>>>> 2. Add an endpoint to kick off an add/remove task, namespaced by >>>>>>>> plugin. ie `POST pulp/api/v3/docker/add-remove/` >>>>>>>> This view can be provided to all plugins by the plugin template, >>>>>>>> and will be based on the current RepositoryVersionCreate: >>>>>>>> >>>>>>>> https://github.com/pulp/pulpcore/blob/master/pulpcore/app/viewsets/repository.py#L221-L258 >>>>>>>> Note: the main purpose of this view is to kick off the general >>>>>>>> add/remove task, which will be unchanged: >>>>>>>> >>>>>>>> https://github.com/pulp/pulpcore/blob/master/pulpcore/app/tasks/repository.py#L70 >>>>>>>> 3. Add an add/remove serializer to the plugin API. >>>>>>>> 3. Plugins needing further customization can provide their own task >>>>>>>> and subclassed serializer. >>>>>>>> >>>>>>>> This gives the plugin writer full control over the endpoint >>>>>>>> (customizable arguments and validation), and full control over the flow >>>>>>>> (extra logic, depsolving, enforced uniqueness). It only uses the >>>>>>>> existing >>>>>>>> patterns (and existing required knowledge), but requires no work (other >>>>>>>> than using the template) for the simple case. >>>>>>>> >>>>>>>> On Mon, Jun 3, 2019 at 2:56 PM Simon Baatz <gmbno...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> On Mon, Jun 03, 2019 at 09:11:07AM -0400, David Davis wrote: >>>>>>>>> > @Simon I like the idea behind the repo_key solution you came >>>>>>>>> up with. >>>>>>>>> > Can you be more specific around cases you think that it >>>>>>>>> couldn't >>>>>>>>> > handle? I imagine that plugin writers could use properties or >>>>>>>>> > denormailzation (ie additional database columns) to solve >>>>>>>>> cases where >>>>>>>>> > they need uniqueness across data that isn't in the database. >>>>>>>>> In a worst >>>>>>>>> > case scenario, they can't use the pulpcore solution and just >>>>>>>>> have to >>>>>>>>> > roll their own. >>>>>>>>> >>>>>>>>> >>>>>>>>> What I wrote probably sounded too pessimistic. You are right, in >>>>>>>>> most cases that should be doable. >>>>>>>>> >>>>>>>>> I agree that we could have a simple default solution that just >>>>>>>>> requires to specify a couple of field names in the easiest case. >>>>>>>>> As you >>>>>>>>> say, it should be possible use custom logic in a plugin if >>>>>>>>> required. >>>>>>>>> >>>>>>>>> Here is the case I was thinking of that it can't handle: >>>>>>>>> >>>>>>>>> In pulp_file, a uniqueness constraint on "relative_path" would >>>>>>>>> allow >>>>>>>>> content units "a" and "a/b" to be in a repo version. >>>>>>>>> >>>>>>>>> However, we may want file repos to be representable on an actual >>>>>>>>> file >>>>>>>>> system (e.g. when exporting them as tar files). For the repo >>>>>>>>> above, >>>>>>>>> this does not work, as "a" can't be a file and a directory at the >>>>>>>>> same time on a standard Unix file system. >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Pulp-dev mailing list >>>>>>>>> Pulp-dev@redhat.com >>>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Pulp-dev mailing list >>>>>>>> Pulp-dev@redhat.com >>>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pulp-dev mailing list >>>>>>> Pulp-dev@redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>> >>>>>> _______________________________________________ >>>>>> Pulp-dev mailing list >>>>>> Pulp-dev@redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>> >>>>> _______________________________________________ >>>>> Pulp-dev mailing list >>>>> Pulp-dev@redhat.com >>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>> >>>> _______________________________________________ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >> _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev >
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev