After some offline discussion with several Pulp devs, we decided to
dedicate this thread to one problem - duplicates (and move the other
problem - filtering/validation - to a different thread).
The current proposal is to have a repo_key on a content model (thanks,
Simon) and ensure its uniqueness
On Mon, Jul 22, 2019 at 4:47 AM Tatiana Tereshchenko
wrote:
>
>
> On Sun, Jul 21, 2019 at 3:00 PM Brian Bouterse
> wrote:
>
>>
>>
>> On Sun, Jul 21, 2019 at 6:23 AM Tatiana Tereshchenko
>> wrote:
>>
>>> +1 to the idea of a repo_key.
>>>
>>> Should we also add the ability to apply custom
On Sun, Jul 21, 2019 at 6:23 AM Tatiana Tereshchenko
wrote:
> +1 to the idea of a repo_key.
>
> Should we also add the ability to apply custom validation of the content
> being added?
> Similar to a repo_key, Content model can optionally provide an additional
> validator.
> Use cases:
> - for
+1 to the idea of a repo_key.
Should we also add the ability to apply custom validation of the content
being added?
Similar to a repo_key, Content model can optionally provide an additional
validator.
Use cases:
- for pulp_file to avoid relative path overlap - e.g. 'a/b' and 'a'
- for pulp_rpm
I want to retell Simon's proposal to have "Content defines a 'repo_key'
similar to a unit_key. This key must be unique within a repo version (and
not globally like the unit_key."
We could adopt his proposal to have the repo_key tuple defined on Content
in pulpcore. If we left the add/remove APIs
Sure, the code can be de-duplicated.
My main worry is that it's a responsibility of a plugin writer not to
forget to ensure uniqueness constraints within a repo version for every
workflow (sync, copy, anything else) where a repo version is created.
Every time before RepositoryVersion.create() is
@Tanya Tereshchenko
> Do I understand correctly that it doesn't cover the sync case and it's
> only about explicit repo version creation?
>
I don't mean that add/remove could not share code with remove duplicate
stage. I wanted to point out that we have a problem here (how to remove
duplicates)
I think I misread your email. If you are saying "newest to associate" and
not "newest content unit", I think that would work.
@ttereshc, couldn't we de-duplicate the logic by creating a class in the
plugin API that RemoveDuplicates uses as well as the add/remove content
endpoints in the plugins?
I don't think this solution would work in the case of creating a new
repository version. Suppose for example you had two content units that
collide, one in a repo version and one older unit that a user explicitly
wants to add to the repo version. If the latter one is older, then what
would happen?
Having a way for units to express their uniqueness per repo sounds good
because then more areas of Pulp's code could answer the question: "will I
have a duplicate if I add content X to repo_version Y".
Let's assume we know that situation is about to occur during sync for
example, what do we do
Do I understand correctly that it doesn't cover the sync case and it's only
about explicit repo version creation?
So the suggestion is to implement the same logic twice: for sync case -
RemoveDuplicates stage and/or maybe some custom stage (e.g. to disallow
overlapping paths), and for direct repo
I have a design in mind for solving this problem:
1. Remove POST to RepositoryVersion (no general add/remove endpoint).
2. Add an endpoint to kick off an add/remove task, namespaced by plugin. ie
`POST pulp/api/v3/docker/add-remove/`
This view can be provided to all plugins by the plugin
On Mon, Jun 03, 2019 at 09:11:07AM -0400, David Davis wrote:
>@Simon I like the idea behind the repo_key solution you came up with.
>Can you be more specific around cases you think that it couldn't
>handle? I imagine that plugin writers could use properties or
>denormailzation (ie
Thanks for raising this issue. The pulp_file also suffers from this problem
in that files with duplicate names can be added to repo versions but they
probably shouldn't be:
https://pulp.plan.io/issues/4028
@Simon I like the idea behind the repo_key solution you came up with. Can
you be more
On Fri, May 31, 2019 at 01:12:58PM +0200, Tatiana Tereshchenko wrote:
>A while ago RemoveDuplicates stage [0] was introduced to solve the
>problem of enforcing uniqueness constraints within a repository version
>at sync time.
>The same problem ought to be solved when content which
A while ago RemoveDuplicates stage [0] was introduced to solve the problem
of enforcing uniqueness constraints within a repository version at sync
time.
The same problem ought to be solved when content which already exists in
Pulp is added to a repository. E.g. Content was uploaded, or content was
16 matches
Mail list logo