On 05/15/2018 09:29 AM, Milan Kovacik wrote:
Hi,

On Tue, May 15, 2018 at 3:22 PM, Dennis Kliban <dkli...@redhat.com> wrote:
On Mon, May 14, 2018 at 3:44 PM, Jeff Ortel <jor...@redhat.com> wrote:
Let's brainstorm on something.

Pulp needs to deal with remote repositories that are composed of multiple
content types which may span the domain of a single plugin.  Here are a few
examples.  Some Red Hat RPM repositories are composed of: RPMs, DRPMs, ,
ISOs and Kickstart Trees.  Some OSTree repositories are composed of OSTrees
& Kickstart Trees. This raises a question:

How can pulp3 best support syncing with remote repositories that are
composed of multiple (unrelated) content types in a way that doesn't result
in plugins duplicating support for content types?

Few approaches come to mind:

1. Multiple plugins (Remotes) participate in the sync flow to produce a
new repository version.
2. Multiple plugins (Remotes) are sync'd successively each producing a new
version of a repository.  Only the last version contains the fully sync'd
composition.
3. Plugins share code.
4. Other?


Option #1: Sync would be orchestrated by core or the user so that multiple
plugins (Remotes) participate in populating a new repository version.  For
example: the RPM plugin (Remote) and the Kickstart Tree plugin (Remote)
would both be sync'd against the same remote repository that is composed of
both types.  The new repository version would be composed of the result of
both plugin (Remote) syncs.  To support this, we'd need to provide a way for
each plugin to operate seamlessly on the same (new) repository version.
Perhaps something internal to the RepositoryVersion.  The repository version
would not be marked "complete" until the last plugin (Remote) sync has
succeeded.  More complicated than #2 but results in only creating truly
complete versions or nothing.  No idea how this would work with current REST
API whereby plugins provide sync endpoints.

I like this approach because it allows the user to perform a single call to
the REST API and specify multiple "sync methods" to use to create a single
new repository version.
Same here, esp. if the goal is an all-or-nothing behavior w/r the
mix-in remotes; i.e an atomic sync.
This has a benefit of a clear start and end of the sync procedure,
that the user might want to refer to.

Option #2: Sync would be orchestrated by core or the user so that multiple
plugins (Remotes) create successive repository versions.  For example: the
RPM plugin (Remote) and the Kickstart Tree plugin (Remote) would both be
sync'd against the same remote repository that is a composition including
both types.  The intermediate versions would be incomplete.  Only the last
version contains the fully sync'd composition.  This approach can be
supported by core today :) but will produce incomplete repository versions
that are marked complete=True.  This /seems/ undesirable, right?  This may
not be a problem for distribution since I would imaging that only the last
(fully composed) version would be published.  But what about other usages of
the repository's "latest" version?
I'm afraid I don't see use of a middle-version esp. in case of
failures; e.g ostree failed to sync while rpm managed and kickstart
managed too; is the sync OK as a whole? What to do with the versions
created? Should I merge the successes into one and retry the failure?
How many versions would this introduce?

(option 2) The partial versions would be created in both normal and failure scenarios.  The normal scenario is created because each plugin (Remote) creates a new version and only the last one is completed.  the intermediate versions are always partial.


Option #3: requires a plugin to be aware of specific repository
composition(s); other plugins and creates a code dependency between plugins.
For example, the RPM plugin could delegate ISOs to the File plugin and
Kickstart Trees to the KickStart Tree plugin.
Do you mean that the RPM plug-in would directly call into the File plug-in?
If that's the case then I don't like it much, would be a pain every
time a new plug-in would be introduced (O(len(plugin)^2) of updates)
or if the API of a plug-in changed (O(len(plugin)) updates).
Esp. keeping the plugin code aware of other plugin updates would be ugly.

Agreed.  The plugins could install libs into site-packages which would at least mitigate the complexity of calling into each other through the pulp plugin framework but I don't think it helps much. Even the rpm dependency is undesirable.


For all options, plugins (Remotes) need to limit sync to affect only those
content types within their domain.  For example, the RPM (Remote) sync
cannot add/remove ISO or KS Trees.

I am an advocate of some from of options #1 or #2.  Combining plugins
(Remotes) as needed to deal with arbitrary combinations within remote
repositories seems very powerful; does not impose complexity on plugin
writers; and does not introduce code dependencies between plugins.

Thoughts?

_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev

Cheers,
milan

_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev

Reply via email to