[Pulp-dev] Importers/Exporters

David Davis Fri, 14 Feb 2020 10:11:18 -0800

Grant and I met today to discuss importers and exporters[0] and we'd like
some feedback before we proceed with the design. To sum up this feature
briefly: users can export a repository version from one Pulp instance and
import it to another.


# Master/Detail vs Core

So one fundamental question is whether we should use a Master/Detail
approach or just have core control the flow but call out to plugins to get
export formats.

To give some background: we currently define Exporters (ie
FileSystemExporter) in core as Master models. Plugins extend this model
which allows them to configure or customize the Exporter. This was
necessary because some plugins need to export Publications (along with
repository metadata) while other plugins who don't have Publications or
metadata export RepositoryVersions.

The other option is to have core handle the workflow. The user would call a
core endpoint and provide a RepositoryVersion. This would work because for
importing/exporting, you wouldn't ever use Publications because metadata
won't be used for importing back into Pulp. If needed, core could provide a
way for plugin writers to write custom handlers/exporters for content types.

If we go with the second option, the question then becomes whether we
should divorce the concept of Exporters and import/export. Or do we also
switch Exporters from Master/Detail to core only?

# Foreign Keys

Content can be distributed across multiple tables (eg UpdateRecord has
UpdateCollection, etc). In our export, we could either use primary keys
(UUIDs) or natural keys to relate records. The former assumes that UUIDs
are unique across Pulp instances. The safer but more complex alternative is
to use natural keys. This would involve storing a set of fields on a record
that would be used to identify a related record.

# Incremental Exports

There are two big pieces of data contained in an export: the dataset of
Content from the database and the artifact files. An incremental export
cuts down on the size of an export by only exporting the differences.
However, when performing an incremental export, we could still export the
complete dataset instead of just a set of differences
(additions/removals/updates). This approach would be simpler and it would
allow us to ensure that the new repo version matches the exported repo
version exactly. It would however increase the export size but not by much
I think--probably some number of megabytes at most.

[0] https://pulp.plan.io/issues/6134

David

_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev

[Pulp-dev] Importers/Exporters

Reply via email to