I also want to make it easy for the user. I don't think we should support repositories now though. We're strapped for time, Katello doesn't need it at the moment, and I think we can add it later.
Another argument for breaking up parameters is that we need to support incremental exports. I think the repository_versions parameter will either need to be a mapping of repo versions to base repo versions or we'll need to have a separate base repo versions parameter that Pulp can check when exporting a repo version. David On Fri, Feb 21, 2020 at 8:30 AM Dennis Kliban <dkli...@redhat.com> wrote: > We can't provide any data from the Katello database, but we can provide > enough data for the archive to contain all the published metadata and > distributions needed to not require the user any extra steps to make the > content available after import. > > We could definitely limit which resources are allowed to be specified for > this API. The user would never have to specify pulp_href for an artifact. > Content would only be exported using repositories, repository versions, or > publications. If a user chooses to export a repository, all the repository > versions for that repository would be exported along with the content and > artifacts that belong to those repo versions. When individual repository > versions are specified, only those repository versions are exported. > Publications would work the same way. > > My main goal is to make the import process as simple as possible for the > user. > > On Fri, Feb 21, 2020 at 7:53 AM David Davis <davidda...@redhat.com> wrote: > >> A couple comments. I'm not sure how pulp will be able to export all the >> extra metadata that comes from Katello as some of it relates to content >> views. Also, I'm hesitant to have the user export a generic list of pulp >> hrefs. I think this could be confusing (do users have to supply artifact >> hrefs to get artifacts?). I'd rather have a list of params users can >> specifically export (eg repository_versions, publications, etc). I think >> Pulp will have decide what you get when you for example export a repository >> version (likely Content, Artifacts, ContentArtifacts). >> >> David >> >> >> On Thu, Feb 20, 2020 at 7:13 PM Dennis Kliban <dkli...@redhat.com> wrote: >> >>> Thanks for all the details. I would like to provide Pulp 3 users with a >>> similar feature. In order to do that, the archive produced by Pulp will >>> need to include all that extra metadata that comes from Katello right now. >>> Pulp should support 2 use cases: >>> >>> - As a user, I can generate an archive by specifying a list of >>> pulp_hrefs. >>> - As a user, I can import an archive that was generated on another >>> pulp. >>> >>> The archive would contain database migrations needed to restore all the >>> resources. It would also have all the files needed to back the artifacts. >>> >>> Users could then provide a list of repository versions, publications, >>> and distributions when creating an artchive. Once the archive is imported, >>> Pulp is serving the content without having to republish. >>> >>> On Thu, Feb 20, 2020 at 9:53 AM Justin Sherrill <jsher...@redhat.com> >>> wrote: >>> >>>> There are two different forms of export today in katello: >>>> >>>> Legacy version: >>>> >>>> * Uses pulp2's export functionality >>>> >>>> * Takes the tarball as is >>>> >>>> "New" Version >>>> >>>> * Just copies published repository as is (following symlinks) >>>> >>>> * Adds own 'katello' metadata to existing tarball >>>> >>>> >>>> I would imagine that with pulp3 we would somewhat combine these two >>>> approaches and take the pulp3 generated export file and add in a metadata >>>> file of some sort. >>>> >>>> Justin >>>> On 2/19/20 2:28 PM, Dennis Kliban wrote: >>>> >>>> Thank you for the details. More questions inline. >>>> >>>> On Wed, Feb 19, 2020 at 2:04 PM Justin Sherrill <jsher...@redhat.com> >>>> wrote: >>>> >>>>> the goal from our side is to have a very similar experience to the >>>>> user. Today the user would: >>>>> >>>>> * run a command (for example, something similar to hammer content-view >>>>> version export --content-view-name=foobar --version=1.0) >>>>> >>>>> * this creates a tarball on disk >>>>> >>>> What all is in the tarball? Is this just a repository export created by >>>> Pulp or is there extra information from the Katello db? >>>> >>>>> * they copy the tarball to external media >>>>> >>>>> * they move the external media to the disconnected katello >>>>> >>>>> * they run 'hammer content-view version import >>>>> --export-tar=/path/to/tarball >>>>> >>>> Does katello untar this archive, create a repository in pulp, sync from >>>> the directory containing the unarchive, and then publish? >>>> >>>>> I don't see this changing much for the user, anything additional that >>>>> needs to be done in pulp can be done behind the cli/api in katello. >>>>> Thanks! >>>>> >>>> >>>> >>>> >>>>> Justin >>>>> On 2/19/20 12:52 PM, Dennis Kliban wrote: >>>>> >>>>> In Katello that uses Pulp 2, what steps does the user need to take >>>>> when importing an export into an air gapped environment? I am concerned >>>>> about making the process more complicated than what the user is already >>>>> used to. >>>>> >>>>> On Wed, Feb 19, 2020 at 11:20 AM David Davis <davidda...@redhat.com> >>>>> wrote: >>>>> >>>>>> Thanks for the responses so far. I think we could export publications >>>>>> along with the repo version by exporting any publication that points to a >>>>>> repo version. >>>>>> >>>>>> My concern with exporting repositories is that users will probably >>>>>> get a bunch of content they don't care about if they want to export a >>>>>> single repo version. That said, if users do want to export entire repos, >>>>>> we >>>>>> could add this feature later I think? >>>>>> >>>>>> David >>>>>> >>>>>> >>>>>> On Wed, Feb 19, 2020 at 10:30 AM Justin Sherrill <jsher...@redhat.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> On 2/14/20 1:09 PM, David Davis wrote: >>>>>>> >>>>>>> Grant and I met today to discuss importers and exporters[0] and we'd >>>>>>> like some feedback before we proceed with the design. To sum up this >>>>>>> feature briefly: users can export a repository version from one Pulp >>>>>>> instance and import it to another. >>>>>>> >>>>>>> # Master/Detail vs Core >>>>>>> >>>>>>> So one fundamental question is whether we should use a Master/Detail >>>>>>> approach or just have core control the flow but call out to plugins to >>>>>>> get >>>>>>> export formats. >>>>>>> >>>>>>> To give some background: we currently define Exporters (ie >>>>>>> FileSystemExporter) in core as Master models. Plugins extend this model >>>>>>> which allows them to configure or customize the Exporter. This was >>>>>>> necessary because some plugins need to export Publications (along with >>>>>>> repository metadata) while other plugins who don't have Publications or >>>>>>> metadata export RepositoryVersions. >>>>>>> >>>>>>> The other option is to have core handle the workflow. The user would >>>>>>> call a core endpoint and provide a RepositoryVersion. This would work >>>>>>> because for importing/exporting, you wouldn't ever use Publications >>>>>>> because >>>>>>> metadata won't be used for importing back into Pulp. If needed, core >>>>>>> could >>>>>>> provide a way for plugin writers to write custom handlers/exporters for >>>>>>> content types. >>>>>>> >>>>>>> If we go with the second option, the question then becomes whether >>>>>>> we should divorce the concept of Exporters and import/export. Or do we >>>>>>> also >>>>>>> switch Exporters from Master/Detail to core only? >>>>>>> >>>>>>> # Foreign Keys >>>>>>> >>>>>>> Content can be distributed across multiple tables (eg UpdateRecord >>>>>>> has UpdateCollection, etc). In our export, we could either use primary >>>>>>> keys >>>>>>> (UUIDs) or natural keys to relate records. The former assumes that UUIDs >>>>>>> are unique across Pulp instances. The safer but more complex >>>>>>> alternative is >>>>>>> to use natural keys. This would involve storing a set of fields on a >>>>>>> record >>>>>>> that would be used to identify a related record. >>>>>>> >>>>>>> # Incremental Exports >>>>>>> >>>>>>> There are two big pieces of data contained in an export: the dataset >>>>>>> of Content from the database and the artifact files. An incremental >>>>>>> export >>>>>>> cuts down on the size of an export by only exporting the differences. >>>>>>> However, when performing an incremental export, we could still export >>>>>>> the >>>>>>> complete dataset instead of just a set of differences >>>>>>> (additions/removals/updates). This approach would be simpler and it >>>>>>> would >>>>>>> allow us to ensure that the new repo version matches the exported repo >>>>>>> version exactly. It would however increase the export size but not by >>>>>>> much >>>>>>> I think--probably some number of megabytes at most. >>>>>>> >>>>>>> If its simper, i would go with that. Saving even ~100-200 MB isn't >>>>>>> that big of a deal IMO. the biggest savings is in the RPM content. >>>>>>> >>>>>>> >>>>>>> >>>>>>> [0] https://pulp.plan.io/issues/6134 >>>>>>> >>>>>>> David >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pulp-dev mailing >>>>>>> listPulp-dev@redhat.comhttps://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pulp-dev mailing list >>>>>>> Pulp-dev@redhat.com >>>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>>> >>>>>> _______________________________________________ >>>>>> Pulp-dev mailing list >>>>>> Pulp-dev@redhat.com >>>>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>>>> >>>>>
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev