On 06/28/2017 02:27 PM, Jeff Ortel wrote:
I have been doing some thinking about pulp3 publishing with the following goals 
in mind:

- Eliminate symlinks.
- Eliminate need for each plugin to have its own Apache conf.
- Prevent orphaned content that is still published from being deleted.

The main concept is to store the relationship between an artifact and a URL in 
the DB instead of using the
filesystem.  A `Publication` is created (and owned) by a publisher.  Each 
`Publication` is composed of (linked
to) many `artifacts`.  The linkage contains the path component of the URL which 
is used to locate the artifact
referenced by a URL.

This covers artifacts as we know them today.  But what about files generated 
during publishing.  A.K.A.
metadata?  I propose that these files be stored as artifacts as well.  This 
requires an `Artifact` to be
redefined slightly.  The definition would read more like:

   "A file associated with either stored or published content".

Or, it would be even more generic, like:

   "A file contained within the pulp inventory that may be associated with a content 
(unit) or publication."

In any case, the relationship to a content (unit) becomes optional.

Publications are not user facing.  I think we can keep this as an internal core 
concept.  At least for the MVP.

The /var/lib/pulp/published directory goes away.

General Flows:

Publishing: "The publisher will compose a publication"

1. Publisher creates a publication using the plugin API.
2. Publisher adds content artifacts to the publication.
3. Publisher generates some metadata files in the working dir.
4. Publisher adds the metadata files to the publication using the plugin API.  
The artifacts can likely be
created behind the scenes by the plugin API.
5. Publisher commits (publishes) the publication.  The plugin API ensures this 
is atomic.

Client makes a GET request for content (or metadata):

1. Request is routed to the content (WSGI) application (just like in pulp2 for 
RPM).
2. Query the `LinkedArtifact` table by URL path component to get the artifact.
3. forward the artifact storage path to:
    <not stored locally>
        streamer
    <stored locally>
        x-send
4. Done.

How would this scale? Assume 10k machines are doing a yum update? How would you handle the thundering heard issue?

Have you checked out how koji handles packages? The use files on disk, but all the package metadata are urls back to the a single location on disk. This may be too rpm specific however. See http://koji.katello.org/kojifiles/repos/foreman-nightly-fedora24-build/latest/x86_64/.

-- bk

_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev

Reply via email to