On 2/22/19 12:07 PM, Brian Bouterse wrote:
On Fri, Feb 22, 2019 at 9:36 AM Justin Sherrill <jsher...@redhat.com
<mailto:jsher...@redhat.com>> wrote:
On 2/18/19 2:41 PM, Austin Macdonald wrote:
Originally, our upload story was as follows:
The user will upload a new file to Pulp via POST to /artifacts/
(provided by core)
The user will create a new plugin specific Content via POST to
/path/to/plugin/content/, referencing whatever artifacts that are
contained, and whatever fields are expected for the new content.
The user will add the new content to a repository via POST to
/repositories/1/versions/
However, this is somewhat cumbersome to the user with 3 API calls
to accomplish something that only took one call in Pulp 2.
How would you do this with one call in pulp2?
https://docs.pulpproject.org/dev-guide/integration/rest-api/content/upload.html
seems to suggest 3-4 calls.
Some plugins implemented the pulp2 equivalent of a one-shot uploader.
Those docs are for pulp2's core which don't include the plugin's docs.
There are a couple of different paths plugins have taken to
improve the user experience:
The Python plugin follows the above workflow, but reads the
Artifact file to determine the values for the fields. The RPM
plugin has gone even farther and created a new endpoint for "one
shot" upload that perform all of this in a single call. I think
it is likely that the Python plugin will move more in the "one
shot" direction, and other plugins will probably follow.
How does the RPM one shot api work? Will it be compatible with
whatever solution https://pulp.plan.io/issues/4196 arrives at?
You would upload the Artifact as binary data along with what content
type it is and what relative path it uses and Pulp creates the
Artifact, Content unit, ContentArtifact. It should be compatible with
issue 4196 because django's binary form data should allow for parallel
uploading before calling the view handler. It may take 2 calls though.
The issue to me isn't about the number of calls as it is the client
data payload complexity.
If i'm having to chunk up data, i already have quite a bit of client
data payload complexity. In pulp 2 this was most of the complexity!
I would hate for all our plugins to move to One shot methods which
users can't even rely on.
I don't think we're taking the "generic" uploading away. You can
always rely on that. The issue w/ one-shot is that it's not possible
(literally) for many content types, e.g. Artifact-less content. It's
also hard for multi-artifact Content so that would probably still be
something plugin writers would provide as a custom thing for their
content type. Regardless it's just not possible to have consistency in
this area.
Why is it not possible to create a one-shot upload for artifact-less
content? (maybe we're defining what a one-shot upload actually is
differently, i'm reading it as something that combines multiple steps
into one)
Why is consistency not possible? I guess i don't see a huge variation of
upload scenarios beyond:
1. upload Zero to many files as artifacts
2. Provide some metadata about the zero or more artifacts or let the
plugin parse it out themselves (or maybe even a combination of the two)
3. Import that unit into a repository.
I can see it being difficult as a user to go through all of those steps
(even if 2 & 3 were combined into one), and the desire is to simplify
the process, but uploading arbitrary files is not simple. Why do i
need to give up the plugin's ability to parse the unit's details because
i'm using the consistent api?
Keep in mind all my questions are coming from a very ignorant
perspective with respect of pulp3 internals, and more from a user
perspective.
My problem with single api calls to upload files is that we cannot
reliably use them due to limitation in request sizes. We have to
be prepared to use multiple calls to upload files regardless.
Maybe if a user is using some plugin that never has super large
files (ansible?) you could be confident you would never hit a
request size limitation. But file, docker, and yum all would
require multiple calls to get the physical data to the server.
I believe arbitrarily large files can be uploaded either through
multi-part form data or through the django-chunked interface. We'll
see what happens with 4196, but I expect arbitrary payload size to be
a requirement for Pulp users.
I care more about having a consistent method for uploading files
than having fewer api calls. If we need a some content specific
api, that's fine, but please make it a consistent part of the
process.
It sounds like the 4-call interface is the only choice then if
consistency is a must. There isn't a way to offer consistency for
one-shot uploaders. Is it ok that Katello will have to fill out all of
the field data when you post the content type? What could be better?
I'll reserve my comments here based on the discussion above.
Thanks!
Justin
I feel like we may be chasing the wrong goal here (fewer calls vs
a more consistent experience).
That said, I think we should discuss this as a community to
encourage plugins to behave similarly, and because there may also
be a possibility for sharing some of code. It is my hope that a
"one shot upload" could do 2 things: 1) Upload and create
Content. 2) Optionally add that content to repositories.
_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com>
https://www.redhat.com/mailman/listinfo/pulp-dev
_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com>
https://www.redhat.com/mailman/listinfo/pulp-dev
_______________________________________________
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev