Could you re-summarize what problem would be solved by not having a FileUpload model, and giving the Artifact model the ability to have partial data and no Content foreign key?
I understand the concern about where on the filesystem the data gets written and how many times, but I'm not seeing how that's related to whether we have a FileUpload model or not. Are we discussing two separate issues? 1) filesystem locations and copy efficiency, and 2) API design? Or is this discussion trying to connect them in a way I'm not seeing? On Tue, Jun 27, 2017 at 3:20 PM, Dennis Kliban <[email protected]> wrote: > On Tue, Jun 27, 2017 at 2:56 PM, Brian Bouterse <[email protected]> > wrote: > >> Picking up from @jortel's observations... >> >> +1 to allowing Artifacts to have an optional FK. >> >> If we have an Artifacts endpoint then we can allow for the deleting of a >> single artifact if it has no FK. I think we want to disallow the removal of >> an Artifact that has a foreign key. Also filtering should allow a single >> operation to clean up all unassociated artifacts by searching for FK=None >> or similar. >> >> Yes, we will need to allow the single call delivering a file to also >> specify the relative path, size, checksums etc. Since the POST body >> contains binary data we either need to accept this data as GET style params >> or use a multi-part MIME upload [0]. Note that this creation of an Artifact >> does not change the repository contents and therefore can be handled >> synchronously outside of the tasking system. >> >> +1 to the saving of an Artifact to perform validation >> >> [0]: https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html >> >> > >> -Brian >> > > I also support this optional FK for Artifacts and validation on save. We > should probably stick with accepting GET parameters for the MVP. Though > multi-part MIME support would be good to consider for 3.1+. > > >> >> On Tue, Jun 27, 2017 at 2:44 PM, Dennis Kliban <[email protected]> >> wrote: >> >>> On Tue, Jun 27, 2017 at 1:24 PM, Michael Hrivnak <[email protected]> >>> wrote: >>> >>>> >>>> On Tue, Jun 27, 2017 at 11:27 AM, Jeff Ortel <[email protected]> wrote: >>>> >>>>> >>>>> - The artifact FK to a content unit would need to become optional. >>>>> >>>>> - Need to add use cases for cleaning up artifacts not associated with >>>>> a content unit. >>>>> >>>>> - The upload API would need additional information needed to create an >>>>> artifact. Like relative path, size, >>>>> checksums etc. >>>>> >>>>> - Since (I assume) you are proposing uploading/writing directly to >>>>> artifact storage (not staging in a working >>>>> dir), the flow would need to involve (optional) validation. If >>>>> validation fails, the artifact must not be >>>>> inserted into the DB. >>>> >>>> >>>> Perhaps a decent middle ground would be to stick with the plan of >>>> keeping uploaded (or partially uploaded) files as a separate model until >>>> they are ready to be turned into a Content instance plus artifacts, and >>>> save their file data directly to somewhere within /var/lib/pulp/. It would >>>> be some path distinct from where Artifacts are stored. That's what I had >>>> imagined we would do anyway. Then as Dennis pointed out, turning that into >>>> an Artifact would only require a move operation on the same filesystem, >>>> which is super-cheap. >>>> >>>> >>> Would that address all the concerns? We'd write the data just once, and >>>> then move it once on the same filesystem. I haven't looked at django's >>>> support for this recently, but it seems like it should be doable. >>>> >>>> I was just looking at the dropbox API and noticed that they provide two >>> separate API endpoints for regular file uploads[0] (< 150mb) and large file >>> uploads[1]. It is the latter that supports chunking and requires using an >>> upload id. For the most common case they support uploading a file with one >>> API call. Our original proposal requires 2 for the same use case. Pulp API >>> users would appreciate having to only make one API call to upload a file. >>> >>> [0] https://www.dropbox.com/developers-v1/core/docs#files_put >>> [1] https://www.dropbox.com/developers-v1/core/docs#chunked-upload >>> >>> >>> >>>> -- >>>> >>>> Michael Hrivnak >>>> >>>> Principal Software Engineer, RHCE >>>> >>>> Red Hat >>>> >>>> _______________________________________________ >>>> Pulp-dev mailing list >>>> [email protected] >>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>> >>>> >>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>> >> > -- Michael Hrivnak Principal Software Engineer, RHCE Red Hat
_______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
