If the uploaded artifact is an ISO at 2.5G how does this impact the choice?

-- bk

On 06/27/2017 09:36 AM, Brian Bouterse wrote:
I thought that we pulled out the chunking uploads from the MVP. IIRC, @jortel and I thought since that use case was for high performing (parallel) uploads and it should be on the 3.1+ page.

+1 to just sending data without having a file handle. If the entire file is delivered in one request then having a file ID to upload to in a second request is just cumbersome. +1 to having the handler receiving that file just make it an Artifact() right away. This will work better with how Django handles file uploads.

I also think we can skip making one Artifact from another. That is not going to be a commonly used use case I think. So removing that use case and chunking that would be:

  * As an authenticated user, I can upload a file which becomes an
    Artifact. At the end up the of upload, the server returns the JSON
    representation of the created Artifact.
  * As an authenticated user, I can create a content unit by providing
    the content type, its Artifacts using IDs for each Artifact, and the
    metadata supplied in the POST body. This call is atomic, content
    unit is created in the database and on the filesystem or not at all.

The biggest reason I think to do this adjustment is to aligns with the users desire to have uploads take fewer calls. This removes at least two calls from the workflow. It also avoids having to save the data multiple times which I don't think we can do practically.

Thoughts or ideas?

-Brian

On Tue, Jun 27, 2017 at 8:55 AM, Dennis Kliban <[email protected] <mailto:[email protected]>> wrote:

    My motivations for writing this email include: recent discussion
    about pulp 2 upload API in #pulp and django's documentation on file
    uploads.

    Files uploaded to Django are initially stored in memory (if under
    2.5 mb) or Python's tempfile module is used to write it to /tmp/
    directory. The file created in /tmp is deleted when and if the last
    file handle is closed.

    If we implement the upload API as described in the MVP doc[0], then
    according to Django docs[1] we will be performing a write to disk 2
    or 3 times for each upload. In cases where a file is bigger than
    2.5mb in size, it will be first written to /tmp. The same file will
    then be written to /var/lib/pulp/uploads (or similar location) when
    the FileUpload model is saved. A third write will occur when an
    artifact is created using the FileUpload. This third write will
    likely be a move though.

    I propose that we eliminate writing the uploaded file to
    /var/lib/pulp/upload and go directly to creating an artifact. The
    use cases can then be rewritten as the following:

      * As an authenticated user, I can upload a file with an optional
        chunk size, and an optional offset. At the end up the of upload
        the server returns the JSON representation of the artifact.


      * As an authenticated user, I can create a new artifact by
        specifying an existing artifact id.


      * As an authenticated user, I can create a content unit by
        providing the content type, its Artifacts using IDs for each
        Artifact, and the metadata supplied in the POST body. This call
        is atomic, content unit is created in the database and on the
        filesystem or not at all.




    [0]
    
https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product#Upload-amp-Copy
    
<https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product#Upload-amp-Copy>
    [1]
    
https://docs.djangoproject.com/en/1.9/topics/http/file-uploads/#handling-uploaded-files-with-a-model
    
<https://docs.djangoproject.com/en/1.9/topics/http/file-uploads/#handling-uploaded-files-with-a-model>

    _______________________________________________
    Pulp-dev mailing list
    [email protected] <mailto:[email protected]>
    https://www.redhat.com/mailman/listinfo/pulp-dev
    <https://www.redhat.com/mailman/listinfo/pulp-dev>




_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev


_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev

Reply via email to