Re: [Pulp-dev] pulp 3 upload API validation

Jeff Ortel Tue, 11 Jul 2017 11:24:07 -0700


On 07/11/2017 12:27 PM, Dennis Kliban wrote:
> On Tue, Jul 11, 2017 at 1:20 PM, Brian Bouterse <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     We should not raise a validation exception because due to semver we can't 
> stop raising that exception
>     later. Specifically if we want to ever allow double checksum being 
> specified in the future.
> 
>     For the MVP, I think the choices are: only respect sha256 and ignore the 
> rest OR do as many standard
>     validators as they specify. I'm ok with either of these with a slight 
> preference for validating as many of
>     them as are specified would be the best. Since we're already handling all 
> of the data at save time,
>     pushing it through additional digest validators will cost a bit of cpu 
> but not much additional I/O. Also
>     if the user asked us to do that validation then having it demand some 
> additional resources is OK. Also
>     having feature parity with the changesets is good. The changesets can 
> validate all standard digest
>     validators so having uploads do the same would be consistent.
> 
> 
> I am reversing what I had previously said, and I agree that we should not 
> raise an exception if a user
> provides more than one checksum at upload time. My current implementation 
> checks every checksum specified by
> the user. 
> 
>


I don't think I understand the purpose of an upload user specifying more than 1 
algorithm/digest for the
platform to validate the integrity of an uploaded file.  Why would the user 
specify an algorithm/digest they
don't trust?

>  
> 
>     On Mon, Jul 10, 2017 at 5:11 PM, Jeff Ortel <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> 
> 
>         On 07/10/2017 02:36 PM, Dennis Kliban wrote:
>         > On Mon, Jul 10, 2017 at 3:26 PM, Michael Hrivnak 
> <[email protected] <mailto:[email protected]>
>         <mailto:[email protected] <mailto:[email protected]>>> wrote:
>         >
>         >
>         >
>         >     On Mon, Jul 10, 2017 at 3:06 PM, Dennis Kliban 
> <[email protected] <mailto:[email protected]> <mailto:[email protected]
>         <mailto:[email protected]>>> wrote:
>         >
>         >         The upload API for Artifacts is going to allow users to 
> specify the artifact size and a digest. The
>         >         Artifact model currently supports  'md5', 'sha1', 'sha224', 
> 'sha256', 'sha384', and 'sha512' digests.
>         >
>         >         Do we want to let users specify more than one digest per 
> upload? e.g. md5 and sha256?
>         >
>         >
>         >     There may be no harm in this, but it would add complexity to 
> the verification and not add much value. I'd
>         >     stick with just one unless there's a compelling reason for 
> multiple.
>         >
>         >
>         > I agree. The API is going to raise a validation exception when more 
> than 1 digest is provided.
> 
>         +1
> 
>         >
>         >
>         >
>         >
>         >
>         >         Do we want to store all 6 digests for each Artifact?
>         >
>         >
>         >     The expensive part of calculating the digests is reading the 
> file. As long as you're already reading the
>         >     entire file, which we will during verification, you may as well 
> stuff the bits through multiple hashers
>         >     (digesters?) and get all the digests. Pulp 2 has a function 
> that does this:
>         >
>         >     
> https://github.com/pulp/pulp/blob/2.13-release/server/pulp/server/util.py#L327-L353
>         
> <https://github.com/pulp/pulp/blob/2.13-release/server/pulp/server/util.py#L327-L353>
>         >     
> <https://github.com/pulp/pulp/blob/2.13-release/server/pulp/server/util.py#L327-L353
>         
> <https://github.com/pulp/pulp/blob/2.13-release/server/pulp/server/util.py#L327-L353>>
>         >
>         >     But we can't always guarantee that we'll have all the checksums 
> available, for at least two reasons. 1) If
>         >     in the future if we want to use yet another algorithm, we 
> probably won't want to run a migration that
>         >     re-reads every file and calculates the additional digest. 2) 
> For on-demand content, we don't have it
>         >     locally, so we can't calculate any additional checksums until 
> it gets fetched.
>         >
>         >     So this may be one of those times where we use a 
> good-ole-fashioned getter method that returns the
>         >     requested digest if it's on the artifact, calculates it if not, 
> or raises an exception if the value isn't
>         >     available and can't be calculated.
>         >
>         >
>         > For uploaded Artifacts, all of the digests will be calculated as 
> the file is being processed during the
>         > upload. So I don't think calculating all of them should incur 
> significantly more cost than just one. The code
>         > snippet from Pulp 2 looks similar to what I am doing.
> 
>         This functionality should be a method on the Artifact and not a util 
> function somewhere.
> 
>         >
>         > I haven't given much thought to the getter, but your idea sounds 
> fine to me.
>         > Thanks,
>         > Dennis
>         >
>         >
>         >
>         >
>         >     --
>         >
>         >     Michael Hrivnak
>         >
>         >     Principal Software Engineer, RHCE
>         >
>         >     Red Hat
>         >
>         >
>         >
>         >
>         > _______________________________________________
>         > Pulp-dev mailing list
>         > [email protected] <mailto:[email protected]>
>         > https://www.redhat.com/mailman/listinfo/pulp-dev 
> <https://www.redhat.com/mailman/listinfo/pulp-dev>
>         >
> 
> 
>         _______________________________________________
>         Pulp-dev mailing list
>         [email protected] <mailto:[email protected]>
>         https://www.redhat.com/mailman/listinfo/pulp-dev 
> <https://www.redhat.com/mailman/listinfo/pulp-dev>
> 
> 
> 
>     _______________________________________________
>     Pulp-dev mailing list
>     [email protected] <mailto:[email protected]>
>     https://www.redhat.com/mailman/listinfo/pulp-dev 
> <https://www.redhat.com/mailman/listinfo/pulp-dev>
> 
>

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev

Re: [Pulp-dev] pulp 3 upload API validation

Reply via email to