On Tue, Jul 11, 2017 at 1:20 PM, Brian Bouterse <[email protected]> wrote:
> We should not raise a validation exception because due to semver we can't > stop raising that exception later. Specifically if we want to ever allow > double checksum being specified in the future. > > For the MVP, I think the choices are: only respect sha256 and ignore the > rest OR do as many standard validators as they specify. I'm ok with either > of these with a slight preference for validating as many of them as are > specified would be the best. Since we're already handling all of the data > at save time, pushing it through additional digest validators will cost a > bit of cpu but not much additional I/O. Also if the user asked us to do > that validation then having it demand some additional resources is OK. Also > having feature parity with the changesets is good. The changesets can > validate all standard digest validators so having uploads do the same would > be consistent. > I am reversing what I had previously said, and I agree that we should not raise an exception if a user provides more than one checksum at upload time. My current implementation checks every checksum specified by the user. > On Mon, Jul 10, 2017 at 5:11 PM, Jeff Ortel <[email protected]> wrote: > >> >> >> On 07/10/2017 02:36 PM, Dennis Kliban wrote: >> > On Mon, Jul 10, 2017 at 3:26 PM, Michael Hrivnak <[email protected] >> <mailto:[email protected]>> wrote: >> > >> > >> > >> > On Mon, Jul 10, 2017 at 3:06 PM, Dennis Kliban <[email protected] >> <mailto:[email protected]>> wrote: >> > >> > The upload API for Artifacts is going to allow users to specify >> the artifact size and a digest. The >> > Artifact model currently supports 'md5', 'sha1', 'sha224', >> 'sha256', 'sha384', and 'sha512' digests. >> > >> > Do we want to let users specify more than one digest per >> upload? e.g. md5 and sha256? >> > >> > >> > There may be no harm in this, but it would add complexity to the >> verification and not add much value. I'd >> > stick with just one unless there's a compelling reason for multiple. >> > >> > >> > I agree. The API is going to raise a validation exception when more >> than 1 digest is provided. >> >> +1 >> >> > >> > >> > >> > >> > >> > Do we want to store all 6 digests for each Artifact? >> > >> > >> > The expensive part of calculating the digests is reading the file. >> As long as you're already reading the >> > entire file, which we will during verification, you may as well >> stuff the bits through multiple hashers >> > (digesters?) and get all the digests. Pulp 2 has a function that >> does this: >> > >> > https://github.com/pulp/pulp/blob/2.13-release/server/pulp/ >> server/util.py#L327-L353 >> > <https://github.com/pulp/pulp/blob/2.13-release/server/ >> pulp/server/util.py#L327-L353> >> > >> > But we can't always guarantee that we'll have all the checksums >> available, for at least two reasons. 1) If >> > in the future if we want to use yet another algorithm, we probably >> won't want to run a migration that >> > re-reads every file and calculates the additional digest. 2) For >> on-demand content, we don't have it >> > locally, so we can't calculate any additional checksums until it >> gets fetched. >> > >> > So this may be one of those times where we use a good-ole-fashioned >> getter method that returns the >> > requested digest if it's on the artifact, calculates it if not, or >> raises an exception if the value isn't >> > available and can't be calculated. >> > >> > >> > For uploaded Artifacts, all of the digests will be calculated as the >> file is being processed during the >> > upload. So I don't think calculating all of them should incur >> significantly more cost than just one. The code >> > snippet from Pulp 2 looks similar to what I am doing. >> >> This functionality should be a method on the Artifact and not a util >> function somewhere. >> >> > >> > I haven't given much thought to the getter, but your idea sounds fine >> to me. >> > Thanks, >> > Dennis >> > >> > >> > >> > >> > -- >> > >> > Michael Hrivnak >> > >> > Principal Software Engineer, RHCE >> > >> > Red Hat >> > >> > >> > >> > >> > _______________________________________________ >> > Pulp-dev mailing list >> > [email protected] >> > https://www.redhat.com/mailman/listinfo/pulp-dev >> > >> >> >> _______________________________________________ >> Pulp-dev mailing list >> [email protected] >> https://www.redhat.com/mailman/listinfo/pulp-dev >> >> > > _______________________________________________ > Pulp-dev mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-dev > >
_______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
