----- Mail original ----- > De: "John Rose" <john.r.r...@oracle.com> > À: "Remi Forax" <fo...@univ-mlv.fr> > Cc: "Paul Sandoz" <paul.san...@oracle.com>, "nio-dev" > <nio-...@openjdk.java.net>, "core-libs-dev" > <core-libs-dev@openjdk.java.net> > Envoyé: Mercredi 2 Mai 2018 07:35:38 > Objet: Re: Hashing files/bytes <was> Re: RFR(JDK11/NIO) 8202285: (fs) Add a > method to Files for comparing file contents
> Here's another potential stacking: > > Define an interface ByteSequence, similar to CharSequence, > as a zero-copy reference to some stored bytes somewhere. > (Give it a long length.) Define bulk methods on it like hash > and mismatch and transferTo. Then make File and ByteBuffer > implement it. Deal with the cross-product of source and > destination types underneath the interface. > > (Also I want ByteSequence as a way to encapsulate resource > data for class files and condy, using zero-copy methods. > The types byte[] and String don't scale and require copies.) your ByteSequence is ByteBuffer ! a ByteBuffer can be a mapped file or wrapped a byte array, mismatch is compareTo, transferTo is put(ByteBuffer), and hash should be messageDigest.digest(ByteBuffer) which doesn't exist but should. > > — John Rémi > > On May 1, 2018, at 3:04 PM, fo...@univ-mlv.fr wrote: >> >> ----- Mail original ----- >>> De: "Paul Sandoz" <paul.san...@oracle.com> >>> À: "Remi Forax" <fo...@univ-mlv.fr> >>> Cc: "Alan Bateman" <alan.bate...@oracle.com>, "nio-dev" >>> <nio-...@openjdk.java.net>, "core-libs-dev" >>> <core-libs-dev@openjdk.java.net> >>> Envoyé: Mardi 1 Mai 2018 00:37:57 >>> Objet: Hashing files/bytes <was> Re: RFR(JDK11/NIO) 8202285: (fs) Add a >>> method >>> to Files for comparing file contents >> >>> Thanks, better then i expected with the transferTo method we recently >>> added, but >>> i think we could do even better for the ease of use case of “give me the >>> hash >>> of this file contents or these bytes or this byte buffer". >> >> yes, it can be a nice addition to java.nio.file.Files and in that case the >> method that compare content can have reference in its documentation to this >> new >> method. >> >>> >>> Paul. >> >> Rémi >> >>> >>>> On Apr 30, 2018, at 3:23 PM, Remi Forax <fo...@univ-mlv.fr> wrote: >>>> >>>>> >>>>> To Remi’s point this might dissuade/guide developers from using this >>>>> method when >>>>> there are other more efficient techniques available when operating at >>>>> larger >>>>> scales. However, it is unfortunately harder that it should be in Java to >>>>> hash >>>>> the contents of a file, a byte[] or ByteBuffer, according to some chosen >>>>> algorithm (or a good default). >>>> >>>> it's 6 lines of code >>>> >>>> var digest = MessageDigest.getInstance("SHA1"); >>>> try(var input = Files.newInputStream(Path.of("myfile.txt")); >>>> var output = new DigestOutputStream(OutputStream.nullOutputStream(), >>>> digest)) { >>>> input.transferTo(output); >>>> } >>>> var hash = digest.digest(); >>>> >>>> or 3 lines if you don't mind to load the whole file in memory >>>> >>>> var digest = MessageDigest.getInstance("SHA1"); >>>> digest.update(Files.readAllBytes(Path.of("myfile.txt"))); >>>> var hash = digest.digest(); >>>> >>>>> >>>>> Paul. >>>> > >>> Rémi