Re: RFR: 8276743: Make openjdk build Zip Archive generation "reproducible"

Erik Joelsson Tue, 09 Nov 2021 09:35:23 -0800

On Tue, 9 Nov 2021 17:26:05 GMT, Erik Joelsson <er...@openjdk.org> wrote:


>> @erikj79 so had a bit of a think, and part of the unzipping.. then 
>> re-gen'ing is not having to load all the entries into memory. You can't 
>> guarantee the order "zip" has created them in, so realistically i'd have to 
>> read all the ZipEntry's into memory, then re-write.. which we can do.. 
>> src.zip is only 55MB or so, so memory requirements won't be huge given 
>> src.zip is the only target here currently.
>
> You are already keeping all the filenames in memory for sorting, so reading 
> up the ZipEntry:s isn't that much more data, just some extra metadata for 
> each entry. The actual file contents is not part of the ZipEntry object. When 
> actually copying the files, you can use the ZipFile class to access 
> ZipEntry's in arbitrary order to read their streams as InputStream.

Actually, you don't even need to save the ZipEntry:s in memory, you can just 
extract filenames from them on the first pass, sort them, then lookup the 
entries in ZipFile again on the second lap. :) I don't think that's necessary 
though.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6311

Re: RFR: 8276743: Make openjdk build Zip Archive generation "reproducible"

Reply via email to