Hi Lin
On Apr 10, 2021, at 11:16 PM, Lin Zang 
<lz...@openjdk.java.net<mailto:lz...@openjdk.java.net>> wrote:


Dear @AlanBateman  and @LanceAndersen,

Thanks a lot for your review and comments!


We should look to see if it makes sense to use some of the more recent java 
features such as Record.

If we are adding a specific class to hold the header fields, perhaps using the 
builder pattern should be considered vs constructors.

I agree, we should finalize the API before moving forward. I am not quite 
fimilar with Record, I will do some investigation, trying to use it in this PR 
and discuss with you later.

Please look at:

The Jep https://openjdk.java.net/jeps/395

A simple tutorial

https://oracle.github.io/learning-library/oci-library/oci-hol/oci-java-app/workshops/freetier/?customTrackingParam=:ow:lp:pt:::RC_WWMK200728P00009:OCIJava_HOL&elqTrackId=7e168170255b485a917a5e6602097868&elqaid=100601&elqat=2&source=:ow:lp:cpo:::RC_WWMK200728P00012:LPD400088742+:ow:lp:cpo::&lab=lab-6-records#STEP1:AsimpleRecord

Chris and Julia’s video is also very good:

https://www.youtube.com/watch?v=pxiHwrEMEtI

Between the use of Record and/or the Builder pattern, assembling the header 
values can be done quite nicely without constructor overload.

If we are writing out these additional fields, what should we do with them when 
reading a gzip file back?

IMO, generally there should be check of header crc, and some other checks like 
the header format, magic number etc. May be we could implement it like the gnu 
gzip tool, which ingore contents of most header fields but verified the general 
header info.

Sorry, that what not my point  Your current proposal provides no wayto access 
these fields similar to for example ZipEntry.  If we are going to set these 
additional fields then we should we should provided a means to access the fields

Have you looked at other gzip api implementations to see what they provide in 
this area?

Please look at api’s such as commons-compress.  They provide access to these 
fields via the API in addition to being able to set the fields.  This is what I 
was referring to above.


I have looked at the gzip implementation at 
https://github.com/linzang/gzip/blob/distrotech-gzip/gzip.c#L1281, this method 
is use to generate header crc value for check. And there is some legal header 
check in this method.  And `file name` is used to save original file name in 
gzip when it is truncated. refer to 
https://github.com/linzang/gzip/blob/distrotech-gzip/zip.c#L37

In JDK, there is a usage of `file comment` in the gzip heap dump file generated 
by jcmd `jmap -dump` command.  at 
https://github.com/openjdk/jdk/blob/ff52f2989fd60ec8251eaf76f4c4b78f10d3e048/src/hotspot/share/services/heapDumperCompression.cpp#L127
 , and this info is used in testing like an hint for uncompression, please 
refer 
https://github.com/openjdk/jdk/blob/ff52f2989fd60ec8251eaf76f4c4b78f10d3e048/test/lib/jdk/test/lib/hprof/parser/GzipRandomAccess.java#L280.

And IMHO the behavior of adding information in `file comments` seems like  a 
general way to transfer extra information between compressor and uncompressor.


Is there anyone who relies on this additional header info? As I mentioned in an 
earlier comment, we should validate the changes against another implementation 
to ensure that we can read the data back correctly and that the additional 
header data that we write can be read by other tools.

May be we could have similar behavior? which just do some general header info 
check, calculate CRC of the header and ignore the real contents of most header 
fields. Do you think it is reasonable?

Adding additional support for these fields is fine, we just need to agree on 
the API changes to move forward prior to updating the PR

Thanks!
Lin

-------------

PR: https://git.openjdk.java.net/jdk/pull/3072

[cid:E1C4E2F0-ECD0-4C9D-ADB4-B16CA7BCB7FC@home]



Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
lance.ander...@oracle.com<mailto:lance.ander...@oracle.com>



Reply via email to