Re: Which release artifact should we expect to be reproducible?

2023-10-19 Thread Mark Thomas

On 19/10/2023 03:17, Christopher Schultz wrote:



But Mark, if you missed my message from the 13th, you'll see that the 
problem is I'm running a slightly different version of Java than you 
are, and the exact spelling of the version string is causing the problem 
-- mostly in MANIFEST.MF files because the whole JRE's version string is 
present in there and not just the version number.


I did see that but filed it under the known issue that JARs that don't 
get passed through BND end up with the Ant and Java version numbers in 
the manifest. Fixing that is on my TODO list.


A recent commit of mine adds the release version number (only) to the 
build.properties.release file so it can be checked for a match in 
verify-release. I wonder if we should check the full version string to 
ensure the verifier and releaser are using the exact same versions. 
That's really the only way to prevent someone from attempting to verify 
a release and claiming it's not reproducible for not-relevant reasons.


With the current build process, I agree with you that we need to check 
the exact Java version used. I'm hopeful that with the manifest fix, we 
could create repeatable builds with different Java/Ant versions. How 
different I'm not sure. Hopefully within a major Java version. If we are 
lucky, across major Java versions.


And I'd very much like to make it next-to-trivial for anyone to verify a 
release build.


+1

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-19 Thread Emmanuel Bourg

Le 19/10/2023 à 04:17, Christopher Schultz a écrit :

But Mark, if you missed my message from the 13th, you'll see that the 
problem is I'm running a slightly different version of Java than you 
are, and the exact spelling of the version string is causing the problem 
-- mostly in MANIFEST.MF files because the whole JRE's version string is 
present in there and not just the version number.


I think the Created-By field should be removed. I've got a quick look at 
the 11.0.0-M13 release and the manifests in tomcat-*.jar don't have it. 
I've found it only in bootstrap.jar and in the external dependencies.


Emmanuel Bourg


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-18 Thread Christopher Schultz

Mark,

On 10/18/23 11:43, Mark Thomas wrote:

On 18/10/2023 15:06, Konstantin Kolinko wrote:

ср, 18 окт. 2023 г. в 14:55, Mark Thomas :


On 17/10/2023 16:36, Mark Thomas wrote:


It looks like Javadoc generation is different between Linux and Windows
with Java 21. That is still causing issues for the full-docs package 
for

Tomcat 11. I'm still looking into options for fixing that. Other than
that, I'm not seeing any reproducibility issues for those files.


I've got as far as figuring out what is causing the problem.

This commit

https://github.com/openjdk/jdk/commit/e9f3e325c274f19b0f6eceea2367708e3be689e9

causes the files from $JAVA_HOME/legal/jdk.javadoc to be added to the
legal directory in the created javadoc. In Linux, some of those files
are symlinks so the entire file gets copied whereas in Windows some of
those files are text files that reference the symlink target.

I am currently leaning towards writing an Ant task that will replace
those "link" files on Windows with the target of the link. It will need
to run after the Javadoc.


Maybe this will be fixed in JDK itself?


It looks like it should be.


Essentially their fix for "8259530" (the commit that you referenced)
is incomplete on Windows,
and that is a legal issue.


+1


BTW, Reviewing that commit, I see that there exists a command-line
option, "--legal-notices" that can be set to "none".

BTW, the files can be seen in apache-tomcat-11.0.0-M13-fulldocs.tar.gz
e.g. \tomcat-11.0-doc\api\legal\LICENSE is the following one nonsense 
line:


 Please see ..\java.base\LICENSE


So, do we try and fix this to get back to completely reproducible builds 
or do we accept that the full-docs package isn't reproducible until this 
bug gets fixed?


Given this is just the full-docs, I'm leaning towards raising an OpenJDK 
bug and accepting that the full-docs package won;t be 100% reproducible 
at the moment.


+1

In the "verify-release" ant target, I'm already ignoring the fulldocs 
artifact, though I am /checking/ it before ignoring the result.


But Mark, if you missed my message from the 13th, you'll see that the 
problem is I'm running a slightly different version of Java than you 
are, and the exact spelling of the version string is causing the problem 
-- mostly in MANIFEST.MF files because the whole JRE's version string is 
present in there and not just the version number.


A recent commit of mine adds the release version number (only) to the 
build.properties.release file so it can be checked for a match in 
verify-release. I wonder if we should check the full version string to 
ensure the verifier and releaser are using the exact same versions. 
That's really the only way to prevent someone from attempting to verify 
a release and claiming it's not reproducible for not-relevant reasons.


And I'd very much like to make it next-to-trivial for anyone to verify a 
release build.


-chris

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-18 Thread Mark Thomas

On 18/10/2023 16:43, Mark Thomas wrote:

On 18/10/2023 15:06, Konstantin Kolinko wrote:

ср, 18 окт. 2023 г. в 14:55, Mark Thomas :


On 17/10/2023 16:36, Mark Thomas wrote:


It looks like Javadoc generation is different between Linux and Windows
with Java 21. That is still causing issues for the full-docs package 
for

Tomcat 11. I'm still looking into options for fixing that. Other than
that, I'm not seeing any reproducibility issues for those files.


I've got as far as figuring out what is causing the problem.

This commit

https://github.com/openjdk/jdk/commit/e9f3e325c274f19b0f6eceea2367708e3be689e9

causes the files from $JAVA_HOME/legal/jdk.javadoc to be added to the
legal directory in the created javadoc. In Linux, some of those files
are symlinks so the entire file gets copied whereas in Windows some of
those files are text files that reference the symlink target.

I am currently leaning towards writing an Ant task that will replace
those "link" files on Windows with the target of the link. It will need
to run after the Javadoc.


Maybe this will be fixed in JDK itself?


It looks like it should be.


Essentially their fix for "8259530" (the commit that you referenced)
is incomplete on Windows,
and that is a legal issue.


+1


BTW, Reviewing that commit, I see that there exists a command-line
option, "--legal-notices" that can be set to "none".

BTW, the files can be seen in apache-tomcat-11.0.0-M13-fulldocs.tar.gz
e.g. \tomcat-11.0-doc\api\legal\LICENSE is the following one nonsense 
line:


 Please see ..\java.base\LICENSE


So, do we try and fix this to get back to completely reproducible builds 
or do we accept that the full-docs package isn't reproducible until this 
bug gets fixed?


Given this is just the full-docs, I'm leaning towards raising an OpenJDK 
bug and accepting that the full-docs package won;t be 100% reproducible 
at the moment.


https://bugs.openjdk.org/browse/JDK-8318469

I'm not currently planning to fix this for Tomcat. I think it will only 
effect Tomcat 11 at the moment although it looks like the partial fix is 
going to be back-ported to Java 11 so we'll see likely see this issue 
for all versions eventually.


Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-18 Thread Mark Thomas

On 18/10/2023 15:06, Konstantin Kolinko wrote:

ср, 18 окт. 2023 г. в 14:55, Mark Thomas :


On 17/10/2023 16:36, Mark Thomas wrote:


It looks like Javadoc generation is different between Linux and Windows
with Java 21. That is still causing issues for the full-docs package for
Tomcat 11. I'm still looking into options for fixing that. Other than
that, I'm not seeing any reproducibility issues for those files.


I've got as far as figuring out what is causing the problem.

This commit

https://github.com/openjdk/jdk/commit/e9f3e325c274f19b0f6eceea2367708e3be689e9

causes the files from $JAVA_HOME/legal/jdk.javadoc to be added to the
legal directory in the created javadoc. In Linux, some of those files
are symlinks so the entire file gets copied whereas in Windows some of
those files are text files that reference the symlink target.

I am currently leaning towards writing an Ant task that will replace
those "link" files on Windows with the target of the link. It will need
to run after the Javadoc.


Maybe this will be fixed in JDK itself?


It looks like it should be.


Essentially their fix for "8259530" (the commit that you referenced)
is incomplete on Windows,
and that is a legal issue.


+1


BTW, Reviewing that commit, I see that there exists a command-line
option, "--legal-notices" that can be set to "none".

BTW, the files can be seen in apache-tomcat-11.0.0-M13-fulldocs.tar.gz
e.g. \tomcat-11.0-doc\api\legal\LICENSE is the following one nonsense line:

 Please see ..\java.base\LICENSE


So, do we try and fix this to get back to completely reproducible builds 
or do we accept that the full-docs package isn't reproducible until this 
bug gets fixed?


Given this is just the full-docs, I'm leaning towards raising an OpenJDK 
bug and accepting that the full-docs package won;t be 100% reproducible 
at the moment.


Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-18 Thread Konstantin Kolinko
ср, 18 окт. 2023 г. в 14:55, Mark Thomas :
>
> On 17/10/2023 16:36, Mark Thomas wrote:
>
> > It looks like Javadoc generation is different between Linux and Windows
> > with Java 21. That is still causing issues for the full-docs package for
> > Tomcat 11. I'm still looking into options for fixing that. Other than
> > that, I'm not seeing any reproducibility issues for those files.
>
> I've got as far as figuring out what is causing the problem.
>
> This commit
>
> https://github.com/openjdk/jdk/commit/e9f3e325c274f19b0f6eceea2367708e3be689e9
>
> causes the files from $JAVA_HOME/legal/jdk.javadoc to be added to the
> legal directory in the created javadoc. In Linux, some of those files
> are symlinks so the entire file gets copied whereas in Windows some of
> those files are text files that reference the symlink target.
>
> I am currently leaning towards writing an Ant task that will replace
> those "link" files on Windows with the target of the link. It will need
> to run after the Javadoc.

Maybe this will be fixed in JDK itself?

Essentially their fix for "8259530" (the commit that you referenced)
is incomplete on Windows,
and that is a legal issue.


BTW, Reviewing that commit, I see that there exists a command-line
option, "--legal-notices" that can be set to "none".

BTW, the files can be seen in apache-tomcat-11.0.0-M13-fulldocs.tar.gz
e.g. \tomcat-11.0-doc\api\legal\LICENSE is the following one nonsense line:

Please see ..\java.base\LICENSE

Best regards,
Konstantin Kolinko

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-18 Thread Mark Thomas

On 17/10/2023 16:36, Mark Thomas wrote:

It looks like Javadoc generation is different between Linux and Windows 
with Java 21. That is still causing issues for the full-docs package for 
Tomcat 11. I'm still looking into options for fixing that. Other than 
that, I'm not seeing any reproducibility issues for those files.


I've got as far as figuring out what is causing the problem.

This commit

https://github.com/openjdk/jdk/commit/e9f3e325c274f19b0f6eceea2367708e3be689e9

causes the files from $JAVA_HOME/legal/jdk.javadoc to be added to the 
legal directory in the created javadoc. In Linux, some of those files 
are symlinks so the entire file gets copied whereas in Windows some of 
those files are text files that reference the symlink target.


I am currently leaning towards writing an Ant task that will replace 
those "link" files on Windows with the target of the link. It will need 
to run after the Javadoc.


Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-17 Thread Mark Thomas
I've fixed a couple of issues that were breaking reproducibility for 
some files. I'll back-port those as soon as I have sent this mail.


It looks like Javadoc generation is different between Linux and Windows 
with Java 21. That is still causing issues for the full-docs package for 
Tomcat 11. I'm still looking into options for fixing that. Other than 
that, I'm not seeing any reproducibility issues for those files.


Mark


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-13 Thread Christopher Schultz

Emmanuel,

On 10/12/23 18:13, Emmanuel Bourg wrote:

Le 12/10/2023 à 23:27, Christopher Schultz a écrit :

I installed the ZIP version of Temurin Java 21 to match your release 
toolchain and I get every file being different. But the versions are 
not exactly the same, so that may be the reason:


Release Java: 21+25-2513
Local Java:   21+35-LTS

I'm also using Cp1252 instead of UTF-8 (ew). I'll try to change that 
and see if it changes anything.


Did you try comparing the files with diffoscope [1]? That would allow 
you to quickly see what varies and prevents the build from being 
reproducible.


It looks like it does come down to the exact JDK being used. The summary 
of differences for, example, apache-tomcat-11.0.0-M13.tar.gz is:


│ │ │ -Created-By: 21+35-2513 (Oracle Corporation)
│ │ │ +Created-By: 21+35-LTS (Eclipse Adoptium)

Over and over again in MANIFEST.MF files.

So it does look like version string changes can be an issue.

-chris

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-12 Thread Emmanuel Bourg

Le 12/10/2023 à 23:27, Christopher Schultz a écrit :

I installed the ZIP version of Temurin Java 21 to match your release 
toolchain and I get every file being different. But the versions are not 
exactly the same, so that may be the reason:


Release Java: 21+25-2513
Local Java:   21+35-LTS

I'm also using Cp1252 instead of UTF-8 (ew). I'll try to change that and 
see if it changes anything.


Did you try comparing the files with diffoscope [1]? That would allow 
you to quickly see what varies and prevents the build from being 
reproducible.


Emmanuel Bourg

[1] https://diffoscope.org

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-12 Thread Christopher Schultz

Mark,

On 10/12/23 15:50, Christopher Schultz wrote:

Mark,

On 10/12/23 13:15, Mark Thomas wrote:

12 Oct 2023 10:29:02 Christopher Schultz :


All,

I've been working on an "ant verify-release" target and I'm finding 
that in the 9.0 release -- the one I'm using as a guinea pig -- the 
SHA-512 hashes do not match for these artifacts:


  apache-tomcat-9.0.82-fulldocs.tar.gz
  apache-tomcat-9.0.82-src.tar.gz
  apache-tomcat-9.0.82-src.zip

They have different file sizes. The *-src artifacts seem to be off 
only by a few bytes (of file size, I haven't compared the contents 
yet) but the fulldocs are quite different.


I'm thinking that maybe these artifacts aren't expected to match 100% 
but I'm not entirely sure. If it's possible to get these to be 
reproducible, I think it would be good.


I did notice that the build contains  in many places and in 
some places we are converting to CRLF and LF in others. Sometimes we 
are using UTF-8 and ISO-8859-1 in others. These are always specified, 
so I wouldn't expect there to be a problem in these areas with 
reproducibility (because they are consistently inconsistent).


Building the fulldocs tar looks like we do not perform a fixcrlf on 
all files that will go into the archive, so if Rémy built on Linux 
(he did) and I verified on Windows (I did) I think maybe the 
line-endings are the problem.


Do we want these artifacts to be 100% reproducible? If so, we have a 
little bit of work to do.


With the exact same version of Ant and the exact same JVM version and 
vendor the builds should be repeatable.


I'm using the exact same versions of the JDK and ant as Rémy, though it 
is on a different platform. Should be expect cross-platform 
repeatability? I should hope so. The other release artifacts I didn't 
mention are all identical (e.g. binary tarballs, .zips, and .exes).


I have checked repeatability across Linux / Windows for some versions 
and it was OK.


Might need to diff the build.xml files to see if they have diverged.


I have committed my verify-release ant target to main. Please have a 
look and see if you spot any errors in the implementation. I definitely 
got different sha512 sums for the above 3 files when I performed the 
build locally. NOTE: The verify-release target currently *ignores* the 
checks the the above files on the off-chance it was intentional. But the 
build will perform the checks and issue a notification... before telling 
you that the build was perfect when it wasn't.


Since the tarball and .exes were identical, I reported the build as 
"repeatable" for the vote.


I'm not yet able to test for repeatability for 11.0.x because I haven't 
yet installed Java 21 on my Windows VM. Chocolatey doesn't yet have that 
package and I'd prefer to use that to the standard packages from 
Eclipse/Temurin/Adoptium/whatever because they are far easier to update.


I installed the ZIP version of Temurin Java 21 to match your release 
toolchain and I get every file being different. But the versions are not 
exactly the same, so that may be the reason:


Release Java: 21+25-2513
Local Java:   21+35-LTS

I'm also using Cp1252 instead of UTF-8 (ew). I'll try to change that and 
see if it changes anything.


-chris

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-12 Thread Christopher Schultz

Mark,

On 10/12/23 13:15, Mark Thomas wrote:

12 Oct 2023 10:29:02 Christopher Schultz :


All,

I've been working on an "ant verify-release" target and I'm finding 
that in the 9.0 release -- the one I'm using as a guinea pig -- the 
SHA-512 hashes do not match for these artifacts:


  apache-tomcat-9.0.82-fulldocs.tar.gz
  apache-tomcat-9.0.82-src.tar.gz
  apache-tomcat-9.0.82-src.zip

They have different file sizes. The *-src artifacts seem to be off 
only by a few bytes (of file size, I haven't compared the contents 
yet) but the fulldocs are quite different.


I'm thinking that maybe these artifacts aren't expected to match 100% 
but I'm not entirely sure. If it's possible to get these to be 
reproducible, I think it would be good.


I did notice that the build contains  in many places and in 
some places we are converting to CRLF and LF in others. Sometimes we 
are using UTF-8 and ISO-8859-1 in others. These are always specified, 
so I wouldn't expect there to be a problem in these areas with 
reproducibility (because they are consistently inconsistent).


Building the fulldocs tar looks like we do not perform a fixcrlf on 
all files that will go into the archive, so if Rémy built on Linux (he 
did) and I verified on Windows (I did) I think maybe the line-endings 
are the problem.


Do we want these artifacts to be 100% reproducible? If so, we have a 
little bit of work to do.


With the exact same version of Ant and the exact same JVM version and 
vendor the builds should be repeatable.


I'm using the exact same versions of the JDK and ant as Rémy, though it 
is on a different platform. Should be expect cross-platform 
repeatability? I should hope so. The other release artifacts I didn't 
mention are all identical (e.g. binary tarballs, .zips, and .exes).


I have checked repeatability across Linux / Windows for some versions 
and it was OK.


Might need to diff the build.xml files to see if they have diverged.


I have committed my verify-release ant target to main. Please have a 
look and see if you spot any errors in the implementation. I definitely 
got different sha512 sums for the above 3 files when I performed the 
build locally. NOTE: The verify-release target currently *ignores* the 
checks the the above files on the off-chance it was intentional. But the 
build will perform the checks and issue a notification... before telling 
you that the build was perfect when it wasn't.


Since the tarball and .exes were identical, I reported the build as 
"repeatable" for the vote.


I'm not yet able to test for repeatability for 11.0.x because I haven't 
yet installed Java 21 on my Windows VM. Chocolatey doesn't yet have that 
package and I'd prefer to use that to the standard packages from 
Eclipse/Temurin/Adoptium/whatever because they are far easier to update.


-chris

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Which release artifact should we expect to be reproducible?

2023-10-12 Thread Mark Thomas

12 Oct 2023 10:29:02 Christopher Schultz :


All,

I've been working on an "ant verify-release" target and I'm finding 
that in the 9.0 release -- the one I'm using as a guinea pig -- the 
SHA-512 hashes do not match for these artifacts:


  apache-tomcat-9.0.82-fulldocs.tar.gz
  apache-tomcat-9.0.82-src.tar.gz
  apache-tomcat-9.0.82-src.zip

They have different file sizes. The *-src artifacts seem to be off only 
by a few bytes (of file size, I haven't compared the contents yet) but 
the fulldocs are quite different.


I'm thinking that maybe these artifacts aren't expected to match 100% 
but I'm not entirely sure. If it's possible to get these to be 
reproducible, I think it would be good.


I did notice that the build contains  in many places and in 
some places we are converting to CRLF and LF in others. Sometimes we 
are using UTF-8 and ISO-8859-1 in others. These are always specified, 
so I wouldn't expect there to be a problem in these areas with 
reproducibility (because they are consistently inconsistent).


Building the fulldocs tar looks like we do not perform a fixcrlf on all 
files that will go into the archive, so if Rémy built on Linux (he did) 
and I verified on Windows (I did) I think maybe the line-endings are 
the problem.


Do we want these artifacts to be 100% reproducible? If so, we have a 
little bit of work to do.


With the exact same version of Ant and the exact same JVM version and 
vendor the builds should be repeatable.


I have checked repeatability across Linux / Windows for some versions and 
it was OK.


Might need to diff the build.xml files to see if they have diverged.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org