Hello Stuart,
On 09/09/21 12:35 am, Stuart Marks wrote:
Unless there's an overriding reason, it might be nice to have the
output format match the format used in the Debian patch that adds
SOURCE_DATE_EPOCH:
https://salsa.debian.org/openjdk-team/openjdk/-/blob/master/debian/patches/reproducible-properties-timestamp.diff
So the current patch implementation uses the format "d MMM yyyy
HH:mm:ss 'GMT'", with a Locale.ROOT (for locale neutral formatting).
I chose this format since that was the one that the (deprecated)
java.util.Date#toGMTString() was using.
Roger's suggestion is to use DateTimeFormatter#RFC_1123_DATE_TIME
date format which is "dow, d MMM yyyy HH:mm:ss GMT" (where dow == day
of week)
IMO, either of these formats are "well known", since they are/were
used within the JDK, especially the
DateTimeFormatter#RFC_1123_DATE_TIME which Roger suggested, since
that's even a public spec.
The one in the debian patch is "yyyy-MM-dd HH:mm:ss z" which although
is fine to use, it however feels a bit "less known".
I was leaning towards Roger's suggestion to use the
RFC_1123_DATE_TIME in my upcoming patch update. Is there a reason why
the one in debian's patch is preferable compared to a spec backed
format?
My point in bringing this is up is to consider interoperability. I
don't have a strong preference over the particular date format. As far
as I can see, there are currently two behaviors "in the wild":
1) Baseline OpenJDK 17 behavior:
dow mon dd hh:mm:ss zzz yyyy
This is the behavior provided by "new Date().toString()" and has
likely not changed in many years. Of course, the actual values reflect
the current time and locale, which hurts reproducibility, but the
format itself hasn't changed.
2) Debian's OpenJDK with SOURCE_DATE_EPOCH set:
yyyy-MM-dd HH:mm:ss z
The question is, what format should the JDK-8231640 use?
I had said earlier that it might be a good idea to match the Debian
format. But thinking about this further, I think sticking with the
original JDK format would be preferable. The Debian change is after
all an outlier.
So the more specific question is, should we try to continue with the
original JDK format or choose a format that's "better" in some sense?
It seems to me that if there's something out there that parses the
date from a properties file, we'd want to avoid breaking this code if
the environment variable is set. So maybe stick with the original
format in all cases. But of course for reproducibility use the epoch
value from the environment and set the locale and zone offset to known
values.
All this makes sense. So I've updated the PR to continue to use the same
date format that java.util.Date.toString() was using, in both the cases
- while writing out the current date (in absence of SOURCE_DATE_EPOCH)
and while writing out the SOURCE_DATE_EPOCH. While writing out the
SOURCE_DATE_EPOCH, we will however fix the timezone to UTC and locale to
ROOT for reproducibility.
-Jaikiran