Hi Jaikiran,
Thanks for taking this on and getting it started.
One use case of canonical storage is repeatable builds.
It would be useful to identify the uses in the JDK that would need to be
changed to use the new function.
On 8/24/21 10:07 AM, Jaikiran Pai wrote:
The java.util.Properties class allows the properties to be written out
to a stream or through a writer. In its current form, the
specification of these APIs state that a comment comprising of the
current date is always written out. The spec doesn't make any
guarantees about the order of the properties when they are written out.
There have been requests asking to make these APIs more deterministic.
These requests, from what I have seen, mainly ask for:
- A way to disable writing out the date comment
- A way to write out the properties in a deterministic and
reproducible way
There have been discussions in the mailing list in the past which have
been captured in JDK-8231640[1]. In these discussions, there has been
an inclination to not touch the current existing API implementations
and instead introduce new API(s) to achieve the proposed use cases.
Before starting off with an implementation, I wanted to try and get
some inputs on what the new API(s) would look like and what the scope
of such a work should be.
Right now, the Properties class has 2 "store" APIs:
public void store(Writer writer, String comments) throws IOException
public void store(OutputStream out, String comments) throws
IOException
I don't think two methods are needed, its easy enough for the caller to
adapt an OutputStream to a Writer
(OutputStreamWriter) and take control of the encoding, so the
OutputStream version is not essential.
For the sake of this discussion we won't be talking about the "save"
API, in that same class, which has been deprecated.
As part of this enhancement, one of the proposed API name is
"storeCanonical". As a start, I plan to use this name for the new API.
Is there any different suggestion for the name?
As for the API signature, I plan to introduce 2 new APIs:
public void storeCanonical(Writer writer, String comments) throws
IOException
public void storeCanonical(OutputStream out, String comments)
throws IOException
to match what the "store" variants currently provide. The first
difference between these "storeCanonical" and the "store" will be the
trivial part where these new "storeCanonical" implementations will no
longer write out the date comment. Users are allowed to pass optional
comments to these new APIs. If any comment is passed to these APIs,
these will continue to be written out first, before the actual
properties are written out.
'storeCanonical' captures the semantics.
Speaking of optional comments, should the APIs accept an instance of
java.util.Optional for the comments parameter. Perhaps:
public void storeCanonical(Writer writer, Optional<String>
comments) throws IOException
public void storeCanonical(OutputStream out, Optional<String>
comments) throws IOException
Optional is overkill here, using null for no comment is conventional and
matches the current usage
in the store(..) methods.
Coming to the part where we write out the properties, these APIs will
write out the properties in the lexicographical order of the property
keys. An additional enhancement perhaps could be to allow users to
pass in an optional java.util.Comparator instance to provide for
application specific ordering of the property keys while being written
out by these APIs. I am not too sure if we should introduce that. Any
inputs? If we do introduce it, we would end up with 4 new APIs:
public void storeCanonical(Writer writer, Optional<String>
comments) throws IOException
public void storeCanonical(OutputStream out, Optional<String>
comments) throws IOException
public void storeCanonical(Writer writer, Optional<String>
comments, Comparator<String> keyOrderer) throws IOException
public void storeCanonical(OutputStream out, Optional<String>
comments, Comparator<String> keyOrderer) throws IOException
Canonical usually already means a non-variable encoding, that seems the
inconsistent with
providing a Comparator.
However, it should be a goal that properties stored with storeCanonical
can be
loaded with load().
Is that worth it?
Finally, the other semantics, like the property key value separators,
how/where newlines are inserted, what character encoding is used
etc... will continue to match with the current semantics of the
"store" APIs.
If a client has the need for a custom format, its quite easy to iterate
over the contents,
sorting if desires and writing the format itself.
A custom format would not be usable with Properties.load.
Simpler is better,
I think a single new method should suffice.
Thanks, Roger
[1] https://bugs.openjdk.java.net/browse/JDK-8231640
-Jaikiran