Hi all,

It's been a while since the 1st round of reviews. Thank you all for the valuable comments and suggestions! Note that Roger's last response went to nio-dev, but not core-libs-dev, I've therefore attached it below.

CSR [2]: the CSR is now approved. Note the write method has been changed to writeString.

Impl [3]: for performance reason and the different requirement from byte-string conversion for malformed/unmappable character handing, the implementation uses a specific method separate from the existing ones. Both Sherman and Alan preferred specific method than adding parameters to the APIs that might be error prone. Thanks Sherman for the code!

JMH benchmark tests showed the new read method performs better than an operation that reads the bytes and convert to string. The gains start to be significant even at quite reasonable sizes (< 10K).


[1] JBS: https://bugs.openjdk.java.net/browse/JDK-8201276
[2] CSR: https://bugs.openjdk.java.net/browse/JDK-8202055
[3] webrevs: http://cr.openjdk.java.net/~joehw/jdk11/8201276/webrev/

Please review.

Thanks,
Joe

On 5/15/18, 7:51 AM, Roger Riggs wrote:
Hi Joe, et. al.

My $.02 on line separators:

- We should avoid clever tricks trying to solve problems that are infrequent such as cross OS file line operations. Most files will be read/written on a single system so the line separators will be as expected.

- Have/use APIs that split lines consistently accepting both line separators so developer code can be agnostic to line separators. aka BufferedReader.readLine for developers that are processing the contents *as lines*. Those other methods already exist; if there are any gaps in line oriented processing that's a separate task.

- These new file methods are defined to handle Charset encoding/decoding and buffering. Since there are other methods to deal with files as lines these methods should not look for or break to lines.

- Performance: adding code to look for line characters will slow it down and in most cases would have no impact
    since the line endings will match the system.

- The strings passed to writeString (CharSequence) should have been constructed using the proper line separators. There are any number of methods that insert the os specific line terminator and its easy to write File.separator as needed.

As for the method naming:

I'd prefer "writeString" as a familiar term that isn't stretched too far by making the argument be a CharSequence.
its fine to call a CharSequence a string (with a lower case s).

$.02, Roger


On 4/27/18 6:33 PM, Joe Wang wrote:


On 4/27/2018 12:50 PM, fo...@univ-mlv.fr wrote:
----- Mail original -----
De: "Joe Wang" <huizhe.w...@oracle.com>
À: "Remi Forax" <fo...@univ-mlv.fr>, "Alan Bateman" <alan.bate...@oracle.com> Cc: "nio-dev" <nio-...@openjdk.java.net>, "core-libs-dev" <core-libs-dev@openjdk.java.net>
Envoyé: Vendredi 27 Avril 2018 21:21:01
Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for reading/writing a string from/to a file
Hi Rémi, Alan,
Hi Joe,

I'm not sure we'd want to replace system dependent line separators with
'\n'. There are cases as you described where the replacement makes
sense. However, there are other cases too. For example, the purpose is to read, edit a text file and then write it back. If this is on Windows,
and if line separators are replaced with '\n', that would cause the
resulting file not formatted properly in for example Notepad. There are cases where users write text files on Windows using Java, and only found
the lines were not separated in Notepad.
I agree that why the counterpart of readString() that write the string should inserts the platform's line separator. BTW, i just found that those methods are not named writeString, or writeCharSequence, i think they should.

While readString() does not modify the original content (e.g. by replacing the platform's line separator with '\n'), write(String) won't either, by adding extra characters such as the line separator.

I would think interfaces shall read along with the parameters.
readString(Path) == read as a String from the specified Path (one could argue for readToString, readAsString, but we generally omit the preps) write(Path, CharSequence) == write the CharSequence to the file, since CharSequence is already in the method signature as a parameter, we probably don't want to add that to the name, otherwise it would read like repeating the word CharSequence.

It is in a similar situation as write(Path, Iterable) where it was defined as writeLines(Path, Iterable).


Files.write(Path, Iterable) is also specified to terminate each line
with the platform's line separator. If readString does the replacement,
it would be inconsistent.

Anyway, if we look for consistency the methods writeCharSequence should transform the \n in the CharSequence to the platform's line separator.

Files.write(Path, Iterable) is is not a counterpart of readString(), it's consistent with Files.lines() or Files.readLines() (or BufferedReader.readLine()) that all suppress the line separators. Anyway, Files.write(path, readString(path)::line) will be consistent if you replace the line separators or not because String.line() suppresses the line separators.

readString pairs with write(String), therefore it's more like Files.write(path, readString(path)) than readString(path)::line. The use case:

    String s = Files.read(path);
    Files.write(path, s.replace("/config/location", "/new/location"));

would then work as expected.

These two methods are one-off (open/read/write/close file) operation. write(String) therefore is not intended for adding/appending multiple Strings from other operations such as String.line(). If an app needs to put the result of String.line() or any other processes into a file using this method, it needs to take care of adding the line separator itself when necessary. "when necessary" would be a judgement call by the developer.

That said, if there's a consensus on the idea of terminating the string with a line separator, then readString method would need to strip it off to go with the write method.


I would think readString behaves like readAllBytes, that would simply read all content faithfully.
readString does an interpolation due to the Charset so it's not the real content, again, the idea is that developers should not have to care about the platform's line separators (they have more important things to think).

The other solution is to just not introduce those new methods, after all Files.lines().collect(Collectors.joining("\n")) already does the job, no ?

While there are many ways to do it, none is as straight-forward. "Read (entire) file to String"/"Write String to file" are popular requests from users. Read to string -> do some String manipulation -> write it back is such a simple use case, it really needs to be very easy to do as illustrated in the above code snippet.

Cheers,
Joe


Best,
Joe
regards,
Rémi

On 4/27/2018 4:43 AM, fo...@univ-mlv.fr wrote:
Hi Alan,

People do not read the documentation.
So adding something in the documentation about when a method should be used or
not is never a solution.

Here the user want a String and provides a charset so you have no way but to
decode the content to substitute the line separator.

cheers,
Rémi

----- Mail original -----
De: "Alan Bateman" <alan.bate...@oracle.com>
À: "Remi Forax" <fo...@univ-mlv.fr>, "Joe Wang" <huizhe.w...@oracle.com>
Cc: "nio-dev" <nio-...@openjdk.java.net>, "core-libs-dev"
<core-libs-dev@openjdk.java.net>
Envoyé: Vendredi 27 Avril 2018 13:34:12
Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for
reading/writing a string from/to a file
On 27/04/2018 12:29, Remi Forax wrote:
I think that having a readString that includes OS dependent line separators is a
mistake,
Java does a great job to try to shield the developer from the kind of things
that makes programs behave differently on different platforms.

readString should subtitute (\r)?\n to \n otherwise either people will do a call replace() which is less efficient or will learn the lesson the hard way.

raw string literal does the same substitution for the same reason.

Yes, there are several discussion points around this and somewhat timely
with multi-string support.

One thing that I think Joe will need in this API is some note to make it clearer what the intended usage is (as I think the intention is simple
cases with mostly single lines of text).

-Alan.


Reply via email to