Perhaps we can "split" this discussion on splitting into a separate thread.  What's happened here is what always happens, which is:

 - Jim spent a lot of time and effort writing a comprehensive and clear proposal;
 - Someone made a tangential comment on one aspect of it;
 - Flood of deep-dive responses on that aspect;
 - Everyone chimes in designing their favorite method not proposed;
 - No one ever comes back to the substance of the proposal.

Hitting the reset button...


On 3/14/2018 9:11 AM, Peter Levart wrote:
I think that:

String delim = ...;
String r = s.splits(Pattern.quote(delim)).collect(Collectors.joining(delim));

... should always produce a result such that r.equals(s);


Otherwise, is it wise to add methods that take a regex as a String? It is rarely needed for a regex parameter to be dynamic. Usually a constant is specified. Are there any plans for Java to support Pattern constants? With constant dynamic support they would be trivial to implement in bytecode. If there are any such plans, then the methods should perhaps take a Pattern instead.

syntax suggestion:

'~' is an unary operator for bit-wise negation of integer values. It could be overloaded for String(s) such that the following two were equivalent:

~ string
Pattern.compile(string)

Now if 'string' above is a constant, '~ string' could be a constant too. Combined with raw string literals, Pattern constants could be very compact.


What do you think?

Regards, Peter

On 03/14/2018 02:35 AM, Xueming Shen wrote:
On 3/13/18, 5:12 PM, Jonathan Bluett-Duncan wrote:
Paul,

AFAICT, one sort of behaviour which String.split() allows which
Pattern.splitAsStream() and the proposed String.splits() don't is allowing
a negative limit, e.g. String.split(string, -1).

Over at http://errorprone.info/bugpattern/StringSplitter, they argue that a limit of -1 has less surprising behaviour than the default of 0, because
e.g. "".split(":") produces [] (empty array), whereas ":".split(":")
produces [""] (array with an empty string), which IMO is not consistent.

This compares with ":".split(":", -1) and "".split(":", -1) which produce ["", ""] (array with two empty strings, each representing ends of `:`) and
[] (empty array) respectively - more consistent IMO.

Should String.splits(`\n|\r\n?`) follow the behaviour of String.split(...,
0) or String.split(..., -1)?  I'd personally argue for the latter.

While these look really confusing, but ":".split(":", n) and "".split(":", n) are really two different scenario. One is for a matched delimiter and the other is a split with no matched delimiter, in which the spec specifies clearly that it returns the original string, in this case the empty string "". Arguably these two don't have to be "consistent".

Personally I think the returned list/array from string.split(regex, -1) might be kinda of "surprising" for end user, in which it has a "trailing" empty string, as it appears to be useless in most use scenario and you probably have to do some special deal with it.

-Sherman




Cheers,
Jonathan

On 13 March 2018 at 23:22, Paul Sandoz<paul.san...@oracle.com>  wrote:


On Mar 13, 2018, at 3:49 PM, John Rose<john.r.r...@oracle.com>  wrote:

On Mar 13, 2018, at 6:47 AM, Jim Laskey<james.las...@oracle.com>  wrote:
…
A. Line support.

public Stream<String>  lines()

Suggest factoring this as:

public Stream<String>  splits(String regex) { }
+1

This is a natural companion to the existing array returning method (as it was the case on Pattern when we added splitAsStream), where one can use a limit() operation to achieve the same effect as the limit parameter on the
array returning method.


public Stream<String>  lines() { return splits(`\n|\r\n?`); }

See also Files/BufferedReader.lines. (Without going into details
Files.lines has some interesting optimizations.)

Paul.



Reply via email to