On Jan 20, 2015, at 5:35 PM, Xueming Shen <xueming.s...@oracle.com> wrote:
> On 1/20/15 8:17 AM, Paul Sandoz wrote: >> Hi, >> >> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8069325-Pattern-splitAsStream-emptyInput/webrev/ >> >> This patch fixes an edge case in Pattern.splitAsStream for matching against >> an empty input string, which deviated from the behaviour of Pattern.split. >> When there are no matches a stream containing the input string should be >> returned rather than an empty stream. >> >> -- >> >> I have kept compatibility with Pattern.split(String ) but i noticed another >> an edge case. >> >> What should the following return: >> >> Pattern.compile("").split("") >> >> [] or [""]? >> >> There is a zero-width match at the beginning and an empty remaining segment >> both of which should be discarded, as such i would expect the result to be >> [] rather than as [""], as currently produced result. > > It may depend on how the "trailing empty string" gets interpreted. Is it > possible to interpret it as the > empty string is the result of the "substring from the beginning 0-width match > and the end of the input > sequence", any thing after that is "trailing"? > Seems a stretch to me. Consider the following which returns []: Pattern.compile("x").split("x"); Replace "x" with "" and intuitively i would expect the same behaviour. > It would be clear if the spec explicitly said, the result of splitting an > empty input is an empty string. > > I would assume someone, mostly the user of String.split(), will get hit by > this "incompatible" change. > Yeah, there is some risk in that. Paul.