Re: StringJoiner: detect or fix delimiter collision?

2014-02-06 Thread Philip Hodges
String is probably the most commonly used class in the whole of Java.
Please please please do not pollute it with this dangerous new method.

Your Java-is-cool example is ridiculously trivial, of no practical use 
whatsoever, and absolutely *not* cool.
How about a proper test where you have to double up the delimiter if it occurs 
in an element:
assertEquals('Java','isn''t','perfect', '+String.join(',', 
javaIsNotPerfect)+');

I now have a working proof of concept, where newJoiner returns an 
UnsplittableJoiner which requires a delimiter handler to convert it back into a 
Joiner. Actually my joiner is an ObjectJoiner, unlike your StringJoiner that is 
just a CharSequenceJoiner, unlike AbstractCollections.toString and 
Arrays.toString which can join any elements that have a toString method.

Please have the guts to pull String.join from the release now, before it 
becomes an antipattern, and you have to deprecate it.
And tell everybody exactly why.
Seriously.


On 27 Jan 2014, at 21:31, Philip Hodges philip.hod...@bluewin.ch wrote:

 I did not predict that it would be a sprintf. It's not going to be 
 consistently misused anything like so frequently.
 I compared it to the unescaped XML generation antipattern.
 
 I have not seen any technical justifications whatsoever so far, just 
 inexplicable enthusiasm.
 
 It is like giving young drivers a faster car with no seat belts.
 
 The trouble is, unlike in Google guava, this is the JDK, you can't just drop 
 flawed beta interfaces two versions later. All you can do is add them to 
 static review tool blacklists and deprecate them to cause a warning at 
 compile time, and hope that people will give them a wide berth. Even if you 
 hide them in an unoffical sun.com package. By the way, I'm looking forward to 
 the proprietary Base64 being changed to simply call the new official one. You 
 can't even modify an equals method in an undocumented reflection 
 implementation in 1.7.0.51 without breaking production applications. Wow.
 
 I am raising doubt and you are ignoring it.
 Don't you have the guts to say whoa, stop the bandwagon, there could just be 
 a real problem with this, it really will make it easier to create bugs 
 without any warning from the compiler and without making anyone's code better 
 *in any way at all*?
 
 I am picking on String.join precisely because I have seen way too many 
 delimiter collision bugs, not just in Java but in several other languages, 
 and because this interface perpetuates a real world problem by preventing 
 future interface changes to detect them.
 
 [I am not always a doomsayer. For example, I am not picking on JSR310 because 
 Date and Calendar and SimpleDateFormat are well known disaster areas and the 
 people working on their replacement have a deep understanding of the issues, 
 have really taken their time, and nothing whatsoever that I have read about 
 it jumped out at me and said this is *wrong*. It might not even be completely 
 perfect. But I am confident it will be so much better than what we have now, 
 and it's a shame that I won't have time to migrate existing apps to it.]
 
 The counterproposal, no, proposal refinement, wafting around inside my head, 
 is to somehow compel programmers to make just one more method call before 
 that final toString(). It will be difficult to find good names, especially 
 ones that will be understood by programmers for whom English is not a first 
 language. Something like:
 
 String.join(delimiter, joinables).assertNoUnescapedDelimiters().toString();
 String.join(delimiter, joinables).neverMindDelimiterCollisions().toString();
 String.join(delimiter, joinables).promiseNoUnescapedDelimiters().toString();
 String.join(delimiter, joinables).escapeDelimiters(escapeChar).toString();
 String.join(delimiter, joinables).quoteElements(quoteChar).toString();
 
 
 The vital thing is that String.join has to return an unjoinable, that needs 
 an adapter method to make it safely joinable. If you get that right, then we 
 can forgive this first shot being a little slow, and enjoy the triumph of 
 CharSequence over immutability.
 
 Yes, ultimately the goal should be to add full support for at least the most 
 popular csv generation recipes.
 
 I'm really sorry I couldn't carry on arguing the case before August. As a 
 minority, I only have one person's quota of energy. I will try to get some 
 more people to look at it.
 
 
 On 27 Jan 2014, at 18:44, Mike Duigou mike.dui...@oracle.com wrote:
 
 
 On Jan 26 2014, at 17:12 , Philip Hodges philip.hod...@bluewin.ch wrote:
 
 Please please please drop StringJoiner from Java 1.8 before it is too late.
 
 It is well past too late. The API has been frozen since August for all but 
 the most exceptional cases.
 
 At first I thought they were cool. Then I tried to use them in anger.
 And was forced to roll my own.
 
 I would encourage you to share your implementation or the javadocs as 
 grist for discussion.  An actual alternative is the *only* thing

Re: StringJoiner: detect or fix delimiter collision?

2014-01-30 Thread Philip Hodges
I did not predict that it would be a sprintf. It's not going to be 
consistently misused anything like so frequently.
I compared it to the unescaped XML generation antipattern.

I have not seen any technical justifications whatsoever so far, just 
inexplicable enthusiasm.

It is like giving young drivers a faster car with no seat belts.

The trouble is, unlike in Google guava, this is the JDK, you can't just drop 
flawed beta interfaces two versions later. All you can do is add them to static 
review tool blacklists and deprecate them to cause a warning at compile time, 
and hope that people will give them a wide berth. Even if you hide them in an 
unoffical sun.com package. By the way, I'm looking forward to the proprietary 
Base64 being changed to simply call the new official one. You can't even modify 
an equals method in an undocumented reflection implementation in 1.7.0.51 
without breaking production applications. Wow.

I am raising doubt and you are ignoring it.
Don't you have the guts to say whoa, stop the bandwagon, there could just be a 
real problem with this, it really will make it easier to create bugs without 
any warning from the compiler and without making anyone's code better *in any 
way at all*?

I am picking on String.join precisely because I have seen way too many 
delimiter collision bugs, not just in Java but in several other languages, and 
because this interface perpetuates a real world problem by preventing future 
interface changes to detect them.

[I am not always a doomsayer. For example, I am not picking on JSR310 because 
Date and Calendar and SimpleDateFormat are well known disaster areas and the 
people working on their replacement have a deep understanding of the issues, 
have really taken their time, and nothing whatsoever that I have read about it 
jumped out at me and said this is *wrong*. It might not even be completely 
perfect. But I am confident it will be so much better than what we have now, 
and it's a shame that I won't have time to migrate existing apps to it.]

The counterproposal, no, proposal refinement, wafting around inside my head, is 
to somehow compel programmers to make just one more method call before that 
final toString(). It will be difficult to find good names, especially ones that 
will be understood by programmers for whom English is not a first language. 
Something like:

String.join(delimiter, joinables).assertNoUnescapedDelimiters().toString();
String.join(delimiter, joinables).neverMindDelimiterCollisions().toString();
String.join(delimiter, joinables).promiseNoUnescapedDelimiters().toString();
String.join(delimiter, joinables).escapeDelimiters(escapeChar).toString();
String.join(delimiter, joinables).quoteElements(quoteChar).toString();


The vital thing is that String.join has to return an unjoinable, that needs an 
adapter method to make it safely joinable. If you get that right, then we can 
forgive this first shot being a little slow, and enjoy the triumph of 
CharSequence over immutability.

Yes, ultimately the goal should be to add full support for at least the most 
popular csv generation recipes.

I'm really sorry I couldn't carry on arguing the case before August. As a 
minority, I only have one person's quota of energy. I will try to get some more 
people to look at it.


On 27 Jan 2014, at 18:44, Mike Duigou mike.dui...@oracle.com wrote:

 
 On Jan 26 2014, at 17:12 , Philip Hodges philip.hod...@bluewin.ch wrote:
 
 Please please please drop StringJoiner from Java 1.8 before it is too late.
 
 It is well past too late. The API has been frozen since August for all but 
 the most exceptional cases.
 
 At first I thought they were cool. Then I tried to use them in anger.
 And was forced to roll my own.
 
 I would encourage you to share your implementation or the javadocs as grist 
 for discussion.  An actual alternative is the *only* thing that is likely 
 to be persuasive.
 
 You seem to be very much in the minority on this issue by the silence from 
 other responders on this list and elsewhere. The concerns you've highlighted 
 with regard to delimiter management, while certainly valid, have not been of 
 sufficient concern to suggest a change in the proposal. The prediction that 
 using StringJoiner will become Java's sprintf seems unlikely to be be true. 
 The reception we've seen thus far for StringJoiner has been otherwise 
 exclusively enthusiastic and positive. 
 
 Alternatives, refinements and counterproposals are always welcome in the 
 discussion. Doomsaying unfortunately adds little.
 
 Mike



Re: StringJoiner: detect or fix delimiter collision?

2014-01-30 Thread Philip Hodges
Please please please drop StringJoiner from Java 1.8 before it is too late.

It is not needed. It does not add value. It is an embarrassment.
We did without it for years. It is not long long overdue. We do not need it now.
It does not need to be in the very first Java 1.8 release.
Try leaving it out, and see if people even complain.
Most people won't even notice it, let alone be ready to misuse it straight away.

String.join is slower than Arrays.toString() or Collections.toString(). Test 
failed.
String.join(delimiter, iterable) needs extra code to call toString() if 
elements are not already CharSequence. Convenience? Test failed.
StringJoiner does not force or even remind the developer to consider delimiter 
collisions. Test failed.

It is a perfect complement for split: the hard to use interface that can't 
split the joined input reliably because it might have too many delimiters.

It will be the newest star on java antipattern web sites. A good place would be 
just after the item about not Assembling XML with String operations because 
the text parts might contain special characters.

Even if it ever did appear that StringJoiner and String.join did what we wanted,
I would still much rather roll my own than use the new StringJoiner in the 
library.
Then I could set the initial capacity of the StringBuilder, skip the defensive 
copies and null checks, and above all, introduce some extra mandatory method 
call to force the developer to tell the compiler what is being done about 
delimiter collisions, even if it is to add an assert that there are no 
delimiters in the added CharSequences, or to waive that check because numbers 
never never contain comma delimiters (hold on ... better check the locale to 
see if comma is used as a decimal point or grouping character after all).

I came across this non-justification [with my additions] on a blog:

StringJoiner and String.join(...) are long, long overdue. They are so long 
overdue that the vast majority of Java developers likely have already written 
or have found [or have had to mend] utilities for joining strings [to make an 
unsplittable delimiter collision], but it is nice [or at least well-meant] for 
the JDK to finally provide this itself. Everyone has encountered situations 
where joining strings is required [and they forgot all about escaping, or 
should have used a PreparedStatement instead], and it is a Good Thing™ that we 
can now express that [bug opportunity] through a standard API that every [no, 
just some] Java developer (eventually) will know [of, but not actually use, and 
even if they do they will need an IDE to tell them that the prefix comes 
*after* the delimiter].



 To quote Joshua Bloch on good API design: 'When in doubt leave it out' (or 
 'You can always add, but you can never remove')



 
 
 On 31 Aug 2013, at 23:13, Mike Duigou mike.dui...@oracle.com wrote:
 
 
 On Aug 30 2013, at 23:40 , Philip Hodges wrote:
 ...
 
 Why is there such a bandwagon rolling for this convenience feature?
 
 Perhaps others just don't agree with you. The choice of functionality offered 
 in the JDK 8 StringJoiner was not arbitrary or haphazardly chosen. If it 
 doesn't meet your particular needs, I am sorry to hear that. Our belief as 
 the proposers is that it offers a reasonable mix of functionality and 
 simplicity to cover useful number of requirements. Our intent was certainly 
 not to create the kitchen magician [1] of string joiners that attempted to 
 incorporate every conceivable option. In particular your concerns about 
 escaping are out of scope for the joiner in part because
 
 CollectionString strings;
 
 String concat = 
 strings.stream().map(MyUtils::escape).collect(Collections.joining(,));
 
 seems like an adequate and flexible solution to most people.
 
 We already have joiners in apache commons and guava.
 
 This is the reason that StringJoiner was considered for JDK--that many others 
 were rolling their own.
 
 At first I thought they were cool. Then I tried to use them in anger.
 And was forced to roll my own.
 
 I would encourage you to share your implementation or the javadocs as grist 
 for discussion.  An actual alternative is the *only* thing that is likely to 
 be persuasive.
 
 That can't be right.
 
 Sometimes it is. The JDK doesn't make any attempt to satisfy everyone (or 
 anyone). We all end up writing lots of non-business logic utility functions 
 to bridge gaps in what JDK offers. This is normal. And, if it turns out to be 
 something lots of people need then it's entirely possible that it could be 
 added to the JDK.
 
 Mike
 
 [1] https://www.youtube.com/watch?v=cGVG9xLHb84
 
 
 A more elaborate offically blessed feature that
 only does half the job is worse than useless.
 Without the extra complex ability to detect or avoid collisions
 it is neither nice, nor a Good Thing.
 
 Phil
 
 http://www.techempower.com/blog/2013/03/26/everything-about-java-8/
 
 StringJoiner and String.join(...) are long, long overdue

Re: StringJoiner: detect or fix delimiter collision?

2013-09-03 Thread Philip Hodges
I don't think my needs are in any way particular or unusual. This is the third 
attempt at a StringJoiner that has not stopped me having to roll my own. 
Arrays.toString and AbstractCollection.toString already meet logging needs and 
serve as simple useful examples for where people don't care about delimiter 
collision.

Your strings/stream/map/escape/collect/joining idiom is impressive and might 
well be adequate and flexible. It is also quite beyond incomprehensible to 
many. Even I don't immediately see what the stream and collect methods 
contribute to the processing, I will look it up later.

A lot of those will simply not think about delimiter collision until it bites 
them in production, because your joiner does not by default insist on either a 
waiver or an escaper or a checker for separators in the collection where you 
wrote map(escape). Maybe they will join some numbers first, then copy that code 
to join some identifiers, and then copy that code to join some names, and it 
will work fine. Until one of the names contains an apostrophe or comma. Or 
they will quickly join a few lines for a web page, separated by a br tag, and 
forget to convert the lines to HTML, or verify that each line is a valid and 
safe HTML subset fragment.

This feature as it stands adds precious little convenience value, while making 
it easier for people to forget to provide any escaping at all. You are missing 
a golden opportunity to ensure that the code only compiles if it is complete 
with a waiver, detector or escaper. I've seen the consequences of escaping 
being left out all too often.

My own rolled code consists of a looper, an appender, and a replacer. I found 
it best if the appender method that chooses and injects the separator also 
inserts the quotes and chooses and calls the escaper (replacer). The code is 
clear apart from the replacer implementation, which has to use the charAt idiom 
instead of foreach to iterate the string without the defensive copy, and has 
inline code to escape the escape character itself as well as the quote 
character. If I was doing it over again, I would try to estimate and pass an 
initial capacity for the StringBuilder, and see if I really have to break the 
append chain in order to pass that same StringBuilder to the replacer instead 
of letting it create its own. I would also change the TODO use guava joiner 
comment to just say *see* guava joiner and see StringJoiner since java 1.8 :( 
It's more than persuasive enough to drive me to keep trying to pass on my 
doubts in this forum, but I would rather spare you the javadocs and 
implementation.

Phil

To quote Joshua Bloch on good API design: 'When in doubt leave it out' (or 'You 
can always add, but you can never remove').



On 31 Aug 2013, at 23:13, Mike Duigou mike.dui...@oracle.com wrote:


On Aug 30 2013, at 23:40 , Philip Hodges wrote:
 ...
 
 Why is there such a bandwagon rolling for this convenience feature?

Perhaps others just don't agree with you. The choice of functionality offered 
in the JDK 8 StringJoiner was not arbitrary or haphazardly chosen. If it 
doesn't meet your particular needs, I am sorry to hear that. Our belief as the 
proposers is that it offers a reasonable mix of functionality and simplicity to 
cover useful number of requirements. Our intent was certainly not to create the 
kitchen magician [1] of string joiners that attempted to incorporate every 
conceivable option. In particular your concerns about escaping are out of scope 
for the joiner in part because

CollectionString strings;

String concat = 
strings.stream().map(MyUtils::escape).collect(Collections.joining(,));

seems like an adequate and flexible solution to most people.

 We already have joiners in apache commons and guava.

This is the reason that StringJoiner was considered for JDK--that many others 
were rolling their own.

 At first I thought they were cool. Then I tried to use them in anger.
 And was forced to roll my own.

I would encourage you to share your implementation or the javadocs as grist for 
discussion.  An actual alternative is the *only* thing that is likely to be 
persuasive.

 That can't be right.

Sometimes it is. The JDK doesn't make any attempt to satisfy everyone (or 
anyone). We all end up writing lots of non-business logic utility functions to 
bridge gaps in what JDK offers. This is normal. And, if it turns out to be 
something lots of people need then it's entirely possible that it could be 
added to the JDK.

Mike

[1] https://www.youtube.com/watch?v=cGVG9xLHb84

 
 A more elaborate offically blessed feature that
 only does half the job is worse than useless.
 Without the extra complex ability to detect or avoid collisions
 it is neither nice, nor a Good Thing.
 
 Phil
 
 http://www.techempower.com/blog/2013/03/26/everything-about-java-8/
 
 StringJoiner and String.join(...) are long, long overdue. They are so long 
 overdue that the vast majority of Java developers likely have already written

Re: StringJoiner: detect or fix delimiter collision?

2013-09-03 Thread Philip Hodges

Are we really going ahead with an implementation that:
- checks even string literal parameters for null
- does it again when String.join calls StringJoiner
- makes defensive copies of just some of the arguments
- creates a StringBuilder with only the default capacity

People would be better off taking a look at
AbstractCollection.toString or Arrays.toString and then,
if they haven't already, roll their own, enhanced with a
StringBuilder capacity, and without all the defensive baggage.

To come back to my original point about why it is a toy interface:

The only worthwhile value that StringJoiner can possibly add is a way
to force programmers to make a conscious decision to waive checking
to see if the separator (or prefix or suffix) can or does occur
in any of the elements, and to apply an escaper/quoter if needed.

Please add some failing round trip unit tests with split to
help raise awareness of the issue of delimiter collision.

The original String elements cannot be extracted from the result
if there are no elements, one empty element, any null elements,
or if the separator occurs in any of the elements.

Please add a test of String.join with delimiter and no varargs elements,
and another test with one null varargs element. I don't feel comfortable
with a signature that does not clearly separate the delimiter from the elements.

Please put an example with escaping in the javadoc.

You might want to change that ridiculous Java-is-cool example.
Of course there are a lot of cool things about Java, and even
String.join cannot compete with acknowledged uncool features
such as Calendar, mutable Dates and unthreadsafe Formatters.
Nevertheless, the example text risks becoming ironic.

Please take a proper critical look at whether there is any
genuine need to dump this half-baked toy into everyone's jdk.
There are some good ideas in Java 8. This is not one of them.

It will soon be starring behind the scenes in CERT advisories.
Once people start using it, even if you do accept and try to fix the
flaws, you will then have two versions in circulation for many years.
And then I will be able to say I told you so, instead of
thank goodness we woke up and stopped it so we could get it right first time.

Why is there such a bandwagon rolling for this convenience feature?

It is not long overdue.
Arrays.toString and AbstractCollection.toString have been there for years.
We already have joiners in apache commons and guava.
At first I thought they were cool. Then I tried to use them in anger.
And was forced to roll my own. That can't be right.

A more elaborate offically blessed feature that
only does half the job is worse than useless.
Without the extra complex ability to detect or avoid collisions
it is neither nice, nor a Good Thing.

Phil

http://www.techempower.com/blog/2013/03/26/everything-about-java-8/

StringJoiner and String.join(...) are long, long overdue. They are so long overdue that the vast majority of Java 
developers likely have already written or have found utilities for joining strings, but it is nice for the JDK to 
finally provide this itself. Everyone has encountered situations where joining strings is required, and it is a Good 
Thing™ that we can now express that through a standard API that every Java developer (eventually) will know. 



On 2013-07-23 22:09, Mike Duigou wrote:


On Jul 23 2013, at 12:43 , ph wrote:


didn't see delimiter collision mentioned anywhere - are you really offering
an interface that only joins a list of entries without escaping or quoting
or even checking for separators (or escape or quote sequences) that might
occur in the entries?


Correct. StringJoiner makes no effort to address escaping/quoting.

 Doing so would add a lot of complexity that would not
 useful interesting to most users and probably still wouldn't satisfy everyone.


If you wish some form of escaping or quoting you can pre-processed entries 
however you like before joining them.

Mike