Re: StringJoiner: detect or fix delimiter collision?
String is probably the most commonly used class in the whole of Java. Please please please do not pollute it with this dangerous new method. Your Java-is-cool example is ridiculously trivial, of no practical use whatsoever, and absolutely *not* cool. How about a proper test where you have to double up the delimiter if it occurs in an element: assertEquals('Java','isn''t','perfect', '+String.join(',', javaIsNotPerfect)+'); I now have a working proof of concept, where newJoiner returns an UnsplittableJoiner which requires a delimiter handler to convert it back into a Joiner. Actually my joiner is an ObjectJoiner, unlike your StringJoiner that is just a CharSequenceJoiner, unlike AbstractCollections.toString and Arrays.toString which can join any elements that have a toString method. Please have the guts to pull String.join from the release now, before it becomes an antipattern, and you have to deprecate it. And tell everybody exactly why. Seriously. On 27 Jan 2014, at 21:31, Philip Hodges philip.hod...@bluewin.ch wrote: I did not predict that it would be a sprintf. It's not going to be consistently misused anything like so frequently. I compared it to the unescaped XML generation antipattern. I have not seen any technical justifications whatsoever so far, just inexplicable enthusiasm. It is like giving young drivers a faster car with no seat belts. The trouble is, unlike in Google guava, this is the JDK, you can't just drop flawed beta interfaces two versions later. All you can do is add them to static review tool blacklists and deprecate them to cause a warning at compile time, and hope that people will give them a wide berth. Even if you hide them in an unoffical sun.com package. By the way, I'm looking forward to the proprietary Base64 being changed to simply call the new official one. You can't even modify an equals method in an undocumented reflection implementation in 1.7.0.51 without breaking production applications. Wow. I am raising doubt and you are ignoring it. Don't you have the guts to say whoa, stop the bandwagon, there could just be a real problem with this, it really will make it easier to create bugs without any warning from the compiler and without making anyone's code better *in any way at all*? I am picking on String.join precisely because I have seen way too many delimiter collision bugs, not just in Java but in several other languages, and because this interface perpetuates a real world problem by preventing future interface changes to detect them. [I am not always a doomsayer. For example, I am not picking on JSR310 because Date and Calendar and SimpleDateFormat are well known disaster areas and the people working on their replacement have a deep understanding of the issues, have really taken their time, and nothing whatsoever that I have read about it jumped out at me and said this is *wrong*. It might not even be completely perfect. But I am confident it will be so much better than what we have now, and it's a shame that I won't have time to migrate existing apps to it.] The counterproposal, no, proposal refinement, wafting around inside my head, is to somehow compel programmers to make just one more method call before that final toString(). It will be difficult to find good names, especially ones that will be understood by programmers for whom English is not a first language. Something like: String.join(delimiter, joinables).assertNoUnescapedDelimiters().toString(); String.join(delimiter, joinables).neverMindDelimiterCollisions().toString(); String.join(delimiter, joinables).promiseNoUnescapedDelimiters().toString(); String.join(delimiter, joinables).escapeDelimiters(escapeChar).toString(); String.join(delimiter, joinables).quoteElements(quoteChar).toString(); The vital thing is that String.join has to return an unjoinable, that needs an adapter method to make it safely joinable. If you get that right, then we can forgive this first shot being a little slow, and enjoy the triumph of CharSequence over immutability. Yes, ultimately the goal should be to add full support for at least the most popular csv generation recipes. I'm really sorry I couldn't carry on arguing the case before August. As a minority, I only have one person's quota of energy. I will try to get some more people to look at it. On 27 Jan 2014, at 18:44, Mike Duigou mike.dui...@oracle.com wrote: On Jan 26 2014, at 17:12 , Philip Hodges philip.hod...@bluewin.ch wrote: Please please please drop StringJoiner from Java 1.8 before it is too late. It is well past too late. The API has been frozen since August for all but the most exceptional cases. At first I thought they were cool. Then I tried to use them in anger. And was forced to roll my own. I would encourage you to share your implementation or the javadocs as grist for discussion. An actual alternative is the *only* thing
Re: StringJoiner: detect or fix delimiter collision?
I did not predict that it would be a sprintf. It's not going to be consistently misused anything like so frequently. I compared it to the unescaped XML generation antipattern. I have not seen any technical justifications whatsoever so far, just inexplicable enthusiasm. It is like giving young drivers a faster car with no seat belts. The trouble is, unlike in Google guava, this is the JDK, you can't just drop flawed beta interfaces two versions later. All you can do is add them to static review tool blacklists and deprecate them to cause a warning at compile time, and hope that people will give them a wide berth. Even if you hide them in an unoffical sun.com package. By the way, I'm looking forward to the proprietary Base64 being changed to simply call the new official one. You can't even modify an equals method in an undocumented reflection implementation in 1.7.0.51 without breaking production applications. Wow. I am raising doubt and you are ignoring it. Don't you have the guts to say whoa, stop the bandwagon, there could just be a real problem with this, it really will make it easier to create bugs without any warning from the compiler and without making anyone's code better *in any way at all*? I am picking on String.join precisely because I have seen way too many delimiter collision bugs, not just in Java but in several other languages, and because this interface perpetuates a real world problem by preventing future interface changes to detect them. [I am not always a doomsayer. For example, I am not picking on JSR310 because Date and Calendar and SimpleDateFormat are well known disaster areas and the people working on their replacement have a deep understanding of the issues, have really taken their time, and nothing whatsoever that I have read about it jumped out at me and said this is *wrong*. It might not even be completely perfect. But I am confident it will be so much better than what we have now, and it's a shame that I won't have time to migrate existing apps to it.] The counterproposal, no, proposal refinement, wafting around inside my head, is to somehow compel programmers to make just one more method call before that final toString(). It will be difficult to find good names, especially ones that will be understood by programmers for whom English is not a first language. Something like: String.join(delimiter, joinables).assertNoUnescapedDelimiters().toString(); String.join(delimiter, joinables).neverMindDelimiterCollisions().toString(); String.join(delimiter, joinables).promiseNoUnescapedDelimiters().toString(); String.join(delimiter, joinables).escapeDelimiters(escapeChar).toString(); String.join(delimiter, joinables).quoteElements(quoteChar).toString(); The vital thing is that String.join has to return an unjoinable, that needs an adapter method to make it safely joinable. If you get that right, then we can forgive this first shot being a little slow, and enjoy the triumph of CharSequence over immutability. Yes, ultimately the goal should be to add full support for at least the most popular csv generation recipes. I'm really sorry I couldn't carry on arguing the case before August. As a minority, I only have one person's quota of energy. I will try to get some more people to look at it. On 27 Jan 2014, at 18:44, Mike Duigou mike.dui...@oracle.com wrote: On Jan 26 2014, at 17:12 , Philip Hodges philip.hod...@bluewin.ch wrote: Please please please drop StringJoiner from Java 1.8 before it is too late. It is well past too late. The API has been frozen since August for all but the most exceptional cases. At first I thought they were cool. Then I tried to use them in anger. And was forced to roll my own. I would encourage you to share your implementation or the javadocs as grist for discussion. An actual alternative is the *only* thing that is likely to be persuasive. You seem to be very much in the minority on this issue by the silence from other responders on this list and elsewhere. The concerns you've highlighted with regard to delimiter management, while certainly valid, have not been of sufficient concern to suggest a change in the proposal. The prediction that using StringJoiner will become Java's sprintf seems unlikely to be be true. The reception we've seen thus far for StringJoiner has been otherwise exclusively enthusiastic and positive. Alternatives, refinements and counterproposals are always welcome in the discussion. Doomsaying unfortunately adds little. Mike
Re: StringJoiner: detect or fix delimiter collision?
Please please please drop StringJoiner from Java 1.8 before it is too late. It is not needed. It does not add value. It is an embarrassment. We did without it for years. It is not long long overdue. We do not need it now. It does not need to be in the very first Java 1.8 release. Try leaving it out, and see if people even complain. Most people won't even notice it, let alone be ready to misuse it straight away. String.join is slower than Arrays.toString() or Collections.toString(). Test failed. String.join(delimiter, iterable) needs extra code to call toString() if elements are not already CharSequence. Convenience? Test failed. StringJoiner does not force or even remind the developer to consider delimiter collisions. Test failed. It is a perfect complement for split: the hard to use interface that can't split the joined input reliably because it might have too many delimiters. It will be the newest star on java antipattern web sites. A good place would be just after the item about not Assembling XML with String operations because the text parts might contain special characters. Even if it ever did appear that StringJoiner and String.join did what we wanted, I would still much rather roll my own than use the new StringJoiner in the library. Then I could set the initial capacity of the StringBuilder, skip the defensive copies and null checks, and above all, introduce some extra mandatory method call to force the developer to tell the compiler what is being done about delimiter collisions, even if it is to add an assert that there are no delimiters in the added CharSequences, or to waive that check because numbers never never contain comma delimiters (hold on ... better check the locale to see if comma is used as a decimal point or grouping character after all). I came across this non-justification [with my additions] on a blog: StringJoiner and String.join(...) are long, long overdue. They are so long overdue that the vast majority of Java developers likely have already written or have found [or have had to mend] utilities for joining strings [to make an unsplittable delimiter collision], but it is nice [or at least well-meant] for the JDK to finally provide this itself. Everyone has encountered situations where joining strings is required [and they forgot all about escaping, or should have used a PreparedStatement instead], and it is a Good Thing™ that we can now express that [bug opportunity] through a standard API that every [no, just some] Java developer (eventually) will know [of, but not actually use, and even if they do they will need an IDE to tell them that the prefix comes *after* the delimiter]. To quote Joshua Bloch on good API design: 'When in doubt leave it out' (or 'You can always add, but you can never remove') On 31 Aug 2013, at 23:13, Mike Duigou mike.dui...@oracle.com wrote: On Aug 30 2013, at 23:40 , Philip Hodges wrote: ... Why is there such a bandwagon rolling for this convenience feature? Perhaps others just don't agree with you. The choice of functionality offered in the JDK 8 StringJoiner was not arbitrary or haphazardly chosen. If it doesn't meet your particular needs, I am sorry to hear that. Our belief as the proposers is that it offers a reasonable mix of functionality and simplicity to cover useful number of requirements. Our intent was certainly not to create the kitchen magician [1] of string joiners that attempted to incorporate every conceivable option. In particular your concerns about escaping are out of scope for the joiner in part because CollectionString strings; String concat = strings.stream().map(MyUtils::escape).collect(Collections.joining(,)); seems like an adequate and flexible solution to most people. We already have joiners in apache commons and guava. This is the reason that StringJoiner was considered for JDK--that many others were rolling their own. At first I thought they were cool. Then I tried to use them in anger. And was forced to roll my own. I would encourage you to share your implementation or the javadocs as grist for discussion. An actual alternative is the *only* thing that is likely to be persuasive. That can't be right. Sometimes it is. The JDK doesn't make any attempt to satisfy everyone (or anyone). We all end up writing lots of non-business logic utility functions to bridge gaps in what JDK offers. This is normal. And, if it turns out to be something lots of people need then it's entirely possible that it could be added to the JDK. Mike [1] https://www.youtube.com/watch?v=cGVG9xLHb84 A more elaborate offically blessed feature that only does half the job is worse than useless. Without the extra complex ability to detect or avoid collisions it is neither nice, nor a Good Thing. Phil http://www.techempower.com/blog/2013/03/26/everything-about-java-8/ StringJoiner and String.join(...) are long, long overdue
Re: StringJoiner: detect or fix delimiter collision?
I don't think my needs are in any way particular or unusual. This is the third attempt at a StringJoiner that has not stopped me having to roll my own. Arrays.toString and AbstractCollection.toString already meet logging needs and serve as simple useful examples for where people don't care about delimiter collision. Your strings/stream/map/escape/collect/joining idiom is impressive and might well be adequate and flexible. It is also quite beyond incomprehensible to many. Even I don't immediately see what the stream and collect methods contribute to the processing, I will look it up later. A lot of those will simply not think about delimiter collision until it bites them in production, because your joiner does not by default insist on either a waiver or an escaper or a checker for separators in the collection where you wrote map(escape). Maybe they will join some numbers first, then copy that code to join some identifiers, and then copy that code to join some names, and it will work fine. Until one of the names contains an apostrophe or comma. Or they will quickly join a few lines for a web page, separated by a br tag, and forget to convert the lines to HTML, or verify that each line is a valid and safe HTML subset fragment. This feature as it stands adds precious little convenience value, while making it easier for people to forget to provide any escaping at all. You are missing a golden opportunity to ensure that the code only compiles if it is complete with a waiver, detector or escaper. I've seen the consequences of escaping being left out all too often. My own rolled code consists of a looper, an appender, and a replacer. I found it best if the appender method that chooses and injects the separator also inserts the quotes and chooses and calls the escaper (replacer). The code is clear apart from the replacer implementation, which has to use the charAt idiom instead of foreach to iterate the string without the defensive copy, and has inline code to escape the escape character itself as well as the quote character. If I was doing it over again, I would try to estimate and pass an initial capacity for the StringBuilder, and see if I really have to break the append chain in order to pass that same StringBuilder to the replacer instead of letting it create its own. I would also change the TODO use guava joiner comment to just say *see* guava joiner and see StringJoiner since java 1.8 :( It's more than persuasive enough to drive me to keep trying to pass on my doubts in this forum, but I would rather spare you the javadocs and implementation. Phil To quote Joshua Bloch on good API design: 'When in doubt leave it out' (or 'You can always add, but you can never remove'). On 31 Aug 2013, at 23:13, Mike Duigou mike.dui...@oracle.com wrote: On Aug 30 2013, at 23:40 , Philip Hodges wrote: ... Why is there such a bandwagon rolling for this convenience feature? Perhaps others just don't agree with you. The choice of functionality offered in the JDK 8 StringJoiner was not arbitrary or haphazardly chosen. If it doesn't meet your particular needs, I am sorry to hear that. Our belief as the proposers is that it offers a reasonable mix of functionality and simplicity to cover useful number of requirements. Our intent was certainly not to create the kitchen magician [1] of string joiners that attempted to incorporate every conceivable option. In particular your concerns about escaping are out of scope for the joiner in part because CollectionString strings; String concat = strings.stream().map(MyUtils::escape).collect(Collections.joining(,)); seems like an adequate and flexible solution to most people. We already have joiners in apache commons and guava. This is the reason that StringJoiner was considered for JDK--that many others were rolling their own. At first I thought they were cool. Then I tried to use them in anger. And was forced to roll my own. I would encourage you to share your implementation or the javadocs as grist for discussion. An actual alternative is the *only* thing that is likely to be persuasive. That can't be right. Sometimes it is. The JDK doesn't make any attempt to satisfy everyone (or anyone). We all end up writing lots of non-business logic utility functions to bridge gaps in what JDK offers. This is normal. And, if it turns out to be something lots of people need then it's entirely possible that it could be added to the JDK. Mike [1] https://www.youtube.com/watch?v=cGVG9xLHb84 A more elaborate offically blessed feature that only does half the job is worse than useless. Without the extra complex ability to detect or avoid collisions it is neither nice, nor a Good Thing. Phil http://www.techempower.com/blog/2013/03/26/everything-about-java-8/ StringJoiner and String.join(...) are long, long overdue. They are so long overdue that the vast majority of Java developers likely have already written
Re: StringJoiner: detect or fix delimiter collision?
Are we really going ahead with an implementation that: - checks even string literal parameters for null - does it again when String.join calls StringJoiner - makes defensive copies of just some of the arguments - creates a StringBuilder with only the default capacity People would be better off taking a look at AbstractCollection.toString or Arrays.toString and then, if they haven't already, roll their own, enhanced with a StringBuilder capacity, and without all the defensive baggage. To come back to my original point about why it is a toy interface: The only worthwhile value that StringJoiner can possibly add is a way to force programmers to make a conscious decision to waive checking to see if the separator (or prefix or suffix) can or does occur in any of the elements, and to apply an escaper/quoter if needed. Please add some failing round trip unit tests with split to help raise awareness of the issue of delimiter collision. The original String elements cannot be extracted from the result if there are no elements, one empty element, any null elements, or if the separator occurs in any of the elements. Please add a test of String.join with delimiter and no varargs elements, and another test with one null varargs element. I don't feel comfortable with a signature that does not clearly separate the delimiter from the elements. Please put an example with escaping in the javadoc. You might want to change that ridiculous Java-is-cool example. Of course there are a lot of cool things about Java, and even String.join cannot compete with acknowledged uncool features such as Calendar, mutable Dates and unthreadsafe Formatters. Nevertheless, the example text risks becoming ironic. Please take a proper critical look at whether there is any genuine need to dump this half-baked toy into everyone's jdk. There are some good ideas in Java 8. This is not one of them. It will soon be starring behind the scenes in CERT advisories. Once people start using it, even if you do accept and try to fix the flaws, you will then have two versions in circulation for many years. And then I will be able to say I told you so, instead of thank goodness we woke up and stopped it so we could get it right first time. Why is there such a bandwagon rolling for this convenience feature? It is not long overdue. Arrays.toString and AbstractCollection.toString have been there for years. We already have joiners in apache commons and guava. At first I thought they were cool. Then I tried to use them in anger. And was forced to roll my own. That can't be right. A more elaborate offically blessed feature that only does half the job is worse than useless. Without the extra complex ability to detect or avoid collisions it is neither nice, nor a Good Thing. Phil http://www.techempower.com/blog/2013/03/26/everything-about-java-8/ StringJoiner and String.join(...) are long, long overdue. They are so long overdue that the vast majority of Java developers likely have already written or have found utilities for joining strings, but it is nice for the JDK to finally provide this itself. Everyone has encountered situations where joining strings is required, and it is a Good Thing™ that we can now express that through a standard API that every Java developer (eventually) will know. On 2013-07-23 22:09, Mike Duigou wrote: On Jul 23 2013, at 12:43 , ph wrote: didn't see delimiter collision mentioned anywhere - are you really offering an interface that only joins a list of entries without escaping or quoting or even checking for separators (or escape or quote sequences) that might occur in the entries? Correct. StringJoiner makes no effort to address escaping/quoting. Doing so would add a lot of complexity that would not useful interesting to most users and probably still wouldn't satisfy everyone. If you wish some form of escaping or quoting you can pre-processed entries however you like before joining them. Mike