[ 
https://issues.apache.org/jira/browse/LANG-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638816#comment-13638816
 ] 

Henri Yandell commented on LANG-882:
------------------------------------

Test is easy - take the current LookupTranslator test and make a StringBuffer 
version. 

Solutions; naively throwing in a TreeMap doesn't work. A ClassCast occurs 
between StringBuffer and String. This is because calling subSequence on 
StringBuffer returns a String (boo!), and for some reason the call to compareTo 
in getEntry of TreeMap doesn't like the different types. Presumably this could 
be solved with a custom comparator.

Changing the key of the HashMap to be a String resolves the issue. It feels 
weird for the key to be typed; ie) if it was StringBuffer("foo"), I'd expect it 
to match the String "foo" as well. Only matching the type of the input seems 
odd. I can see value in keeping the translate-to part of the system as 
CharSequence; you could have large items of text that won't be read until such 
a time as they need to be obtained.


                
> LookupTranslator accepts CharSequence as input, but fails to work with 
> implementations other than String
> --------------------------------------------------------------------------------------------------------
>
>                 Key: LANG-882
>                 URL: https://issues.apache.org/jira/browse/LANG-882
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.text.translate.*
>    Affects Versions: 3.1
>            Reporter: Mark A. Ziesemer
>             Fix For: 3.2
>
>
> The core of {{org.apache.commons.lang3.text.translate}} is a 
> {{HashMap<CharSequence, CharSequence> lookupMap}}.
> From the Javadoc of {{CharSequence}} (emphasis mine):
> {quote}
> This interface does not refine the general contracts of the equals and 
> hashCode methods. The result of comparing two objects that implement 
> CharSequence is therefore, in general, undefined. Each object may be 
> implemented by a different class, and there is no guarantee that each class 
> will be capable of testing its instances for equality with those of the 
> other. *It is therefore inappropriate to use arbitrary CharSequence instances 
> as elements in a set or as keys in a map.*
> {quote}
> The current implementation causes code such as the following to not work as 
> expected:
> {code}
> CharSequence cs1 = "1 < 2";
> CharSequence cs2 = CharBuffer.wrap("1 < 2".toCharArray());
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs1));
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2));
> {code}
> ... which gives the following results (but should be identical):
> {noformat}
> 1 &lt; 2
> 1 < 2
> {noformat}
> The problem, at a minimum, is that {{CharBuffer.equals}} is even documented 
> in the Javadoc that:
> {quote}
> A char buffer is not equal to any other type of object.
> {quote}
> ... so a lookup on a CharBuffer in the Map will always fail when compared 
> against the String implementations that it contains.
> An obvious work-around is to instead use something along the lines of either 
> of the following:
> {code}
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2.toString()));
> System.out.println(StringEscapeUtils.escapeHtml4(cs2.toString()));
> {code}
> ... which forces everything back to a {{String}}.  However, this is not 
> practical when working with large sets of data, which would require 
> significant heap allocations and garbage collection concerns.  (As such, I 
> was actually trying to use the {{translate}} method that outputs to a 
> {{Writer}} - but simplified the above examples to omit this.)
> Another option that I'm considering is to use a custom {{CharSequence}} 
> wrapper around a {{char[]}} that implements {{hashCode()}} and {{equals()}} 
> to work with those implemented on {{String}}.  (However, this will be 
> interesting due to the symmetric assumption - which is further interesting 
> that {{String.equals}} is currently implemented using {{instanceof}} - even 
> though {{String}} is {{final}}...)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to