[
https://issues.apache.org/jira/browse/LANG-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Henri Yandell updated LANG-882:
-------------------------------
Fix Version/s: 3.2
> LookupTranslator accepts CharSequence as input, but fails to work with
> implementations other than String
> --------------------------------------------------------------------------------------------------------
>
> Key: LANG-882
> URL: https://issues.apache.org/jira/browse/LANG-882
> Project: Commons Lang
> Issue Type: Bug
> Components: lang.text.translate.*
> Affects Versions: 3.1
> Reporter: Mark A. Ziesemer
> Fix For: 3.2
>
>
> The core of {{org.apache.commons.lang3.text.translate}} is a
> {{HashMap<CharSequence, CharSequence> lookupMap}}.
> From the Javadoc of {{CharSequence}} (emphasis mine):
> {quote}
> This interface does not refine the general contracts of the equals and
> hashCode methods. The result of comparing two objects that implement
> CharSequence is therefore, in general, undefined. Each object may be
> implemented by a different class, and there is no guarantee that each class
> will be capable of testing its instances for equality with those of the
> other. *It is therefore inappropriate to use arbitrary CharSequence instances
> as elements in a set or as keys in a map.*
> {quote}
> The current implementation causes code such as the following to not work as
> expected:
> {code}
> CharSequence cs1 = "1 < 2";
> CharSequence cs2 = CharBuffer.wrap("1 < 2".toCharArray());
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs1));
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2));
> {code}
> ... which gives the following results (but should be identical):
> {noformat}
> 1 < 2
> 1 < 2
> {noformat}
> The problem, at a minimum, is that {{CharBuffer.equals}} is even documented
> in the Javadoc that:
> {quote}
> A char buffer is not equal to any other type of object.
> {quote}
> ... so a lookup on a CharBuffer in the Map will always fail when compared
> against the String implementations that it contains.
> An obvious work-around is to instead use something along the lines of either
> of the following:
> {code}
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2.toString()));
> System.out.println(StringEscapeUtils.escapeHtml4(cs2.toString()));
> {code}
> ... which forces everything back to a {{String}}. However, this is not
> practical when working with large sets of data, which would require
> significant heap allocations and garbage collection concerns. (As such, I
> was actually trying to use the {{translate}} method that outputs to a
> {{Writer}} - but simplified the above examples to omit this.)
> Another option that I'm considering is to use a custom {{CharSequence}}
> wrapper around a {{char[]}} that implements {{hashCode()}} and {{equals()}}
> to work with those implemented on {{String}}. (However, this will be
> interesting due to the symmetric assumption - which is further interesting
> that {{String.equals}} is currently implemented using {{instanceof}} - even
> though {{String}} is {{final}}...)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira