Thank you very much for this detail explanation. As it is a simple modification to get this enhancement, and we had an issue about it, please re-open it.
On Fri, Mar 2, 2012 at 4:09 PM, Weijun Wang <weijun.w...@oracle.com> wrote: > LDAP URL (RFC 4516 2.1) specifies that only <reserved>, <unreserved>, and > <pct-encoded> chars can be used, which do not include general non-ASCII > unicode. UrlUtil deals with these chars correctly. > > The javadoc of URLDecoder [1] also only allows these characters, and it > says -- > > There are two possible ways in which this decoder could deal with > illegal strings. It could either leave illegal characters alone or > it could throw an IllegalArgumentException. Which approach the > decoder takes is left to the implementation. > > Now the Oracle implementation of the class "leave illegal characters > alone" and a Unicode char is still Unicode and you get the correct result. > > In this sense, UrlUtil is not as good as URLDecoder. It neither leave them > alone nor throw an exception. Therefore, maybe it's better to use > URLDecoder here, but before any spec officially supports "other" characters > (a category defined in the URI class, including non-ASCII non-control > non-space Unicode chars), it's better to use 100% legal chars in an LDAP > URI. > > If you have a strong request, I can re-open the bug. > > Thanks > Max > > [1] > http://docs.oracle.com/javase/**7/docs/api/java/net/**URLDecoder.html<http://docs.oracle.com/javase/7/docs/api/java/net/URLDecoder.html> > > On 03/02/2012 02:15 PM, Sean Chou wrote: > >> >> But UrlUtil.decode(DN, "UTF8") and URLDecoder.decode(DN, "UTF8") >> are returning >> different strings, if DN has invalid encoding, why URLDecoder.decode(DN, >> "UTF8") can >> decode it ? >> >> On Thu, Mar 1, 2012 at 4:21 PM, Weijun Wang <weijun.w...@oracle.com >> <mailto:weijun.w...@oracle.com**>> wrote: >> >> Added some evaluation. Copied here: >> >> The URL in the testcase has an invalid encoding. Its Unicode characters >> must be encoded in UTF-8. For example, >> >> \u3070 -> \e3\81\b0 -> %5Ce3%5C81%5Cb0 >> >> -Weijun >> >> >> On 03/01/2012 03:39 PM, Sean Chou wrote: >> >> Hi all, >> >> I just encountered this bug: >> >> http://bugs.sun.com/__**bugdatabase/view_bug.do?bug___**id=6961765<http://bugs.sun.com/__bugdatabase/view_bug.do?bug___id=6961765> >> >> <http://bugs.sun.com/**bugdatabase/view_bug.do?bug_**id=6961765<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6961765>> >> . >> But it is >> closed as "NOT A BUG" without any comments. >> >> Would anyone take a look and give it a comment ? Thanks. >> >> >> >> >> -- >> Best Regards, >> Sean Chou >> >> -- Best Regards, Sean Chou