I think the Jakarta Commons Lang package has what you are looking for:

public static java.lang.String unescapeHtml(java.lang.String str)

    Unescapes a string containing entity escapes to a string containing the
actual Unicode characters corresponding to the escapes. Supports HTML 4.0
entities.

    For example, the string "<Français>" will become
"<Français>"

    If an entity is unrecognized, it is left alone, and inserted verbatim
into the result string. e.g. "&gt;&zzzz;x" will become ">&zzzz;x".

    Parameters:
        str - the String to unescape, may be null 
    Returns:
        a new unescaped String, null if null string input

http://commons.apache.org/lang/api-release/org/apache/commons/lang/StringEscapeUtils.html#unescapeHtml(java.lang.String)

Hope that helps,
Rainer



Dirk Stöcker-3 wrote:
> 
> On Wed, 20 Aug 2008, Bodo Meissner wrote:
> 
>>> Is there a Java function, which can can convert the HTML &amp;, &apos; 
>>> and so on into UTF8? I did not find anything.
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/HTML-entities-tp19073422p19078371.html
Sent from the OpenStreetMap - JOSM Dev mailing list archive at Nabble.com.


_______________________________________________
josm-dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/josm-dev

Reply via email to