Author: bayard
Date: Tue Mar  1 07:31:23 2011
New Revision: 1075691

URL: http://svn.apache.org/viewvc?rev=1075691&view=rev
Log:
Adding information on the text.translate package

Modified:
    commons/proper/lang/trunk/src/site/xdoc/article3_0.xml

Modified: commons/proper/lang/trunk/src/site/xdoc/article3_0.xml
URL: 
http://svn.apache.org/viewvc/commons/proper/lang/trunk/src/site/xdoc/article3_0.xml?rev=1075691&r1=1075690&r2=1075691&view=diff
==============================================================================
--- commons/proper/lang/trunk/src/site/xdoc/article3_0.xml (original)
+++ commons/proper/lang/trunk/src/site/xdoc/article3_0.xml Tue Mar  1 07:31:23 
2011
@@ -79,7 +79,31 @@ we will remove the related methods in La
 <section name="New packages">
 <p>Two new packages have shown up. org.apache.commons.lang3.concurrent, which 
unsurprisingly provides support classes for 
 multi-threaded programming, and org.apache.commons.lang3.text.translate, which 
provides a pluggable API for text transformation. </p>
-<!-- TODO: Add examples -->
+<!-- TODO: <h3>concurrent.*</h3> -->
+
+<h3>text.translate.*</h3>
+<p>A common complaint with StringEscapeUtils was that its escapeXml and 
escapeHtml methods should not be escaping non-ASCII characters. We agreed and 
made the change while creating a modular approach to let users define their own 
escaping constructs. </p>
+<p>The simplest way to show this is to look at the code that implements 
escapeXml:</p>
+<pre>
+    return ESCAPE_XML.translate(input);
+</pre>
+<p>Very simple. Maybe a bit too very simple, let's look a bit deeper. </p>
+<pre>
+    public static final CharSequenceTranslator ESCAPE_XML =
+        new AggregateTranslator(
+            new LookupTranslator(EntityArrays.BASIC_ESCAPE()),
+            new LookupTranslator(EntityArrays.APOS_ESCAPE())
+        );
+</pre>
+<p>Here we see that <code>ESCAPE_XML</code> is a 
'<code>CharSequenceTranslator</code>', which in turn is made up of two lookup 
translators based on the basic XML escapes and another to escape apostrophes. 
This shows one way to combine translators. Another can be shown by looking at 
the example to achieve the old XML escaping functionality (escaping non-ASCII): 
</p>
+<pre>
+          StringEscapeUtils.ESCAPE_XML.with( UnicodeEscaper.above(0x7f) );
+</pre>
+<p>That takes the standard Commons Lang provided escape functionality, and 
adds on another translation layer. Another JIRA requested option was to also 
escape non-printable ASCII, this is now achievable with a modification of the 
above: </p>
+<pre>
+          StringEscapeUtils.ESCAPE_XML.with( UnicodeEscaper.outsideOf(32, 
0x7f) );
+</pre>
+<p>You can also implement your own translators (be they for escaping, 
unescaping or some aspect of your own). See the 
<code>CharSequenceTranslator</code> and its <code>CodePointTranslator</code> 
helper subclass for details - primarily a case of implementing the 
translate(CharSequence, int, Writer);int method. </p>
 </section>
 <section name="New classes + methods">
 <p>There are many new classes and methods in Lang 3.0 - the most complete way 
to see the changes is via this <a href="lang2-lang3-clirr-report.html">Lang2 to 
Lang3 Clirr report</a>. </p>
@@ -110,6 +134,7 @@ multi-threaded programming, and org.apac
 <ul>
 <li>StringUtils.isAlpha, isNumeric and isAlphanumeric now all return false 
when passed an empty String. Previously they returned true. </li>
 <li>SystemUtils.isJavaVersionAtLeast now relies on the 
<code>java.specification.version</code> and not the <code>java.version</code> 
System property. </li>
+<li>StringEscapeUtils.escapeXml and escapeHtml no longer escape high value 
unicode characters by default. The text.translate package is available to 
recreate the old behaviour. </li>
 </ul>
 </section>
 


Reply via email to