Hi, On Nov 26, 2007 5:44 PM, Brian Thompson <[EMAIL PROTECTED]> wrote: > In my application, I implemented a custom search/replace method to filter > out illegal characters. It's pretty simple to write, so I didn't spend much > time looking for a library method to handle it. AFAIK, the Jackrabbit API > doesn't address this issue. I could be wrong, though (correct me if I'm > wrong, please, Jackrabbit devs!).
There are two classed for this purpose in the jackrabbit-jcr-commons component: org.apache.jackrabbit.util.ISO9075 [1] This class implements the ISO9075 escaping mechanism that the JCR spec uses in the document view serialization format. All invalid name characters are converted to _xNNNN_ sequences, where NNNN is the hexadecimal representation of the Unicode code unit (UTF-16) of the character in question. This escaping format can look a bit surprising if you use the document view export feature, as the _x prefix ends up doubly escaped when exported to XML. org.apache.jackrabbit.util.Text [2] This class implements (among other things) a few variations of the URI escaping mechanism defined in RFC 2396. All invalid (as defined by the escaping method you choose) characters are converted to %NN sequences where NN is the hexadecimal representation of the Unicode code unit (UTF-8) of the character in question. This escaping format can look a bit surprising if you map node names or paths to URIs, as the % prefix ends up doubly escaped. [1] http://jackrabbit.apache.org/api/1.3/org/apache/jackrabbit/util/ISO9075.html [2] http://jackrabbit.apache.org/api/1.3/org/apache/jackrabbit/util/Text.html BR, Jukka Zitting
