Why inventing the wheel? What bad about ISO ISO9075 ? Tuesday 27 November 2007 11:41:08 Marcel Reutegger написав: > the public review of JSR 283 also contains a third approach how to deal > with those illegal characters. IMO this should be the preferred one because > it will ensure interoperability. > > <jsr-283-public-review> > 3.6.3 Exposing non-JCR Names > An implementation that exposes a non-JCR data store through the JCR API may > encounter names with characters not allowed within JCR names. To allow for > this, a JCR repository should expose non-JCR characters as private use > Unicode code point characters according to the following mapping: > > Non-JCR character (Unicode code point) Private use Unicode code point > * (U+002A) U+F02A > / (U+002F) U+F02F > > : (U+003A) U+F03A > > [ (U+005B) U+F05B > ] (U+005D) U+F05D > > | (U+007C) U+F07C > > This mapping should be used when a JCR method returns a name containing a > non-JCR character. The mapping should also be used (in reverse) when a JCR > method is called with a path or name containing one of the six private use > code points above. > </jsr-283-public-review> > > jackrabbit does not yet have a utility, which implements this escaping. > contributions are welcome! ;) > > regards > marcel > > Jukka Zitting wrote: > > Hi, > > > > On Nov 26, 2007 5:44 PM, Brian Thompson <[EMAIL PROTECTED]> wrote: > >> In my application, I implemented a custom search/replace method to > >> filter out illegal characters. It's pretty simple to write, so I didn't > >> spend much time looking for a library method to handle it. AFAIK, the > >> Jackrabbit API doesn't address this issue. I could be wrong, though > >> (correct me if I'm wrong, please, Jackrabbit devs!). > > > > There are two classed for this purpose in the jackrabbit-jcr-commons > > component: > > > > org.apache.jackrabbit.util.ISO9075 [1] > > > > This class implements the ISO9075 escaping mechanism that the JCR spec > > uses in the document view serialization format. All invalid name > > characters are converted to _xNNNN_ sequences, where NNNN is the > > hexadecimal representation of the Unicode code unit (UTF-16) of the > > character in question. > > > > This escaping format can look a bit surprising if you use the document > > view export feature, as the _x prefix ends up doubly escaped when > > exported to XML. > > > > org.apache.jackrabbit.util.Text [2] > > > > This class implements (among other things) a few variations of the URI > > escaping mechanism defined in RFC 2396. All invalid (as defined by the > > escaping method you choose) characters are converted to %NN sequences > > where NN is the hexadecimal representation of the Unicode code unit > > (UTF-8) of the character in question. > > > > This escaping format can look a bit surprising if you map node names > > or paths to URIs, as the % prefix ends up doubly escaped. > > > > [1] > > http://jackrabbit.apache.org/api/1.3/org/apache/jackrabbit/util/ISO9075.h > >tml [2] > > http://jackrabbit.apache.org/api/1.3/org/apache/jackrabbit/util/Text.html > > > > BR, > > > > Jukka Zitting
-- SY, Alex Lukin RIPE NIC HDL: LEXA1-RIPE
