[
https://issues.apache.org/jira/browse/OAK-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532331#comment-15532331
]
Julian Reschke commented on OAK-4857:
-------------------------------------
Citing
<https://docs.adobe.com/content/docs/en/spec/jcr/2.0/3_Repository_Model.html#3.2.4%20Naming%20Restrictions>:
bq. This definition of JCR name represents the least restrictive set of
constraints permitted for the naming of items and other entities. A repository
may further restrict the names of entities to a subset of JCR names and in most
cases is encouraged to do so.
bq. ...
bq. The characters declared invalid within a local name (“/”, “:”, “\[“, “]”,
“|”, “*”) represent only those characters which are used as metacharacters in
JCR names, paths and name-matching patterns (see §5.2.2 Iterating Over Child
Items). These restrictions are not necessarily sufficient to enforce best
practices in the creation of JCR names. In particular, the minimal grammar
defined here permits JCR names with leading and trailing whitespace as well as
characters which may appear superficially identical while representing
different code points, creating a potential security issue.
bq. Though this specification does not attempt to define good naming practice,
implementers are discouraged from permitting names with these and other
problematic characteristics when possible. However, there may be cases where
the latitude provided by the minimal grammar is useful, for example, when a JCR
implementation is built on top of an existing data store with an unconventional
naming scheme.
> Support space chars common in CJK inside node names
> ---------------------------------------------------
>
> Key: OAK-4857
> URL: https://issues.apache.org/jira/browse/OAK-4857
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.4.7, 1.5.10
> Reporter: Alexander Klimetschek
> Attachments: OAK-4857-tests.patch
>
>
> Oak (like Jackrabbit) does not allow spaces commonly used in CJK like
> {{u3000}} (ideographic space) or {{u00A0}} (no-break space) _inside_ a node
> name, while allowing some of them (the non breaking spaces) at the _beginning
> or end_.
> They should be supported for better globalization readiness, and filesystems
> allow them, making common filesystem to JCR mappings unnecessarily hard.
> Escaping would be an option for applications, but there is currently no
> utility method for it
> ([Text.escapeIllegalJcrChars|https://jackrabbit.apache.org/api/2.8/org/apache/jackrabbit/util/Text.html#escapeIllegalJcrChars(java.lang.String)]
> will not escape these spaces), nor is it documented for applications how to
> do so.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)