[
https://issues.apache.org/jira/browse/SLING-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Felix Meschberger updated SLING-2609:
-------------------------------------
Attachment: NodeNameFilter.java
Proposed changes to the NodeNameFilter.
This change also includes handling the maximum length for the generated name in
the filter itself to limit the actual work done.
In the end, I also think the NodeNameFilter class should be integrated into the
DefaultNodeNameGenerator instead of being a separate single-method class.
> Support non-ASCII based languages for node name generation
> ----------------------------------------------------------
>
> Key: SLING-2609
> URL: https://issues.apache.org/jira/browse/SLING-2609
> Project: Sling
> Issue Type: Improvement
> Components: Servlets
> Affects Versions: Servlets Post 2.1.2
> Reporter: Felix Meschberger
> Assignee: Felix Meschberger
> Attachments: NodeNameFilter.java
>
>
> The Sling POST Servlet has built-in support to automatically generate names
> for newly generated resources based of some name hint or the value of some
> select properties.
> Such name hints are filtered in a very crude way, though:
> * the string is converted to lower case
> * only ascii letters and digits supported
> * non-accepted characters replaced by underscore (_)
> This leads to the following problems:
> * Non-BMP (surrogate) Unicode characters are converted to just underscore
> * Words separated by whitespace (e.g. the title "My Brand new Page" are now
> separated by underscore instead of dash (-) which may lead to indexing
> problems (see http://www.youtube.com/watch?v=AQcSFsQyct8)
> This all happens in the NodeNameFilter class.
> I suggest we change this as follows:
> * Operate on code points instead (int type) of just characters (char type)
> * Accept all characters valid for JCR names. This is all Unicode characters
> except { ., /, :, [, ], *, ', ", | }. These characters are replaced by
> underscore
> * Convert all white space characters (Character.isWhitespace(int)) by dash
> * Convert all other characters to lower case (Character.toLowerCase(int))
> * Consecutive dash and underscore characters folded into just one
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira