I don't know of a library or built-in that would handle that. But you could
write one. If you do, try to release the source.
In another direction, one fairly cheap solution might be to check the user-name
before creating the user, or try-catch the library call. If it looks iffy or
fails, apologize to the user and ask them to asciify it in their own preferred
way. That way there are no surprises when the user sees møøse automatically
(and irrevocably?) translated to moeoese or moose. People are sometimes very
sensitive about these things, so either variant might annoy someone.
A pre-flight check could use fn:matches with the pattern from security.xsd:
<xs:simpleType name="user-name">
<xs:annotation>
<xs:documentation>
</xs:documentation>
<xs:appinfo>
</xs:appinfo>
</xs:annotation>
<xs:restriction base="xs:token">
<xs:pattern value="[a-zA-Z0-9._@-]+"/>
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>
Longer term you could ask MarkLogic to expand that pattern to cover more
languages. It's only a matter of time before more users start wanting to use
non-ASCII scripts for usernames. I'm not sure if there's any technical reason
for the restriction. Using HTTP auth means user-id can't contain a colon ':',
but otherwise I believe anything goes. Of course browsers might not support
everything, and I'm not sure about LDAP, NTLM, etc.
-- Mike
On 1 Aug 2014, at 07:05 , David Sewell <[email protected]> wrote:
> We have a user-facing function that creates login names based on their real
> names, using initial characters from their surname and last name to create a
> MarkLogic user name. For the first time we recorded a server error when
> someone registered with a name beginning with a non-ASCII character
> (Norwegian Ø), because currently MarkLogic username cannot have non-ASCII
> characters.
>
> So I thought the easy solution would be to use xdmp:diacritic-less(). But no,
> that only changes characters like ñ and é that are accented variants of a
> single letter. It does not touch combined charaters like Ø or Æ.
>
> Of course I could use fn:translate to catch all of the likely cases, but is
> there a more general-purpose standard or extension function to perform
> normalization to ASCII for accented/combined Latin characters in a MarkLogic
> environment?
>
> David S.
>
> --
> David Sewell, Editorial and Technical Manager
> ROTUNDA, The University of Virginia Press
> PO Box 400314, Charlottesville, VA 22904-4314 USA
> Email: [email protected] Tel: +1 434 924 9973
> Web:
> http://rotunda.upress.virginia.edu/_______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general