I didn't realize non-ASCII characters weren't allowed in usernames.  This
is a serious shortcoming in ML.  Unless it can be fixed soon this may force
developers to use a mapping approach and something unique like a UUID for
the ML user name.





On Fri, Aug 1, 2014 at 2:07 PM, Michael Blakeley <[email protected]> wrote:

> I don't know of a library or built-in that would handle that. But you
> could write one. If you do, try to release the source.
>
> In another direction, one fairly cheap solution might be to check the
> user-name before creating the user, or try-catch the library call. If it
> looks iffy or fails, apologize to the user and ask them to asciify it in
> their own preferred way. That way there are no surprises when the user sees
> møøse automatically (and irrevocably?) translated to moeoese or moose.
> People are sometimes very sensitive about these things, so either variant
> might annoy someone.
>
> A pre-flight check could use fn:matches with the pattern from security.xsd:
>
>   <xs:simpleType name="user-name">
>     <xs:annotation>
>       <xs:documentation>
>       </xs:documentation>
>       <xs:appinfo>
>       </xs:appinfo>
>     </xs:annotation>
>     <xs:restriction base="xs:token">
>       <xs:pattern value="[a-zA-Z0-9._@-]+"/>
>       <xs:minLength value="1"/>
>     </xs:restriction>
>   </xs:simpleType>
>
> Longer term you could ask MarkLogic to expand that pattern to cover more
> languages. It's only a matter of time before more users start wanting to
> use non-ASCII scripts for usernames. I'm not sure if there's any technical
> reason for the restriction. Using HTTP auth means user-id can't contain a
> colon ':', but otherwise I believe anything goes. Of course browsers might
> not support everything, and I'm not sure about LDAP, NTLM, etc.
>
> -- Mike
>
> On 1 Aug 2014, at 07:05 , David Sewell <[email protected]> wrote:
>
> > We have a user-facing function that creates login names based on their
> real names, using initial characters from their surname and last name to
> create a MarkLogic user name. For the first time we recorded a server error
> when someone registered with a name beginning with a non-ASCII character
> (Norwegian Ø), because currently MarkLogic username cannot have non-ASCII
> characters.
> >
> > So I thought the easy solution would be to use xdmp:diacritic-less().
> But no, that only changes characters like ñ and é that are accented
> variants of a single letter. It does not touch combined charaters like Ø or
> Æ.
> >
> > Of course I could use fn:translate to catch all of the likely cases, but
> is there a more general-purpose standard or extension function to perform
> normalization to ASCII for accented/combined Latin characters in a
> MarkLogic environment?
> >
> > David S.
> >
> > --
> > David Sewell, Editorial and Technical Manager
> > ROTUNDA, The University of Virginia Press
> > PO Box 400314, Charlottesville, VA 22904-4314 USA
> > Email: [email protected]   Tel: +1 434 924 9973
> > Web:
> http://rotunda.upress.virginia.edu/_______________________________________________
> > General mailing list
> > [email protected]
> > http://developer.marklogic.com/mailman/listinfo/general
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>



-- 

============================================
Timothy Cook
LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook
MLHIM http://www.mlhim.org
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to