I didn't realize non-ASCII characters weren't allowed in usernames. This is a serious shortcoming in ML. Unless it can be fixed soon this may force developers to use a mapping approach and something unique like a UUID for the ML user name.
On Fri, Aug 1, 2014 at 2:07 PM, Michael Blakeley <[email protected]> wrote: > I don't know of a library or built-in that would handle that. But you > could write one. If you do, try to release the source. > > In another direction, one fairly cheap solution might be to check the > user-name before creating the user, or try-catch the library call. If it > looks iffy or fails, apologize to the user and ask them to asciify it in > their own preferred way. That way there are no surprises when the user sees > møøse automatically (and irrevocably?) translated to moeoese or moose. > People are sometimes very sensitive about these things, so either variant > might annoy someone. > > A pre-flight check could use fn:matches with the pattern from security.xsd: > > <xs:simpleType name="user-name"> > <xs:annotation> > <xs:documentation> > </xs:documentation> > <xs:appinfo> > </xs:appinfo> > </xs:annotation> > <xs:restriction base="xs:token"> > <xs:pattern value="[a-zA-Z0-9._@-]+"/> > <xs:minLength value="1"/> > </xs:restriction> > </xs:simpleType> > > Longer term you could ask MarkLogic to expand that pattern to cover more > languages. It's only a matter of time before more users start wanting to > use non-ASCII scripts for usernames. I'm not sure if there's any technical > reason for the restriction. Using HTTP auth means user-id can't contain a > colon ':', but otherwise I believe anything goes. Of course browsers might > not support everything, and I'm not sure about LDAP, NTLM, etc. > > -- Mike > > On 1 Aug 2014, at 07:05 , David Sewell <[email protected]> wrote: > > > We have a user-facing function that creates login names based on their > real names, using initial characters from their surname and last name to > create a MarkLogic user name. For the first time we recorded a server error > when someone registered with a name beginning with a non-ASCII character > (Norwegian Ø), because currently MarkLogic username cannot have non-ASCII > characters. > > > > So I thought the easy solution would be to use xdmp:diacritic-less(). > But no, that only changes characters like ñ and é that are accented > variants of a single letter. It does not touch combined charaters like Ø or > Æ. > > > > Of course I could use fn:translate to catch all of the likely cases, but > is there a more general-purpose standard or extension function to perform > normalization to ASCII for accented/combined Latin characters in a > MarkLogic environment? > > > > David S. > > > > -- > > David Sewell, Editorial and Technical Manager > > ROTUNDA, The University of Virginia Press > > PO Box 400314, Charlottesville, VA 22904-4314 USA > > Email: [email protected] Tel: +1 434 924 9973 > > Web: > http://rotunda.upress.virginia.edu/_______________________________________________ > > General mailing list > > [email protected] > > http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > -- ============================================ Timothy Cook LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook MLHIM http://www.mlhim.org
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
