[precis] More on mappings (iSCSI perspective)

Black, David Fri, 03 May 2013 12:20:36 -0700

I've taken a look at the mappings draft from an iSCSI perspective.
The iSCSI entities that use Unicode are iSCSI names, so I'll start
with some background on them.


iSCSI names are somewhat restrictive in allowed characters.  For
ASCII characters, only the following are allowed:
   -  ASCII dash character ('-' = U+002d)
   -  ASCII dot character ('.' = U+002e)
   -  ASCII colon character (':' = U+003a)
   -  ASCII lower-case characters ('a'..'z' = U+0061..U+007a)
   -  ASCII digit characters ('0'..'9' = U+0030..U+0039)
All other ASCII characters are prohibited, including U=0020; SPACE.
The Unicode support generalizes from this.

The iSCSI protocol design kept Unicode functionality outside the
protocol; stringprep is used on iSCSI names *before* the protocol
sees them so that the protocol can do binary comparison for equality
of iSCSI names that are communicated "on the wire".  Hence whatever
we do here should not affect the specification of the "on the wire"
iSCSI protocol.

For context, the iSCSI stringprep profile stuck fairly close to
stringprep (RFC 3454) with the following items of note:

        - whitespace is prohibited in output.
                Prohibition in input is probably the best approach here.
        - U+3002; ideographic full stop is prohibited in output.
                Mapping that to U+002E; FULL STOP would be user-friendly.
        - NFKC was used as it was the only reasonable option in
                RFC 3454.
        - Everything for which RFC 3454 suggested possible prohibition
                was prohibited, i.e., all of Appendix C of RFC 3454.
        - Bidirectional support followed RFC 3454.

        - Case mapping is used.  The design rationale was to allow
                mixed case human input, mapping to lower case to
                obtain binary-comparable identifiers.

I believe that both case mapping and local case mapping will be
appropriate for iSCSI, as at least the Turkish language dependency
on the mapping of upper case I appearss relevant.  I would prefer
to see both mapping mechanisms specified in one document to limit
the possibility for confusion and mistakes (e.g., if local case
mapping is not always implemented along with case mapping), so
I'd prefer to move it into the framework draft to resolve the
open issues listed in Section 6 of the mappings draft.

I also observe that the framework draft would benefit from an appendix
that summarizes potential differences in behavior from RFC 3454 - when
the precis profile for iSCSI is written, I'd prefer to point to that
text and describe how it applies (the alternative is to write it from
scratch in the iSCSI precis profile).  iSCSI will not have any
on-the-wire version change to pick up precis support for Unicode, so
there will probalby need to be some text that describes what happens
when stringprep and precis are mixed in the same protocol session
(which may well include a blunt "SHOULD NOT" about doing that).

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
[email protected]        Mobile: +1 (978) 394-7754
----------------------------------------------------

_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

[precis] More on mappings (iSCSI perspective)

Reply via email to