Elwyn,

Thanks for your careful and thorough review.  I have included my comments below.

Note, if I have not commented below, please assume I agree with your comment 
and will update the draft to include your feedback.

Discussion inline below...

Phil

[email protected]

> On Apr 17, 2015, at 2:26 PM, Elwyn Davies <[email protected]> wrote:
> 
> I am the assigned Gen-ART reviewer for this draft. For background on
> Gen-ART, please see the FAQ at
> 
> <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
> 
> Please resolve these comments along with any other Last Call comments
> you may receive.
> 
> Document: draft-ietf-scim-core-schema-17.txt
> Reviewer: Elwyn Davies
> Review Date: 2015/04/09
> IETF LC End Date: 2015/04/20
> IESG Telechat date: (if known) -
> 
> Summary: Not ready.  The 'major' issue identified is really political rather 
> than strictly technical although the proposed syntax does limit the 
> applicability (or at least the easy applicability) of the scheme. Making the 
> schemas more aware of practice outside the basic English speaking world 
> should be an aim of IETF work, IMO.  The minor issues are mostly  only just 
> more than editorial nits - and there are quite a few of these also.
> 
> Major issues:
> ===========
> s4.1.1, "name" attribute:  The definition of this attribute is culturally 
> insensitive.  The
> collection of name sub-attribute terms are North American/UK/Aussie/NZ 
> English -speaking biased.  The authors might wish to consider 
> http://www.w3.org/International/questions/qa-personal-names.  

[PH] I am not sure I agree with your conclusion. SCIM is a provisioning 
protocol and not a rendering protocol. SCIM is concerned with conveying 
information between systems based on commonly used fields. Further for 
formatted name, there is no restriction on how the name may be structured. 
Essentially SCIM already follows the above reference. Note that we don’t use a 
regional identifier in name, but rather have it as a separate attribute 
“locale”. This enables a UI to render the name in the appropriate regional 
method.

The WG did have several international participants and no issues were raised.

I believe that what is being reflected is that while database structures and 
schemas tend to be “western” oriented, there are well trodden industry 
practices to map those fields to render user facing material in the proper 
localized forms.

In deference to your concern over the weekend, I re-raised the issue with 
Oracle developers and have been informed that there has been an 
internationalization review and the specification does not present any problems 
for us as international implementers.

> To a lesser extent this also applies to the definition of the addresses 
> attribute in s4.1.2.  The issue of the representation of postal addresses 
> incorporated in I-Ds and RFCs in the xml2rfc schema has been debated at 
> length on the rfc-interest mailing list.  The new (v3) vocabulary replaces 
> the specific sub-attributes with an ordered  list of "postalLine" elements 
> (see https://tools.ietf.org/html/draft-hoffman-xml2rfc-16#section-2.39). 
> Further, the use of country codes in RFCs has been dropped some time ago.  It 
> might be better to represent the address in a less specific way and leave 
> display up to user interfaces that can consider the relevant locale.  My 
> suggestion, FWIW, would be to have a country, possibly a code field plus an 
> ordered array of postalLines that can contain any of the additional 
> components and cater for any locale specific format.

[PH] The suggested workaround of following the format for XML2RFC using an 
array of “postalLine” values causes more problems in the current SCIM model:
1.  SCIM does not support array references (e.g. postalLine/1, postalLine/2). 
In some cases, the order of elements may not be guaranteed by some 
impelementers (e.g. those building on top of LDAP)
2.  SCIM’s objective is to provision. So the meaning associated with the field 
must be well known to map it to the receivers data system. A system line 
postalLine1, postalLine2 would require attribute parsing which is substantially 
more complex and unreliable given the wide variance.  
3.  In practice, international data is collected through a user-interface which 
then categorizes the input and maps them to the appropriate attributes.  If we 
introduce agnostic attribute naming in the protocol (postalAddress1, 2 etc) we 
actually lose knowledge of what the intended content is.  This means that it 
becomes impractical to say does postalAddress2 map to city, county, district, 
or mail stop of a target system.

IOW. In the XML2RFC, we depend on human editors to validate and align the data. 
As a protocol we do not have this luxury.


> 
> ======================================================
> 
> Minor issues:
> ===========
> Reference to SCIM Protocol document:  At a bare minimum a normative reference 
> to the SCIM protocol document (currently draft-ietf-scim-api-16) is needed in 
> s1.2 where the protocol is referred to in the first two definitions.  In my 
> opinion, this document would be improved by the addition of a brief overview 
> of the operation of the SCIM protocol and the implications for the design of 
> the schema.   For example, s2 talks about 'replacement of a resource':  
> Knowing in advance that one of the operations anticipated in the protocol is 
> replacement makes this clearer.

No objection. There has been a practical problem that XML2RFC won’t let you 
reference a draft version that does not exist. In practice we’ve had to publish 
API 2 days after core-schema in order to allow the validation to succeed.  I 
will amend the document and leave notes for the RFC editor to co-publish the 
specs since they reference each other.

> 
> s1.1, Use of OPTIONAL and REQUIRED:  These terms are overloaded in this 
> document.  The majority of uses are not specifying features of the protocol 
> as per RFC 2119 but indicating the necessity or otherwise of the presence of 
> particular attributes in resource types.  AFAICS the only RFC 2119 usages are 
> one place in  s2.2.7 for OPTIONAL  and two adjacent places in s10.3.1 for 
> REQUIRED .   To avoid the overloading it would be easy to omit OPTIONAL and 
> REQUIRED from the RFC 2119 list, use the alternative RFC 2119 terminology 
> (MAY in s2.2.7 and MUST in s10.3.1) and provide a separate note on the usage 
> of OPTIONAL and REQUIRED in s1.1.

[PH] In the context of a schema document, the use of OPTIONAL and REQUIRED is 
equivalent to protocol normative language and was intended.

Never-the-less, if you feel strongly, I can re-word to avoid RFC2119 language 
entirely.
> 
> s2.1, Syntax of attribute names:  I am confused by the constraints suggested 
> here.
> (1)   "Attribute names SHOULD be camel-cased":  AFAICS this has no impact on 
> the specification or protocol.  My guess is that the specification has 
> adopted the convention normally used in JavaScript.  This is merely a 
> representation of the convention used in SCIM schemas and RFC 2119 language 
> is inappropriate.  I suggest replacing this with
> "This document uses the camel-casing convention for attribute names (e.g., 
> "camelCase").
> (2) "nameChar   = "-" / "_" / DIGIT / ALPHA": Given the close association 
> with JavaScript, it seems inappropriate to allow hyphen (-) as a character in 
> attribute names as this is illegal in JavaScript.
> (3) The definition should say whether attribute names are case sensitive.

[PH] I will clarify.  Attribute names should be case-insensitive.


> (4) Even though there is ABNF, it would be useful to note explicitly that 
> names are limited to a subset of ASCII rather than the much wider JSON string 
> or JavaScript variable character sets.

[PH] As the document references the core rules in RFC5234.  Are you suggesting 
we should restrict to A-Z / a-z rather than ALPHA?
> 
> s2.2.7, $ref:  In s2.2.7, $ref is defined as a sub-attribute name but does 
> not match the attribute name syntax discussed in the previous comment for 
> s2.1.  Does the attribute name syntax apply to sub -attributes?  Or are they 
> just JSON member names?

The ABNF should be 

ATTRNAME   = ALPHA *(nameChar)
nameChar   = “$” / “-" / "_" / DIGIT / ALPHA

Note:  there is no requirement that javascript attribute names must be exactly 
the same as JSON payload attribute names. So while “-“ may be problematic in 
Javascript (they need to be escaped), I see no protocol impacts and suggest we 
not eliminate “-“ (hypen) from attribute names at this time.

> 
> s2.3, next to last para:  To ensure that the service provider knows what it 
> ought to do to canonicalize a given value, the schema specification needs to 
> specify what canonicalization means for each type of attribute.  Having read 
> further on, I see that this is done in most cases for relevant attributes 
> defined in this draft.   A note that this should be done generally when 
> defining new schemas is needed here.  This is particularly important for 
> strings that might have internationalization issues (c.f., the discussion of 
> string comparison in filtering in section 5 of draft-ietf-scim-api-16.)
> 
> s7, canonicalValues:  The wording here
>>         When
>>         applicable service providers MUST specify the canonical types
>>         specified in the core schema specification; e.g., "work",
>>         "home".
> seems to imply that the possible canonicalValues mentioned in the definitions 
> of User, Group etc.  schemas earlier in the draft are actually normative 
> minimum requirements that could, at least in some cases, be extended.  The 
> wording used in the earlier sections is rather less definitive and appears to 
> indicate that the suggested values are examples that a service provider might 
> possible want to replace if they considered alternative values better suited 
> to their application, e.g.
>>   userType
>>      Used to identify the organization to user relationship. Typical
>>      values used might be "Contractor", "Employee", "Intern", "Temp",
>>      "External", and "Unknown" but any value may be used.
> and
>>   phoneNumbers
>>      Phone numbers for the user.   ...  The "display" sub-attribute
>>      MAY be used to return the canonicalized representation of the
>>      phone number value.  The sub-attribute "type" often has typical
>>      values of "work", "home", "mobile", "fax", "pager", and "other",
>>      and MAY allow more types to be defined by the SCIM clients.
> The wording used in the earlier sections seems to need 'tightening up' to 
> make it clear what minimum set of canonicalValues is required for 
> conformance, if indeed that is what is wanted.

[PH] Agreed. Suggestion:

A collection of suggested values for an attribute. For example often used with 
the “type” attribute to categorize a value such as “home” or “work”.  The 
service provider MAY choose to ignore values it does not support.
> 
> s7, caseExact:  I think you may need to clarify what case insensitivity means 
> for languages other than unaccented English.  It may be sufficient to provide 
> a note and a pointer to the discussion of filtering and normalization in the 
> protocol draft.
> 
> s10.3:  The registration procedure seems overly complex.  If, as stated, an 
> RFC is required in all cases, then the standard (RFC 7035) IETF Review 
> registration policy would seem to fill the bill and there is no need for a 
> designated expert.  Alternatively, Specification Required (with a designated 
> expert as is standard for this case) could be used if other types of 
> specification could be countenanced.  I suspect the requirement for a 
> standards track RFC as a way of modifying an existing value is going to come 
> back to bite us if the original specification was not standards track.  I am 
> not sure this attempt to provide a higher hurdle for modifications is the 
> best way to go about this - In general, IETF Review would, I think, give 
> enough pushback against inappropriate updates without requiring standards 
> track in all cases.  Overall, I recommend that the authors consult your AD 
> and IANA to determine how best to structure the registration procedure.

[PH] The current document is probably following older IANA practices which I 
understand have recently been updated. Leif Johansson has suggested we can 
simplify by updating the document to reflect the new recommendations. I’ll let 
Leif comment more on this issue.
> 
> ===========================================================
> 
> Nits/editorial comments:
> =====================
> Global: s/e.g. /e.g., /
> 
> The term 'endpoint': The term '(network) endpoint' has a particular technical 
> meaning in W3C/HTTP jargon although it the usage in (e.g.) 
> http://www.w3.org/TR/wsdl.html seems rather self-referential.  It would be 
> useful to provide a definition.  Perhaps something like:
> (Network) endpoint:  Also known as a 'port' (see 
> http://www.w3.org/TR/wsdl.html).  A  port has a 'port type' that identifies a 
> set of operations invoked by HTTP methods.  Each port is identified by a URI  
> typically constructed from the base URI identifying the server implementing 
> the operation and a relative URI bound to the port type.  The methods are 
> associated with abstract data types, such as the schema specified in this 
> document.  HTTP messages carry data structured according to the abstract data 
> types.
> 
> Canonicalized URLs:   Presumably URLs should be canonicalized in line with 
> Section 6 of RFC 3986.  An appropriate global place to say this would be s2.3 
> I believe.   However RFC 6986 offers a 'ladder' of canonicalizations and it 
> would be desirable to say what rung on this ladder should be used.  
> Presumably either 6.2.3 or 6.2.4.
> 
> s1, para 1, last sentence:  The phrase 'redundantly integrated' is not 
> felicitous.  Suggestion:
> 
> OLD:
>   Similarly, cloud services
>   providers seeking to inter-operate with multiple application
>   marketplaces or cloud identity providers must be redundantly
>   integrated.
> NEW:
>   Similarly, cloud services
>   providers seeking to inter-operate with multiple application
>   marketplaces or cloud identity providers would require pairwise
>   integration.
> END
> 
> s1, para 2: Worth adding a reference to [PortableContacts] since you have it 
> already and its not a 'well-known' item.
> I fear that LDAP is not a well-known abbreviation within the meaning of the 
> act, and needs expanding.
> Maybe add a ref to RFC 6350 for vCards.
> 
> s2, para 1, 1st sentence: s/contents of which/allowable contents of which/
> s2, para 1, 4th sentence: s/alidation/Validation/
> 
> s2, para 1, last sentence: s/the attributed defined schema/its 
> characteristics as defined in the relevant schema/
> 
> s2, para 2: s/extend schema/ extend a schema/ [or "extend schemas"]
> 
> s2.1, para 1: s/For each attribute, SCIM schema/For each attribute, a SCIM 
> schema/
> 
> s2.2:  The list of characteristics and their default values is not associated 
> with the data type of the attribute but is another set of attributes of each 
> attribute defined.  This would be clearer if the list of defaults and 
> examples was separated out into a new section (probably after s2.2).  It 
> would be helpful to point out explicitly that these defaults apply to all the 
> attributes defined in the draft - I found the tacit assumption of default 
> characteristics in later definitions of attributes had me asking myself 
> whether certain characteristics ought to have been defined whereas they were 
> actually covered by the defaults.
> 
> s2.2, 1st bullet: For consistency, s/required/REQUIRED/
> 
> s2.2, bullet 5:
> OLD:
> o  have no canonical values (e.g. type is "home" or "work"),
> NEW:
> o  have no canonical values (for example, the "type" sub-attribute in Section 
> 2.3),
> END
> 
> s2.2.6, Base 64 URL encoding:  Presumably the trailing padding characters can 
> be omitted here - this should be mentioned whether or not they are needed.
> 
> s2.2.8: Presumably, in line with s2.3 and the JSON specification, the order 
> of component attributes is not significant.  If this is so, it should be 
> mentioned here:  Perhaps add:
>    The order of the component attributes is not significant. Servers and
>    clients MUST NOT require or expect attributes to be in
>    any specific order when an object is either generated or analyzed.
> 
> s2.3, 1st para:  I found this difficult to parse.  Suggest:
> OLD:
>   Multi-valued attributes contain a list of value or may contain sub-
>   attributes and MAY also be considered complex attributes.  The order
>   of values returned by the server SHOULD NOT be guaranteed.  The sub-
>   attributes below are considered normative and when specified SHOULD
>   be used as defined.
> NEW:
>    Multi-valued attributes contain a list of elements, using the JSON array 
> format
>    defined in Section 5 of [RFC7159].  Elements can be either
>    o   primitive values, or
>    o   objects with a set of sub-attributes and values, using the JSON object 
> format
>         defined in Section 4 of [RFC7159], in which case they MAY also be 
> considered
>         to be complex attributes.  As with complex attributes, the order of 
> sub-attributes
>         is not significant.  The pre-defined sub-attributes listed in this 
> section can be
>        used with multi-valued attribute objects but these sub-attributes 
> should only be used
>       with the meanings as defined here.
> 
> s2.3: Question: Can sub-attributes have sub-sub-attributes?  I don't think I 
> see any examples and maybe the definition in s1.2 effectively excludes them.  
> Might be worth being explicit.

[PH] Agreed. I will add text that complex attributes may not have complex 
sub-attributes (sub-sub-attributes).

> 
> s2.3, "primary" sub-attribute: Should this be specified as assumed to be 
> "false" if not present in a relevant object?  I don't think this is covered 
> by the defaults anywhere.

Agreed.
> 
> s2.3, $ref: I guess this ought always to be canonicalized  - this can be 
> noted in the following paragraph where canonicalization is discussed.  This 
> would be a good place to specify a reference for URL canonicalization as 
> mentioned above.
> 
> s2.3, last para: Suggest being a little more explicit about the scope of this 
> paragraph.  I suggest:
> OLD:
>   Service providers MAY return the same value more than once with
>   different types (e.g. the same e-mail address may used for work and
>   home), but SHOULD NOT return the same (type, value) combination more
>   than once per Attribute, as this complicates processing by the
>   Consumer.
> NEW:
>   Service providers MAY return element objects with the same "value" 
> sub-attribute
>   more than once with a  different "type" sub-attribute (e.g., the same 
> e-mail address
>   may used for work and home), but SHOULD NOT return the same (type, value)
>   combination more than once per Attribute, as this complicates processing by 
> the
>   consumer.
> END
> Note "Consumer" replaced by "consumer" - there is no definition of a specific 
> meaning for this term.
> 
> s3, Resource Type:  s/("meta.resourceType")/("meta.resourceType", see Section 
> 3.1)/
> 
> s3, Schemas Attribute:  I think s/the namespace of SCIM schema that 
> defines/the namespaces of the SCIM schemas that define/; s/All 
> representations of SCIM schema MUST include a non-zero value array/All 
> representations of SCIM schemas MUST include a non-empty array/
> 
> s3, name used in example:  I don't know if the RFC Editor has a policy on 
> suitable fictitious names equivalent to example.com for domains.  Apparently 
> Jane Roe and Mary Major have been used in US legal practice  as female 
> alternatives to the ubiquitous Mr John Doe.  Probably good to check with the 
> RFC Editor.
> 
> s3.1, id, externalId, meta.version, meta.resourceType:  I suspect these ought 
> to be caseExact?
> 
> s3.1, externalId:  The concepts of "provisioning domain" and a "client's 
> tenant" need to be defined.  The externalId attribute is not explicitly 
> defined as REQUIRED or OPTIONAL.
> 
> s3.1.1, meta.resource:  I got the impression from s3 that meta.resourceType 
> was REQUIRED rather than being optional as noted in the first para of s3.1.1.
> 
> s3.1.1, meta.location: Should the value of this sub-attribute be the same as 
> Content-Location rather than Location?  Is it intended that the request 
> should be redirected (or that the resource was newly created?  If not it 
> seems Content-Location would be more appropriate.  A normative reference to 
> the relevant HTTP RFC (probably RFC 7231) ought to be included.
> 
> s3.1.1, meta.version:  Would one expect a weak or strong ETag?  A normative 
> reference to the relevant HTTP RFC (probably RFC 7232) ought to be included.
> 
> s3.2, last sentence: s/Section 6and/Section 6 and/ (missing space).
> 
> s3.3, 1st para:    s/used in LDAP/are used in LDAP/;
>                            s/Each "schemas" value indicates additive 
> schema/Each value in the "schemas" attribute  indicates an additive schema/;
>                            s/See Figure 5 for an example JSON 
> representation/See Figure 5 for an example of the JSON representation/
> 
> s3.3, para 2: s/"schemas" URI value/URI value in the "schemas" attribute/
> 
> s4.1.1, userName:  Having said that each User MUST have  include a non-empty 
> userName value, why is this attribute RECOMMENDED rather than REQUIRED?  I 
> guess it ought to be caseExact also.
> 
> s4.1.1, profileUrl: Needs a canonicalization mechanism specified.
> 
> s4.1.1, preferredLanguage:  There is potentially more than one preferred 
> language (as per Accept-Languages) so this presumably this ought to be a 
> Multi-valued attribute.  The Accept-Language header syntax also has an 
> optional, per language, weight to assist with selection.  Should this be 
> catered for here as well?  This would presumably mean that it should have 
> sub-attributes (e.g.) using "value" for the name and "weight" or some such.   
> Also s/localized User interface/localized user interface/
> 
> s4.1.1, password:  I *hope* there is a discussion of the security 
> implications of this field later.  A pointer to this discussion would be 
> highly desirable.
> 
> s4.1.2, photos: A reference to the canonicalization mechanism is needed (see 
> previous comment).
> 
> s4.1.2, entitlements, roles: There doesn't seem to be any good reason for 
> capitalizing 'NO' here: s/NO/no/, 2 places.
> 
> s4.2, para 2: s/by the service provider are considered/by the service 
> provider, and are considered/
> 
> s4.3, employeeNumber:  Maybe this might be better called an 
> "employeeIdentifier" since it can be alphanumeric.  Is there any reason why 
> this can't just be any old string?
> 
> s5, patch: A pointer to the SCIM protocol draft PATCH operation would be 
> helpful.
> 
> s5, bulk: A pointer to the SCIM protocol draft Bulk operations section would 
> be helpful.  I note that the capitalized form is not used in the protocol 
> draft: suggest s/BULK/Bulk/ (total of 2 places)
> 
> s5, filter: A pointer to some appropriate part of the SCIM protocol draft  
> (maybe s3.4.2.2) would be helpful.
> 
> s6, endpoint:  (1)The endpoint is defined to be a relative URI.   It is 
> therefore inappropriate that the example here is "/Users".  I guess it ought 
> to be "Users".  There are a number of example of relative URIs starting with 
> / in the examples in Section 8 that also ought to be corrected.
> 
> s6, endpoint: (2) Please bear with me, this is a bit long winded... I 
> initially thought that the 'endpoint' mechanism was a possible contravention 
> of BCP 190/RFC 7320:  Quoting s2.3 of RFC 7320:
>>    Scheme definitions define the presence, format, and semantics of a
>>    path component in URIs; all other specifications MUST NOT constrain,
>>    or define the structure or the semantics for any path component.
>> 
>>    ....
>> 
>>    For example, an application ought not specify a fixed URI path
>>    "/myapp", since this usurps the host's control of that space.
>> 
>>    Specifying a fixed path relative to another (e.g., {whatever}/myapp)
>>    is also bad practice (even if "whatever" is discovered as suggested
>>    in Section 3); while doing so might prevent collisions, it does not
>>    avoid the potential for operational difficulties (for example, an
>>    implementation that prefers to use query processing instead, because
>>    of implementation constraints).
> In Section 6, the definition of the endpoint attribute specifies that each 
> schema has to declare a relative URI or path component that gives access to 
> schema instances.  My initial thinking was that  the endpoint value was 
> standardized for Users and Groups in the draft.  My interpretation of s2.3 of 
> RFC 7320 was that this technique is deprecated as bad practice.  After 
> sleeping on it, I think I understand that the endpoint value is *not* 
> standardized and potentially each service provider can use a different 
> endpoint name if they really have to (although I guess in this case it would 
> be good to go with the defaults.)  So I am happy that this isn't flagrantly 
> contravening BCP 190, although I am not sure about the query processing bit 
> at the end of the quoted section.  Conclusion: I think it would be useful to 
> add a note to the definition of endpoint to indicate that it is at the choice 
> of the service delivering the resources and is not a fixed value, maybe 
> saying that this is intended to avoid infringing BCP 190.
> 
> s7, mutability:
> OLD:
> mutability  A single keyword indicating what types of
>                    modifications an attribute MAY accept as follows:
> This 'MAY' is not about the 'protocol'.. Suggest:
> NEW:
> mutability  A single keyword indicating the circumstances under
>                    which the value of the attribute can be (re)defined:
> END
> 
> s9: s/personally identifiable information/personally identifying information/g
> 
> s9, 1st bullet: s/mulitple/multiple/
> 
> s9: para 1:  It would be sensible to also forbid the carrying of passwords in 
> requests that are not encrypted.
> 
> s9:  It would be worth emphasizing that privacy issues should be considered 
> whenever resource extensions are defined.
> 
> s10.1:  This is a request for a new entry in the  'URN Sub-namespace for 
> Registered Protocol Parameter Identifiers' ...
> OLD:
>   IANA has created a registry for new IETF URN sub-namespaces,
>   "urn:ietf:params:scim:", per [RFC3553].  The registration request is
>   as follows:
> 
>   Per [RFC3553], IANA has registered a new URN sub-namespace,
>   "urn:ietf:params:scim".
> NEW:
>   IANA is requested to add an entry to the 'IETF URN Sub-namespace for 
> Registered Protocol Parameter Identifiers'
>   registry and create a sub-namespace for the Registered Parameter Identifier 
> as per [RFC3553]:
>   "urn:ietf:params:scim:".
>   The registration request is as follows:
> END
> 
> s10.2:   This section is lacking a specification of exactly what is recorded 
> in the new SCIM registry - the template tells how to apply and considerations 
> to be used in granting the request.  See Section 8.4 of RFC 7035, for 
> example, to see what is needed here.
> 
> 
> s11.1: Needs a reference to the SCIM protocol document.
> 
> s11.2, [Olson-TZ] is  incomplete - I suspect it needs a reference to the IANA 
> TZ database http://www.iana.org/time-zones
> 

_______________________________________________
Gen-art mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/gen-art

Reply via email to