Re: [Syslog] An early last call comment on protocol-19

2007-02-06 Thread Sam Hartman
The description of non-ascii characters in the registry refers to
non-ascii characters in the description field, etc.  The subtags are
ascii.


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-06 Thread tom.petch
- Original Message -
From: "Sam Hartman" <[EMAIL PROTECTED]>
To: "tom.petch" <[EMAIL PROTECTED]>
Cc: "David Harrington" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Monday, February 05, 2007 10:44 PM
Subject: Re: [Syslog] An early last call comment on protocol-19


> What part of 4646 allows non-ASCII characters?  How is encoding an
> issue?

Sam

In section 3.1. " Format of the IANA Language Subtag Registry" it says
"  Characters from outside the US-ASCII [ISO646] repertoire, as well as
   the AMPERSAND character ("&", %x26) when it occurs in a field-body,
   are represented by a "Numeric Character Reference" using hexadecimal
   notation in the style used by [XML10"
which suggests to me that characters outside the US-ASCII repertoire may occur
in
a language subtag.
.
This section does define the encoding within the IANA Language Subtag Registry
but I do not see that as necessarily defining encodings to be used elsewhere and
I see benefits in using UTF-8 in -protocol should encoding be needed.

I am conscious that section 2.1 of RFC4646 says
"Note that although [RFC4234] refers to octets, the language tags
   described in this document are sequences of characters from the
   US-ASCII [ISO646] repertoire.  Language tags MAY be used in documents
   and applications that use other encodings, so long as these encompass
   the US-ASCII repertoire."
which supports my view language tags are characters, not an encoding thereof.  I
cannot reconcile the reference in 2.1 to US-ASCII repertoire with 3.1 and its
reference to encoding when outside the US-ASCII repertoire.

I note that section 4.4.  "Canonicalization of Language Tags" refers to
"Case folding of ASCII letters in certain locales, unless carefully handled,
sometimes produces non-ASCII character values."
with the delightful example of
"the letter 'i' (U+0069) in Turkish and Azerbaijani is uppercased to U+0130"
so on balance, I think that characters outside the US-ASCII repertoire may
occur.

It may be that this is considered too low a probability to consider and that we
limit the language subtags to ASCII, in which case, encoding is not an issue.

I have checked draft-ietf-ltru-4646bis and the wording is unchanged there.

As I said to start with, I do find RFC4646 magnificently powerful, perhaps
too much so, in its entirety, for some use cases.

Tom Petch


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-05 Thread Sam Hartman
What part of 4646 allows non-ASCII characters?  How is encoding an
issue?

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-05 Thread tom.petch

Tom Petch

- Original Message -
From: "Sam Hartman" <[EMAIL PROTECTED]>
To: "tom.petch" <[EMAIL PROTECTED]>
Cc: "David Harrington" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Monday, February 05, 2007 7:18 PM
Subject: Re: [Syslog] An early last call comment on protocol-19


> >>>>> "tom" == tom petch <[EMAIL PROTECTED]> writes:
>
> tom> Pity, I had hoped that David's compromise would be
> tom> acceptable.  RFC4646 (the current BCP0047) is a magnificent
> tom> piece of work and does enable the generator of text to
> tom> specify quite precisely how it should be interpreted.  I love
> tom> the differentiation between the dotted letter I of Azerbaijan
> tom> and Turkey, in fact all the comments about Azerbaijani,
> tom> Mongolian and Icelandic.
>
> tom> What concerns me is conformance, what does it mean that a
> tom> parameter MUST conform to this BCP or any other, an issue
> tom> that has surfaced on this list before.  If we just changed
> tom> the reference so that the I-D were to read "it MUST contain a
> tom> two letter language identifier as defined in BCP0047 [13}"
> tom> then I have no problem but this does rather negate the intent
> tom> of the BCP.
>
> First you need to remove the two-letter restriction; language tags can be
longer than two letters,
> but besides that, this is exactly what I think you should do.
>
> tom> The BCP defines two levels of conformance (s.2.2.9) and I
> tom> suspect that even the lower level requires online access to
> tom> the IANA website so what does a receiver of a syslog message
> tom> do?  Take it as an opaque character string?  Check the ABNF?
> tom> Do as RFC4646 specifies, for well-formed or validating
> tom> conformance?
>
> The lower level (well-formed) does not require online access to the registry.
>
> Imho for syslog, receivers MAY|SHOULD  check that a tag is well-formed.
> That's probably best as a MAY.

Specifying in -protocol 'MAY check that a tag is well-formed' would allay my
concerns.

We still need to think about the encoding (which RFC4646 pointedly excludes).
At first sight, the existing text and ABNF about parameter values is unaffected
(UTF-8 plus escaping for PARAM-VALUE)

Tom Petch


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-05 Thread Sam Hartman
> "tom" == tom petch <[EMAIL PROTECTED]> writes:

tom> Pity, I had hoped that David's compromise would be
tom> acceptable.  RFC4646 (the current BCP0047) is a magnificent
tom> piece of work and does enable the generator of text to
tom> specify quite precisely how it should be interpreted.  I love
tom> the differentiation between the dotted letter I of Azerbaijan
tom> and Turkey, in fact all the comments about Azerbaijani,
tom> Mongolian and Icelandic.

tom> What concerns me is conformance, what does it mean that a
tom> parameter MUST conform to this BCP or any other, an issue
tom> that has surfaced on this list before.  If we just changed
tom> the reference so that the I-D were to read "it MUST contain a
tom> two letter language identifier as defined in BCP0047 [13}"
tom> then I have no problem but this does rather negate the intent
tom> of the BCP.

First you need to remove the two-letter restriction; language tags can be 
longer than two letters,
but besides that, this is exactly what I think you should do.

tom> The BCP defines two levels of conformance (s.2.2.9) and I
tom> suspect that even the lower level requires online access to
tom> the IANA website so what does a receiver of a syslog message
tom> do?  Take it as an opaque character string?  Check the ABNF?
tom> Do as RFC4646 specifies, for well-formed or validating
tom> conformance?

The lower level (well-formed) does not require online access to the registry.

Imho for syslog, receivers MAY|SHOULD  check that a tag is well-formed.
That's probably best as a MAY.

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-05 Thread tom.petch


Tom Petch


- Original Message -
From: "Sam Hartman" <[EMAIL PROTECTED]>
To: "David Harrington" <[EMAIL PROTECTED]>
Cc: "'tom.petch'" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Thursday, February 01, 2007 7:43 PM
Subject: Re: [Syslog] An early last call comment on protocol-19


> >>>>> "David" == David Harrington <[EMAIL PROTECTED]> writes:
>
> David> Hi WG, If ISO is a subset of what is covered by BCP047,
> David> then would it be acceptable to REQUIRE the ISO subset
> David> mandatory-to-implement-for-compliance for interoperability
> David> purposes, and implementations MAY support other languages
> David> in BCP047 with no assurance of interoperability with
> David> standard-compliant implementations?
>
> No, I'd really need a fairly strong justification that went through
> the languages you were not supporting and explained why that was
> appropriate for syslog.
>
> BCP 47 is by definition the IETF's best current practice for language
> tagging.  Absent a compelling reason to do something else, you should
> identify languages that way.  Tom has not (so far) presented a
> compelling reason.
>

Pity, I had hoped that David's compromise would be acceptable.  RFC4646 (the
current BCP0047) is a magnificent piece of work and does enable the generator of
text to specify quite precisely how it should be interpreted.  I love the
differentiation between the dotted letter I of Azerbaijan and Turkey, in fact
all the comments about Azerbaijani, Mongolian and Icelandic.

What concerns me is conformance, what does it mean that a parameter MUST conform
to this BCP or any other, an issue that has surfaced on this list before.  If we
just changed the reference so that the I-D were to read
"it MUST contain a two letter language identifier as defined in BCP0047 [13}"
then I have no problem but this does rather negate the intent of the BCP.

The BCP defines two levels of conformance (s.2.2.9) and I suspect that even the
lower level requires online access to the IANA website so what does a receiver
of a syslog message do?  Take it as an opaque character string?  Check the ABNF?
Do as RFC4646 specifies, for well-formed or validating conformance?

I suggest anyone considering this question look at the current online registry
as well as RFC4646, and note such comments as
'"en-a-bbb-a-ccc" is invalid'
whereas
 'the tag "en-a-bbb-x-a-ccc"' is valid; or
"sl-IT-nedis" is suitable but "it-IT-nedis" is not.  It is beautiful.

The issue I see is conformance.  What can we expect the recipient of a syslog
message to do without placing a significant burden thereon?

Note too that RFC4646 defines a character string so we also have to specify an
encoding thereof, another issue that has surfaced on this list before.

Tom Petch



___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-01 Thread Sam Hartman
> "David" == David Harrington <[EMAIL PROTECTED]> writes:

David> Hi WG, If ISO is a subset of what is covered by BCP047,
David> then would it be acceptable to REQUIRE the ISO subset
David> mandatory-to-implement-for-compliance for interoperability
David> purposes, and implementations MAY support other languages
David> in BCP047 with no assurance of interoperability with
David> standard-compliant implementations?

No, I'd really need a fairly strong justification that went through
the languages you were not supporting and explained why that was
appropriate for syslog.

BCP 47 is by definition the IETF's best current practice for language
tagging.  Absent a compelling reason to do something else, you should
identify languages that way.  Tom has not (so far) presented a
compelling reason.


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


RE: [Syslog] An early last call comment on protocol-19

2007-02-01 Thread David Harrington
Hi WG,

If ISO is a subset of what is covered by BCP047, then would it be
acceptable to REQUIRE the ISO subset
mandatory-to-implement-for-compliance for interoperability purposes,
and implementations MAY support other languages in BCP047 with no
assurance of interoperability with standard-compliant implementations?

Would that satisfy the needs of both Sam and Tom and others in the WG?

Are there technical reasons why implementations MUST NOT support
BCP047, but only ISO?

David Harrington
[EMAIL PROTECTED] 
[EMAIL PROTECTED]
[EMAIL PROTECTED]


> -Original Message-
> From: Sam Hartman [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, February 01, 2007 11:25 AM
> To: tom.petch
> Cc: [EMAIL PROTECTED]
> Subject: Re: [Syslog] An early last call comment on protocol-19
> 
> As I said before if you are not going to use BCP 47, you need to
> clearly explain for each class of languages BCP 47 supports and your
> application does not support why it is OK to be unable to label
those
> applications.
> 
> ___
> Syslog mailing list
> Syslog@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/syslog
> 



___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-01 Thread Sam Hartman
As I said before if you are not going to use BCP 47, you need to
clearly explain for each class of languages BCP 47 supports and your
application does not support why it is OK to be unable to label those
applications.

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] An early last call comment on protocol-19

2007-02-01 Thread tom.petch
Because our chair proposed it and copied Sam on the e-mail? (21Nov2005)

BCP0047 is horrendously complex, way over the top for most use cases and fails
to define a simple subset when that is all the application needs.  ISO meets the
need, why make it more complex than it need be?  I would not like to see a
reference  change to the BCP as it stands without a strict limit on which bits
of the BCP we are agreeing to the use of, like what ISO specifies and no more!

Tom Petch


- Original Message -
From: "Rainer Gerhards" <[EMAIL PROTECTED]>
To: "Sam Hartman" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wednesday, January 31, 2007 7:45 PM
Subject: RE: [Syslog] An early last call comment on protocol-19


Sam,

I need to check the mailing list archives and my notes, but I think
there was no technical reason to use ISO instead of BCP 47. If I do not
find anything, I'll simply change the reference. In any case, I'll post
what I find out.

Rainer

> -Original Message-
> From: Sam Hartman [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, January 31, 2007 10:39 AM
> To: [EMAIL PROTECTED]
> Subject: [Syslog] An early last call comment on protocol-19
>
>
>
> I failed to write this up yesterday.
>
> Your protocol document uses ISO language identifiers rather than BCP
> 47.  Please either use BCP 47 or explain for all the language sets
> that BCP 47 can identify but your choice cannot why syslog
> implementations will not care.
>
>
> ___
> Syslog mailing list
> Syslog@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/syslog

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


RE: [Syslog] An early last call comment on protocol-19

2007-01-31 Thread Rainer Gerhards
Sam,

I need to check the mailing list archives and my notes, but I think
there was no technical reason to use ISO instead of BCP 47. If I do not
find anything, I'll simply change the reference. In any case, I'll post
what I find out.

Rainer

> -Original Message-
> From: Sam Hartman [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, January 31, 2007 10:39 AM
> To: [EMAIL PROTECTED]
> Subject: [Syslog] An early last call comment on protocol-19
> 
> 
> 
> I failed to write this up yesterday.
> 
> Your protocol document uses ISO language identifiers rather than BCP
> 47.  Please either use BCP 47 or explain for all the language sets
> that BCP 47 can identify but your choice cannot why syslog
> implementations will not care.
> 
> 
> ___
> Syslog mailing list
> Syslog@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/syslog

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog