On Thursday, August 21, 2003, at 04:44 pm, Mark Davis wrote:
There is one open issue I'd like to draw people's attention to: whether to have
a narrow or broader approach to the whitespace in a pattern environment. The
narrower definition would be:
0009..000D ; Pattern_White_Space # <CHARACTER TABULATION>..<CARRIAGE RETURN
(CR)>
0020 ; Pattern_White_Space # SPACE
0085 ; Pattern_White_Space # <NEXT LINE (NEL)>
200E..200F ; Pattern_White_Space # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK
2028 ; Pattern_White_Space # LINE SEPARATOR
2029 ; Pattern_White_Space # PARAGRAPH SEPARATOR
while the broader one would add:
00A0 ; Pattern_White_Space # NO-BREAK SPACE 2000..200A ; Pattern_White_Space # EN QUAD..HAIR SPACE 202F ; Pattern_White_Space # NARROW NO-BREAK SPACE 205F ; Pattern_White_Space # MEDIUM MATHEMATICAL SPACE 3000 ; Pattern_White_Space # IDEOGRAPHIC SPACE
My judgement is that in a pattern environment the narrower devition would be
better. One might go so far as recommending that the others be quoted, to reduce
possible confusion when reading regular expressions, queries, or other patterns.
Mark __________________________________ http://www.macchiato.com ► “Eppur si muove” ◄
----- Original Message ----- From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, August 21, 2003 02:44 Subject: RE: Proposed Draft UTR #31 - Syntax Characters
This notice is relevant to anyone dealing with programming languages,queryspecifications, regular expressions, scripting languages, and similardomains.
That's me.
I read the draft, and actually I was very happy with it. No complaints at
all. I am particularly happy that the mathematical letters and numbers
(1D400-1D7FF) will be permitted in identifiers. This is important because it
allows mathematical expressions and programming-language expressions to use
the same symbols (for the first time!). I also noted the comment about how
specific porgramming languages could, if they wished, ignore <font>
equivalences (and hence ignore the mathematical letters and numbers) - so I
guess that keeps everyone happy.
I would have used the feedback form, but I didn't see much point as I had no
complaints.
Jill
-----Original Message----- From: Rick McGowan [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 20, 2003 7:23 PM To: [EMAIL PROTECTED] Subject: Proposed Draft UTR #31 - Syntax Characters
This notice is relevant to anyone dealing with programming languages, query
specifications, regular expressions, scripting languages, and similar
domains.
The Proposed Draft UTR #31: Identifier and Pattern Syntax will be discussed
at
the UTC meeting next week. Part of that document (Section 4) is a proposal
for
two new immutable properties, Pattern_White_Space and Pattern_Syntax. As
immutable properties, these would not ever change once they are introduced
into
the standard, so it is important to get feedback on their contents
beforehand.
The UTC will not be making a final determination on these properties at this
meeting, but it is important that any feedback on them is supplied as early
in
the process as possible so that it can be considered thoroughly. The draft
is
found at http://www.unicode.org/reports/tr31/ and feedback can be submitted
as
described there.
Regards, Rick McGowan Unicode, Inc.

