Peter Kirk wrote:I wonder. Hyphen-minus is used as an operator in ranges when it does not have the meaning minus, as in the example in UTR31 "[[:gc=s:] | [:gc=p:] | [\u2190-\u2BFF]]". If hyphen is an operator here, so probably should be maqaf. And even if the sense is minus, I wonder is Hebrew users sometimes use maqaf for minus at least in error.
Similarly, Hebrew geresh and gershayim look like quotation marks and are used interchangeably in legacy encodings,
the same with maqaf and hyphen - maqaf is very much the cultural equivalent of hyphen, and I have seen recent discussion about whether the hyphen key on a
Hebrew keyboard ought actually to generate a maqaf.
No, wait. The fact that maqaf id the cultural (and visual) equivalent of a hyphen, is a good reason to *exclude* it from class <Pattern_Syntax>, i.e. *allow* it in identifiers, so that composite words can be used as identifier.
As an ordinary Latin hyphen is already in the list, by your
argument there is no reason to exclude other things that
look like it and function like it.
I guess that the only reason why the ASCII '-' is included in <Pattern_Syntax> is that it is also used as "minus". If if only had the meaning "hyphen", it would not be in <Pattern_Syntax>.
_ Marco
Anyway, U+2010 HYPHEN is listed although this is explicitly not a minus sign.
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

