On Tue Mar 26 2002 - 10:01:43 EST, Mark Davis ☕️  wrote:

http://www.unicode.org/mail-arch/unicode-ml/y2002-m03/0598.html

> Apostrophe, hyphen, and various other puncutation by default continue
> a word, but this behavior may be overriden on a per-language basis.
> Heuristics or more sophisticated engines may be needed when the
> apostrophe is at the end of a word, as in “the peoples' choice”, since
> it is ambiguous. The modifier letter apostrophe, on the other hand, is
> always treated as a letter.

 

[I replaced '<' '>' with '“' '”' to prevent confusion with a tag by the user 
agent.]

 

On Tue Mar 26 2002 - 11:44:28 EST, Marco Cimarosti  wrote:

http://www.unicode.org/mail-arch/unicode-ml/y2002-m03/0604.html


 

> Mark Davis wrote: 
>> Apostrophe, hyphen, and various other puncutation by default continue 
>> a word, but this behavior may be overriden on a per-language basis. 

> This may work for things such as finding word boundaries, but not for 
> identifiers. 

> According to the ID_Start and ID_Continue properties in 
> , neither 
> U+0027 (APOSTROPHE) nor U+2019 (RIGHT SINGLE QUOTATION MARK) are allowed in 
> an identifier. And this is not surprising, since they are primarily 
> quotation marks. 

> On the other hand, U+02BC (MODIFIER LETTER APOSTROPHE) is allowed in any 
> position within an identifier. Using U+02BC as the apostrophe, would allow 
> to use words such as: ,  or <'em> in identifiers. 

> But this hits against the fact that Unicode's own suggestion is to use 
> U+2019 for the apostrophe.

 


On Tue Mar 26 2002 - 12:08:41 EST , Marco Cimarosti  wrote:

http://www.unicode.org/mail-arch/unicode-ml/y2002-m03/0608.html



> But, as you say, the apostrophe is legitimate and sometimes mandatory in the 
> orthography of English and many other languages. So, it seems to me that its 
> preferred encoding should make it possible to use it in identifiers, 
> filenames, URI(')s, and so on.

 

 

Don't we fall back into the times of all-0x27 and stay in front of on-going 
confusion when 


English apostrophe is ambiguated with closing-quote? 


As you told us, having both U+02BC and U+2019 in use will need some 
supplemental algorithms.


But as you told in 2002, this is true when both are confused in only one 
character, too.


 

I suspect that the cost of using MODIFIER LETTER APOSTROPHE for English 
apostrophe (and as 


apostrophe on the whole) today would mainly be the cost of updating 
implementations and text files. 


If this cost is too high, we would have to consider that text has not to be 
quoted nor to be converted 


between British and US English. I hope people will stay communicating and 
exchanging.


 

Marcel Schneider


 

 

 

 











Reply via email to