RE: [WSG] UTF-8 (was: Quirks mode vs Standards mode)

2005-04-19 Thread Richard Ishida
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Dean Jackson
 Sent: 19 April 2005 17:12
...

  I try to avoid entities with exception for '
 
 You're right. If you're using UTF-8 you only need to encode 
 the characters that are special in HTML/XHTML/XML (,  and ).
 Using numeric entities (or even named entities) in a UTF-8 
 file for characters that are outside the range of ASCII is 
 usually a waste of space.
 
 The only time I use them is when I'm on a keyboard/system 
 where I don't know how to enter the character, such as å. 
 I'd type aring; in this case.
 
 PS. Hopefully the W3C i18n guru Richard is listening and will 
 tell everyone if I'm wrong.

Hi Dean. I'd hesitate to say anyone was right or wrong here, but I'm of the
same opinion, albeit with one small exception.  I think in UTF-8
NCRs/entities beyond the ASCII range can be useful for invisible characters
(such as LRM in Arabic/Hebrew) or ambiguous characters (such as non-breaking
space - which looks like an ordinary space).

Tee mentioned some issues with Chinese characters on IE Mac that I haven't
got to the bottom of yet, but I don't recall encountering any other problems
that could be solved by using escapes instead.

For a fuller version of my opinion see the slides starting at
http://www.w3.org/International/tutorials/tutorial-char-enc/en/all.html#Slid
e0440

RI

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list  getting help
**



Re: [WSG] UTF-8 (was: Quirks mode vs Standards mode)

2005-04-19 Thread Gene Falck
Hi Dean,
You wrote:
... Norwenglish lines of text into numeric entities
(UTF-8) where needed.
What characters needs encoding into numeric entities when using UTF-8?
I try to avoid entities with exception for '
It is a small nuisance, of course. I do use them
when I type (US English qwertyuiop keyboard) as I
usually don't have a place to copy and paste. It
does work quite well, though, when I copy and
paste something I used entities or the numeric
codes for into Outlook at work. Mostly at work I
use a degree sign or a plus/minus sign but there
is a lot to cover for foreign place and personal
names that is not on my keyboard.
You're right. If you're using UTF-8 you only need to encode
the characters that are special in HTML/XHTML/XML (,  and ).
Using numeric entities (or even named entities) in a UTF-8 file
for characters that are outside the range of ASCII is usually
a waste of space.
Does anyone have a good quick reference as to which
characters are good on UTF-8? How about a faster or
easier way to type them in? I wasn't aware (until
this thread) that there was enough space for place
name and personal name non-English characters in the
UTF-8 standard.
Regards,
Gene Falck
[EMAIL PROTECTED]
**
The discussion list for  http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list  getting help
**


RE: [WSG] UTF-8 (was: Quirks mode vs Standards mode)

2005-04-19 Thread Richard Ishida
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Gene Falck
 Sent: 19 April 2005 18:49
...
 Does anyone have a good quick reference as to which 
 characters are good on UTF-8? How about a faster or easier 
 way to type them in? 

FWIW you may find this useful for Latin characters:
http://people.w3.org/rishida/scripts/pickers/latin/

See http://people.w3.org/rishida/scripts/pickers/ for explanations and other
scripts.

RI

**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list  getting help
**



[WSG] UTF-8 (was: Quirks mode vs Standards mode)

2005-04-18 Thread Anders Nawroth

HTMLTidy is the only useful piece of software I've found for web page
development, and I use it to clean up my pages and get proper encoding
of my Norwenglish lines of text into numeric entities (UTF-8) where 
needed.
What characters needs encoding into numeric entities when using UTF-8?
I try to avoid entities with exception for '
/anders (Sweden)
**
The discussion list for  http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list  getting help
**


Re: [WSG] UTF-8 (was: Quirks mode vs Standards mode)

2005-04-18 Thread Paul Menard
Just curious what tidy parameters you are using. I have some European (Polish, 
Czech, Russian)
language sites I'm working on and would prefer to convert the UTF-8 to some 
numeric equal for
certain high-range letters.

Paul
--- Anders Nawroth [EMAIL PROTECTED] wrote:
 
  HTMLTidy is the only useful piece of software I've found for web page
  development, and I use it to clean up my pages and get proper encoding
  of my Norwenglish lines of text into numeric entities (UTF-8) where 
  needed.
 
 What characters needs encoding into numeric entities when using UTF-8?
 
 I try to avoid entities with exception for '
 
 /anders (Sweden)
 **
 The discussion list for  http://webstandardsgroup.org/
 
  See http://webstandardsgroup.org/mail/guidelines.cfm
  for some hints on posting to the list  getting help
 **
 
 
**
The discussion list for  http://webstandardsgroup.org/

 See http://webstandardsgroup.org/mail/guidelines.cfm
 for some hints on posting to the list  getting help
**



Re: [WSG] UTF-8 (was: Quirks mode vs Standards mode)

2005-04-18 Thread Kornel Lesinski
On Mon, 18 Apr 2005 18:10:44 +0100, Paul Menard [EMAIL PROTECTED]  
wrote:

Just curious what tidy parameters you are using. I have some European  
(Polish, Czech, Russian) language sites I'm working on and would prefer 
to convert the UTF-8 to some numeric equal for certain high-range  
letters.
I couldn't get Tidy to properly transcode non-latin1 encodings and I use it
with -raw option that at least prevents it from ruining documents.
Conversion is as easy as copypaste - get text displayed properly, copy it
and paste into UTF-capable editor.
--
regards, Kornel Lesiski
**
The discussion list for  http://webstandardsgroup.org/
See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list  getting help
**