RE: [WSG] UTF-8 (was: Quirks mode vs Standards mode)
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dean Jackson Sent: 19 April 2005 17:12 ... I try to avoid entities with exception for ' You're right. If you're using UTF-8 you only need to encode the characters that are special in HTML/XHTML/XML (, and ). Using numeric entities (or even named entities) in a UTF-8 file for characters that are outside the range of ASCII is usually a waste of space. The only time I use them is when I'm on a keyboard/system where I don't know how to enter the character, such as å. I'd type aring; in this case. PS. Hopefully the W3C i18n guru Richard is listening and will tell everyone if I'm wrong. Hi Dean. I'd hesitate to say anyone was right or wrong here, but I'm of the same opinion, albeit with one small exception. I think in UTF-8 NCRs/entities beyond the ASCII range can be useful for invisible characters (such as LRM in Arabic/Hebrew) or ambiguous characters (such as non-breaking space - which looks like an ordinary space). Tee mentioned some issues with Chinese characters on IE Mac that I haven't got to the bottom of yet, but I don't recall encountering any other problems that could be solved by using escapes instead. For a fuller version of my opinion see the slides starting at http://www.w3.org/International/tutorials/tutorial-char-enc/en/all.html#Slid e0440 RI ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] UTF-8 (was: Quirks mode vs Standards mode)
Hi Dean, You wrote: ... Norwenglish lines of text into numeric entities (UTF-8) where needed. What characters needs encoding into numeric entities when using UTF-8? I try to avoid entities with exception for ' It is a small nuisance, of course. I do use them when I type (US English qwertyuiop keyboard) as I usually don't have a place to copy and paste. It does work quite well, though, when I copy and paste something I used entities or the numeric codes for into Outlook at work. Mostly at work I use a degree sign or a plus/minus sign but there is a lot to cover for foreign place and personal names that is not on my keyboard. You're right. If you're using UTF-8 you only need to encode the characters that are special in HTML/XHTML/XML (, and ). Using numeric entities (or even named entities) in a UTF-8 file for characters that are outside the range of ASCII is usually a waste of space. Does anyone have a good quick reference as to which characters are good on UTF-8? How about a faster or easier way to type them in? I wasn't aware (until this thread) that there was enough space for place name and personal name non-English characters in the UTF-8 standard. Regards, Gene Falck [EMAIL PROTECTED] ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
RE: [WSG] UTF-8 (was: Quirks mode vs Standards mode)
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gene Falck Sent: 19 April 2005 18:49 ... Does anyone have a good quick reference as to which characters are good on UTF-8? How about a faster or easier way to type them in? FWIW you may find this useful for Latin characters: http://people.w3.org/rishida/scripts/pickers/latin/ See http://people.w3.org/rishida/scripts/pickers/ for explanations and other scripts. RI ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
[WSG] UTF-8 (was: Quirks mode vs Standards mode)
HTMLTidy is the only useful piece of software I've found for web page development, and I use it to clean up my pages and get proper encoding of my Norwenglish lines of text into numeric entities (UTF-8) where needed. What characters needs encoding into numeric entities when using UTF-8? I try to avoid entities with exception for ' /anders (Sweden) ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] UTF-8 (was: Quirks mode vs Standards mode)
Just curious what tidy parameters you are using. I have some European (Polish, Czech, Russian) language sites I'm working on and would prefer to convert the UTF-8 to some numeric equal for certain high-range letters. Paul --- Anders Nawroth [EMAIL PROTECTED] wrote: HTMLTidy is the only useful piece of software I've found for web page development, and I use it to clean up my pages and get proper encoding of my Norwenglish lines of text into numeric entities (UTF-8) where needed. What characters needs encoding into numeric entities when using UTF-8? I try to avoid entities with exception for ' /anders (Sweden) ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help ** ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] UTF-8 (was: Quirks mode vs Standards mode)
On Mon, 18 Apr 2005 18:10:44 +0100, Paul Menard [EMAIL PROTECTED] wrote: Just curious what tidy parameters you are using. I have some European (Polish, Czech, Russian) language sites I'm working on and would prefer to convert the UTF-8 to some numeric equal for certain high-range letters. I couldn't get Tidy to properly transcode non-latin1 encodings and I use it with -raw option that at least prevents it from ruining documents. Conversion is as easy as copypaste - get text displayed properly, copy it and paste into UTF-capable editor. -- regards, Kornel Lesiski ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **