Re: [WSG] HTML special characters coding
I guess svg and mathml doctype are more strict with the characters, I'll choose NCRs and utf-8 -- Regards, Dani Iswara http://daniiswara.net/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
I still use encoded characters in attributes sometimes, for example in alt text that needs quote makr. I can't think of an example off hand, but I assume entities are still needed for that? -Alastair *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
On 17 Jun 2008, at 23:46, Patrick H. Lauke wrote: Beyond the inbuilt entities I tend to just use the characters directly in the markup and specify UTF-8 encoding. Has been working reasonably well in all modern browsers. On 18 Jun 2008, at 00:19, Andrew Cunningham wrote: Use amp; nbsp; lt; and gt; All other characters should be actual characters. So, that would seem to be the consensus. Well, how fascinating; you learn something new every day on this list, and in this case it's making me feel really stupid because I've been encoding every non-standard character. Admittedly I'm using Coda to write my markup and that app has a vry handy 'Encode entities' function that, when combined with a keyboard shortcut, simplifies it enormously. But it seems that maybe I'm just making unnecessary work for myself. I've been doing it that way thus far because I learned (during my 'teach yourself hand-written html/css' stage) that it was the 'correct' way to do it. Is this a case where the correct way is actually unnecessary? So let me see if I have this right: as long as my page declares an encoding (I use UTF-8) I don't need to encode the entities, I can just type them straight into the markup. Is that correct? Will it validate? (I normally use an xhtml 1.0 strict doctype). -- Rick Lecoat www.sharkattack.co.uk *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
RE: [WSG] HTML special characters coding
Rick Lecoat So let me see if I have this right: as long as my page declares an encoding (I use UTF-8) I don't need to encode the entities, I can just type them straight into the markup. Is that correct? Make sure that your whole environment is UTF-8 (your code editor, any database input forms /admin page you may have, etc). Then yes, it should all work fine. Will it validate? (I normally use an xhtml 1.0 strict doctype). Yes. P Patrick H. Lauke Web Editor Enterprise Development University of Salford Room 113, Faraday House Salford, Greater Manchester M5 4WT UK T +44 (0) 161 295 4779 [EMAIL PROTECTED] www.salford.ac.uk A GREATER MANCHESTER UNIVERSITY *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
RE: [WSG] HTML special characters coding
Can others with experience with this please confirm (or not) what Patrick has said? Thanks. Kevin --- Original Message --- From:Patrick Lauke [EMAIL PROTECTED] Sent:Wed 6/18/08 6:10 am To:wsg@webstandardsgroup.org Subj:RE: [WSG] HTML special characters coding Rick Lecoat So let me see if I have this right: as long as my page declares an encoding (I use UTF-8) I don't need to encode the entities, I can just type them straight into the markup. Is that correct? Make sure that your whole environment is UTF-8 (your code editor, any database input forms /admin page you may have, etc). Then yes, it should all work fine. Will it validate? (I normally use an xhtml 1.0 strict doctype). Yes. P Patrick H. Lauke Web Editor Enterprise Development University of Salford Room 113, Faraday House Salford, Greater Manchester M5 4WT UK T +44 (0) 161 295 4779 [EMAIL PROTECTED] www.salford.ac.uk A GREATER MANCHESTER UNIVERSITY *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
RE: [WSG] HTML special characters coding
Yes Patrick is correct. But that is the same with any character encoing, everything needs to match up. Different scripting modules throw in their own quirks into the mix. How easy it is, or how complex it is depends on how many languages and how many writing scripts you need to support. The more diverse the linguistic content, the more important it becomes to get the internationalization architecture right. To create a monolingual environment in unicode is fairly routine, just need to make sure everything is right at each step. A useful resource on migrating to unicode is available at http://www.w3.org/International/articles/unicode-migration/ Andrew On Wed, June 18, 2008 11:12 pm, [EMAIL PROTECTED] wrote: Can others with experience with this please confirm (or not) what Patrick has said? Thanks. Kevin --- Original Message --- From:Patrick Lauke [EMAIL PROTECTED] Sent:Wed 6/18/08 6:10 am To:wsg@webstandardsgroup.org Subj:RE: [WSG] HTML special characters coding Rick Lecoat So let me see if I have this right: as long as my page declares an encoding (I use UTF-8) I don't need to encode the entities, I can just type them straight into the markup. Is that correct? Make sure that your whole environment is UTF-8 (your code editor, any database input forms /admin page you may have, etc). Then yes, it should all work fine. Will it validate? (I normally use an xhtml 1.0 strict doctype). Yes. P Patrick H. Lauke Web Editor Enterprise Development University of Salford Room 113, Faraday House Salford, Greater Manchester M5 4WT UK T +44 (0) 161 295 4779 [EMAIL PROTECTED] www.salford.ac.uk A GREATER MANCHESTER UNIVERSITY *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** -- Andrew Cunningham Research and Development Coordinator Vicnet State Library of Victoria Australia [EMAIL PROTECTED] *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
On 18/06/2008, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Can others with experience with this please confirm (or not) what Patrick has said? Thanks. Yes, Patrick is correct. I would add one caveat. If you use UTF-8 (personally, I see no reason to anything else), you should not use ASCII characters (hex) 81-9F / (dec) 129-159 which includes stuff like 151; for an em dash and 150; for an en dash. Instead, either use the character directly or use #8212; and #8211; for the em dash and en dash respectively. -- T. R. Valentine Your friends will argue with you. Your enemies don't care. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
On Thu, June 19, 2008 12:40 am, T. R. Valentine wrote: Yes, Patrick is correct. I would add one caveat. If you use UTF-8 (personally, I see no reason to anything else), you should not use ASCII characters (hex) 81-9F / (dec) 129-159 which includes stuff like 151; for an em dash and 150; for an en dash. Instead, either use the character directly or use #8212; and #8211; for the em dash and en dash respectively. My understanding is that since HTML 4.0 all numerical character references are defined in terms of the document character set. For HTML4 onwards the document character set is always Unicode regardless of the character encoding of the document. So in HTML4 onwards en dash and em dash are #8211; and #8212 You'd have to go back to HTML 3.2 for 150; and 151; to be considered en-dash and em-dash characters. And even then HTML 3.2 used ISO-8859-1 specifically, so 150; and 151; would be technically undefined. -- Andrew Cunningham Research and Development Coordinator Vicnet State Library of Victoria Australia [EMAIL PROTECTED] *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
I have always used the for ampersand. The only time I use the code is when there isn't an actual character on the keyboard. I.e copyright sign. I don't think it matter on which one to use. ~Calvin Calvin Chan www.calvinchan.net On Tue, Jun 17, 2008 at 1:55 PM, kevin_erickson [EMAIL PROTECTED] wrote: Hello, I am looking for advice on if the best way to code for special characters is to use the actual character or the attribute value or the alt code? i.e. for the ampersand should one use or amp;? Does it matter? I know that Dreamweaver automates some of this but what is the best practice? Thank you kevin *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
RE: [WSG] HTML special characters coding
Hi Kevin, I use the amp;? Code purely because not all browser's can read on it's own as this tells the browser to expect a special character, which in turn leads to a more user friendly experience. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of kevin_erickson Sent: 17 June 2008 21:55 To: wsg@webstandardsgroup.org Subject: [WSG] HTML special characters coding Hello, I am looking for advice on if the best way to code for special characters is to use the actual character or the attribute value or the alt code? i.e. for the ampersand should one use or amp;? Does it matter? I know that Dreamweaver automates some of this but what is the best practice? Thank you kevin *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** -- No virus found in this incoming message. Checked by AVG. Version: 7.5.524 / Virus Database: 270.3.0/1505 - Release Date: 16/06/2008 07:20 *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
On 17/06/2008, kevin_erickson [EMAIL PROTECTED] wrote: Hello, I am looking for advice on if the best way to code for special characters is to use the actual character or the attribute value or the alt code? i.e. for the ampersand should one use or amp;? Does it matter? I know that Dreamweaver automates some of this but what is the best practice? For the ampersand I always use amp; because that was how I was taught (I even use it in URLs) and I use nbsp; lt; gt; -- -- but I do not use the HTML character entity (ampersand+text+simicolon) for typing other characters, e.g. I would never use zeta;omega;eta; -- I'd just type ζωη -- not only is it easier to read the markup, it takes a /lot/ less space. -- T. R. Valentine Your friends will argue with you. Your enemies don't care. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
kevin_erickson wrote: Hello, I am looking for advice on if the best way to code for special characters is to use the actual character or the attribute value or the alt code? i.e. for the ampersand should one use or amp;? Does it matter? I know that Dreamweaver automates some of this but what is the best practice? You're always supposed to encode as amp; (even in hrefs) and that's what standards compliance requires. (I use XHTML and I also want to be parseable as XML so aside from XMLs inbuilt entities of lt; gt; amp; quot; and apos; I tend to use NCRs...). -- .Matthew Holloway http://holloway.co.nz/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
Matthew Holloway wrote: (I use XHTML and I also want to be parseable as XML so aside from XMLs inbuilt entities of lt; gt; amp; quot; and apos; I tend to use NCRs...). Beyond the inbuilt entities I tend to just use the characters directly in the markup and specify UTF-8 encoding. Has been working reasonably well in all modern browsers. P -- Patrick H. Lauke __ re·dux (adj.): brought back; returned. used postpositively [latin : re-, re- + dux, leader; see duke.] www.splintered.co.uk | www.photographia.co.uk http://redux.deviantart.com __ Co-lead, Web Standards Project (WaSP) Accessibility Task Force http://webstandards.org/ __ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
kevin_erickson provided the following information on 18/06/2008 6:55 AM: Hello, I am looking for advice on if the best way to code for special characters is to use the actual character or the attribute value or the alt code? i.e. for the ampersand should one use or amp;? Does it matter? I know that Dreamweaver automates some of this but what is the best practice? I prefer to use the character entity reference. A great reference can be found here: http://www.digitalmediaminute.com/reference/entity/index.php Using over amp; will get picked on when checking your Mark up validation. http://validator.w3.org/ (Although I'm not sure if this is the case with every doctype (I should check this one day)) Andrew *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
Use amp; nbsp; lt; and gt; All other characters should be actual characters. Use a character encoding that contains all the characters you require. Use of NCRs and other entities should be rare occurances for language challenged environments. Andrew kevin_erickson wrote: Hello, I am looking for advice on if the best way to code for special characters is to use the actual character or the attribute value or the alt code? i.e. for the ampersand should one use or amp;? Does it matter? I know that Dreamweaver automates some of this but what is the best practice? Thank you kevin *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
Patrick H. Lauke wrote: Beyond the inbuilt entities I tend to just use the characters directly in the markup and specify UTF-8 encoding. Has been working reasonably well in all modern browsers. LOL, i enjoyed the wording. Considering the document character set of HTML4 is Unicode, if it can't be displayed in UTF-8 in a browser, then it can't be displayed using entitiies or NCRs either ;) *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
Andrew Cunningham wrote: LOL, i enjoyed the wording. Considering the document character set of HTML4 is Unicode, if it can't be displayed in UTF-8 in a browser, then it can't be displayed using entitiies or NCRs either ;) Generally I agree, although one good thing about entities (including NCRs of course) is that it'll typically come up as a ? when it's unknown rather than mangled as ’. So it'll break more gracefully. Also there can be other things involved other than the browser when writing HTML, such as bad proxies. I can't remember the name of the software but a few years ago an adblocker proxy that I installed on my parents machine would break UTF-8 horribly... of course that's the proxy's fault but entites would work around their bug. (I don't really have strong opinions either way though) -- .Matthew Holloway http://holloway.co.nz/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
up as a ? when it's unknown rather than mangled as ’ has caused me truma in the past. now I use UTF-8 aiming to entifyand quotes aswell as £ and such dealing with large amounts of content thats been created in a wyswyg editor can be quite an issue erronus classes nbsp; also some handle special chars better than others 2008/6/18 Matthew Holloway [EMAIL PROTECTED]: Andrew Cunningham wrote: LOL, i enjoyed the wording. Considering the document character set of HTML4 is Unicode, if it can't be displayed in UTF-8 in a browser, then it can't be displayed using entitiies or NCRs either ;) Generally I agree, although one good thing about entities (including NCRs of course) is that it'll typically come up as a ? when it's unknown rather than mangled as ’. So it'll break more gracefully. Also there can be other things involved other than the browser when writing HTML, such as bad proxies. I can't remember the name of the software but a few years ago an adblocker proxy that I installed on my parents machine would break UTF-8 horribly... of course that's the proxy's fault but entites would work around their bug. (I don't really have strong opinions either way though) -- .Matthew Holloway http://holloway.co.nz/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
thank you for the good responses. Very helpful. Kevin --- Original Message --- From:Matthew Holloway [EMAIL PROTECTED] Sent:Tue 6/17/08 7:36 pm To:wsg@webstandardsgroup.org Subj:Re: [WSG] HTML special characters coding Andrew Cunningham wrote: LOL, i enjoyed the wording. Considering the document character set of HTML4 is Unicode, if it can't be displayed in UTF-8 in a browser, then it can't be displayed using entitiies or NCRs either ;) Generally I agree, although one good thing about entities (including NCRs of course) is that it'll typically come up as a ? when it's unknown rather than mangled as â??. So it'll break more gracefully. Also there can be other things involved other than the browser when writing HTML, such as bad proxies. I can't remember the name of the software but a few years ago an adblocker proxy that I installed on my parents machine would break UTF-8 horribly... of course that's the proxy's fault but entites would work around their bug. (I don't really have strong opinions either way though) -- .Matthew Holloway http://holloway.co.nz/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
Matthew Holloway wrote: Andrew Cunningham wrote: LOL, i enjoyed the wording. Considering the document character set of HTML4 is Unicode, if it can't be displayed in UTF-8 in a browser, then it can't be displayed using entitiies or NCRs either ;) Generally I agree, although one good thing about entities (including NCRs of course) is that it'll typically come up as a ? when it's unknown rather than mangled as ’. So it'll break more gracefully. a slight correction: NCRs by definition are always know. the question mark could inticate a number of different problems, not limited to, but including lack of appropriate fonts available (although thats more likely to be a missing/.notdef glyph rather than a question mark) or the character has been mangled by a script or module on a web site's back end, etc. while seeing something like ’ instead is a completely different story, i.e. either the http header or the meta element in the web page are indicating the wrong encoding, or in some cases no encoding is declared. NCRs are defined in terms of the Document Character Set for HTML, and are thus independant of the character encoding used to display individual pages. But using the most appropraite character encoding for the document is the best approach. Each is an example of very different problems or issues with a web page, and shouldn't be lumped in together. But as I indicated in a previous email: Use of NCRs and other entities should be rare occurances for language challenged environments The reality is that some tools are very poor at handling Unicode, and NCRs are at times a necessary evil. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
Andrew Cunningham wrote: a slight correction: NCRs by definition are always know. Ah, we seem to actually agree but we're talking about what's known to different things. Unknown when I used it was in terms of the ability to render it sucessfully (known to the browser as a whole) not just in terms of expressing characters accurately (which seems to be what yours is known to). And as said NCRs for my use are for HTML *and* XML, not just HTML. Regarding missing glyph characters like boxes or boxes with codepages/codepoints or ? ...different platforms and browsers display different fallbacks. Or as Wikipedia says, Systems that do not offer a fallback font typically display black or white rectangles, question marks, or nothing at all in place of missing characters. Symbols in a fallback font can contain annotations such as the relevant Unicode block and the script system used. Entity errors vs encoding errors like ’ errors are completely different errors, that was the point -- to contrast two completely different ways of encoding characters and the errors that result (’ vs ? vs missing glyph boxes). I have a slight preference for entities because they don't tend to get mangled by stupid non-unicode-aware tools but that's about it. Cheers :) -- .Matthew Holloway http://holloway.co.nz/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***
Re: [WSG] HTML special characters coding
I don't think this is right. It depends what language and character set you have specified the document to be in. If the character is included in the character set, there is no need to use the special code... provided the browser can read that character set... Jason On Wed, Jun 18, 2008 at 8:25 AM, Matthew Holloway [EMAIL PROTECTED] wrote: kevin_erickson wrote: Hello, I am looking for advice on if the best way to code for special characters is to use the actual character or the attribute value or the alt code? i.e. for the ampersand should one use or amp;? Does it matter? I know that Dreamweaver automates some of this but what is the best practice? You're always supposed to encode as amp; (even in hrefs) and that's what standards compliance requires. (I use XHTML and I also want to be parseable as XML so aside from XMLs inbuilt entities of lt; gt; amp; quot; and apos; I tend to use NCRs...). -- .Matthew Holloway http://holloway.co.nz/ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: [EMAIL PROTECTED] ***