Re: [WSG] Regarding foreign languages
Vaska, you¨re still mixing those: I think you are mixing two things which should be separated. The first problem is the language of the page (defined in the header) The second problem is how to create a non-ascii character He is right. I've already identified that I will be using utf-8. And I've accepted use of xml:lang/lang: in both the header and on the individual form elements (as necessary) - what am I still mixing on this issue? Am I missing something more obvious? No, the browser will. It will send the characters in the encoding (charset, not language!) of the page. Thanks, I understand what's going on with this now. I was really just curious how it was dealt with - I don't believe it changes anything on the server-end (and didn't think it would). You mention the use of Unicode...perhaps I'm way out there on this point but am I not allowed to assume that the user will be using unicode to input their data? I know it's a web browser, but is there some way I can restrict their input to unicode (the page xml:lang that is)? If they enter something else, it likely won't work. Perhaps this is where I'm still 'mixing' things up? v ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
Re: [WSG] Regarding foreign languages
Vaska, you¨re still mixing those: I think you are mixing two things which should be separated. The first problem is the language of the page (defined in the header) The second problem is how to create a non-ascii character He is right. It is a tricky business because for a French typist I can use entities and change an é into é It's wise to use codepage that contain this character, or better UTF. but with Chinese everything comes up unreadable (as you've mentioned) Even when using Unicode? There will be a situation where one page will have the header encoding in ZH and an input/text field as EN-US. I'm pretty sure that the field itself won't establish the language parameters that go into the field - the operating system will. No, the browser will. It will send the characters in the encoding (charset, not language!) of the page. One thing I don't understand though, is at what point does the computer actually use the xml:lang attribute? At the input (client-side)? When it gets to the server/table (server-side)? I can type any language I want into the textarea, but what comes out can vary... The 'lang' attrib is mostly for screen readers, CSS language tools and some processing applications. It doesn't determine the way how characters are inputed/printed/transfered. That's a part for charset. What, where, which formats do I use and stick with if the idea is to support just about any lanugage that's out there (theoretically)? Some Unicode - I don't know how it works with Asian/Arabic/Hebrew - whether UTF8, 16 or 32, what about the Endians etc. ... -- Jan Brasna aka JohnyB :: www.alphanumeric.cz | www.janbrasna.com ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
Re: [WSG] Regarding foreign languages
Believe me, I'm listening to what y'all are telling me. It is a tricky business because for a French typist I can use entities and change an é into é but with Chinese everything comes up unreadable (as you've mentioned). I think this is going to end up being a case by case scenario - is this how it's done? Unfortunately, even though I worked in a translation office for some time, I don't have much experience at this end of things (I only did design back then, no programming). While I can smoothly transition the insertion of xml:lang and lang attributes into my form elements as needed, it just doesn't feel like it's the right thing to do (of course, this feeling has no basis in reality). There will be a situation where one page will have the header encoding in ZH and an input/text field as EN-US. I'm pretty sure that the field itself won't establish the language parameters that go into the field - the operating system will. I'm confused, aside from hoping that the user will understand what needs to go into the field, how this will work. Or perhaps this is purely a design/usability issue. One thing I don't understand though, is at what point does the computer actually use the xml:lang attribute? At the input (client-side)? When it gets to the server/table (server-side)? I can type any language I want into the textarea, but what comes out can vary... And one more thing, my language declaration (in the header)...I've seen so many different kinds and read a few articles on the subject but I don't know exactly where to go on this: en en-us en-gb zh zh-hans etc. What, where, which formats do I use and stick with if the idea is to support just about any lanugage that's out there (theoretically)? Thanks for the help...v On Jun 2, 2005, at 10:46 PM, Juergen Auer wrote: On 2 Jun 2005 at 16:49, Vaska.WSG wrote: It's for a multilanguage site and base language will be English. Everything on the form will be English except the actual input (textarea). Hello Vaska, I think you are mixing two things which should be separated. The first problem is the language of the page (defined in the header) or the language of a block (defined like http://www.sql-und-xml.de/ ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help ** ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
Re: [WSG] Regarding foreign languages
Vaska > - will utf-8 suffice? Yes - however as Peter has pointed out you may need to consider server side aspects. > - do I need to specify http://www.w3.org/1999/xhtml"; > xml:lang='en' lang='en'> as ZN? is it necessary? Isn't utf-8 good > enough? You should specify the lang in the html element/s containing the text. For example this page is in English but contains divs with other languages (FF will get XHTML 1.1 and MSIE XHTML 1.0): http://www.orbital.co.nz/anx/index.cfm/1,10,96,html Johan > Original Message > From: Vaska.WSG <[EMAIL PROTECTED]> > To: wsg@webstandardsgroup.org > Date: Fri, Jun-3-2005 2:05 AM > Subject: [WSG] Regarding foreign languages > > Am I allowed to ask about non-CSS things here? > > In particular, I'm trying to deal with how to handle inputs of Chinese > characters via some forms. What I'm wondering is... > > - will utf-8 suffice? > - do I need to specify http://www.w3.org/1999/xhtml"; > xml:lang='en' lang='en'> as ZN? is it necessary? Isn't utf-8 good > enough? > > And further, I'm not sure how to handle Chinese text on the validation > end of things, but this might be a subject for a different list > altogether. > > I'll eventually have to deal with some other languages but Chinese will > > likely be one of the more difficult ones. > > ??? > > I'll see what happens when I send this...v > > ** > The discussion list for http://webstandardsgroup.org/ > > See http://webstandardsgroup.org/mail/guidelines.cfm > for some hints on posting to the list & getting help > ** > > > ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
RE: [WSG] Regarding foreign languages
Hi Vaska, > Am I allowed to ask about non-CSS things here? WSG is not just a CSS list. Your question is entirely appropriate as it is dealing with firm Web Standards. > In particular, I'm trying to deal with how to handle inputs > of Chinese > characters via some forms. What I'm wondering is... One thing you need to watch is what the application or web server is expecting from the form. In ColdFusion MX there are times when you have to tell the ColdFusion server to expect a certain encoding from form posts. E.g. setencoding("form", "UTF-8"); I don't know whether PHP and other server-side languages have this need or whether they work it out themselves. Not that I want to start that discussion here (as server-side technology is off topic) but as a concept, it may well be part of the problem and I just wanted to add it to your debug process. Peter ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
Re: [WSG] Regarding foreign languages
On 2 Jun 2005 at 16:49, Vaska.WSG wrote: > It's for a multilanguage site and base language will be English. > Everything on the form will be English except the actual input > (textarea). Hello Vaska, I think you are mixing two things which should be separated. The first problem is the language of the page (defined in the header) or the language of a block (defined like http://www.sql-und-xml.de/ ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
Re: [WSG] Regarding foreign languages
You need both because the language of the page and the encoding of the characters in the document are different things. UTF8 does not tell you which language you're using, and the language attributes to not exist for the purpose of rendering characters correctly. A page in UTF-8 could be in any language, it doesn't tell you which. But the language attribute(s) are used for other things. Since you can select them with CSS, a web browser can apply regionalised quotation marks to blocks of a document if you've declared the langauge. A screen reader will use different libraries to read different languages, too. There are a variety of 'beyond the browser' uses for the attribute. For further reference: You also use both "lang" and "xml:lang" in XHTML transitional for backward compatibility with HTML4, whilst in strict mode "xml:lang" is all you need. In your case, since the page is mostly in English I would have lang="en" in the element, and as you suggest, put lang="zn" in the relevent form elements or parent containers as necessary. Ben On 6/2/05, Vaska. WSG <[EMAIL PROTECTED]> wrote: > It's for a multilanguage site and base language will be English. > Everything on the form will be English except the actual input > (textarea). Would it hurt anything if I just kept the lang declaration > as EN in the header? Or, since the input will be Chinese should it be > ZN? Or, do I need to be more specific and delcare lang=ZN on the > textarea itself? > > I was wondering though...since it's ALL utf-8 it might not be necessary > to declare lang=whatever at all? > > Out of curiousity, I'm not sure why we need to declare lang and > xml:lang since utf-8 (I believe) is all we really need? > > > On Jun 2, 2005, at 4:21 PM, Ben Ward wrote: > > > The language in your html element should be the language of the page. > > If you have a section of the page (be that a parapraph, form, > > anything) which uses a different language then you can add a lang and > > xml:lang attribute to that as well. HTML is generally rather good at > > doing multi-lingual documents. > > > > I could do this on a page (this is condensed down and is missing some > > attributes, but I just want to show the xml:lang/lang behaviour): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The language declaration doesn't restrict the characters you can use > > in forms, regardless. So you don't need to add a language attribute to > > your sub-elements unless you are explicitly requiring Chinese input. > > Obviously if it's an all chinese site then it would make sense to > > change the language value in the element itself. > > > > Ben > > ** > > The discussion list for http://webstandardsgroup.org/ > > > > See http://webstandardsgroup.org/mail/guidelines.cfm > > for some hints on posting to the list & getting help > > ** > > > > > > > > ** > The discussion list for http://webstandardsgroup.org/ > > See http://webstandardsgroup.org/mail/guidelines.cfm > for some hints on posting to the list & getting help > ** > > -- http://www.ben-ward.co.uk ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
Re: [WSG] Regarding foreign languages
It's for a multilanguage site and base language will be English. Everything on the form will be English except the actual input (textarea). Would it hurt anything if I just kept the lang declaration as EN in the header? Or, since the input will be Chinese should it be ZN? Or, do I need to be more specific and delcare lang=ZN on the textarea itself? I was wondering though...since it's ALL utf-8 it might not be necessary to declare lang=whatever at all? Out of curiousity, I'm not sure why we need to declare lang and xml:lang since utf-8 (I believe) is all we really need? On Jun 2, 2005, at 4:21 PM, Ben Ward wrote: The language in your html element should be the language of the page. If you have a section of the page (be that a parapraph, form, anything) which uses a different language then you can add a lang and xml:lang attribute to that as well. HTML is generally rather good at doing multi-lingual documents. I could do this on a page (this is condensed down and is missing some attributes, but I just want to show the xml:lang/lang behaviour): The language declaration doesn't restrict the characters you can use in forms, regardless. So you don't need to add a language attribute to your sub-elements unless you are explicitly requiring Chinese input. Obviously if it's an all chinese site then it would make sense to change the language value in the element itself. Ben ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help ** ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **
Re: [WSG] Regarding foreign languages
The language in your html element should be the language of the page. If you have a section of the page (be that a parapraph, form, anything) which uses a different language then you can add a lang and xml:lang attribute to that as well. HTML is generally rather good at doing multi-lingual documents. I could do this on a page (this is condensed down and is missing some attributes, but I just want to show the xml:lang/lang behaviour): The language declaration doesn't restrict the characters you can use in forms, regardless. So you don't need to add a language attribute to your sub-elements unless you are explicitly requiring Chinese input. Obviously if it's an all chinese site then it would make sense to change the language value in the element itself. Ben ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help **