Re: [WSG] Regarding foreign languages
Vaska, you¨re still mixing those: I think you are mixing two things which should be separated. The first problem is the language of the page (defined in the header) The second problem is how to create a non-ascii character He is right. It is a tricky business because for a French typist I can use entities and change an é into eacute; It's wise to use codepage that contain this character, or better UTF. but with Chinese everything comes up unreadable (as you've mentioned) Even when using Unicode? There will be a situation where one page will have the header encoding in ZH and an input/text field as EN-US. I'm pretty sure that the field itself won't establish the language parameters that go into the field - the operating system will. No, the browser will. It will send the characters in the encoding (charset, not language!) of the page. One thing I don't understand though, is at what point does the computer actually use the xml:lang attribute? At the input (client-side)? When it gets to the server/table (server-side)? I can type any language I want into the textarea, but what comes out can vary... The 'lang' attrib is mostly for screen readers, CSS language tools and some processing applications. It doesn't determine the way how characters are inputed/printed/transfered. That's a part for charset. What, where, which formats do I use and stick with if the idea is to support just about any lanugage that's out there (theoretically)? Some Unicode - I don't know how it works with Asian/Arabic/Hebrew - whether UTF8, 16 or 32, what about the Endians etc. ... -- Jan Brasna aka JohnyB :: www.alphanumeric.cz | www.janbrasna.com ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Regarding foreign languages
Vaska, you¨re still mixing those: I think you are mixing two things which should be separated. The first problem is the language of the page (defined in the header) The second problem is how to create a non-ascii character He is right. I've already identified that I will be using utf-8. And I've accepted use of xml:lang/lang: in both the header and on the individual form elements (as necessary) - what am I still mixing on this issue? Am I missing something more obvious? No, the browser will. It will send the characters in the encoding (charset, not language!) of the page. Thanks, I understand what's going on with this now. I was really just curious how it was dealt with - I don't believe it changes anything on the server-end (and didn't think it would). You mention the use of Unicode...perhaps I'm way out there on this point but am I not allowed to assume that the user will be using unicode to input their data? I know it's a web browser, but is there some way I can restrict their input to unicode (the page xml:lang that is)? If they enter something else, it likely won't work. Perhaps this is where I'm still 'mixing' things up? v ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re[4]: [WSG] Regarding foreign languages
Patrick! Am Donnerstag, 2. Juni 2005 um 18:11:30 haben Sie geschrieben: I agree with you in all points but this one. Even in XHTML 1.0 the lang-Attribute is needed. At the risk of splitting very fine hairs even further: *needed* or *allowed* ? I'd tend to think the latter... You are right! *needed* should not say mandatory. Maybe we should say *allowed* and *recommended*? Martin. ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
[WSG] Making PDF and Word files accessible
Hello all, I have the task of adding a bunch of PDF and Word files to a web site I work on, that currently conforms to WAI Priority 1 guidelines. My first question is that if I convert the PDF files to HTML to make them more accessible, am I right in thinking that this is only half my job done? If the original file wasn't marked up correctly in the first place before being saved as PDF (with headings, etc) does this mean that its still not really accessible? Secondly, with the Word documents, if there is an easier way to convert them to HTML? At the moment I am saving as HTML from Word, taking them into Dreamweaver and using 'Clean up Word HTML'. After that I use 'Find and replace' to strip out all font, span and attributes from p such as class and style. At which point I still have to mark up the document with proper headings, bulleted lists, etc. A little time-consuming and fiddly to say the least! Am I doing this right or is there another way to make these files accessible? (and make my life easier, after all it is Friday :-) ) Angela Angela Galvin Worth Media 15-17 Middle Street Brighton BN1 1AL T: 01273 201149 F: 01273 710004 - www.worthmedia.net ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Making PDF and Word files accessible
At 05:36 AM 6/3/2005, you wrote: snip Secondly, with the Word documents, if there is an easier way to convert them to HTML? At the moment I am saving as HTML from Word, taking them into Dreamweaver and using 'Clean up Word HTML'. After that I use 'Find and replace' to strip out all font, span and attributes from p such as class and style. At which point I still have to mark up the document with proper headings, bulleted lists, etc. A little time-consuming and fiddly to say the least! Am I doing this right or is there another way to make these files accessible? (and make my life easier, after all it is Friday :-) ) Angela Angela Galvin Worth Media 15-17 Middle Street Brighton BN1 1AL T: 01273 201149 F: 01273 710004 - www.worthmedia.net I would skip the part where you save from Word into HTML. Why give yourself the grief? If you copy and paste the text into the 'content' part of your standard page, the line breaks will show you where the paragraph and headings are. I'm using Homesite so I just select and repeat the similar code ( first p, then h1, h2 etc) from one end of the document to the other. Generally the only thing missing them is the the use of bold and italic within the text (not part of the heading structure) and any tables or lists within the text. Validate to catch any stray weirdness and on to the next. Perhaps not the most interesting type of web coding but listening to music of your taste, you can work up a good rhythm and code a whack of stuff relatively cleanly. Not a bad way to spend a Friday. Mary Krieger Winnipeg Manitoba Canada http://www.mts.net/~mkrieger ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Making PDF and Word files accessible
Angela Galvin wrote: Hello all, I have the task of adding a bunch of PDF and Word files to a web site I work on, that currently conforms to WAI Priority 1 guidelines. My first question is that if I convert the PDF files to HTML to make them more accessible, am I right in thinking that this is only half my job done? If the original file wasn't marked up correctly in the first place before being saved as PDF (with headings, etc) does this mean that its still not really accessible? Secondly, with the Word documents, if there is an easier way to convert them to HTML? At the moment I am saving as HTML from Word, taking them into Dreamweaver and using 'Clean up Word HTML'. After that I use 'Find and replace' to strip out all font, span and attributes from p such as class and style. At which point I still have to mark up the document with proper headings, bulleted lists, etc. A little time-consuming and fiddly to say the least! Am I doing this right or is there another way to make these files accessible? (and make my life easier, after all it is Friday :-) ) Angela Hi Angela, No easy way, but the most reliable is to cut and paste from Word into the design view of Dreamweaver. Using the design view ensures that all the spacing is preserved and indeed, all the quotes etc are presented as the correct codes. I didn't know this myself until recently, when someone on this list told me about it. Hope this helps, -- Bob McClelland Cornwall (U.K.) www.gwelanmor-internet.co.uk ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Making PDF and Word files accessible
On Fri, 2005-06-03 at 06:36, Angela Galvin wrote: Secondly, with the Word documents, if there is an easier way to convert them to HTML? I use an open source program, antiword, to convert the Word docs to text and then just add the necessary markup. (And, of course, edit out the Word weirdness!) I've found this to be about 5 times faster than cut and paste. This is on a Linux box, but a Windows version of antiword seems to be available at- http://www.informatik.uni-frankfurt.de/~markus/antiword/ George ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
RE: [WSG] Making PDF and Word files accessible
Mary Krieger wrote: If you copy and paste the text into the 'content' part of your standard page, the line breaks will show you where the paragraph and headings are. I'm using Homesite so I just select and repeat the similar code ( first p, then h1, h2 etc) from one end of the document to the other. Depending on your version of MS Office, copying from displayed text may bring in a bunch of inline styles. Yes, even pasting into a text document! Ack! So, I usually save Word files as plain text (no line breaks) first. Next I use a good text editor with regular expression searching (I use TextPad, there are many others) to wrap text chunks in paragraph tags (e.g. ^is the beginning of a line, $ is the end, \n is carriage return, etc...) And last, I do a search and replace for weird apostrophes, quotes, dashes, etc... Generally the only thing missing them is the the use of bold and italic within the text (not part of the heading structure) and any tables or lists within the text. If you save as text, you'll still have tabs and funky characters for lists, which can also be regular expression searched and replaced with the right tags. I actually create a batch action for each contributor role that regularly sends me Word documents, which does most of the standard searches one after another (and in the right order, which I can screw up if it's been awhile) with the press of a hotkey. This allows me to include foreign characters for certain contributors, em dashes for others, different list designators for Macs vs. PCs, etc... The newest Acrobat (7 Pro) also exports to plain text quite effectively...not just RTF. It ostensibly offers an html w/css option, but uses inline styles extensively, so the plain text route is more efficient. Jona Decker Madison, WI USA ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
[WSG] Re: Ten questions for Russ
Ha! The shoe's on the other foot, eh Russ? Good show Maxine, ~d -- Douglas Clifton [EMAIL PROTECTED] http://loadaveragezero.com/ ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Ten questions for Russ
Douglas Clifton wrote: Ha! The shoe's on the other foot, eh Russ? I can't believe the WCAG 1.0 Guidelines and Checkpoints for Flash link in section 6 goes to a .swf file. o_O -- Love does not demand its own way.1 Corinthians 13:5 Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://members.ij.net/mrmazda/auth/ ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Ten questions for Russ
Completely true - the irony! The original post is here: http://www.markme.com/accessibility/archives/007344.cfm Unfortunately, it too goes off to the same flash file. Russ I can't believe the WCAG 1.0 Guidelines and Checkpoints for Flash link in section 6 goes to a .swf file. o_O ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Making PDF and Word files accessible
Hi there, My first question is that if I convert the PDF files to HTML to make them more accessible, am I right in thinking that this is only half my job done? If the original file wasn't marked up correctly in the first place before being saved as PDF (with headings, etc) does this mean that its still not really accessible? As an extremely broad generalisation, yes - bad source gets bad output. However every case is different so you'll have to check your resulting (X)HTML to make sure it's standards compliant/accessible. Secondly, with the Word documents, if there is an easier way to convert them to HTML? At the moment I am saving as HTML from Word, taking them into Dreamweaver and using 'Clean up Word HTML'. Try http://textism.com/wordcleaner/ I've found it's pretty good, esp. in conjunction with the DW tricks you mention. If you have a large amount of this sort of work, you might like to invest in http://cita.disability.uiuc.edu/software/office/ cheers h -- --- http://www.200ok.com.au/ --- The future has arrived; it's just not --- evenly distributed. - William Gibson ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Ten questions for Russ
Russ, One of the topics you discuss is your stance on the XHTML vs HTML debate. Your links support your stance -- I've read these before, and find them interesting and insightful, however they are trying to convince the reader of their point and I prefer a balanced argument. In looking for articles on the other side of the argument, I quickly found myself swamped in a mountain of words, some rational, some rabid. It would take days to wade through. Do you know of a place or article that is a good roundup of the arguments, presented neutrally or balanced so the reader can assess his/her position and decide accordingly? Right now I'm serving HTML (text/html Content-type headers), but as a coding practice we code as XHTML 1.0 Strict (for the cleanliness of the code, not for the XML properties). For the sake of validation as a coding tool, we need to put in the XHTML DOCTYPE, but I'd like to serve the HTML DOCTYPE in order to match our Content-type headers. Perhaps some automated scripty thing. But that is neither here nor there. What I really want to do is weigh what I regard as our shop's personal coding standards against a roundup of these arguments to see where we stand. Any pointers? -- Ben Curtis : webwright bivia : a personal web studio http://www.bivia.com v: (818) 507-6613 ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Ten questions for Russ
Russ wrote: [quote] At the risk of being burned at the stake, I think that unless you are willing to serve your pages as application/xhtml+xml with content negotiation, then you are probably better off staying with HTML 4.01 at this time. [/quote] Let me be the first to gather the kindling :-) The whole MIME debate started with Ian Hickson. Let me summarize his argument: If you author bad XHTML and serve it up as HTML, you won't know that you have invalid XHTML and you will blame XHTML when you find out. Sorry, this is not a valid argument. This is fear mongering. For more advocacy along the same line from Ian, have a read of this: http://www.hixie.ch/advocacy/xslt This article advocates the use of Python, Perl, JavaScript, C++ and a DOM parser to do transformations over XSLT. This clearly shows that Ian's knowledge on the subject is academic. Anyone familiar with the benefits of XSLT, will get a good laugh from this short article. Regards, -Vlad http://xstandard.com Standard-compliant XHTML WYSIWYG Editor ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Ten questions for Russ
Any pointers? Hi Ben, When interviewed, I was reluctant to express an opinion on this topic for the very reasons you describe - the XHTML vs HTML argument quickly turns from facts to opinion - similar to the font size and liquid vs fixed width debates. I completely agree with Vlad that Hixies article is not the best on this subject. It is poorly written but does have historic value. Probably the best articles to read would be those where information is delivered from the W3C itself like: HTML Versus XHTML http://webstandards.org/learn/askw3c/oct2003.html Serving XHTML with the Right MIME Type http://webstandards.org/learn/askw3c/sep2003.html Some links for the for opinion can be found here: http://www.d.umn.edu/itss/support/Training/Online/webdesign/xml.html#xhtml Good luck Russ ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
[WSG] alt tags and image captions
Having never seen/heard a screen reader in action, I am uncertain about how to make some aspects of coding user-friendly for those using screen readers. Specifically, I find my alt tags are almost always the same as my captions. For example, if I insert an image of Joe Smith, my code might look something like this: pimg src=images/joe_smith.jpg alt=Joe Smith //p p class=photocaptionJoe Smith/p Does the screen reader read, Joe Smith Joe Smith? If so, I would have thought that this repetition would get very annoying especially if there are a lot of images on the page. Could someone please enlighten me? Hope Stewart ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
[WSG] Character encoding
I've always thought that characters should be marked up with appropriate entity codes (for example, accented letters, etc.) in (X)HTML, rather than simply pasted in and left for character encoding and the user agent to take care of. I've written a plugin for the WordPress weblog software that does this for most characters ( http://www.joahua.com/blog/2005/06/04/curlyenc-03 - any discussion regarding this email to me offlist or post as comments, please, because it's software-related ), but I'm still not sure if it's required. It's just always felt dirty seeing certain characters not written in their appropriate entity codes. Could someone shed any light on this? Are entity codes redundant, or should we be using them where possible? Kind Regards, Joshua Street base10solutions Website: http://www.base10solutions.com.au/ Phone: (02) 9898-0060 Fax: (02) 8572-6021 Mobile: 0425 808 469 Multimedia Development Agency E-mails and any attachments sent from base10solutions are to be regarded as confidential. Please do not distribute or publish any of the contents of this e-mail without the senders consent. If you have received this e-mail in error, please notify the sender by replying to the e-mail, and then delete the message without making copies or using it in any way. Although base10solutions takes precautions to ensure that e-mail sent from our accounts are free of viruses, we encourage recipients to undertake their own virus scan on each e-mail before opening, as base10solutions accepts no responsibility for loss or damage caused by the contents of this e-mail. ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **
Re: [WSG] Character encoding
On Fri, 2005-06-03 at 23:42 -0400, Vlad Alexander wrote: Hi Joshua, If you are serving your content as Unicode (UTF-16 or UTF-8), then there is no need to use entities. If you do need to escape characters and you are using XHTML, then it's best to use their decimal values rather than entities. This makes your markup more easily parsable by XML technologies in your CMS (on the back-end). For example, instead of nbsp; use #160; Ah, okay. The plugin is using decimal values, but WordPress also uses UTF-8 by default -- so perhaps it is redundant. It's just always felt dirty seeing certain characters not written in their appropriate entity codes. Hmmm...that's a very English centric view of the Web ;-) Yeah, I thought that too, but couldn't think of another way to say it! *blushes whilst wishing he were bilingual!* Thanks :) -- Joshua Street [EMAIL PROTECTED] base10solutions ** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list getting help **