[WSG] PDF Conversion
Hi all, We need a tool to help us convert our many existing PDF documents into Word and/or HTML to improve the accessibility of our web and intranet content. While there are tools (both freeware and licence ware) available, I would like to get some recommendations and experience of other organisations in selecting and using of such conversion tools. Your help is greatly appreciated. Thanks Neeraj *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
I would be very interested in other people's experiences as well. Thanks for asking the question Neeraj. From: li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org] On Behalf Of Neeraj Challana Sent: Wednesday, 9 February 2011 1:19 PM To: wsg@webstandardsgroup.org Subject: [WSG] PDF Conversion Hi all, We need a tool to help us convert our many existing PDF documents into Word and/or HTML to improve the accessibility of our web and intranet content. While there are tools (both freeware and licence ware) available, I would like to get some recommendations and experience of other organisations in selecting and using of such conversion tools. Your help is greatly appreciated. Thanks Neeraj *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
[WSG] Autosvar - Ikke til stede: WSG Digest
Jeg er ikke på skolen i øjeblikket. Men vender tilbage så hurtigt jeg kan I am not at the college at the moment - but I will get back to you as soon as possible Med venlig hilsen/best wishes Peter Larsen Center for Medie og Kommunikation Roskilde Tekniske Skole *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
[WSG] Out of Office AutoReply: WSG Digest
I am currently out of office. I will be back in the office on Thursday, 10 February 2011. For enquiries please contact: Papinder Hamid (x77756) p: +61 2 8237 7756 e: papinder.ha...@macquarie.com Notice: The information contained in this email is confidential. If you are not the intended recipient, you may not disclose or use the information in this email in any way. If you received it in error, please tell us immediately by return email and delete the document. Macquarie does not guarantee the integrity of any emails or attached files and is not responsible for any changes made to them by any other person. Macquarie does not warrant or guarantee that information contained in any email or attached file is free of viruses, worms, trojan horses or anything else having contaminating or destructive properties and has not been intercepted and interfered with during transmission. It is your sole responsibility to protect yourself against such risk and, by opening any email or attached file you agree to assume all risks associated with electronic data transmission. Electronic communications carried within the Macquarie system may be monitored. Macquarie Funds Group services are provided by Macquarie Bank Limited ABN 46 008 583 542 or one of its related entities. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
[WSG] En Vacaciones Re: WSG Digest
Hola, estoy de vacaciones hasta el 21 de febrero! Cualquier cosa comunicarse con Abigail Norambuena abig...@mente.cl o al fono 7146470 muchas gracIas Atte Eduardo V MENTE ENAXXION *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hi Neeraj, Some questions: 1. are you also aiming to make the PDF's accessible? (i.e. tagged PDFs) 2. why PDF to Word? I have found there is little benefit in this type of conversion. I just checked with a blind user now - asking is there any advantage in Word over PDF? His answer: If the PDF is well structured, converting it to Word could remove some of the assistive structure. If the PF is not well structured, there is no advantage either way One place to try as a conversion service/tool is River Docs http://riverdocs.com/ Good luck! Russ On 09/02/2011, at 1:18 PM, Neeraj Challana wrote: Hi all, We need a tool to help us convert our many existing PDF documents into Word and/or HTML to improve the accessibility of our web and intranet content. While there are tools (both freeware and licence ware) available, I would like to get some recommendations and experience of other organisations in selecting and using of such conversion tools. Your help is greatly appreciated. Thanks Neeraj *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
On 09/02/11 16:00, Russ Weakley wrote: Some questions: 1. are you also aiming to make the PDF's accessible? (i.e. tagged PDFs) 2. why PDF to Word? I have found there is little benefit in this type of conversion. I just checked with a blind user now - asking is there any advantage in Word over PDF? His answer: If the PDF is well structured, converting it to Word could remove some of the assistive structure. If the PF is not well structured, there is no advantage either way Thanks for asking those questions, Russ, and checking with users of assistive technologies. I also wondered how moving from an open standard to a proprietary one would help anyone with anything... Sadly, most people creating documents know far less about structured data, consistent formatting, and open standards than people on this list... Dave -- Dave Lane, Egressive Ltd d...@egressive.com m +64212298147 p +6439633733 http://egressive.com Free/OpenSourceSoftware: because to share is human Only use Open Standards - w3.org, Drupal powers communities - drupal.org Effusion Group http://effusiongroup.com Software Patents kill innovation *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
Dave Lane: Thanks for asking those questions, Russ, and checking with users of assistive technologies. I also wondered how moving from an open standard to a proprietary one would help anyone with anything... Perhaps because not everyone would agree with Russ' blind user, and they might have a setup that can handle Word better than PDF. For those who might not be aware of it, current Australian government requirements mandate that PDFs should not be published on their own, but should be accompanied by an accessible equivalent. Kerry -- Kerry Webb Manager Policy Office | InTACT Shared Services | ACT Government --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! Thanks Russ - Russ Weakley Max Design Phone: (02) 9410 2521 Mobile: 0403 433 980 Email: r...@maxdesign.com.au Skype: russ-maxdesign MSN: r...@maxdesign.com.au Website: http://www.maxdesign.com.au/ Twitter: http://twitter.com/russmaxdesign Linkedin: http://www.linkedin.com/in/russweakley Slideshare: http://www.slideshare.net/maxdesign/ -- On 09/02/2011, at 2:41 PM, Webb, KerryA kerrya.w...@act.gov.au wrote: Dave Lane: Thanks for asking those questions, Russ, and checking with users of assistive technologies. I also wondered how moving from an open standard to a proprietary one would help anyone with anything... Perhaps because not everyone would agree with Russ' blind user, and they might have a setup that can handle Word better than PDF. For those who might not be aware of it, current Australian government requirements mandate that PDFs should not be published on their own, but should be accompanied by an accessible equivalent. Kerry -- Kerry Webb Manager Policy Office | InTACT Shared Services | ACT Government --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). Regards, Dave -- Dave Lane, Egressive Ltd d...@egressive.com m +64212298147 p +6439633733 http://egressive.com Free/OpenSourceSoftware: because to share is human Only use Open Standards - w3.org, Drupal powers communities - drupal.org Effusion Group http://effusiongroup.com Software Patents kill innovation *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
Just to touch on the OP's question, Adobe Acrobat Pro has the ability to batch export many pdfs to HTML. Select File Export Multiple Files. Select the files you want batch converted, choose html as your output. Proceed to laugh \ cry at the lack of formatting \ structure retained in the html version. -Original Message- From: li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org] On Behalf Of Webb, KerryA Sent: Wednesday, 9 February 2011 3:33 PM To: wsg@webstandardsgroup.org Subject: RE: [WSG] PDF Conversion Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hi all, This is not a solution to your problem as these documents have already been created but just wanted to add my two-cents. Generally publications are created/developed using a word processing file (MS-Word or equivalent). Word processors have the ability to work with their own internal stylesheets which aside from providing visual consitency in relation to headings etc. it can also be used to provide a structure to the document. This can be used to automatically generate table of contents etc. but more importantly in the context of this question it also provides a heading hierarchy (just like that required by accessible HTML). Preparing a corporate document(s) template for staff to use in the preparation of documents can take some negotiating and a slight shift in how people work with programs like Word (not just selecting a piece of text and making it 20-point Arial but instead formatting is as a heading 2 for example) but it provides many advantages including two very important ones such as the ability to export that document as a web page (with a CSS section rather than inline markup) but also allowing the document (along with other requirements such as providing alternative text to images etc.) to be fully accessible to screen readers. Sam On 9 February 2011 15:48, Geary, Damien damien.ge...@act.gov.au wrote: Just to touch on the OP's question, Adobe Acrobat Pro has the ability to batch export many pdfs to HTML. Select File Export Multiple Files. Select the files you want batch converted, choose html as your output. Proceed to laugh \ cry at the lack of formatting \ structure retained in the html version. -Original Message- From: li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org] On Behalf Of Webb, KerryA Sent: Wednesday, 9 February 2011 3:33 PM To: wsg@webstandardsgroup.org Subject: RE: [WSG] PDF Conversion Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hello, I understand that Nuance make a PDF converter: http://shop.nuance.com/store?Action=DisplayPageEnv=BASELocale=en_AUSiteID=scsoftAPid=ProductDetailsPageproductID=208595700 Have not used it myself however, it may be an improvement on the Acrobat batch convert that Damien talks about. Perhaps others could offer comments. Regards, Grant Bailey On 9/02/2011 4:33 PM, Samuel Santana wrote: Hi all, This is not a solution to your problem as these documents have already been created but just wanted to add my two-cents. Generally publications are created/developed using a word processing file (MS-Word or equivalent). Word processors have the ability to work with their own internal stylesheets which aside from providing visual consitency in relation to headings etc. it can also be used to provide a structure to the document. This can be used to automatically generate table of contents etc. but more importantly in the context of this question it also provides a heading hierarchy (just like that required by accessible HTML). Preparing a corporate document(s) template for staff to use in the preparation of documents can take some negotiating and a slight shift in how people work with programs like Word (not just selecting a piece of text and making it 20-point Arial but instead formatting is as a heading 2 for example) but it provides many advantages including two very important ones such as the ability to export that document as a web page (with a CSS section rather than inline markup) but also allowing the document (along with other requirements such as providing alternative text to images etc.) to be fully accessible to screen readers. Sam On 9 February 2011 15:48, Geary, Damien damien.ge...@act.gov.au mailto:damien.ge...@act.gov.au wrote: Just to touch on the OP's question, Adobe Acrobat Pro has the ability to batch export many pdfs to HTML. Select File Export Multiple Files. Select the files you want batch converted, choose html as your output. Proceed to laugh \ cry at the lack of formatting \ structure retained in the html version. -Original Message- From: li...@webstandardsgroup.org mailto:li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org mailto:li...@webstandardsgroup.org] On Behalf Of Webb, KerryA Sent: Wednesday, 9 February 2011 3:33 PM To: wsg@webstandardsgroup.org mailto:wsg@webstandardsgroup.org Subject: RE: [WSG] PDF Conversion Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org mailto:memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe:
RE: [WSG] PDF Conversion
linux pdftohtml (you can apt-get it) Its not perfect (formatting often comes out a bit strange and the html is messy) but at least you end up with something you can edit. Unfortunately I haven't seen anything better yet, and absolutely nothing anywhere near good enough to use without needing to manually edit or clean up the output. My recommendation: If its for public release and needs to be accessible or converted to other formats, don't use pdf to start with! *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
On 09/02/11 20:17, Michael MD wrote: My recommendation: If its for public release and needs to be accessible or converted to other formats, don't use pdf to start with! I think it's fair to say that if the source document is poorly structured or lacks structure, you're out of luck no matter what you do. People need to be trained to understand the importance of structural conventions and consistency... and now we've come full circle back to open standard formats :) Dave -- Dave Lane, Egressive Ltd d...@egressive.com m +64212298147 p +6439633733 http://egressive.com Free/OpenSourceSoftware: because to share is human Only use Open Standards - w3.org, Drupal powers communities - drupal.org Effusion Group http://effusiongroup.com Software Patents kill innovation *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***