Re: [WSG] PDF Conversion
You could put all your pdfs on Google docs and they will be available like any other web documents like this one: https://docs.google.com/viewer?a=vpid=explorerchrome=truesrcid=0B1iqp0kGPjWsZjA2MTFmMTQtM2ZmYS00OWU2LWI4NjMtMzEyMjYwMjYzOGI3hl=en hth --- On Wed, 9/2/11, Neeraj Challana neeraj.mail...@gmail.com wrote: From: Neeraj Challana neeraj.mail...@gmail.com Subject: [WSG] PDF Conversion To: wsg@webstandardsgroup.org Date: Wednesday, 9 February, 2011, 2:18 Hi all, We need a tool to help us convert our many existing PDF documents into Word and/or HTML to improve the accessibility of our web and intranet content. While there are tools (both freeware and licence ware) available, I would like to get some recommendations and experience of other organisations in selecting and using of such conversion tools. Your help is greatly appreciated. Thanks Neeraj *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I agree to the fact that HTML approach is the best. I normally use Google docs they can open any documents - pdfs, word, excel, images etc like any other html document. See the link: https://docs.google.com/viewer?a=vpid=explorerchrome=truesrcid=0B1iqp0kGPjWsZjA2MTFmMTQtM2ZmYS00OWU2LWI4NjMtMzEyMjYwMjYzOGI3hl=en It is just a sample page I have created for demonstration purposes with inadvertent typos!. It is just a simple pdf file but displayed in your browser and requires no plugs of any kind. *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
I agree to the fact that HTML approach is the best. I normally use Google docs they can open any documents - pdfs, word, excel, images etc like any other html document. See the link: https://docs.google.com/viewer?a=vpid=explorerchrome=truesrcid=0B1iqp0kGPjWsZjA2MTFmMTQtM2ZmYS00OWU2LWI4NjMtMzEyMjYwMjYzOGI3hl=en It is just a sample page I have created for demonstration purposes with inadvertent typos!. It is just a simple pdf file but displayed in your browser and requires no plugs of any kind. Looks like it just converted it to an image ... not really accessible to anyone using a screenreader or not able to view images! Did the original pdf have any text in it (or did it just contain an image)? *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
[WSG] PDF Conversion
Hi all, We need a tool to help us convert our many existing PDF documents into Word and/or HTML to improve the accessibility of our web and intranet content. While there are tools (both freeware and licence ware) available, I would like to get some recommendations and experience of other organisations in selecting and using of such conversion tools. Your help is greatly appreciated. Thanks Neeraj *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
I would be very interested in other people's experiences as well. Thanks for asking the question Neeraj. From: li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org] On Behalf Of Neeraj Challana Sent: Wednesday, 9 February 2011 1:19 PM To: wsg@webstandardsgroup.org Subject: [WSG] PDF Conversion Hi all, We need a tool to help us convert our many existing PDF documents into Word and/or HTML to improve the accessibility of our web and intranet content. While there are tools (both freeware and licence ware) available, I would like to get some recommendations and experience of other organisations in selecting and using of such conversion tools. Your help is greatly appreciated. Thanks Neeraj *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hi Neeraj, Some questions: 1. are you also aiming to make the PDF's accessible? (i.e. tagged PDFs) 2. why PDF to Word? I have found there is little benefit in this type of conversion. I just checked with a blind user now - asking is there any advantage in Word over PDF? His answer: If the PDF is well structured, converting it to Word could remove some of the assistive structure. If the PF is not well structured, there is no advantage either way One place to try as a conversion service/tool is River Docs http://riverdocs.com/ Good luck! Russ On 09/02/2011, at 1:18 PM, Neeraj Challana wrote: Hi all, We need a tool to help us convert our many existing PDF documents into Word and/or HTML to improve the accessibility of our web and intranet content. While there are tools (both freeware and licence ware) available, I would like to get some recommendations and experience of other organisations in selecting and using of such conversion tools. Your help is greatly appreciated. Thanks Neeraj *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
On 09/02/11 16:00, Russ Weakley wrote: Some questions: 1. are you also aiming to make the PDF's accessible? (i.e. tagged PDFs) 2. why PDF to Word? I have found there is little benefit in this type of conversion. I just checked with a blind user now - asking is there any advantage in Word over PDF? His answer: If the PDF is well structured, converting it to Word could remove some of the assistive structure. If the PF is not well structured, there is no advantage either way Thanks for asking those questions, Russ, and checking with users of assistive technologies. I also wondered how moving from an open standard to a proprietary one would help anyone with anything... Sadly, most people creating documents know far less about structured data, consistent formatting, and open standards than people on this list... Dave -- Dave Lane, Egressive Ltd d...@egressive.com m +64212298147 p +6439633733 http://egressive.com Free/OpenSourceSoftware: because to share is human Only use Open Standards - w3.org, Drupal powers communities - drupal.org Effusion Group http://effusiongroup.com Software Patents kill innovation *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
Dave Lane: Thanks for asking those questions, Russ, and checking with users of assistive technologies. I also wondered how moving from an open standard to a proprietary one would help anyone with anything... Perhaps because not everyone would agree with Russ' blind user, and they might have a setup that can handle Word better than PDF. For those who might not be aware of it, current Australian government requirements mandate that PDFs should not be published on their own, but should be accompanied by an accessible equivalent. Kerry -- Kerry Webb Manager Policy Office | InTACT Shared Services | ACT Government --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! Thanks Russ - Russ Weakley Max Design Phone: (02) 9410 2521 Mobile: 0403 433 980 Email: r...@maxdesign.com.au Skype: russ-maxdesign MSN: r...@maxdesign.com.au Website: http://www.maxdesign.com.au/ Twitter: http://twitter.com/russmaxdesign Linkedin: http://www.linkedin.com/in/russweakley Slideshare: http://www.slideshare.net/maxdesign/ -- On 09/02/2011, at 2:41 PM, Webb, KerryA kerrya.w...@act.gov.au wrote: Dave Lane: Thanks for asking those questions, Russ, and checking with users of assistive technologies. I also wondered how moving from an open standard to a proprietary one would help anyone with anything... Perhaps because not everyone would agree with Russ' blind user, and they might have a setup that can handle Word better than PDF. For those who might not be aware of it, current Australian government requirements mandate that PDFs should not be published on their own, but should be accompanied by an accessible equivalent. Kerry -- Kerry Webb Manager Policy Office | InTACT Shared Services | ACT Government --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). Regards, Dave -- Dave Lane, Egressive Ltd d...@egressive.com m +64212298147 p +6439633733 http://egressive.com Free/OpenSourceSoftware: because to share is human Only use Open Standards - w3.org, Drupal powers communities - drupal.org Effusion Group http://effusiongroup.com Software Patents kill innovation *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
RE: [WSG] PDF Conversion
Just to touch on the OP's question, Adobe Acrobat Pro has the ability to batch export many pdfs to HTML. Select File Export Multiple Files. Select the files you want batch converted, choose html as your output. Proceed to laugh \ cry at the lack of formatting \ structure retained in the html version. -Original Message- From: li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org] On Behalf Of Webb, KerryA Sent: Wednesday, 9 February 2011 3:33 PM To: wsg@webstandardsgroup.org Subject: RE: [WSG] PDF Conversion Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hi all, This is not a solution to your problem as these documents have already been created but just wanted to add my two-cents. Generally publications are created/developed using a word processing file (MS-Word or equivalent). Word processors have the ability to work with their own internal stylesheets which aside from providing visual consitency in relation to headings etc. it can also be used to provide a structure to the document. This can be used to automatically generate table of contents etc. but more importantly in the context of this question it also provides a heading hierarchy (just like that required by accessible HTML). Preparing a corporate document(s) template for staff to use in the preparation of documents can take some negotiating and a slight shift in how people work with programs like Word (not just selecting a piece of text and making it 20-point Arial but instead formatting is as a heading 2 for example) but it provides many advantages including two very important ones such as the ability to export that document as a web page (with a CSS section rather than inline markup) but also allowing the document (along with other requirements such as providing alternative text to images etc.) to be fully accessible to screen readers. Sam On 9 February 2011 15:48, Geary, Damien damien.ge...@act.gov.au wrote: Just to touch on the OP's question, Adobe Acrobat Pro has the ability to batch export many pdfs to HTML. Select File Export Multiple Files. Select the files you want batch converted, choose html as your output. Proceed to laugh \ cry at the lack of formatting \ structure retained in the html version. -Original Message- From: li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org] On Behalf Of Webb, KerryA Sent: Wednesday, 9 February 2011 3:33 PM To: wsg@webstandardsgroup.org Subject: RE: [WSG] PDF Conversion Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org *** __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
Hello, I understand that Nuance make a PDF converter: http://shop.nuance.com/store?Action=DisplayPageEnv=BASELocale=en_AUSiteID=scsoftAPid=ProductDetailsPageproductID=208595700 Have not used it myself however, it may be an improvement on the Acrobat batch convert that Damien talks about. Perhaps others could offer comments. Regards, Grant Bailey On 9/02/2011 4:33 PM, Samuel Santana wrote: Hi all, This is not a solution to your problem as these documents have already been created but just wanted to add my two-cents. Generally publications are created/developed using a word processing file (MS-Word or equivalent). Word processors have the ability to work with their own internal stylesheets which aside from providing visual consitency in relation to headings etc. it can also be used to provide a structure to the document. This can be used to automatically generate table of contents etc. but more importantly in the context of this question it also provides a heading hierarchy (just like that required by accessible HTML). Preparing a corporate document(s) template for staff to use in the preparation of documents can take some negotiating and a slight shift in how people work with programs like Word (not just selecting a piece of text and making it 20-point Arial but instead formatting is as a heading 2 for example) but it provides many advantages including two very important ones such as the ability to export that document as a web page (with a CSS section rather than inline markup) but also allowing the document (along with other requirements such as providing alternative text to images etc.) to be fully accessible to screen readers. Sam On 9 February 2011 15:48, Geary, Damien damien.ge...@act.gov.au mailto:damien.ge...@act.gov.au wrote: Just to touch on the OP's question, Adobe Acrobat Pro has the ability to batch export many pdfs to HTML. Select File Export Multiple Files. Select the files you want batch converted, choose html as your output. Proceed to laugh \ cry at the lack of formatting \ structure retained in the html version. -Original Message- From: li...@webstandardsgroup.org mailto:li...@webstandardsgroup.org [mailto:li...@webstandardsgroup.org mailto:li...@webstandardsgroup.org] On Behalf Of Webb, KerryA Sent: Wednesday, 9 February 2011 3:33 PM To: wsg@webstandardsgroup.org mailto:wsg@webstandardsgroup.org Subject: RE: [WSG] PDF Conversion Dave wrote: On 09/02/11 16:55, Russ Weakley wrote: Hi Kerry. Neither the blind user or I were suggesting that alternatives were not a good idea, or even a requirement. I'd always recommend providing an HTML alternative if possible along with accessible (tagged) PDF. The question was about Word as as a viable alternative to PDF. I am not sure it is. Though others may disagree! I'm not an accessibility expert, but it seems pretty obvious that if the PDF isn't well structured (which would presumably make it more accessible), I can't imagine that converting it to an MS Word document will add any sensible structure that wasn't there before. Neither am I an accessibility expert, but I'm of necessity taking more interest in it these days. There are a number of reasons - not just about structure - why a blind user might have trouble with a PDF. An MS Word (or an RTF) document may be a more accessible alternative to a PDF. Using standards compliant HTML as an alternative accessible standard makes much more sense (again, assuming the source document wasn't generated from your typical poorly structured MS Word document). And few Web managers will find the time and resources to create a readable standards compliant HTML version of a multi-multi-page PDF, whereas a Word document will in many cases be more doable. Kerry --- This email, and any attachments, may be confidential and also privileged. If you are not the intended recipient, please notify the sender and delete all copies of this transmission along with any attachments immediately. You should not copy or use it for any purpose, nor disclose its contents to any other person. --- *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org mailto:memberh...@webstandardsgroup.org *** *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http
RE: [WSG] PDF Conversion
linux pdftohtml (you can apt-get it) Its not perfect (formatting often comes out a bit strange and the html is messy) but at least you end up with something you can edit. Unfortunately I haven't seen anything better yet, and absolutely nothing anywhere near good enough to use without needing to manually edit or clean up the output. My recommendation: If its for public release and needs to be accessible or converted to other formats, don't use pdf to start with! *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***
Re: [WSG] PDF Conversion
On 09/02/11 20:17, Michael MD wrote: My recommendation: If its for public release and needs to be accessible or converted to other formats, don't use pdf to start with! I think it's fair to say that if the source document is poorly structured or lacks structure, you're out of luck no matter what you do. People need to be trained to understand the importance of structural conventions and consistency... and now we've come full circle back to open standard formats :) Dave -- Dave Lane, Egressive Ltd d...@egressive.com m +64212298147 p +6439633733 http://egressive.com Free/OpenSourceSoftware: because to share is human Only use Open Standards - w3.org, Drupal powers communities - drupal.org Effusion Group http://effusiongroup.com Software Patents kill innovation *** List Guidelines: http://webstandardsgroup.org/mail/guidelines.cfm Unsubscribe: http://webstandardsgroup.org/join/unsubscribe.cfm Help: memberh...@webstandardsgroup.org ***