Hi Leonard, I have already wrote a code extending TaggedPdfReaderTool. It gets the alt text as an attribute of a /span tag within a /figure tag (the way iText deals with alt text, see iText in Action page 510). But this is not the way Acrobat (and Word) deals with alt text attribute. In fact, Acrobat does not read the alt text of a pdf created by iText (following the recommended strategy) and vice-verse (but a screen reader (ex.NVDA) reads the alt text of both types).
Leonard, could you put me in touch with the iText consultancy? If my software works, my business could work too, and it could generate incoming to iText. Thanks, Walter On 16/08/2012 18:32, Leonard Rosenthol wrote: > Alt is a property of the tag - so it would be associated with the tag itself > (Figure in this case, I guess). So perhaps the TaggedPdfReaderTool needs to > be extended to support attributes. > > Leonard > > -----Original Message----- > From: Walter Cybis [mailto:walter.cy...@polymtl.ca] > Sent: Thursday, August 16, 2012 5:49 PM > To: Post all your questions about iText here > Subject: Re: [iText-questions] How can I extract the alt text of an image wit > iText? > > Thanks Leonard, > > I've already read chapter 14 of ISO 32000-1:2008 (in fact very instructive), > as well as the relative chapters in the iText in Action book (2nd edtion) and > PDF Reference book (sixth edition). > > I'm coding a java/itext program in order to extract the alt text from figures > in a pdf file, and I'll give you more information about my problem. > > I have created pdf files with MS Word in which the pictures where described > by alt text, I opened this file with Acrobat and I saw the "alt text" content > in the corresponding figure tag properties. > I applied the "TaggedPdfReaderTool" class to this document in a way to > extract its structure to a xml file. > There is no mention to the Alt text in the resulting xml. There is only a > "figure" tag over there. > > For other side, I applied the "PdfContentReaderTool" to the original Pdf in > order to get the dictionaries coded into the stream (- Content Stream -), and > there is no mention to Alt text on them neither. > > I would really appreciate some advice.... > > > Walter > PS- I opened the original pdf file with notepad++ and I can see the figure > dictionary, but there is no /Alt key in it. > I can not see the dictionaries within the figure stream... > > > > On 14/08/2012 14:47, Leonard Rosenthol wrote: >> You should read the chapter (14, IIRC) in ISO 32000-1:2008 about Tagged PDF >> and Structured PDF. Alt (for images or any other element in PDF) is >> handled via structure. >> >> Leonard >> >> -----Original Message----- >> From: Walter Cybis [mailto:walter.cy...@polymtl.ca] >> Sent: Tuesday, August 14, 2012 2:26 PM >> To: itext-questions@lists.sourceforge.net >> Subject: [iText-questions] How can I extract the alt text of an image wit >> iText? >> >> Hi >> >> Is there a way to get the alt text of an image in a pdf file with iText? >> I think the alt text is anywhere in the image stream... because there is no >> a dictionary for the alt text in the file. >> >> Perhaps overwriting PdfContentStreamProcessor including a new "\Span" >> operator ? >> >> Walter >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and threat >> landscape has changed and how IT managers can respond. Discussions will >> include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> iText-questions mailing list >> iText-questions@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/itext-questions >> >> iText(R) is a registered trademark of 1T3XT BVBA. >> Many questions posted to this list can (and will) be answered with a >> reference to the iText book: http://www.itextpdf.com/book/ Please check the >> keywords list before you ask for examples: >> http://itextpdf.com/themes/keywords.php >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> iText-questions mailing list >> iText-questions@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/itext-questions >> >> iText(R) is a registered trademark of 1T3XT BVBA. >> Many questions posted to this list can (and will) be answered with a >> reference to the iText book: http://www.itextpdf.com/book/ >> Please check the keywords list before you ask for examples: >> http://itextpdf.com/themes/keywords.php >> > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > iText-questions mailing list > iText-questions@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/itext-questions > > iText(R) is a registered trademark of 1T3XT BVBA. > Many questions posted to this list can (and will) be answered with a > reference to the iText book: http://www.itextpdf.com/book/ > Please check the keywords list before you ask for examples: > http://itextpdf.com/themes/keywords.php > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > iText-questions mailing list > iText-questions@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/itext-questions > > iText(R) is a registered trademark of 1T3XT BVBA. > Many questions posted to this list can (and will) be answered with a > reference to the iText book: http://www.itextpdf.com/book/ > Please check the keywords list before you ask for examples: > http://itextpdf.com/themes/keywords.php > ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php