Colette, you are not corrupting the PDF document but the structure Information needed for tagged PDF is missing.
Maruan Sahyoun > Am 13.06.2014 um 19:41 schrieb Colette Joubarne > <[email protected]>: > > Maruan, > > I use the parser to tokenize, and then loop thru the tokens. If a token is a > TJ or Tj operator, I grab the text, in certain cases I replace some of the > text (letter by letter, maintaining the existing structure), and add these > tokens to a new token list. If it is not a TJ or Tj operator I just copy the > token to the new token list. I then write the token list to the doc and save. > > If I am corrupting the structure, how is it that the document displays > correctly? > > Colette > > -----Original Message----- > From: Maruan Sahyoun [mailto:[email protected]] > Sent: June-13-14 12:54 PM > To: [email protected] > Subject: Re: Unable to mark document as tagged > > Hi Colette, > > the modified version does not contain the structure information needed for > tagged PDFs. How do you create the modified version from the first one? > > BR > Maruan > >> Am 13.06.2014 um 17:48 schrieb Colette Joubarne >> <[email protected]>: >> >> Maruan, >> >> I am copying the entire structure from a tagged document and just replacing >> some of the text, so I would think that the structure is unchanged. Then >> again who knows what I might have messed up. >> >> James-pdf is the original file: >> https://dl.dropboxusercontent.com/u/7689859/James.pdf >> >> James-mod.pdf is the modified file: >> https://dl.dropboxusercontent.com/u/7689859/James-mod.pdf >> >> Colette >> >> -----Original Message----- >> From: Maruan Sahyoun [mailto:[email protected]] >> Sent: June-13-14 10:45 AM >> To: [email protected] >> Subject: Re: Unable to mark document as tagged >> >> Hi Colette, >> >> this information alone doesn't make a document a tagged PDF! You might not >> have the structure information needed within your PDF. Would you have a >> works / doesn't work sample which you could upload to a public location as >> attachments are not allowed on the mailing list? >> >> BR >> Maruan >> >>> Am 13.06.2014 um 15:44 schrieb Colette Joubarne >>> <[email protected]>: >>> >>> Maruan, >>> >>> Yes you are right, however why is it that when I look at the properties in >>> Adobe Reader it indicates that the document is not tagged? >>> >>> 3 0 obj >>> << >>> /Marked true >>> >>> Colette >>> -----Original Message----- >>> From: Maruan Sahyoun [mailto:[email protected]] >>> Sent: June-13-14 9:19 AM >>> To: [email protected] >>> Subject: Re: Unable to mark document as tagged >>> >>> Dear Colette, >>> >>> /MarkInfo 3 0 R indicates that the information you are looking for is >>> referenced and should be available in 3 0 obj. Could you verify that? >>> >>> With kind regards >>> >>> Maruan >>> >>>> Am 13.06.2014 um 14:21 schrieb Colette Joubarne >>>> <[email protected]>: >>>> >>>> I have a tagged pdf doc with the following header: >>>> >>>> /Type/Catalog/Pages 2 0 R/Lang(en-CA) /StructTreeRoot 10 0 >>>> R/MarkInfo<</Marked true >>>> >>>> I read in the contents, replace some of the text and create a new doc. I >>>> copy the document information from the original doc and set marked to true. >>>> >>>> newDoc = new PDDocument(); >>>> >>>> newDoc.setDocumentInformation(PTConstants.pdfDoc.getDocumentInformation()); >>>> >>>> PDMarkInfo markinfo = new PDMarkInfo(); >>>> markinfo.setMarked(true); >>>> newDoc.getDocumentCatalog().setMarkInfo(markinfo); >>>> >>>> and when I check that it was set, it returns true: >>>> >>>> PDMarkInfo markInfo = >>>> PTConstants.pdfDoc.getDocumentCatalog().getMarkInfo(); >>>> if ((markInfo != null) && (markInfo.isMarked())) >>>> System.out.println("true"); >>>> >>>> But, while the resulting document displays correctly, the header indicates >>>> that it is not tagged: >>>> >>>> /Type /Catalog >>>> /Version /1.4 >>>> /Pages 2 0 R >>>> /MarkInfo 3 0 R >>>> >>>> Any idea what is going on? >>>> >>>> Colette >

