Maruan and Duff,

This is my first experience using a help forum like this, and the response was 
great.
I appreciate the help.

I will look into the documentation and hopefully be able to figure out what I 
am doing wrong.

Colette

-----Original Message-----
From: Duff Johnson [mailto:[email protected]] 
Sent: June-13-14 1:59 PM
To: [email protected]
Subject: Re: Unable to mark document as tagged

Colette,

It might be a good idea to take a look at 14.8 of ISO 32000-1, which defines 
tagged PDF.

You can download it for free:

http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf

Duff.


On Jun 13, 2014, at 1:52 PM, Maruan Sahyoun <[email protected]> wrote:

> Colette,
> 
> you are not corrupting the PDF document but the structure Information needed 
> for tagged PDF is missing. 
> 
> Maruan Sahyoun
> 
>> Am 13.06.2014 um 19:41 schrieb Colette Joubarne 
>> <[email protected]>:
>> 
>> Maruan,
>> 
>> I use the parser to tokenize, and then loop thru the tokens. If a token is a 
>> TJ or Tj operator, I grab the text, in certain cases I replace some of the 
>> text (letter by letter, maintaining the existing structure), and add these 
>> tokens to a new token list. If it is not a TJ or Tj operator I just copy the 
>> token to the new token list. I then write the token list to the doc and save.
>> 
>> If I am corrupting the structure, how is it that the document displays 
>> correctly?
>> 
>> Colette
>> 
>> -----Original Message-----
>> From: Maruan Sahyoun [mailto:[email protected]] 
>> Sent: June-13-14 12:54 PM
>> To: [email protected]
>> Subject: Re: Unable to mark document as tagged
>> 
>> Hi Colette,
>> 
>> the modified version does not contain the structure information needed for 
>> tagged PDFs.  How do you create the modified version from the first one?
>> 
>> BR
>> Maruan
>> 
>>> Am 13.06.2014 um 17:48 schrieb Colette Joubarne 
>>> <[email protected]>:
>>> 
>>> Maruan,
>>> 
>>> I am copying the entire structure from a tagged document and just replacing 
>>> some of the text, so I would think that the structure is unchanged. Then 
>>> again who knows what I might have messed up.
>>> 
>>> James-pdf is the original file:
>>> https://dl.dropboxusercontent.com/u/7689859/James.pdf
>>> 
>>> James-mod.pdf is the modified file:
>>> https://dl.dropboxusercontent.com/u/7689859/James-mod.pdf
>>> 
>>> Colette
>>> 
>>> -----Original Message-----
>>> From: Maruan Sahyoun [mailto:[email protected]] 
>>> Sent: June-13-14 10:45 AM
>>> To: [email protected]
>>> Subject: Re: Unable to mark document as tagged
>>> 
>>> Hi Colette,
>>> 
>>> this information alone doesn't make a document a tagged PDF! You might not 
>>> have the structure information needed within your PDF. Would you have a 
>>> works / doesn't work sample which you could upload to a public location as 
>>> attachments are not allowed on the mailing list?
>>> 
>>> BR
>>> Maruan
>>> 
>>>> Am 13.06.2014 um 15:44 schrieb Colette Joubarne 
>>>> <[email protected]>:
>>>> 
>>>> Maruan,
>>>> 
>>>> Yes you are right, however why is it that when I look at the properties in 
>>>> Adobe Reader it indicates that the document is not tagged?
>>>> 
>>>> 3 0 obj
>>>> <<
>>>> /Marked true
>>>> 
>>>> Colette
>>>> -----Original Message-----
>>>> From: Maruan Sahyoun [mailto:[email protected]] 
>>>> Sent: June-13-14 9:19 AM
>>>> To: [email protected]
>>>> Subject: Re: Unable to mark document as tagged
>>>> 
>>>> Dear Colette,
>>>> 
>>>> /MarkInfo 3 0 R indicates that the information you are looking for is 
>>>> referenced and should be available in 3 0 obj. Could you verify that?
>>>> 
>>>> With kind regards
>>>> 
>>>> Maruan
>>>> 
>>>>> Am 13.06.2014 um 14:21 schrieb Colette Joubarne 
>>>>> <[email protected]>:
>>>>> 
>>>>> I have a tagged pdf doc with the following header:
>>>>> 
>>>>>        /Type/Catalog/Pages 2 0 R/Lang(en-CA) /StructTreeRoot 10 0 
>>>>> R/MarkInfo<</Marked true
>>>>> 
>>>>> I read in the contents, replace some of the text and create a new doc. I 
>>>>> copy the document information from the original doc and set marked to 
>>>>> true.
>>>>> 
>>>>>        newDoc = new PDDocument();
>>>>>        
>>>>> newDoc.setDocumentInformation(PTConstants.pdfDoc.getDocumentInformation());
>>>>> 
>>>>>        PDMarkInfo markinfo = new PDMarkInfo();
>>>>>        markinfo.setMarked(true);
>>>>>        newDoc.getDocumentCatalog().setMarkInfo(markinfo);
>>>>> 
>>>>> and when I check that it was set, it returns true:
>>>>> 
>>>>>  PDMarkInfo markInfo = 
>>>>> PTConstants.pdfDoc.getDocumentCatalog().getMarkInfo();
>>>>>  if ((markInfo != null) && (markInfo.isMarked())) 
>>>>> System.out.println("true");
>>>>> 
>>>>> But, while the resulting document displays correctly, the header 
>>>>> indicates that it is not tagged:
>>>>> 
>>>>> /Type /Catalog
>>>>> /Version /1.4
>>>>> /Pages 2 0 R
>>>>> /MarkInfo 3 0 R
>>>>> 
>>>>> Any idea what is going on?
>>>>> 
>>>>> Colette
>> 

Reply via email to