Regarding page labels support issue (PDFBOX-90 vs. PDFBOX-532)

Igor Podolskiy Sun, 24 Jan 2010 07:21:45 -0800

Hello Andreas, hello everybody,

Andreas Lehmkühler commented on PDFBOX-90:
------------------------------------------


There is another implementation for this feature (see PDFBOX-532). Now we have 
to decide which one is the more suitable solution.

oops, I didn't see this second bug report as I prepared the patch forPDFBOX-90. My bad.

As to the decision: I took a quick look at the PageLabelExtractor byNavendu Garg of PDFBOX-532 and identified the following differences tomy implementation:


1. PDModel integration:
   [PDFBOX-90]: integrated in the PDModel

[PDFBOX-532]: no integration with the current model (though Navanduhas at least thought of it as seen in his JIRA comment).


2. Reading and writing page labels:
   [PDFBOX-90]: read/write
   [PDFBOX-532]: read only

3. Number trees (per PDF specification, the page labels information is anumber tree):[PDFBOX-90]: can read any number trees, writes flat arrays(degenerated number trees)

   [PDFBOX-532]: can read flat arrays only

4. Numberless pages. It is possible to create page label which don't useany subsequential numbering and is just a text string. This is done byomitting the S (style) entry in the page label dictionary and settingthe P (prefix) entry. (see the note to Table 159 in the ISO32000 standard)

   [PDFBOX-90]: works

[PDFBOX-532]: returns incorrect labels (see last else clause in thegetNextLabel() method)


5. Mapping direction:
   [PDFBOX-90]: label -> page index, page index -> label

[PDFBOX-532]: page index -> label (of course, this can be invertedin user code, so it isn't that big an issue)


6. Roman numerals support:
   [PDFBOX-90]: unbounded (Adobe Reader like)
   [PDFBOX-532]: up to 4000

7. Test cases:
   [PDFBOX-90]: none :(
   [PDFBOX-532]: 1

So, aside from the test cases (which I could improve), I'd favor myimplementation (patch for PDFBOX-90). Of course, I also might be justbiased ;)

I hope this comparison will make the decision a little bit easier.Whatever it will be - thanks for this excellent library and keep up thegood work!


--
Best regards,
Igor

Regarding page labels support issue (PDFBOX-90 vs. PDFBOX-532)

Reply via email to