[
https://issues.apache.org/jira/browse/PDFBOX-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antti Lankila updated PDFBOX-2105:
----------------------------------
Attachment: pdfbox-multipagetiff.diff
Here's the patch. Please let me know if this form is acceptable, or if we
should make usage a bit different.
An alternative design for this feature that I thought up would involve in
having a static method that returns List<Integer> of addresses which correspond
to starting points of TIFF images in the RandomAccessBuffer. Instead of passing
the page number, we would pass the address. This would be slightly better in
that:
1) it avoids the O(N^2) algorithm for extracting TIFF pages
2) allows user to discover how many pages a TIFF contains
> Support for multipage TIFFs in CCITTFactory, makes PDFBox capable of doing
> tiff2pdf
> -----------------------------------------------------------------------------------
>
> Key: PDFBOX-2105
> URL: https://issues.apache.org/jira/browse/PDFBOX-2105
> Project: PDFBox
> Issue Type: Improvement
> Components: PDModel
> Reporter: Antti Lankila
> Priority: Minor
> Labels: features, patch
> Attachments: pdfbox-multipagetiff.diff
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> I created a patch based on Sergey Ushakov's work that handles multipage
> TIFFs. This allows fast and efficient conversion from TIFF to PDF
> The general approach is to provide a new factory method that accepts an image
> (page) number, and then appropriate page number is located when the CCITT
> stream is being extracted.
> There's a minor inefficiency in this approach because the seek starts from
> the beginning for each page, causing O(N^2) algorithm when extracting every
> page, but maximum size for file appears to be 2 GB and the cost for finding a
> single page will still be low, so I bet this will never come up in practice.
> There is no method that tells how many pages TIFF files have. I opted to
> simply return null from the factory method that accepts page number if there
> is no such page, so users can use this as condition to break from a TIFF to
> PDF conversion loop.
--
This message was sent by Atlassian JIRA
(v6.2#6252)