Tilman, Please accept my sincere apologies for incorrectly calling you Tim! This was a genuine oversight.
Here are my issues: Problem 1: We use PDFBox 1.8.7 to merge these two files: https://www.dropbox.com/s/7lnbdieo9t8k38e/good.pdf?dl=0 https://www.dropbox.com/s/35lafjdqrt7vy3e/michael%20levine.pdf?dl=0 This is the resultant merged file: https://www.dropbox.com/s/gwmbd053269at0p/Merged%20PDF.pdf?dl=0 The problem: page TL-9 appears black as shown here: https://www.dropbox.com/s/09bcw1h87f5hbyy/Screenshot%202014-10-03%20at%203.28.51%20PM.png?dl=0 ———————— Problem 2: We used PDFBox 1.8.7 to merge these two files: https://www.dropbox.com/s/7lnbdieo9t8k38e/good.pdf?dl=0 https://www.dropbox.com/s/dwlxoj2hpvbnr5i/badform.pdf?dl=0 The merge does not proceed due to password encryption of badform.pdf. Does PDFBox have a way to handle password encrypt files? Strangely, the file can be opened normally (without the need to enter a password)! We had another 8 files that did not merge properly with 1.8.6, but now merges fine with 1.8.7. Only the two issues above are outstanding. Thanks, Marc On Oct 2, 2014, at 3:23 PM, Tilman Hausherr <[email protected]> wrote: > Am 02.10.2014 um 20:28 schrieb Marc Davis: >> Tim, 1.8.7 seems to have fixed all our issues! Thanks so much for >> recommending this. > > I'm "Tilman". "Tim" is a (very nice) committer from Apache TIKA, a project > that does use PDFBox. > >> We do have two images that seem troublesome: >> >> https://www.dropbox.com/s/35lafjdqrt7vy3e/michael%20levine.pdf?dl=0 (after >> merging TL-9 page is black) > > Then please post the other file, and the result. In other words - just assume > we're dumb and lazy, so please provide every file / step that produces an > error, rather describe more than needed. Even then, solutions may take some > time: > https://issues.apache.org/jira/browse/PDFBOX-1511 > took oder a year and was a group effort of at least six people. > > And there's a contradiction: you're writing "1.8.7 seems to have fixed all > our issues", but then you're mentioning two new problems... > >> https://www.dropbox.com/s/dwlxoj2hpvbnr5i/badform.pdf?dl=0 (file is password >> protected, does PDFBox have a way around this?) > > I was able to display it in the browser. I didn't test it wirh PDFBox; some > files are protected with the empty password. If you use the new nonSeq parser > (loadNonSeq()), just use "" as extra parameter. If you use load(), then it is > more complex, then use openProtection() (download the source code to see how) > > Tilman > >> >> I’d love to hear your thoughts on this... >> >> Thanks, >> Marc >> >> >> >> On Oct 1, 2014, at 3:18 PM, Tilman Hausherr <[email protected]> wrote: >> >>> Then please retry with 1.8.7, because the problem should be fixed there, >>> hopefully. (A problem related to identically named resources in both PDF >>> files) >>> >>> if it still happens, please open an issue in JIRA, and attach the two PDF >>> files and the result. If the files are confidential, please try producing >>> non-confidential files. >>> >>> Tilman >>> >>> Am 01.10.2014 um 21:13 schrieb Marc Davis: >>>> I am using v1.8.6 >>>> >>>> Thanks, >>>> Marc >>>> >>>> >>>> >>>> On Oct 1, 2014, at 3:08 PM, Tilman Hausherr <[email protected]> wrote: >>>> >>>>> What version are you using? We recently fixed a bug with merge. >>>>> >>>>> Tilman >>>>> >>>>> Am 01.10.2014 um 15:21 schrieb Marc Davis: >>>>>> I use pdfbox to merge PDF files but we find that many files from >>>>>> scanners or files generated from AutoCAD do not merge properly (they are >>>>>> either blank or missing fonts). However, when we open and save the file >>>>>> in a native reader such as Adobe Reader (Windows) or Preview in Mac, and >>>>>> then merge again, the merge works fine! >>>>>> >>>>>> Is there a workaround for this in PDFBox? >>>>>> >>>>>> Thanks, >>>>>> Marc >>>>>> >>>>>> >>>>>> >

