[ https://issues.apache.org/jira/browse/PDFBOX-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239855#comment-14239855 ]
Tilman Hausherr commented on PDFBOX-1886: ----------------------------------------- Why do you think that the OCR is missing? I can copy & paste text from the "santa-cruz-flats-project-part-2 (1).pdf" file. > Merge Function strips OCR layer in acrobat > ------------------------------------------ > > Key: PDFBOX-1886 > URL: https://issues.apache.org/jira/browse/PDFBOX-1886 > Project: PDFBox > Issue Type: Bug > Components: Utilities > Affects Versions: 1.8.4 > Reporter: adam brin > Fix For: 2.1.0 > > Attachments: cover_page4818280580458469287.pdf, page1.pdf, > santa-cruz-flats-project-part-2 (1).pdf > > > We use the PDFMergerUtility to add cover pages to documents automatically. > We're finding that when we do so, it strips the OCR data from the source of > the merged files. > {code} > PDFMergerUtility merger = new PDFMergerUtility(); > File outputFile = File.createTempFile(); > merger.setDestinationStream(new FileOutputStream(outputFile)); > for (File file : files) { > merger.addSource(file); > } > merger.mergeDocuments(); > return outputFile; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)