Re: PDFBox 2.0
Am 02.03.2015 um 22:49 schrieb Eric Douglas: Hi, I hope this is to the right place, my first time using this list. I'm working with a 2.0 maven trunk I found and I have a couple issues. I can't seem to find this class which is in some online javadocs referencing pdfbox 1.8: - org.apache.pdfbox.pdfviewer.PDFPagePanel org.apache.pdfbox.tools.gui.PDFPagePanel Now, I'm trying to use pdfbox on a server to get pdf pages, then serialize one page at a time to render on a client. What's the best way to do this? none at all. You can't cut a PDF like a salami. There is the PDFSplit application, but the size of all pages will be larger than the size of the original PDF, because of common objects. Tilman I'm trying to avoid serializing the entire document since the load time is much faster doing single pages, but PDPage is not serializable. org.apache.pdfbox.rendering.PDFRenderer accepts the document on the constructor but doesn't seem to need it to render a single page, other than to get the page from it in renderPageToGraphics which could easily add an overload to just accept the PDPage object. - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
PDFBox 2.0
Hi, I hope this is to the right place, my first time using this list. I'm working with a 2.0 maven trunk I found and I have a couple issues. I can't seem to find this class which is in some online javadocs referencing pdfbox 1.8: - org.apache.pdfbox.pdfviewer.PDFPagePanel Now, I'm trying to use pdfbox on a server to get pdf pages, then serialize one page at a time to render on a client. What's the best way to do this? I'm trying to avoid serializing the entire document since the load time is much faster doing single pages, but PDPage is not serializable. org.apache.pdfbox.rendering.PDFRenderer accepts the document on the constructor but doesn't seem to need it to render a single page, other than to get the page from it in renderPageToGraphics which could easily add an overload to just accept the PDPage object.
Re: PDFBox 2.0.0 and UTF8 chars
Hi Tilman Hausherr thaush...@t-online.de hat am 1. März 2015 um 19:54 geschrieben: Heh heh, I wanted to make a similar comment, but then I saw the stack trace showing that he did just that... Ups. you are right. The stack trace doesn't belong to the listed code. So, most likely thers is an issue with that specific font. Either a malformed font or a fontbox issue. BR Andreas Lehmkühler Tilman Am 01.03.2015 um 18:53 schrieb Andreas Lehmkuehler: Hi, Am 28.02.2015 um 11:52 schrieb Ivan Klaric: Hello good PDFBox people, I am working on a pet project with PDFBox and I encountered what seems to be an issue with UTF8 chars. If you take the following standard example: public static void main(String[] args) { try { PDDocument document = new PDDocument(); PDPage page = new PDPage(); document.addPage( page ); PDFont font = PDTrueTypeFont.loadTTF(document, new File(res/Roboto-Regular.ttf)); Try to load the TTF font as a Type0 font PDFont font = PDType0Font.load(document, new File(res/Roboto-Regular.ttf)); BR Andreas Lehmkühler PDPageContentStream contentStream = null; contentStream = new PDPageContentStream(document, page); contentStream.beginText(); contentStream.setFont( font, 12 ); contentStream.moveTextPositionByAmount( 100, 700 ); contentStream.drawString( Hello World čćžšđČĆŽŠĐ ); contentStream.endText(); contentStream.close(); document.save( /tmp/HelloWorld.pdf); document.close(); } catch (IOException e) { e.printStackTrace(); } } (those weird characters in the drawString method are some pretty common croatian letters). This is what I get: java.io.IOException: Error: Could not find referenced cmap stream Identity-H at org.apache.fontbox.cmap.CMapParser.getExternalCMap(CMapParser.java:418) at org.apache.fontbox.cmap.CMapParser.parsePredefined(CMapParser.java:84) at org.apache.pdfbox.pdmodel.font.CMapManager.getPredefinedCMap(CMapManager.java:54) at org.apache.pdfbox.pdmodel.font.PDType0Font.readEncoding(PDType0Font.java:159) at org.apache.pdfbox.pdmodel.font.PDType0Font.init(PDType0Font.java:119) at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:59) at com.company.Main.main(Main.java:20) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) Am I doing something wrong? I took the Roboto-Regular font here: http://www.fontsquirrel.com/fonts/roboto If I remove the weird Croatian characters, the error remains the same. However, if I use the PDTrueTypeFont.loadTTF() (which seems to be deprecated) the same thing works without the Croatian characters. If I put the Croatian characters back in (and use PDTrueTypeFont), I get Exception in thread main java.lang.IllegalArgumentException: U+010D is not available in this font's Encoding at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.encode(PDTrueTypeFont.java:261) at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:268) at org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:316) at org.apache.pdfbox.pdmodel.PDPageContentStream.drawString(PDPageContentStream.java:282) at com.company.Main.main(Main.java:25) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) I manually looked into the font file and it seems to contain the U+010D character. What am I doing wrong here? Thanks, Ivan - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: PDF without stamps
The fact they are custom annotations does not mean they are invalid, just non-standard. On Mon, Mar 2, 2015 at 5:00 PM, Kevin Morin mo...@codelutin.com wrote: Ok, but when I open the file with any pdf reader (evince, xpdf,...) the stamps are displayed (and xpdf does not log errors). On 02/03/2015 16:53, Gilad Denneboom wrote: My guess is that it's a custom-made annotation type, and therefore not one that PDFBox will recognize. On Mon, Mar 2, 2015 at 4:23 PM, Kevin Morin mo...@codelutin.com wrote: Hi all, I found a pdf file which have some kind of stamps, but these stamps are not displayed... When I open the file with evince, I have the following warnings: Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT, Unimplemented annotation: POPPLER_ANNOT_STAMP, Unimplemented annotation: POPPLER_ANNOT_INK but it is displayed correctly. I have no particular logs with pdfbox. I cannot send the file publically, who can I send it to? Or is there something I am missing? BR Kevin - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: PDF without stamps
My guess is that it's a custom-made annotation type, and therefore not one that PDFBox will recognize. On Mon, Mar 2, 2015 at 4:23 PM, Kevin Morin mo...@codelutin.com wrote: Hi all, I found a pdf file which have some kind of stamps, but these stamps are not displayed... When I open the file with evince, I have the following warnings: Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT, Unimplemented annotation: POPPLER_ANNOT_STAMP, Unimplemented annotation: POPPLER_ANNOT_INK but it is displayed correctly. I have no particular logs with pdfbox. I cannot send the file publically, who can I send it to? Or is there something I am missing? BR Kevin - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: Error on PDDocument.load
Hi, Andreas, you said in the issue that you have a solution in mind, did you succeed in fixing it or not? It seems that my users have a lot of files of this kind... Thanks BR Kevin On 11/02/2015 23:16, Tilman Hausherr wrote: I wasn't able to create a non confidential version of the file that works with Adobe Reader. But here's an issue and a proposed patch. https://issues.apache.org/jira/browse/PDFBOX-2679 Tilman Am 11.02.2015 um 18:54 schrieb Tilman Hausherr: No, his file is confidential. However we might create a non confidential file that has the same error. Tilman Am 11.02.2015 um 18:40 schrieb John Hewson: Can we get a JIRA issue open for this, preferably with the file attached? -- John On 11 Feb 2015, at 00:29, Tilman Hausherr thaush...@t-online.de wrote: Yes, they made hacks. So did we, for many types of malformed files. Please send the file also to Andreas, unless you already did, he did many workarounds for malformed files. Tilman Am 11.02.2015 um 09:05 schrieb Kevin Morin: Ok. Why other softwares are able to open it (like xpf)? I guess they made a hack to fix this? Are you going to do something too? Thanks BR Kevin On 11/02/2015 08:53, Tilman Hausherr wrote: Hi, I can reproduce the error. Your file is malformed. Please open it with NOTEPAD++ and go to the end: xref 1 7 00 65535 f 09 0 n 358745 0 n 358842 0 n 359029 0 n 359087 0 n 359138 0 n trailer The first number (1) means the number of the first object. So it would be 1. The second number(7) is the size of the table. The number 1 is incorrect, it should be 0, because 00 65535 f is the dummy object 0. Press CTRL-G and enter the offsets (e.g. 9, 45, 358745, ...) and you will see what I mean. From the pdf spec: The free entries in the cross-reference table form a linked list, with each free entry containing the object number of the next. The first entry in the table (object number 0) is always free and has a generation number of 65,535; it is the head of the linked list of free objects Tilman Am 11.02.2015 um 08:21 schrieb Kevin Morin: Hi, I am sorry, it seems that I did not send you the right file... Actually, I was testing the wrong file on linux from the begining also. The file is displaying blank also on linux and on java 7 or 8... Here is the right file. I am sorry to make you work for nothing... BR Kevin On 10/02/2015 21:32, Tilman Hausherr wrote: So we e-mailed and the result is - you're really working on W2008 with the file that you sent me - you get the same error on W2008 with the app (and I don't) I have analysed that file and did some debug traces. If loading that on W2008 is a no-no, you'd have to build from source and I'll tell you the changes. http://home.snafu.de/tilman/tmp/pdfbox-app-2.0.0-TILMAN.jar Don't use that version for production. It contains lots of stuff for my own tests. Only use it for this problem. Here's the output that you should get: Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser parseXrefStream INFORMATION: parseXrefStream: objByteOffset = 116 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 7 0 obj at offset: 16 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 8 0 obj at offset: 573 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 9 0 obj at offset: 633 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 10 0 obj at offset: 817 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 11 0 obj at offset: 914 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 12 0 obj at offset: 116 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 13 0 obj at offset: 436 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser parseXrefStream INFORMATION: parseXrefStream: objByteOffset = 363505 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 1 0 obj at offset: 359638 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 2 0 obj at offset: 363167 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 3 0 obj at offset: 363307 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 4 0 obj at offset: 363505 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 5 stmnr: 2 Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.PDFXrefStreamParser parse INFORMATION: PDFXrefStreamParser: 6 stmnr:
Re: PDF without stamps
Ok, but when I open the file with any pdf reader (evince, xpdf,...) the stamps are displayed (and xpdf does not log errors). On 02/03/2015 16:53, Gilad Denneboom wrote: My guess is that it's a custom-made annotation type, and therefore not one that PDFBox will recognize. On Mon, Mar 2, 2015 at 4:23 PM, Kevin Morin mo...@codelutin.com wrote: Hi all, I found a pdf file which have some kind of stamps, but these stamps are not displayed... When I open the file with evince, I have the following warnings: Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT, Unimplemented annotation: POPPLER_ANNOT_STAMP, Unimplemented annotation: POPPLER_ANNOT_INK but it is displayed correctly. I have no particular logs with pdfbox. I cannot send the file publically, who can I send it to? Or is there something I am missing? BR Kevin - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org
Re: PDF without stamps
Hi Kevin, you can send it to me and I'll take a look. BR Maruan Am 02.03.2015 um 16:23 schrieb Kevin Morin mo...@codelutin.com: Hi all, I found a pdf file which have some kind of stamps, but these stamps are not displayed... When I open the file with evince, I have the following warnings: Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT, Unimplemented annotation: POPPLER_ANNOT_STAMP, Unimplemented annotation: POPPLER_ANNOT_INK but it is displayed correctly. I have no particular logs with pdfbox. I cannot send the file publically, who can I send it to? Or is there something I am missing? BR Kevin - To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org