Re: PDFBox 2.0

2015-03-02 Thread Tilman Hausherr

Am 02.03.2015 um 22:49 schrieb Eric Douglas:

Hi, I hope this is to the right place, my first time using this list.
I'm working with a 2.0 maven trunk I found and I have a couple issues.
I can't seem to find this class which is in some online javadocs
referencing pdfbox 1.8:

- org.apache.pdfbox.pdfviewer.PDFPagePanel


org.apache.pdfbox.tools.gui.PDFPagePanel




Now, I'm trying to use pdfbox on a server to get pdf pages, then serialize
one page at a time to render on a client.  What's the best way to do this?


none at all. You can't cut a PDF like a salami. There is the PDFSplit 
application, but the size of all pages will be larger than the size of 
the original PDF, because of common objects.


Tilman


I'm trying to avoid serializing the entire document since the load time is
much faster doing single pages, but PDPage is not serializable.
  org.apache.pdfbox.rendering.PDFRenderer accepts the document on the
constructor but doesn't seem to need it to render a single page, other than
to get the page from it in renderPageToGraphics which could easily add an
overload to just accept the PDPage object.




-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



PDFBox 2.0

2015-03-02 Thread Eric Douglas
Hi, I hope this is to the right place, my first time using this list.
I'm working with a 2.0 maven trunk I found and I have a couple issues.
I can't seem to find this class which is in some online javadocs
referencing pdfbox 1.8:

   - org.apache.pdfbox.pdfviewer.PDFPagePanel


Now, I'm trying to use pdfbox on a server to get pdf pages, then serialize
one page at a time to render on a client.  What's the best way to do this?
I'm trying to avoid serializing the entire document since the load time is
much faster doing single pages, but PDPage is not serializable.
 org.apache.pdfbox.rendering.PDFRenderer accepts the document on the
constructor but doesn't seem to need it to render a single page, other than
to get the page from it in renderPageToGraphics which could easily add an
overload to just accept the PDPage object.


Re: PDFBox 2.0.0 and UTF8 chars

2015-03-02 Thread Andreas Lehmkühler
Hi

 Tilman Hausherr thaush...@t-online.de hat am 1. März 2015 um 19:54
 geschrieben:
 
 
 Heh heh, I wanted to make a similar comment, but then I saw the stack 
 trace showing that he did just that...
Ups. you are right. The stack trace doesn't belong to the listed code. So, most
likely thers is an issue with that specific font. Either a malformed font or a
fontbox issue.

BR
Andreas Lehmkühler
 
 Tilman
 
 Am 01.03.2015 um 18:53 schrieb Andreas Lehmkuehler:
  Hi,
 
  Am 28.02.2015 um 11:52 schrieb Ivan Klaric:
  Hello good PDFBox people,
 
  I am working on a pet project with PDFBox and I encountered what 
  seems to
  be an issue with UTF8 chars. If you take the following standard example:
 
   public static void main(String[] args) {
   try {
   PDDocument document = new PDDocument();
   PDPage page = new PDPage();
   document.addPage( page );
   PDFont font = PDTrueTypeFont.loadTTF(document, new
  File(res/Roboto-Regular.ttf));
 
  Try to load the TTF font as a Type0 font
 
  PDFont font = PDType0Font.load(document, new 
  File(res/Roboto-Regular.ttf));
 
  BR
  Andreas Lehmkühler
 
   PDPageContentStream contentStream = null;
   contentStream = new PDPageContentStream(document, page);
   contentStream.beginText();
   contentStream.setFont( font, 12 );
   contentStream.moveTextPositionByAmount( 100, 700 );
   contentStream.drawString( Hello World čćžšđČĆŽŠĐ );
   contentStream.endText();
   contentStream.close();
   document.save( /tmp/HelloWorld.pdf);
   document.close();
 
   } catch (IOException e) {
   e.printStackTrace();
   }
   }
 
  (those weird characters in the drawString method are some pretty common
  croatian letters). This is what I get:
  java.io.IOException: Error: Could not find referenced cmap stream 
  Identity-H
  at 
  org.apache.fontbox.cmap.CMapParser.getExternalCMap(CMapParser.java:418)
  at 
  org.apache.fontbox.cmap.CMapParser.parsePredefined(CMapParser.java:84)
  at
  org.apache.pdfbox.pdmodel.font.CMapManager.getPredefinedCMap(CMapManager.java:54)
  
 
  at
  org.apache.pdfbox.pdmodel.font.PDType0Font.readEncoding(PDType0Font.java:159)
  
 
  at 
  org.apache.pdfbox.pdmodel.font.PDType0Font.init(PDType0Font.java:119)
  at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:59)
  at com.company.Main.main(Main.java:20)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  
 
  at
  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  
 
  at java.lang.reflect.Method.invoke(Method.java:483)
  at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
 
 
  Am I doing something wrong? I took the Roboto-Regular font here:
  http://www.fontsquirrel.com/fonts/roboto
 
  If I remove the weird Croatian characters, the error remains the same.
  However, if I use the PDTrueTypeFont.loadTTF() (which seems to be
  deprecated) the same thing works without the Croatian characters. If 
  I put
  the Croatian characters back in (and use PDTrueTypeFont), I get
 
  Exception in thread main java.lang.IllegalArgumentException: U+010D is
  not available in this font's Encoding
  at
  org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.encode(PDTrueTypeFont.java:261)
  
 
  at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:268)
  at
  org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:316)
  
 
  at
  org.apache.pdfbox.pdmodel.PDPageContentStream.drawString(PDPageContentStream.java:282)
  
 
  at com.company.Main.main(Main.java:25)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  
 
  at
  sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  
 
  at java.lang.reflect.Method.invoke(Method.java:483)
  at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
 
  I manually looked into the font file and it seems to contain the U+010D
  character. What am I doing wrong here?
 
  Thanks,
  Ivan
 
 
 
  -
  To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
  For additional commands, e-mail: users-h...@pdfbox.apache.org
 
 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
 For additional commands, e-mail: users-h...@pdfbox.apache.org


-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



Re: PDF without stamps

2015-03-02 Thread Gilad Denneboom
The fact they are custom annotations does not mean they are invalid, just
non-standard.

On Mon, Mar 2, 2015 at 5:00 PM, Kevin Morin mo...@codelutin.com wrote:

 Ok, but when I open the file with any pdf reader (evince, xpdf,...) the
 stamps are displayed (and xpdf does not log errors).


 On 02/03/2015 16:53, Gilad Denneboom wrote:

 My guess is that it's a custom-made annotation type, and therefore not one
 that PDFBox will recognize.

 On Mon, Mar 2, 2015 at 4:23 PM, Kevin Morin mo...@codelutin.com wrote:

  Hi all,

 I found a pdf file which have some kind of stamps, but these stamps are
 not displayed... When I open the file with evince, I have the following
 warnings: Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT,
 Unimplemented annotation: POPPLER_ANNOT_STAMP, Unimplemented
 annotation:
 POPPLER_ANNOT_INK but it is displayed correctly. I have no particular
 logs
 with pdfbox.

 I cannot send the file publically, who can I send it to? Or is there
 something I am missing?

 BR

 Kevin

 -
 To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
 For additional commands, e-mail: users-h...@pdfbox.apache.org





 -
 To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
 For additional commands, e-mail: users-h...@pdfbox.apache.org




Re: PDF without stamps

2015-03-02 Thread Gilad Denneboom
My guess is that it's a custom-made annotation type, and therefore not one
that PDFBox will recognize.

On Mon, Mar 2, 2015 at 4:23 PM, Kevin Morin mo...@codelutin.com wrote:

 Hi all,

 I found a pdf file which have some kind of stamps, but these stamps are
 not displayed... When I open the file with evince, I have the following
 warnings: Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT,
 Unimplemented annotation: POPPLER_ANNOT_STAMP, Unimplemented annotation:
 POPPLER_ANNOT_INK but it is displayed correctly. I have no particular logs
 with pdfbox.

 I cannot send the file publically, who can I send it to? Or is there
 something I am missing?

 BR

 Kevin

 -
 To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
 For additional commands, e-mail: users-h...@pdfbox.apache.org




Re: Error on PDDocument.load

2015-03-02 Thread Kevin Morin

Hi,

Andreas, you said in the issue that you have a solution in mind, did you 
succeed in fixing it or not? It seems that my users have a lot of files 
of this kind...


Thanks
BR

Kevin

On 11/02/2015 23:16, Tilman Hausherr wrote:

I wasn't able to create a non confidential version of the file that
works with Adobe Reader. But here's an issue and a proposed patch.

https://issues.apache.org/jira/browse/PDFBOX-2679

Tilman

Am 11.02.2015 um 18:54 schrieb Tilman Hausherr:

No, his file is confidential.

However we might create a non confidential file that has the same error.

Tilman

Am 11.02.2015 um 18:40 schrieb John Hewson:

Can we get a JIRA issue open for this, preferably with the file
attached?

-- John


On 11 Feb 2015, at 00:29, Tilman Hausherr thaush...@t-online.de
wrote:

Yes, they made hacks. So did we, for many types of malformed files.
Please send the file also to Andreas, unless you already did, he did
many workarounds for malformed files.

Tilman


Am 11.02.2015 um 09:05 schrieb Kevin Morin:
Ok. Why other softwares are able to open it (like xpf)? I guess
they made a hack to fix this? Are you going to do something too?

Thanks
BR

Kevin


On 11/02/2015 08:53, Tilman Hausherr wrote:
Hi,

I can reproduce the error. Your file is malformed. Please open it
with
NOTEPAD++ and go to the end:

xref
1 7
00 65535 f
09 0 n
358745 0 n
358842 0 n
359029 0 n
359087 0 n
359138 0 n
trailer

The first number (1) means the number of the first object. So it
would
be 1. The second number(7) is the size of the table. The number 1 is
incorrect, it should be 0, because 00 65535 f is the dummy
object 0. Press CTRL-G and enter the offsets (e.g. 9, 45, 358745,
...)
and you will see what I mean.

 From the pdf spec:

The free entries in the cross-reference table form a linked list,
with
each free entry containing the object number of the next. The first
entry in the table (object number 0) is always free and has a
generation
number of 65,535; it is the head of the linked list of free objects

Tilman



Am 11.02.2015 um 08:21 schrieb Kevin Morin:
Hi,

I am sorry, it seems that I did not send you the right file...
Actually, I was testing the wrong file on linux from the begining
also. The file is displaying blank also on linux and on java 7 or
8...
Here is the right file.

I am sorry to make you work for nothing...

BR

Kevin



On 10/02/2015 21:32, Tilman Hausherr wrote:
So we e-mailed and the result is
- you're really working on W2008 with the file that you sent me
- you get the same error on W2008 with the app (and I don't)

I have analysed that file and did some debug traces. If loading
that on
W2008 is a no-no, you'd have to build from source and I'll tell
you the
changes.

http://home.snafu.de/tilman/tmp/pdfbox-app-2.0.0-TILMAN.jar

Don't use that version for production. It contains lots of stuff
for my
own tests. Only use it for this problem. Here's the output that you
should get:

Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser
parseXrefStream
INFORMATION: parseXrefStream: objByteOffset = 116
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 7 0 obj at offset: 16
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 8 0 obj at offset: 573
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 9 0 obj at offset: 633
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 10 0 obj at offset: 817
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 11 0 obj at offset: 914
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 12 0 obj at offset: 116
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 13 0 obj at offset: 436
Feb 10, 2015 9:27:18 PM org.apache.pdfbox.pdfparser.COSParser
parseXrefStream
INFORMATION: parseXrefStream: objByteOffset = 363505
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 1 0 obj at offset: 359638
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 2 0 obj at offset: 363167
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 3 0 obj at offset: 363307
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 4 0 obj at offset: 363505
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 5 stmnr: 2
Feb 10, 2015 9:27:18 PM
org.apache.pdfbox.pdfparser.PDFXrefStreamParser
parse
INFORMATION: PDFXrefStreamParser: 6 stmnr: 

Re: PDF without stamps

2015-03-02 Thread Kevin Morin
Ok, but when I open the file with any pdf reader (evince, xpdf,...) the 
stamps are displayed (and xpdf does not log errors).


On 02/03/2015 16:53, Gilad Denneboom wrote:

My guess is that it's a custom-made annotation type, and therefore not one
that PDFBox will recognize.

On Mon, Mar 2, 2015 at 4:23 PM, Kevin Morin mo...@codelutin.com wrote:


Hi all,

I found a pdf file which have some kind of stamps, but these stamps are
not displayed... When I open the file with evince, I have the following
warnings: Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT,
Unimplemented annotation: POPPLER_ANNOT_STAMP, Unimplemented annotation:
POPPLER_ANNOT_INK but it is displayed correctly. I have no particular logs
with pdfbox.

I cannot send the file publically, who can I send it to? Or is there
something I am missing?

BR

Kevin

-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org







-
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org



Re: PDF without stamps

2015-03-02 Thread Maruan Sahyoun
Hi Kevin,

you can send it to me and I'll take a look.

BR

Maruan

Am 02.03.2015 um 16:23 schrieb Kevin Morin mo...@codelutin.com:

 Hi all,
 
 I found a pdf file which have some kind of stamps, but these stamps are not 
 displayed... When I open the file with evince, I have the following warnings: 
 Unimplemented annotation: POPPLER_ANNOT_FREE_TEXT, Unimplemented 
 annotation: POPPLER_ANNOT_STAMP, Unimplemented annotation: 
 POPPLER_ANNOT_INK but it is displayed correctly. I have no particular logs 
 with pdfbox.
 
 I cannot send the file publically, who can I send it to? Or is there 
 something I am missing?
 
 BR
 
 Kevin
 
 -
 To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
 For additional commands, e-mail: users-h...@pdfbox.apache.org