date:20140122

[jira] [Updated] (PDFBOX-1861) Line is incorrectly dashed

2014-01-22 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1861:


Attachment: asy-gouraud.pdf-1-good.png

I had a look at other operators and notice that they do a transformation, but 
the dash pattern isn't transformed in SetLineDashPattern.java. Inserting this 
code
{code}
float[] dashArray = null;
if (!lineDashPattern.isDashPatternEmpty())
{
dashArray = lineDashPattern.getCOSDashPattern().toFloatArray();
Matrix ctm = 
context.getGraphicsState().getCurrentTransformationMatrix();
if (ctm != null && ctm.getXScale() > 0)
{
for (int i = 0; i < dashArray.length; ++i)
{
dashArray[i] *= ctm.getXScale();
}
}
}
{code}
and using that dashArray result instead of 
lineDashPattern.getCOSDashPattern().toFloatArray() later down produces the 
attached image (which includes my change from PDFBOX-615).

> Line is incorrectly dashed
> --
>
> Key: PDFBOX-1861
> URL: https://issues.apache.org/jira/browse/PDFBOX-1861
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: W7
>Reporter: Tilman Hausherr
>Priority: Minor
> Attachments: asy-gouraud.pdf, asy-gouraud.pdf-1-good.png, 
> asy-gouraud.pdf-1-trunk.png
>
>
> The line in the attached page should be dashed differently than it is in the 
> rendering.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879493#comment-13879493
 ] 

Andreas Lehmkühler commented on PDFBOX-1822:


I'm still thinking about the limitation to the non-sequential parser. Is it 
allowed to mix up the usage of xref tables and xref streams within a pdf?

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
>Assignee: Andreas Lehmkühler
> Fix For: 1.8.4, 2.0.0
>
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1861) Line is incorrectly dashed

2014-01-22 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1861:


Attachment: asy-gouraud.pdf-1-trunk.png
asy-gouraud.pdf

PDF file taken from http://asymptote.sourceforge.net/gallery/ (clicking on the 
images brings the pdfs, clicking on the texts brings the source codes)

> Line is incorrectly dashed
> --
>
> Key: PDFBOX-1861
> URL: https://issues.apache.org/jira/browse/PDFBOX-1861
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: W7
>Reporter: Tilman Hausherr
>Priority: Minor
> Attachments: asy-gouraud.pdf, asy-gouraud.pdf-1-trunk.png
>
>
> The line in the attached page should be dashed differently than it is in the 
> rendering.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (PDFBOX-1861) Line is incorrectly dashed

2014-01-22 Thread Tilman Hausherr (JIRA)

Tilman Hausherr created PDFBOX-1861:
---

 Summary: Line is incorrectly dashed
 Key: PDFBOX-1861
 URL: https://issues.apache.org/jira/browse/PDFBOX-1861
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 2.0.0
 Environment: W7
Reporter: Tilman Hausherr
Priority: Minor


The line in the attached page should be dashed differently than it is in the 
rendering.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (PDFBOX-1860) HTML converter escapes formatting close tags

2014-01-22 Thread Cheng Leong (JIRA)

Cheng Leong created PDFBOX-1860:
---

 Summary: HTML converter escapes formatting close tags
 Key: PDFBOX-1860
 URL: https://issues.apache.org/jira/browse/PDFBOX-1860
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.8.3
Reporter: Cheng Leong
Priority: Minor
 Attachments: pdftest.pdf

Bug introduced by PDFBOX-1213 in 1.8.3 for HTML style information.
Bold style tags are opened correctly, but the close tags are html-escaped.

{noformat}
~/work/pdfbox ((1.8.3))$ java -jar app/target/pdfbox-app-1.8.3.jar ExtractText 
-html -nonSeq -console pdftest.pdf 
http://www.w3.org/TR/html4/loose.dtd";>
1725.PDF



E:\M55\!\1725.fm 2003-01-01 18:15 P Tagg, IPM, 
University of Liverpool

A VERY SMALL PDF FILE

A VERY SMALL PDF FILE

A VERY SMALL PDF FILE

A VERY SMALL PDF FILE

A VERY SMALL PDF FILE

A VERY SMALL PDF FILE



{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1860) HTML converter escapes formatting close tags

2014-01-22 Thread Cheng Leong (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Leong updated PDFBOX-1860:


Attachment: pdftest.pdf

> HTML converter escapes formatting close tags
> 
>
> Key: PDFBOX-1860
> URL: https://issues.apache.org/jira/browse/PDFBOX-1860
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 1.8.3
>Reporter: Cheng Leong
>Priority: Minor
> Attachments: pdftest.pdf
>
>
> Bug introduced by PDFBOX-1213 in 1.8.3 for HTML style information.
> Bold style tags are opened correctly, but the close tags are html-escaped.
> {noformat}
> ~/work/pdfbox ((1.8.3))$ java -jar app/target/pdfbox-app-1.8.3.jar 
> ExtractText -html -nonSeq -console pdftest.pdf 
>  "http://www.w3.org/TR/html4/loose.dtd";>
> 1725.PDF
> 
> 
> 
> E:\M55\!\1725.fm 2003-01-01 18:15 P Tagg, 
> IPM, University of Liverpool
> 
> A VERY SMALL PDF FILE
> 
> A VERY SMALL PDF FILE
> 
> A VERY SMALL PDF FILE
> 
> A VERY SMALL PDF FILE
> 
> A VERY SMALL PDF FILE
> 
> A VERY SMALL PDF FILE
> 
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler resolved PDFBOX-1822.


   Resolution: Fixed
Fix Version/s: 2.0.0
   1.8.4
 Assignee: Andreas Lehmkühler

I merged the changes into the 1.8 branch in revision 1560498.

The fix is limited to the non-sequential parser.

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
>Assignee: Andreas Lehmkühler
> Fix For: 1.8.4, 2.0.0
>
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879140#comment-13879140
 ] 

Andreas Lehmkühler edited comment on PDFBOX-1822 at 1/22/14 8:24 PM:
-

I merged the changes into the 1.8 branch in revision 1560498.

The fix is limited to the non-sequential parser.

Thanks for all the input and help!


was (Author: lehmi):
I merged the changes into the 1.8 branch in revision 1560498.

The fix is limited to the non-sequential parser.

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
>Assignee: Andreas Lehmkühler
> Fix For: 1.8.4, 2.0.0
>
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: DeviceN/Separation JPEGs

2014-01-22 Thread Andreas Lehmkuehler


Hi,

Am 21.01.2014 23:29, schrieb Andreas Lehmkuehler:

Hi,

Am 21.01.2014 23:08, schrieb John Hewson:

Does anyone have any PDF files with JPEGs that use DeviceN or Separation color
spaces?

I'm pretty sure that I have one, but I can't find it. I'll continue searching
tomorrow 
I wrote a little search tool and finally I found some pdfs. Find attached a list 
of pdfs using images with a separation colorspace. Those pdfs should all

be attached to the referenced jira issue.



I’m trying to test out some new code...

Thanks

-- John



BR
Andreas Lehmkühler


BR
Andreas Lehmkühler

PDFBOX1095-receipt.pdf: Found PDPixelmap using Separation: Im0 on page 1
PDFBOX1116-Flyer_A4_Zubehoer_10_Prozent_300dpi.pdf: Found PDJpeg using 
Separation: Im25 on page 2
PDFBOX1307-invertedImage.pdf: Found PDJpeg using Separation: Im1 on page 1
PDFBOX1307-invertedImage.pdf: Found PDJpeg using Separation: Im2 on page 1
PDFBOX1610-kitest3_2904_01.6710338.0.pdf: Found PDPixelmap using Separation: 
Im2 on page 1
PDFBOX1610-kitest3_2904_01.6710338.0.pdf: Found PDPixelmap using Separation: 
Im3 on page 1
PDFBOX1610-kitest3_2904_01.6710338.0.pdf: Found PDPixelmap using Separation: 
Im61 on page 1
PDFBOX1522-Traveler20120822.pdf: Found PDPixelmap using Separation: Im1 on page 
1
PDFBOX1522-Traveler20120822.pdf: Found PDJpeg using Separation: Im17 on page 3
PDFBOX1522-Traveler20120822.pdf: Found PDJpeg using Separation: Im205 on page 8
PDFBOX1522-Traveler20120822.pdf: Found PDPixelmap using Separation: Im208 on 
page 8
PDFBOX1522-Traveler20120822.pdf: Found PDJpeg using Separation: Im209 on page 8
PDFBOX1522-Traveler20120822.pdf: Found PDPixelmap using Separation: Im210 on 
page 8
PDFBOX833-sample.pdf: Found PDPixelmap using Separation: Im6 on page 1
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im1 on page 6
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im2 on page 13
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im3 on page 15
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im4 on page 16
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im5 on page 17
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im6 on page 18
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im7 on page 19
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im8 on page 20
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im9 on page 21
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im10 on page 22
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im11 on page 23
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im12 on page 24
PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im13 on page 25
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im5 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im4 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im7 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im6 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im9 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im8 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im10 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im27 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im0 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im13 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im26 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im1 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im14 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im29 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im2 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im11 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im3 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im28 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im12 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im23 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im17 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im22 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im18 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im25 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im15 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im24 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im16 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im21 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im19 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im20 on page 2
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im5 on page 3
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im4 on page 3
PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im7 on page 3
PDFBOX1691-FORIS-HV.pdf: F

[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread vakhtang koroghlishvili (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879123#comment-13879123
 ] 

vakhtang koroghlishvili commented on PDFBOX-1822:
-

Álison Fernandes
I'have tested your issue too. As I see , everything is well.

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread vakhtang koroghlishvili (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879109#comment-13879109
 ] 

vakhtang koroghlishvili edited comment on PDFBOX-1822 at 1/22/14 8:13 PM:
--

I have attached fixed version document.


was (Author: v.koroghlishvili):
I attach fixed version document.

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Issue Comment Deleted] (PDFBOX-1857) Attachment damages singature

2014-01-22 Thread vakhtang koroghlishvili (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vakhtang koroghlishvili updated PDFBOX-1857:


Comment: was deleted

(was: that is result of file. everything is well.)

> Attachment damages singature
> 
>
> Key: PDFBOX-1857
> URL: https://issues.apache.org/jira/browse/PDFBOX-1857
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDFReader, Signing, Utilities, Writing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: jack
>Assignee: Thomas Chojecki
>Priority: Blocker
> Attachments: attach.txt, original.pdf, original[signed].pdf, 
> original[with-attachment].pdf, original[with-attachment][signed].pdf, 
> original[with-attachment]_signed.pdf
>
>
> I have PDF document. 
> 1) Adobe reader reads  document well. 
> 2) I sign document  (using pdfbox-examples)  and everything is well
> 3) Then I try to attach file to original PDF (Code is written in the pdfbox 
> web page - in the cookBook).
> 4) Adobe reader reads  attached document well. everything is well.
> 5) Now I have document with attachment. 
> 6) I try to sign that document  (I mean document with attachment). And I have 
> 2 problem:
> First:
> when I open document, Adobe reader tells me that signature byte range is 
> invalid.
> Second:
> when I try to close document (I mean to close adobe reader), Adobe reader 
> tells me that:
> Do you want to save changes to "original[with-attachment][signed]" before 
> closing?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1857) Attachment damages singature

2014-01-22 Thread vakhtang koroghlishvili (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vakhtang koroghlishvili updated PDFBOX-1857:


Attachment: original[with-attachment]_signed.pdf

that is result of file. everything is well.

> Attachment damages singature
> 
>
> Key: PDFBOX-1857
> URL: https://issues.apache.org/jira/browse/PDFBOX-1857
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDFReader, Signing, Utilities, Writing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: jack
>Assignee: Thomas Chojecki
>Priority: Blocker
> Attachments: attach.txt, original.pdf, original[signed].pdf, 
> original[with-attachment].pdf, original[with-attachment][signed].pdf, 
> original[with-attachment]_signed.pdf
>
>
> I have PDF document. 
> 1) Adobe reader reads  document well. 
> 2) I sign document  (using pdfbox-examples)  and everything is well
> 3) Then I try to attach file to original PDF (Code is written in the pdfbox 
> web page - in the cookBook).
> 4) Adobe reader reads  attached document well. everything is well.
> 5) Now I have document with attachment. 
> 6) I try to sign that document  (I mean document with attachment). And I have 
> 2 problem:
> First:
> when I open document, Adobe reader tells me that signature byte range is 
> invalid.
> Second:
> when I try to close document (I mean to close adobe reader), Adobe reader 
> tells me that:
> Do you want to save changes to "original[with-attachment][signed]" before 
> closing?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1857) Attachment damages singature

2014-01-22 Thread vakhtang koroghlishvili (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879110#comment-13879110
 ] 

vakhtang koroghlishvili commented on PDFBOX-1857:
-

the patch of PDFBOX-1822 is solution of this issue. I have tested and this 
issue is already fixed. 

> Attachment damages singature
> 
>
> Key: PDFBOX-1857
> URL: https://issues.apache.org/jira/browse/PDFBOX-1857
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDFReader, Signing, Utilities, Writing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: jack
>Assignee: Thomas Chojecki
>Priority: Blocker
> Attachments: attach.txt, original.pdf, original[signed].pdf, 
> original[with-attachment].pdf, original[with-attachment][signed].pdf, 
> original[with-attachment]_signed.pdf
>
>
> I have PDF document. 
> 1) Adobe reader reads  document well. 
> 2) I sign document  (using pdfbox-examples)  and everything is well
> 3) Then I try to attach file to original PDF (Code is written in the pdfbox 
> web page - in the cookBook).
> 4) Adobe reader reads  attached document well. everything is well.
> 5) Now I have document with attachment. 
> 6) I try to sign that document  (I mean document with attachment). And I have 
> 2 problem:
> First:
> when I open document, Adobe reader tells me that signature byte range is 
> invalid.
> Second:
> when I try to close document (I mean to close adobe reader), Adobe reader 
> tells me that:
> Do you want to save changes to "original[with-attachment][signed]" before 
> closing?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread vakhtang koroghlishvili (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vakhtang koroghlishvili updated PDFBOX-1822:


Attachment: unsigned_signed_fix.pdf

I attach fixed version document.

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread vakhtang koroghlishvili (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879107#comment-13879107
 ] 

vakhtang koroghlishvili commented on PDFBOX-1822:
-

I'have tested and everything is well. 
Andreas, thank you for this patch. :)

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Error printing...

2014-01-22 Thread John Hewson

Yep, you’ll need to create a JavaFX bundle in the usual manner 
http://docs.oracle.com/javafx/2/deployment/packaging.htm

-- John

On 22 Jan 2014, at 11:36, Alin Mazilu  wrote:

> Thank you for your quick responses, but the application is a JavaFX self
> contained application packaged with the JRE and is independent of the JRE
> installed on the OS. So I think I need to package the JAI libraries but I
> have no idea how :D Any thoughts?
> 
> Thank you,
> 
> Alin
> 
> 
> On Wed, Jan 22, 2014 at 1:48 PM, John Hewson  wrote:
> 
>> Yes, there is. Simply Google "JBIG2 plugin” and follow the first link, it
>> will be called "jbig2-imageio".
>> 
>> -- John
>> 
>> On 22 Jan 2014, at 09:16, Alin Mazilu  wrote:
>> 
>>> Hello all,
>>> 
>>> I am printing some PDFs and I am getting this:
>>> 
>>> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
>>> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded
>> datastream.
>>> Jan 22, 2014 12:07:47 PM
>>> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
>>> SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
>>> Jan 22, 2014 12:07:47 PM
>> org.apache.pdfbox.util.operator.pagedrawer.Invoke
>>> process
>>> WARNING: getRGBImage returned NULL
>>> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
>>> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded
>> datastream.
>>> Jan 22, 2014 12:07:47 PM
>>> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
>>> SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
>>> Jan 22, 2014 12:07:47 PM
>> org.apache.pdfbox.util.operator.pagedrawer.Invoke
>>> process
>>> WARNING: getRGBImage returned NULL
>>> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
>>> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded
>> datastream.
>>> Jan 22, 2014 12:07:47 PM
>>> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
>>> SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
>>> Jan 22, 2014 12:07:47 PM
>> org.apache.pdfbox.util.operator.pagedrawer.Invoke
>>> process
>>> WARNING: getRGBImage returned NULL
>>> 
>>> Is there a quick way to fix this? Is there a JBIG2 plugin? I really need
>> to
>>> fix it today or I'm in trouble. :)
>>> 
>>> Thank you,
>>> 
>>> Alin
>> 
>>

Re: Error printing...

2014-01-22 Thread Alin Mazilu

Thank you for your quick responses, but the application is a JavaFX self
contained application packaged with the JRE and is independent of the JRE
installed on the OS. So I think I need to package the JAI libraries but I
have no idea how :D Any thoughts?

Thank you,

Alin


On Wed, Jan 22, 2014 at 1:48 PM, John Hewson  wrote:

> Yes, there is. Simply Google "JBIG2 plugin” and follow the first link, it
> will be called "jbig2-imageio".
>
> -- John
>
> On 22 Jan 2014, at 09:16, Alin Mazilu  wrote:
>
> > Hello all,
> >
> > I am printing some PDFs and I am getting this:
> >
> > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
> > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded
> datastream.
> > Jan 22, 2014 12:07:47 PM
> > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
> > SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
> > Jan 22, 2014 12:07:47 PM
> org.apache.pdfbox.util.operator.pagedrawer.Invoke
> > process
> > WARNING: getRGBImage returned NULL
> > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
> > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded
> datastream.
> > Jan 22, 2014 12:07:47 PM
> > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
> > SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
> > Jan 22, 2014 12:07:47 PM
> org.apache.pdfbox.util.operator.pagedrawer.Invoke
> > process
> > WARNING: getRGBImage returned NULL
> > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
> > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded
> datastream.
> > Jan 22, 2014 12:07:47 PM
> > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
> > SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
> > Jan 22, 2014 12:07:47 PM
> org.apache.pdfbox.util.operator.pagedrawer.Invoke
> > process
> > WARNING: getRGBImage returned NULL
> >
> > Is there a quick way to fix this? Is there a JBIG2 plugin? I really need
> to
> > fix it today or I'm in trouble. :)
> >
> > Thank you,
> >
> > Alin
>
>

[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879023#comment-13879023
 ] 

Andreas Lehmkühler commented on PDFBOX-1822:


I changed the implementation of COSDocument#isXRefStream in revision 1560474. I 
tno more relies on the trailer dictionary to determine if the pdf uses a xref 
table or a xref stream.

Can someone please check if the signing now works as I'm not familiar with all 
that signing stuff.

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Error printing...

2014-01-22 Thread John Hewson

Yes, there is. Simply Google "JBIG2 plugin” and follow the first link, it will 
be called "jbig2-imageio".

-- John

On 22 Jan 2014, at 09:16, Alin Mazilu  wrote:

> Hello all,
> 
> I am printing some PDFs and I am getting this:
> 
> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
> Jan 22, 2014 12:07:47 PM
> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
> SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
> process
> WARNING: getRGBImage returned NULL
> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
> Jan 22, 2014 12:07:47 PM
> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
> SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
> process
> WARNING: getRGBImage returned NULL
> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
> Jan 22, 2014 12:07:47 PM
> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
> SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
> process
> WARNING: getRGBImage returned NULL
> 
> Is there a quick way to fix this? Is there a JBIG2 plugin? I really need to
> fix it today or I'm in trouble. :)
> 
> Thank you,
> 
> Alin

RE: Error printing...

2014-01-22 Thread Simon Steiner

Hi,

Install jar from
https://code.google.com/p/jbig2-imageio/

Thanks

-Original Message-
From: Alin Mazilu [mailto:impet...@gmail.com] 
Sent: 22 January 2014 17:16
To: dev@pdfbox.apache.org
Subject: Error printing...

Hello all,

I am printing some PDFs and I am getting this:

Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
Jan 22, 2014 12:07:47 PM
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
process
WARNING: getRGBImage returned NULL
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
Jan 22, 2014 12:07:47 PM
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
process
WARNING: getRGBImage returned NULL
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
Jan 22, 2014 12:07:47 PM
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
process
WARNING: getRGBImage returned NULL

Is there a quick way to fix this? Is there a JBIG2 plugin? I really need to
fix it today or I'm in trouble. :)

Thank you,

Alin

Error printing...

2014-01-22 Thread Alin Mazilu

Hello all,

I am printing some PDFs and I am getting this:

Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
Jan 22, 2014 12:07:47 PM
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
process
WARNING: getRGBImage returned NULL
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
Jan 22, 2014 12:07:47 PM
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
process
WARNING: getRGBImage returned NULL
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode
SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream.
Jan 22, 2014 12:07:47 PM
org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage
SEVERE: Something went wrong ... the pixelmap doesn't contain any data.
Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke
process
WARNING: getRGBImage returned NULL

Is there a quick way to fix this? Is there a JBIG2 plugin? I really need to
fix it today or I'm in trouble. :)

Thank you,

Alin

[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread Thomas Chojecki (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878594#comment-13878594
 ] 

Thomas Chojecki commented on PDFBOX-1822:
-

You are right.It looks also like a merged document. See the ID. So maybe the 
application merged a document that contains a classic xref table with a 
document that contains a xref stream.

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid

2014-01-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878556#comment-13878556
 ] 

Andreas Lehmkühler commented on PDFBOX-1822:


I had a quick look at the attached pdf (unsigned.pdf) and IMO it's broken:
{code}
xref
0 20
00 65535 f
15 0 n
000497 0 n
004753 0 n
004810 0 n
004917 0 n
000200 0 n
004960 0 n
005185 0 n
005234 0 n
005275 0 n
005515 0 n
005537 0 n
005741 0 n
005776 0 n
006109 0 n
006161 0 n
006457 0 n
006724 0 n
007034 0 n
trailer
<<
/DecodeParms <<
/Columns 4
/Predictor 12
>>
/Filter /FlateDecode
/ID [ ]
/Info 6 0 R
/Length 58
/Root 1 0 R
/Size 20
/Type /XRef
/W [1 2 1]
/Index [14 13]
>>
startxref
24851
%%EOF
{code}

The pdf obviously uses a xref table. The trailer contains values which are only 
expected if the pdf uses a xref stream, such as the /Type /Size /W, but the 
stream itself is missing. In the following the COSWriter tries to determine if 
a xref table or a xref stream should be written by calling 
COSDocument#isXRefStream. That method delivers "true" as the broken trailer 
dictionary contains a "/Type /XRef" entry. Maybe we should introduce a new 
internal boolean value within the trailer which is set when reading a xref 
stream, so that we can be sure that the pdf really uses a xref stream not not 
just has broken trailer dictionary which leads to false information.

I hope this makes sense ... ;-)

> Signature byte range is Invalid
> ---
>
> Key: PDFBOX-1822
> URL: https://issues.apache.org/jira/browse/PDFBOX-1822
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDModel, PDModel.AcroForm, Signing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: vakhtang koroghlishvili
> Attachments: araxis-merge - compare two document.jpg, 
> damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf
>
>
> On person send me a unsigned PDF document. He wanted to sign it. When I try 
> to sign it (using pad box), I have some problem.
> After signing adobe reader tells me "The signature byre range is invalid".  
> I will attach original and signed document.
> I think, it is PDF box parser error. another signature libraries sign 
> document very well. I'm searching the problem at the moment, in order to fix 
> it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces

2014-01-22 Thread Vitalie Bureanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalie Bureanu updated PDFBOX-1858:


Description: 
Extracted text does not have spaces between some words.

Use to test please a string on line 74a... inside of attached test.pdf.

It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
Form isattached , checkhere"

The result is not seems to be good, the words are "glued".

I tried to use a class PDF Text Stripper but the result still remain the same.

For us it is a big problem. Can it be resolved, please?

With respect,
Vitalie

  was:
Extracted text does not have spaces between some words.

Use to test please a string on line 74a... inside of attached test.pdf.

It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
Form isattached , checkhere"

The result is not seems to be good, the words are "glued".

I tried to use a class PDF Text Stripper but the result still remain the same.

Can it be resolved, please?

With respect,
Vitalie


> Extracted text does not have spaces
> ---
>
> Key: PDFBOX-1858
> URL: https://issues.apache.org/jira/browse/PDFBOX-1858
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Text extraction
>Affects Versions: 1.8.3
> Environment: Linux 64bit, Java
>Reporter: Vitalie Bureanu
> Attachments: Screenshot.jpg, test.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Extracted text does not have spaces between some words.
> Use to test please a string on line 74a... inside of attached test.pdf.
> It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
> Form isattached , checkhere"
> The result is not seems to be good, the words are "glued".
> I tried to use a class PDF Text Stripper but the result still remain the same.
> For us it is a big problem. Can it be resolved, please?
> With respect,
> Vitalie



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces

2014-01-22 Thread Vitalie Bureanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalie Bureanu updated PDFBOX-1858:


Description: 
Extracted text does not have spaces between some words.

Use to test please a string on line 74a... inside of attached test.pdf.

It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
Form isattached , checkhere"

The result is not seems to be good, the words are "glued".

I tried to use a class PDF Text Stripper but the result still remain the same.

Can it be resolved, please?

With respect,
Vitalie

  was:
Extracted text does not have spaces between some words.

Use to test please a string on line 74a... inside of attached test.pdf.

It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
Form isattached , checkhere"

The result is not seems to be good, the words are "glued".

I tried to use a class PDF Text Stripper but the result still remain the same.

Can it be solved, please?

With respect,
Vitalie


> Extracted text does not have spaces
> ---
>
> Key: PDFBOX-1858
> URL: https://issues.apache.org/jira/browse/PDFBOX-1858
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Text extraction
>Affects Versions: 1.8.3
> Environment: Linux 64bit, Java
>Reporter: Vitalie Bureanu
> Attachments: Screenshot.jpg, test.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Extracted text does not have spaces between some words.
> Use to test please a string on line 74a... inside of attached test.pdf.
> It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
> Form isattached , checkhere"
> The result is not seems to be good, the words are "glued".
> I tried to use a class PDF Text Stripper but the result still remain the same.
> Can it be resolved, please?
> With respect,
> Vitalie



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1859) ClassCastException for unknown destination type

2014-01-22 Thread Hendrik Lescak (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hendrik Lescak updated PDFBOX-1859:
---

Attachment: Speisepläne.pdf

> ClassCastException for unknown destination type
> ---
>
> Key: PDFBOX-1859
> URL: https://issues.apache.org/jira/browse/PDFBOX-1859
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.3, 2.0.0
>Reporter: Hendrik Lescak
> Attachments: Speisepläne.pdf
>
>
> Trying to read the outlines failed for the attached document.
> {code:java}
> import java.io.IOException;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;
> /**
>  * @author mailto:andre.kisch...@interface-projects.de";>André 
> Kischkel
>  * @since 22.01.2014
>  * @version $Revision$
>  */
> public class TestPDDestination {
>   public static void main(String[] args) throws IOException {
>   PDDocument doc = PDDocument.load("Speisepläne.pdf");
>   traverse(doc.getDocumentCatalog().getDocumentOutline());
>   doc.close();
>   }
>   
>   static void traverse(PDOutlineNode node) throws IOException {
>   if (node instanceof PDOutlineItem) {
>   PDDestination dst = ((PDOutlineItem) 
> node).getDestination();
>   /**
>* throws java.lang.ClassCastException: 
> org.apache.pdfbox.cos.COSFloat cannot be cast to 
> org.apache.pdfbox.cos.COSName,
>* but should be something like a PDPageXYZDestination!
>*/
>   System.out.println(dst);
>   }
>   for (PDOutlineItem child = node.getFirstChild(); child != null; 
> child = child.getNextSibling()) {
>   traverse(child);
>   }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces

2014-01-22 Thread Vitalie Bureanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalie Bureanu updated PDFBOX-1858:


Description: 
Extracted text does not have spaces between some words.

Use to test please a string on line 74a... inside of attached test.pdf.

It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
Form isattached , checkhere"

The result is not seems to be good, the words are "glued".

I tried to use a class PDF Text Stripper but the result still remain the same.

Can it be solved, please?

With respect,
Vitalie

  was:
Extracted text does not have spaces between some words.

Use to test please a string on line 74a... inside of attached test.pdf.

It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
Form isattached , checkhere"

The result is not seems to be good, the words are "glued".

I tried to use a class PDF Text Stripper but the resultstill remain the same.

Can it be solved, please?

With respect,
Vitalie


> Extracted text does not have spaces
> ---
>
> Key: PDFBOX-1858
> URL: https://issues.apache.org/jira/browse/PDFBOX-1858
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Text extraction
>Affects Versions: 1.8.3
> Environment: Linux 64bit, Java
>Reporter: Vitalie Bureanu
> Attachments: Screenshot.jpg, test.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Extracted text does not have spaces between some words.
> Use to test please a string on line 74a... inside of attached test.pdf.
> It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
> Form isattached , checkhere"
> The result is not seems to be good, the words are "glued".
> I tried to use a class PDF Text Stripper but the result still remain the same.
> Can it be solved, please?
> With respect,
> Vitalie



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1859) ClassCastException for unknown destination type

2014-01-22 Thread Hendrik Lescak (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hendrik Lescak updated PDFBOX-1859:
---

Description: 
Trying to read the outlines failed for the attached document.

{code}
import java.io.IOException;

import org.apache.pdfbox.pdmodel.PDDocument;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;

/**
 * @author mailto:andre.kisch...@interface-projects.de";>André 
Kischkel
 * @since 22.01.2014
 * @version $Revision$
 */
public class TestPDDestination {

public static void main(String[] args) throws IOException {
PDDocument doc = PDDocument.load("Speisepläne.pdf");
traverse(doc.getDocumentCatalog().getDocumentOutline());
doc.close();
}

static void traverse(PDOutlineNode node) throws IOException {
if (node instanceof PDOutlineItem) {
PDDestination dst = ((PDOutlineItem) 
node).getDestination();
/**
 * throws java.lang.ClassCastException: 
org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName,
 * but should be something like a PDPageXYZDestination!
 */
System.out.println(dst);
}
for (PDOutlineItem child = node.getFirstChild(); child != null; 
child = child.getNextSibling()) {
traverse(child);
}
}
}
{code}

  was:
Trying to read the outlines failed for the attached document.

{code:java}
import java.io.IOException;

import org.apache.pdfbox.pdmodel.PDDocument;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;

/**
 * @author mailto:andre.kisch...@interface-projects.de";>André 
Kischkel
 * @since 22.01.2014
 * @version $Revision$
 */
public class TestPDDestination {

public static void main(String[] args) throws IOException {
PDDocument doc = PDDocument.load("Speisepläne.pdf");
traverse(doc.getDocumentCatalog().getDocumentOutline());
doc.close();
}

static void traverse(PDOutlineNode node) throws IOException {
if (node instanceof PDOutlineItem) {
PDDestination dst = ((PDOutlineItem) 
node).getDestination();
/**
 * throws java.lang.ClassCastException: 
org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName,
 * but should be something like a PDPageXYZDestination!
 */
System.out.println(dst);
}
for (PDOutlineItem child = node.getFirstChild(); child != null; 
child = child.getNextSibling()) {
traverse(child);
}
}
}
{code}


> ClassCastException for unknown destination type
> ---
>
> Key: PDFBOX-1859
> URL: https://issues.apache.org/jira/browse/PDFBOX-1859
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.3, 2.0.0
>Reporter: Hendrik Lescak
> Attachments: Speisepläne.pdf
>
>
> Trying to read the outlines failed for the attached document.
> {code}
> import java.io.IOException;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;
> /**
>  * @author mailto:andre.kisch...@interface-projects.de";>André 
> Kischkel
>  * @since 22.01.2014
>  * @version $Revision$
>  */
> public class TestPDDestination {
>   public static void main(String[] args) throws IOException {
>   PDDocument doc = PDDocument.load("Speisepläne.pdf");
>   traverse(doc.getDocumentCatalog().getDocumentOutline());
>   doc.close();
>   }
>   
>   static void traverse(PDOutlineNode node) throws IOException {
>   if (node instanceof PDOutlineItem) {
>   PDDestination dst = ((PDOutlineItem) 
> node).getDestination();
>   /**
>* throws java.lang.ClassCastException: 
> org.apache.pdfbox.cos.C

[jira] [Updated] (PDFBOX-1859) ClassCastException for unknown destination type

2014-01-22 Thread Hendrik Lescak (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hendrik Lescak updated PDFBOX-1859:
---

Description: 
Trying to read the outlines failed for the attached document.

{code}
import java.io.IOException;

import org.apache.pdfbox.pdmodel.PDDocument;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;

public class TestPDDestination {

public static void main(String[] args) throws IOException {
PDDocument doc = PDDocument.load("Speisepläne.pdf");
traverse(doc.getDocumentCatalog().getDocumentOutline());
doc.close();
}

static void traverse(PDOutlineNode node) throws IOException {
if (node instanceof PDOutlineItem) {
PDDestination dst = ((PDOutlineItem) 
node).getDestination();
/**
 * throws java.lang.ClassCastException: 
org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName,
 * but should be something like a PDPageXYZDestination!
 */
System.out.println(dst);
}
for (PDOutlineItem child = node.getFirstChild(); child != null; 
child = child.getNextSibling()) {
traverse(child);
}
}
}
{code}

  was:
Trying to read the outlines failed for the attached document.

{code}
import java.io.IOException;

import org.apache.pdfbox.pdmodel.PDDocument;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;

/**
 * @author mailto:andre.kisch...@interface-projects.de";>André 
Kischkel
 * @since 22.01.2014
 * @version $Revision$
 */
public class TestPDDestination {

public static void main(String[] args) throws IOException {
PDDocument doc = PDDocument.load("Speisepläne.pdf");
traverse(doc.getDocumentCatalog().getDocumentOutline());
doc.close();
}

static void traverse(PDOutlineNode node) throws IOException {
if (node instanceof PDOutlineItem) {
PDDestination dst = ((PDOutlineItem) 
node).getDestination();
/**
 * throws java.lang.ClassCastException: 
org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName,
 * but should be something like a PDPageXYZDestination!
 */
System.out.println(dst);
}
for (PDOutlineItem child = node.getFirstChild(); child != null; 
child = child.getNextSibling()) {
traverse(child);
}
}
}
{code}


> ClassCastException for unknown destination type
> ---
>
> Key: PDFBOX-1859
> URL: https://issues.apache.org/jira/browse/PDFBOX-1859
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.3, 2.0.0
>Reporter: Hendrik Lescak
> Attachments: Speisepläne.pdf
>
>
> Trying to read the outlines failed for the attached document.
> {code}
> import java.io.IOException;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
> import 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;
> public class TestPDDestination {
>   public static void main(String[] args) throws IOException {
>   PDDocument doc = PDDocument.load("Speisepläne.pdf");
>   traverse(doc.getDocumentCatalog().getDocumentOutline());
>   doc.close();
>   }
>   
>   static void traverse(PDOutlineNode node) throws IOException {
>   if (node instanceof PDOutlineItem) {
>   PDDestination dst = ((PDOutlineItem) 
> node).getDestination();
>   /**
>* throws java.lang.ClassCastException: 
> org.apache.pdfbox.cos.COSFloat cannot be cast to 
> org.apache.pdfbox.cos.COSName,
>* but should be something like a PDPageXYZDestination!
>*/
>   System.out.println(dst);
>   }
>   for (PDOutlineIt

[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces

2014-01-22 Thread Vitalie Bureanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalie Bureanu updated PDFBOX-1858:


Attachment: Screenshot.jpg
test.pdf

> Extracted text does not have spaces
> ---
>
> Key: PDFBOX-1858
> URL: https://issues.apache.org/jira/browse/PDFBOX-1858
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Text extraction
>Affects Versions: 1.8.3
> Environment: Linux 64bit, Java
>Reporter: Vitalie Bureanu
> Attachments: Screenshot.jpg, test.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Extracted text does not have spaces between some words.
> Use to test please a string on line 74a... inside of attached test.pdf.
> It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
> Form isattached , checkhere"
> The result is not seems to be good, the words are "glued".
> I tried to use a class PDF Text Stripper but the resultstill remain the same.
> Can it be solved, please?
> With respect,
> Vitalie



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces

2014-01-22 Thread Vitalie Bureanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalie Bureanu updated PDFBOX-1858:


Attachment: (was: Untitled-1.jpg)

> Extracted text does not have spaces
> ---
>
> Key: PDFBOX-1858
> URL: https://issues.apache.org/jira/browse/PDFBOX-1858
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Text extraction
>Affects Versions: 1.8.3
> Environment: Linux 64bit, Java
>Reporter: Vitalie Bureanu
> Attachments: Screenshot.jpg, test.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Extracted text does not have spaces between some words.
> Use to test please a string on line 74a... inside of attached test.pdf.
> It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
> Form isattached , checkhere"
> The result is not seems to be good, the words are "glued".
> I tried to use a class PDF Text Stripper but the resultstill remain the same.
> Can it be solved, please?
> With respect,
> Vitalie



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces

2014-01-22 Thread Vitalie Bureanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalie Bureanu updated PDFBOX-1858:


Attachment: Untitled-1.jpg

> Extracted text does not have spaces
> ---
>
> Key: PDFBOX-1858
> URL: https://issues.apache.org/jira/browse/PDFBOX-1858
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Text extraction
>Affects Versions: 1.8.3
> Environment: Linux 64bit, Java
>Reporter: Vitalie Bureanu
> Attachments: Screenshot.jpg, test.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Extracted text does not have spaces between some words.
> Use to test please a string on line 74a... inside of attached test.pdf.
> It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
> Form isattached , checkhere"
> The result is not seems to be good, the words are "glued".
> I tried to use a class PDF Text Stripper but the resultstill remain the same.
> Can it be solved, please?
> With respect,
> Vitalie



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (PDFBOX-1859) ClassCastException for unknown destination type

2014-01-22 Thread Hendrik Lescak (JIRA)

Hendrik Lescak created PDFBOX-1859:
--

 Summary: ClassCastException for unknown destination type
 Key: PDFBOX-1859
 URL: https://issues.apache.org/jira/browse/PDFBOX-1859
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 1.8.3, 2.0.0
Reporter: Hendrik Lescak


Trying to read the outlines failed for the attached document.

{code:java}
import java.io.IOException;

import org.apache.pdfbox.pdmodel.PDDocument;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem;
import 
org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode;

/**
 * @author mailto:andre.kisch...@interface-projects.de";>André 
Kischkel
 * @since 22.01.2014
 * @version $Revision$
 */
public class TestPDDestination {

public static void main(String[] args) throws IOException {
PDDocument doc = PDDocument.load("Speisepläne.pdf");
traverse(doc.getDocumentCatalog().getDocumentOutline());
doc.close();
}

static void traverse(PDOutlineNode node) throws IOException {
if (node instanceof PDOutlineItem) {
PDDestination dst = ((PDOutlineItem) 
node).getDestination();
/**
 * throws java.lang.ClassCastException: 
org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName,
 * but should be something like a PDPageXYZDestination!
 */
System.out.println(dst);
}
for (PDOutlineItem child = node.getFirstChild(); child != null; 
child = child.getNextSibling()) {
traverse(child);
}
}
}
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (PDFBOX-1858) Extracted text does not have spaces

2014-01-22 Thread Vitalie Bureanu (JIRA)

Vitalie Bureanu created PDFBOX-1858:
---

 Summary: Extracted text does not have spaces
 Key: PDFBOX-1858
 URL: https://issues.apache.org/jira/browse/PDFBOX-1858
 Project: PDFBox
  Issue Type: Bug
  Components: Parsing, Text extraction
Affects Versions: 1.8.3
 Environment: Linux 64bit, Java
Reporter: Vitalie Bureanu


Extracted text does not have spaces between some words.

Use to test please a string on line 74a... inside of attached test.pdf.

It will be extracted as: "74a Amount of line73youwant refunded toyou . If 
Form isattached , checkhere"

The result is not seems to be good, the words are "glued".

I tried to use a class PDF Text Stripper but the resultstill remain the same.

Can it be solved, please?

With respect,
Vitalie



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1857) Attachment damages singature

2014-01-22 Thread Thomas Chojecki (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878492#comment-13878492
 ] 

Thomas Chojecki commented on PDFBOX-1857:
-

Sorry, I misinterpreted your point 2 and 3 in the comment. I thought you try to 
attach a file to a signed document. So this is not duplicating PDFBOX-1837 but 
this time it duplicates  PDFBOX-1822.

If you have more informations that can be helpful, feel free to add additional 
informations to this issue.

I'm working at the moment on a similar issue. Maybe it also solves this 
problematic,

> Attachment damages singature
> 
>
> Key: PDFBOX-1857
> URL: https://issues.apache.org/jira/browse/PDFBOX-1857
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDFReader, Signing, Utilities, Writing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: jack
>Assignee: Thomas Chojecki
>Priority: Blocker
> Attachments: attach.txt, original.pdf, original[signed].pdf, 
> original[with-attachment].pdf, original[with-attachment][signed].pdf
>
>
> I have PDF document. 
> 1) Adobe reader reads  document well. 
> 2) I sign document  (using pdfbox-examples)  and everything is well
> 3) Then I try to attach file to original PDF (Code is written in the pdfbox 
> web page - in the cookBook).
> 4) Adobe reader reads  attached document well. everything is well.
> 5) Now I have document with attachment. 
> 6) I try to sign that document  (I mean document with attachment). And I have 
> 2 problem:
> First:
> when I open document, Adobe reader tells me that signature byte range is 
> invalid.
> Second:
> when I try to close document (I mean to close adobe reader), Adobe reader 
> tells me that:
> Do you want to save changes to "original[with-attachment][signed]" before 
> closing?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1857) Attachment damages singature

2014-01-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878483#comment-13878483
 ] 

jack commented on PDFBOX-1857:
--

The main problem is the broken signature, but this problem is not solved yet. I 
attach documents too.

1) attach file to original document
2) try to sign document

result:
Signature byte range is invalid.

> Attachment damages singature
> 
>
> Key: PDFBOX-1857
> URL: https://issues.apache.org/jira/browse/PDFBOX-1857
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDFReader, Signing, Utilities, Writing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: jack
>Assignee: Thomas Chojecki
>Priority: Blocker
> Attachments: attach.txt, original.pdf, original[signed].pdf, 
> original[with-attachment].pdf, original[with-attachment][signed].pdf
>
>
> I have PDF document. 
> 1) Adobe reader reads  document well. 
> 2) I sign document  (using pdfbox-examples)  and everything is well
> 3) Then I try to attach file to original PDF (Code is written in the pdfbox 
> web page - in the cookBook).
> 4) Adobe reader reads  attached document well. everything is well.
> 5) Now I have document with attachment. 
> 6) I try to sign that document  (I mean document with attachment). And I have 
> 2 problem:
> First:
> when I open document, Adobe reader tells me that signature byte range is 
> invalid.
> Second:
> when I try to close document (I mean to close adobe reader), Adobe reader 
> tells me that:
> Do you want to save changes to "original[with-attachment][signed]" before 
> closing?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1857) Attachment damages singature

2014-01-22 Thread jack (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878477#comment-13878477
 ] 

jack commented on PDFBOX-1857:
--

No, I attach file to the document before signing. I attach file to the original 
document and then I try to sign.

> Attachment damages singature
> 
>
> Key: PDFBOX-1857
> URL: https://issues.apache.org/jira/browse/PDFBOX-1857
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDFReader, Signing, Utilities, Writing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: jack
>Assignee: Thomas Chojecki
>Priority: Blocker
> Attachments: attach.txt, original.pdf, original[signed].pdf, 
> original[with-attachment].pdf, original[with-attachment][signed].pdf
>
>
> I have PDF document. 
> 1) Adobe reader reads  document well. 
> 2) I sign document  (using pdfbox-examples)  and everything is well
> 3) Then I try to attach file to original PDF (Code is written in the pdfbox 
> web page - in the cookBook).
> 4) Adobe reader reads  attached document well. everything is well.
> 5) Now I have document with attachment. 
> 6) I try to sign that document  (I mean document with attachment). And I have 
> 2 problem:
> First:
> when I open document, Adobe reader tells me that signature byte range is 
> invalid.
> Second:
> when I try to close document (I mean to close adobe reader), Adobe reader 
> tells me that:
> Do you want to save changes to "original[with-attachment][signed]" before 
> closing?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1808) PDFTextStripper.getText - hight memory usage

2014-01-22 Thread Timo Boehme (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878466#comment-13878466
 ] 

Timo Boehme commented on PDFBOX-1808:
-

one addition to my last comment: it is JVM implementation dependent if in case 
of large free memory the JVM will release memory to the operating system. In 
case of server VMs they typically keep the allocated memory - independent if 
the Java application still needs it.

> PDFTextStripper.getText - hight memory usage
> 
>
> Key: PDFBOX-1808
> URL: https://issues.apache.org/jira/browse/PDFBOX-1808
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 1.8.2, 1.8.3
> Environment: Windows 7
> Java jdk 1.7.0_45
>Reporter: Guyenot Jeremy
>Assignee: Andreas Lehmkühler
>Priority: Critical
>  Labels: performance
> Attachments: 1808-java char copyof.jpg, 1808-java char 
> copyofrange.jpg, 1808-java usage.jpg, 1808-pdfbox usage.jpg, 
> 1808-snapshot.nps, DOSSIER DE CANDIDATURE_001.pdf, 
> Screenshot2014-01-21-19-51-24.png, netbeans_project.jpg, s5-1.png, s5-2.png, 
> s50-1.png, s50-2.png
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Hello,
> i'm trying to extract text from pdfs but i can find that the PDFTextStripper 
> use a lot of memory.
> With a pdf that have 2676 pages (for a 4.6Mo size) it use 1.5Go memory.
> I also constat that the memory is'nt free after the getText method is called.
> You can see my code bellow:
> double virgule = Math.pow(10, 2);
>   System.out.println("START - Total memory (Mo): " + 
> Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule);
> PDDocument cd = PDDocument.load(file);
>   System.out.println("PDDocument getNumberOfPages - Nombre de 
> pages: " + cd.getNumberOfPages());
>   System.out.println("PDDocument load - Total memory (Mo): " + 
> Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule);
> String pdfText = "";
> try{
>   PDFTextStripper stripper = new PDFTextStripper();
>   pdfText = stripper.getText(cd);
>   System.out.println("PDFTextStripper getText - Total 
> memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * 
> virgule) / virgule);
>   stripper.resetEngine();
>   stripper = null;
>   System.out.println("PDFTextStripper resetEngine - Total 
> memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * 
> virgule) / virgule);
> }
> finally{
>   if( cd!=null ){
>   cd.close();
>   cd = null;
>   System.out.println("PDDocument close - Total 
> memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * 
> virgule) / virgule);
>   }
> }
> retour = new TextField(fieldName, pdfText, Field.Store.NO);
>   System.out.println("TextField - Total memory (Mo): " + 
> Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule);
> And the result into my output window:
> START - Total memory (Mo): 95.0
> PDDocument getNumberOfPages - Nombre de pages: 2676
> PDDocument load - Total memory (Mo): 121.0
> PDFTextStripper getText - Total memory (Mo): 757.0
> PDFTextStripper resetEngine - Total memory (Mo): 757.0
> PDDocument close - Total memory (Mo): 757.0
> TextField - Total memory (Mo): 757.0
> pdfText - Total memory (Mo): 757.0
> I also try to call System.gc() but the memory use is the same.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PDFBOX-1808) PDFTextStripper.getText - hight memory usage

2014-01-22 Thread Timo Boehme (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878461#comment-13878461
 ] 

Timo Boehme commented on PDFBOX-1808:
-

[~jguyenot] please inform yourself about the meaning of the memory statistics 
provided by Java. *Total memory* is (as the name says) all the memory the VM 
uses. What you want is the used memory (by your application). This has to be 
calculated by totalMem - freeMem (see e.g. 
http://stackoverflow.com/questions/3571203/what-is-the-exact-meaning-of-runtime-getruntime-totalmemory-and-freememory)

> PDFTextStripper.getText - hight memory usage
> 
>
> Key: PDFBOX-1808
> URL: https://issues.apache.org/jira/browse/PDFBOX-1808
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 1.8.2, 1.8.3
> Environment: Windows 7
> Java jdk 1.7.0_45
>Reporter: Guyenot Jeremy
>Assignee: Andreas Lehmkühler
>Priority: Critical
>  Labels: performance
> Attachments: 1808-java char copyof.jpg, 1808-java char 
> copyofrange.jpg, 1808-java usage.jpg, 1808-pdfbox usage.jpg, 
> 1808-snapshot.nps, DOSSIER DE CANDIDATURE_001.pdf, 
> Screenshot2014-01-21-19-51-24.png, netbeans_project.jpg, s5-1.png, s5-2.png, 
> s50-1.png, s50-2.png
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Hello,
> i'm trying to extract text from pdfs but i can find that the PDFTextStripper 
> use a lot of memory.
> With a pdf that have 2676 pages (for a 4.6Mo size) it use 1.5Go memory.
> I also constat that the memory is'nt free after the getText method is called.
> You can see my code bellow:
> double virgule = Math.pow(10, 2);
>   System.out.println("START - Total memory (Mo): " + 
> Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule);
> PDDocument cd = PDDocument.load(file);
>   System.out.println("PDDocument getNumberOfPages - Nombre de 
> pages: " + cd.getNumberOfPages());
>   System.out.println("PDDocument load - Total memory (Mo): " + 
> Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule);
> String pdfText = "";
> try{
>   PDFTextStripper stripper = new PDFTextStripper();
>   pdfText = stripper.getText(cd);
>   System.out.println("PDFTextStripper getText - Total 
> memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * 
> virgule) / virgule);
>   stripper.resetEngine();
>   stripper = null;
>   System.out.println("PDFTextStripper resetEngine - Total 
> memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * 
> virgule) / virgule);
> }
> finally{
>   if( cd!=null ){
>   cd.close();
>   cd = null;
>   System.out.println("PDDocument close - Total 
> memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * 
> virgule) / virgule);
>   }
> }
> retour = new TextField(fieldName, pdfText, Field.Store.NO);
>   System.out.println("TextField - Total memory (Mo): " + 
> Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule);
> And the result into my output window:
> START - Total memory (Mo): 95.0
> PDDocument getNumberOfPages - Nombre de pages: 2676
> PDDocument load - Total memory (Mo): 121.0
> PDFTextStripper getText - Total memory (Mo): 757.0
> PDFTextStripper resetEngine - Total memory (Mo): 757.0
> PDDocument close - Total memory (Mo): 757.0
> TextField - Total memory (Mo): 757.0
> pdfText - Total memory (Mo): 757.0
> I also try to call System.gc() but the memory use is the same.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (PDFBOX-1857) Attachment damages singature

2014-01-22 Thread Thomas Chojecki (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Chojecki resolved PDFBOX-1857.
-

Resolution: Duplicate
  Assignee: Thomas Chojecki

You got two problems here.
1.) Attaching content after signing.
2.) Problem with the id management.

1.) not solvable right now (see PDFBOX-1837). We would need to change the 
object storing engine and rewrite the writer. So if you want to attach content, 
you should do it before the first sign, or remove the signature before 
attaching content and then you can sign again.

2.) I looked in the specification PDF3200-1:2008 at the chapter 14.4 and can 
confirm the solution that was mentioned on stackoverflow. 

nonetheless i close it as duplicated, because the main problem is the broken 
signature.

> Attachment damages singature
> 
>
> Key: PDFBOX-1857
> URL: https://issues.apache.org/jira/browse/PDFBOX-1857
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, PDFReader, Signing, Utilities, Writing
>Affects Versions: 1.8.3, 1.8.4, 2.0.0
>Reporter: jack
>Assignee: Thomas Chojecki
>Priority: Blocker
> Attachments: attach.txt, original.pdf, original[signed].pdf, 
> original[with-attachment].pdf, original[with-attachment][signed].pdf
>
>
> I have PDF document. 
> 1) Adobe reader reads  document well. 
> 2) I sign document  (using pdfbox-examples)  and everything is well
> 3) Then I try to attach file to original PDF (Code is written in the pdfbox 
> web page - in the cookBook).
> 4) Adobe reader reads  attached document well. everything is well.
> 5) Now I have document with attachment. 
> 6) I try to sign that document  (I mean document with attachment). And I have 
> 2 problem:
> First:
> when I open document, Adobe reader tells me that signature byte range is 
> invalid.
> Second:
> when I try to close document (I mean to close adobe reader), Adobe reader 
> tells me that:
> Do you want to save changes to "original[with-attachment][signed]" before 
> closing?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

41 matches

Mail list logo