[
https://issues.apache.org/jira/browse/PDFBOX-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823482#comment-13823482
]
Michael Klink edited comment on PDFBOX-1778 at 11/15/13 8:49 AM:
-----------------------------------------------------------------
I don't think this resolution fixes anything.
Consider the attached PDF extreme-values.png which Acrobat Reader XI displays
as two pages with numerous concentric rings, cf. the screenshot
extreme-values.png.
Trying to prepend red color setting tokens before the page content stream
tokens of this PDF using this code:
PDDocument doc ;
doc = PDDocument.load(source);
List pages = doc.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
PDPage page = (PDPage) pages.get(i);
PDStream contents = page.getContents();
PDFStreamParser parser = new PDFStreamParser(contents.getStream());
parser.parse();
List tokens = parser.getTokens();
tokens.add(0, COSInteger.get(1));
tokens.add(1, COSInteger.get(0));
tokens.add(2, COSInteger.get(0));
tokens.add(3, PDFOperator.getOperator("rg"));
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream();
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
page.setContents(updatedStream);
}
doc.save(target);
doc.close();
with the current 2.0.0-SNAPSHOT results in the (also attached) PDF
extreme-values-red.pdf which does not at all represent the original in red, cf.
the screenshot extreme-values-red.png.
The reason is simple, I used somewhat extreme values for coordinates and
coordinate transformations in the sample PDF:
The first page uses small numbers in the transformation matrix and large
coordinate values:
0.000001 0 0 0.000001 0 0 cm
10000000 0 m
10000000 5522999.6 5522999.6 10000000 0 10000000 c
-5522999.6 10000000 -10000000 5522999.6 -10000000 0 c
-10000000 -5522999.6 -5522999.6 -10000000 0 -10000000 c
5522999.6 -10000000 10000000 -5522999.6 10000000 0 c
...
Here the result PDF sets a null transformation matrix:
0 0 0 0 0 0 cm
This obviously collapses all the rings.
The first page uses large numbers in the transformation matrix and small
coordinate values:
10000000 0 0 10000000 0 0 cm
0.000001 0 m
0.000001 0.0000005523 0.0000005523 0.000001 0 0.000001 c
-0.0000005523 0.000001 -0.000001 0.0000005523 -0.000001 0 c
-0.000001 -0.0000005523 -0.0000005523 -0.000001 0 -0.000001 c
0.0000005523 -0.000001 0.000001 -0.0000005523 0.000001 0 c
...
Here the result wreaks havoc among the coordinates, many set to 0, many others
collapsed to the same value...
While the sample certainly is somewhat extreme, I don't think it invalid.
PS Yes, the spec (ISO 32000-1, Annex C) mentions 5 as the "Number of
significant decimal digits of precision in fractional part", it also mentions
± 1.175 × 10^(-38) as "Nonzero real values closest to 0". Thus, the 5 above
obviously has to be interpreted in relation to this normalized number
representation, not the PDF number style. In PDF number this means that after
the first non-0 digit the exact value of only the following 5 digits counts,
not after the decimal point.
PPS: Using 1.8.2, the result looks as expected
was (Author: mkl):
I don't think this resolution fixes anything.
Consider the attached PDF extreme-values.png which Acrobat Reader XI displays
as two pages with numerous concentric rings, cf. the screenshot
extreme-values.png.
Trying to prepend red color setting tokens before the page content stream
tokens of this PDF using this code:
PDDocument doc ;
doc = PDDocument.load(source);
List pages = doc.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
PDPage page = (PDPage) pages.get(i);
PDStream contents = page.getContents();
PDFStreamParser parser = new PDFStreamParser(contents.getStream());
parser.parse();
List tokens = parser.getTokens();
tokens.add(0, COSInteger.get(1));
tokens.add(1, COSInteger.get(0));
tokens.add(2, COSInteger.get(0));
tokens.add(3, PDFOperator.getOperator("rg"));
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream();
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
page.setContents(updatedStream);
}
doc.save(target);
doc.close();
with the current 2.0.0-SNAPSHOT results in the (also attached) PDF
extreme-values-red.pdf which does not at all represent the original in red, cf.
the screenshot extreme-values-red.png.
The reason is simple, I used somewhat extreme values for coordinates and
coordinate transformations in the sample PDF:
The first page uses small numbers in the transformation matrix and large
coordinate values:
0.000001 0 0 0.000001 0 0 cm
10000000 0 m
10000000 5522999.6 5522999.6 10000000 0 10000000 c
-5522999.6 10000000 -10000000 5522999.6 -10000000 0 c
-10000000 -5522999.6 -5522999.6 -10000000 0 -10000000 c
5522999.6 -10000000 10000000 -5522999.6 10000000 0 c
...
Here the result PDF sets a null transformation matrix:
0 0 0 0 0 0 cm
This obviously collapses all the rings.
The first page uses large numbers in the transformation matrix and small
coordinate values:
10000000 0 0 10000000 0 0 cm
0.000001 0 m
0.000001 0.0000005523 0.0000005523 0.000001 0 0.000001 c
-0.0000005523 0.000001 -0.000001 0.0000005523 -0.000001 0 c
-0.000001 -0.0000005523 -0.0000005523 -0.000001 0 -0.000001 c
0.0000005523 -0.000001 0.000001 -0.0000005523 0.000001 0 c
...
Here the result wreaks havoc among the coordinates, many set to 0, many others
collapsed to the same value...
While the sample certainly is somewhat extreme, I don't think it invalid.
PS Yes, the spec (ISO 32000-1, Annex C) mentions 5 as the "Number of
significant decimal digits of precision in fractional part", it also mentions
± 1.175 × 10^(-38) as "Nonzero real values closest to 0". Thus, the 5 above
obviously has to be interpreted in relation to this normalized number
representation, not the PDF number style. In PDF number this means that after
the first non-0 digit the exact value of only the following 5 digits counts,
not after the decimal point.
PPS: Using 1.8.2, the result looks as expected
> Rounding issue in generated PDF file
> ------------------------------------
>
> Key: PDFBOX-1778
> URL: https://issues.apache.org/jira/browse/PDFBOX-1778
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel, Signing
> Affects Versions: 1.8.3
> Reporter: vakhtang koroghlishvili
> Assignee: Andreas Lehmkühler
> Priority: Critical
> Fix For: 1.8.3, 2.0.0
>
> Attachments: extreme-values-red.pdf, extreme-values-red.png,
> extreme-values.pdf, extreme-values.png, original.pdf, saved.pdf
>
>
> We have PDF file which was signed by some other application.
> When we try to sign it with PDFbox, previous revision is damaged.
> We did some investigations and found such problem:
> (question on stackoverflow is here:
> http://stackoverflow.com/questions/19903884/pdf-document-is-modified-by-another-revision/19905271?noredirect=1#19905271
> )
> Some PDF tags are changed in new revisions.
> For example values of following tags:
> /WhitePoint
> /Gamma
> /Matrix
> are changed from values like this: 0.9505
> to values like this: 0.9505000114
> We think this is problem of converting float/double inside COSFloat.
> Following code just opens and saves PDF file and this operation changes
> values of mentioned text:
> public void saveTo(String sourceFile, String destFile) throws Exception{
> PDDocument doc = PDDocument.load(new FileInputStream(sourceFile));
> doc.save(new FileOutputStream(destFile));
> }
--
This message was sent by Atlassian JIRA
(v6.1#6144)