date:20150810

[jira] [Commented] (PDFBOX-1695) Improve pdfbox tests

2015-08-10 Thread Tilman Hausherr (JIRA)

[
https://issues.apache.org/jira/browse/PDFBOX-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680353#comment-14680353
]

Tilman Hausherr commented on PDFBOX-1695:
-

The only thing I'm unhappy with is that the timeout is hardcoded in
ParallelParameterized.

Improve pdfbox tests

Key: PDFBOX-1695
URL: https://issues.apache.org/jira/browse/PDFBOX-1695
Project: PDFBox
Issue Type: Improvement
Affects Versions: 1.8.2, 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
Priority: Minor
Labels: tdd, test-driven, testing
Fix For: 2.0.0

Attachments: ccitt4.tif, jbig2test-01.png, jbig2test.pdf

I'd like to improve the tests for rendering.
org/apache/pdfbox/util/TestPDFToImage.java is disabled in pdfbox\pom.xml .
This has been disabled since 2009 ?! So I enabled it here.
The subdir rendering is missing in pdfbox\target\test-output for these tests
When a test fails because the rendered image is not identical, no detailed
message appears on the console. It appears only in pdfbox.log and not on the
console.
this is because of the settings in
pdfbox\src\test\resources\logging.properties
If this is on purpose, please change the texts in
pdfbox\src\test\java\org\apache\pdfbox\util\*.java from
One or more failures, see test log for details
to
One or more failures, see test logfile 'pdfbox.log' for details
I wanted to attach a PDF with ccitt g4 compression and its rendering created
with the 1.8.2 version, but it doesn't work out, seems that CIB generates
files that can be rendered properly with 1.8.2. However I attach the TIFF g4
file, and a JBIG2 test file from it. I don't have access to a Xerox
WorkCentre (enter jbig2 in google news :-) ) so I used a free service, so
there's a watermark.
It should be included into
pdfbox\src\test\resources\input\rendering
I have created the image myself and I give it into the public domain.
If my suggestion is accepted, it would be nice if people could create files
that fail in current versions or have failed in old versions, and release
these files to the public domain, so that they can be added to the tests.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-2917) PDF to Image, faint/dim Images

2015-08-10 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680333#comment-14680333
 ] 

Tilman Hausherr commented on PDFBOX-2917:
-

Sadly not very successful either:
- immo-kurier_arsenal_93x62.pdf the barcode now has the wrong color
- Page 3 of PDFBOX-2348.pdf is too dark
- the Braun file is unchanged. This could be because your patch doesn't change 
standard CMYK images, which do also use an ICC colorspace.

 PDF to Image, faint/dim Images
 --

 Key: PDFBOX-2917
 URL: https://issues.apache.org/jira/browse/PDFBOX-2917
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 2.0.0
 Environment: Windows 8.1, jdk1.8.0_51, jre1.8.0_51
Reporter: Samuil Goranov
Priority: Trivial
  Labels: images, newbie
 Attachments: PDFBOX-2917-v2.patch, 
 PDFBOX-2917__Use_linear_RGB_for_image_color_conversion_to_workaround_JDK_bug.patch,
  saved0.png, screenshot-1.png, selection.pdf


 {code:title=pdftoimage.java|borderStyle=solid}
 PDDocument document = null;
 File file = new File(F:\\Projects\\java\\pdfbox\\complete.pdf);
 document = PDDocument.load( file );
 try {
 // retrieve image
 BufferedImage bi = new PDFRenderer(document).renderImageWithDPI( 
 0 , 150, ImageType.RGB );
 File outputfile = new File(saved0.png);
 ImageIO.write(bi, png, outputfile);
 } catch (IOException e) {
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-2893) Simplify COSStream encoding and decoding

2015-08-10 Thread Tilman Hausherr (JIRA)

[
https://issues.apache.org/jira/browse/PDFBOX-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680341#comment-14680341
]

Tilman Hausherr commented on PDFBOX-2893:
-

got it and I agree and will do in PDFBOX-1695.

Simplify COSStream encoding and decoding

Key: PDFBOX-2893
URL: https://issues.apache.org/jira/browse/PDFBOX-2893
Project: PDFBox
Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
Priority: Blocker
Fix For: 2.0.0

Attachments: PDFBOX-2893-2.patch

Performance issues and memory usage issues surrounding streams are one of the
few things blocking the release of 2.0 (see PDFBOX-2301, PDFBOX-2882,
PDFBOX-2883).
Though we've managed to reduce some of the memory used by RandomAccessBuffer
and to take advantage of buffering of scratch files, we still have problems
with the amount of memory which COSStream holds onto. Changes introduced in
2.0 have resulted in COSStreams having a very complex relationship with
classes which hold a lot of memory in complex ways (e.g. the fields:
tempBuffer, filteredBuffer, unfilteredBuffer, filteredStream,
unFilteredStream, scratchFile). Access to scratch file pages in particular
does not seem to be well regulated, especially with regards to multithreading
(an avenue we'd at least like to leave open).
Given recent flux, I'm doubtful that we can ship the current API for
COSStream w.r.t. RandomAccess without shipping performance issues or flaws
which will be unfixable without breaking changes.
One of the recent changes to COSStream is that it now exposes a RandomAccess,
this is so that PDFStreamParser can parse content streams (as well as other
subclasses which handle xref and object streams). However, streams are
fundamentally not random access - stream filters are sequential. While the
consumer of a stream may wish to buffer the data (in memory or scratch) for
random access, COSStream itself does not need to expose such an elaborate API
- many pieces of gymnastics are performed inside COSStream to present this
illusion, at significant cost. We should remove that.
But what about providing a RandomAccess for PDFStreamParser,
PDFObjectStreamParser, and PDFXrefStreamParser? It turns out that those
classes don't actually perform random I/O. They perform sequential I/O with a
buffer for peek/unread.
We need to simplify to get 2.0 fast, lean, and maintainable. Here's what I
think we should do:
1. Split the interfaces for sequential and random I/O
- Introduce a new SequentialSource interface for sequential I/O, with thin
wrappers for RandomAccessRead and InputStream.
- BaseParser will use SequentialSource rather than RandomAccessRead (this
will be inherited by PDFStreamParser, PDFObjectStreamParser, and
PDFXrefStreamParser).
- COSParser will use RandomAccessRead and pass a SequentialSource wrapper to
it's superclass, BaseParser.
2. Remove RandomAccess APIs from COSStream, expose only InputStream and
OutputStream, as we used to do. We can pass an InputStream to PDFStreamParser
using a wrapper which implements SequentialSource. This will remove
tempBuffer, filteredBuffer, and unfilteredBuffer from COSStream, all of which
hold memory.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

[jira] [Commented] (PDFBOX-1695) Improve pdfbox tests

2015-08-10 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680345#comment-14680345
 ] 

ASF subversion and git services commented on PDFBOX-1695:
-

Commit 1695134 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1695134 ]

PDFBOX-1695: use a parameterized test as suggested by John Hewson, remove 
unused stuff

 Improve pdfbox tests
 

 Key: PDFBOX-1695
 URL: https://issues.apache.org/jira/browse/PDFBOX-1695
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 1.8.2, 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
Priority: Minor
  Labels: tdd, test-driven, testing
 Fix For: 2.0.0

 Attachments: ccitt4.tif, jbig2test-01.png, jbig2test.pdf


 I'd like to improve the tests for rendering.
 org/apache/pdfbox/util/TestPDFToImage.java is disabled in pdfbox\pom.xml . 
 This has been disabled since 2009 ?! So I enabled it here.
 The subdir rendering is missing in pdfbox\target\test-output for these tests
 When a test fails because the rendered image is not identical, no detailed 
 message appears on the console. It appears only in pdfbox.log and not on the 
 console.
 this is because of the settings in
 pdfbox\src\test\resources\logging.properties
 If this is on purpose, please change the texts in 
 pdfbox\src\test\java\org\apache\pdfbox\util\*.java from
 One or more failures, see test log for details
 to
 One or more failures, see test logfile 'pdfbox.log' for details
 I wanted to attach a PDF with ccitt g4 compression and its rendering created 
 with the 1.8.2 version, but it doesn't work out, seems that CIB generates 
 files that can be rendered properly with 1.8.2. However I attach the TIFF g4 
 file, and a JBIG2 test file from it. I don't have access to a Xerox 
 WorkCentre (enter jbig2 in google news :-) ) so I used a free service, so 
 there's a watermark.
 It should be included into
 pdfbox\src\test\resources\input\rendering
 I have created the image myself and I give it into the public domain.
 If my suggestion is accepted, it would be nice if people could create files 
 that fail in current versions or have failed in old versions, and release 
 these files to the public domain, so that they can be added to the tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

56 matches

Mail list logo