date:20140318

[jira] [Reopened] (PDFBOX-1988) PDFBox ExtractText issue of PDF with no embedded fonts

2014-03-18 Thread John Hewson (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson reopened PDFBOX-1988:
-


Reopening because we leave issues open until the version they were fixed in is 
released.

 PDFBox ExtractText issue of PDF with no embedded fonts
 --

 Key: PDFBOX-1988
 URL: https://issues.apache.org/jira/browse/PDFBOX-1988
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering, Text extraction
Affects Versions: 1.8.4
 Environment: Windows 7
 Also, PASE on IBM i
Reporter: Craig Strong
  Labels: patch
 Fix For: 1.8.5, 2.0.0

 Attachments: Test1.pdf

   Original Estimate: 120h
  Remaining Estimate: 120h

 I have been using PDFBox 1.8.4 to extract text from several different PDF 
 files fine.  I use the latest PDFBox app with ExtractText command line.  
 There is one PDF that PDFBox (and iText) fails to extract any text even 
 though I can extract the text with Adobe Reader and also pdftotext.exe part 
 of XPdf.  java -jar pdfbox-app-1.8.4.jar ExtractText Test1.pdf Out.txt.  I 
 don't want to have to rely on using pdftotext.exe from a PC since this is 
 part of an automated application.  I think the error relates to an unknown 
 font type and having to use the few fonts installed in the jar file.  I tried 
 running the API classes and trying to force a font from a certain location 
 but I still got errors.  I thought I loaded the font with the loadTTF method 
 but I don't know if that did anything with the font.  I would really like to 
 have this working straight from the ExtractText class anyway.
 Here are the errors I am getting.  I tried this from both a Windows 7 PC and 
 our IBM i in the PASE environment but I get the same errors.  The section 
 starting processEncodedText and on repeats a few times so I just included the 
 first entries.
  
 Mar 10, 2014 3:50:44 PM org.apache.pdfbox.pdmodel.font.PDFontFactory 
 createFont   
 WARNING: Substituting TrueType for unknown font subtype=  
 
 Mar 10, 2014 3:50:45 PM org.apache.pdfbox.util.PDFStreamEngine 
 processOperator
 WARNING: java.lang.NullPointerException   
 
 Throwable occurred: java.lang.NullPointerException
 
 at 
 org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.loadDescriptorDictionary(PDTrueTypeFont.java:375)
 at 
 org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.ensureFontDescriptor(PDTrueTypeFont.java:221)
 
 at 
 org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.init(PDTrueTypeFont.java:119) 

 at 
 org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:121)
   
 at 
 org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:204)  

 at 
 org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:604) 

 at 
 org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)  

 at 
 org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)
  
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)

 at 
 org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:456)  

 at 
 org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:381) 

 at 
 org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:340)

 at 
 org.apache.pdfbox.ExtractText.startExtraction(ExtractText.java:275)   

 at org.apache.pdfbox.ExtractText.main(ExtractText.java:85)
   
 at org.apache.pdfbox.PDFBox.main(PDFBox.java:58)  
   
 Mar 10, 2014 3:50:45 PM org.apache.pdfbox.util.PDFStreamEngine 
 processEncodedText   
 WARNING: java.lang.NullPointerException   
   
 Throwable occurred: java.lang.NullPointerException
 
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:355)
 at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45) 
 
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)

 at 
 org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
   
 at

[jira] [Resolved] (PDFBOX-1988) PDFBox ExtractText issue of PDF with no embedded fonts

2014-03-18 Thread John Hewson (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson resolved PDFBOX-1988.
-

Resolution: Fixed

 PDFBox ExtractText issue of PDF with no embedded fonts
 --

 Key: PDFBOX-1988
 URL: https://issues.apache.org/jira/browse/PDFBOX-1988
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering, Text extraction
Affects Versions: 1.8.4
 Environment: Windows 7
 Also, PASE on IBM i
Reporter: Craig Strong
  Labels: patch
 Fix For: 1.8.5, 2.0.0

 Attachments: Test1.pdf

   Original Estimate: 120h
  Remaining Estimate: 120h

 I have been using PDFBox 1.8.4 to extract text from several different PDF 
 files fine.  I use the latest PDFBox app with ExtractText command line.  
 There is one PDF that PDFBox (and iText) fails to extract any text even 
 though I can extract the text with Adobe Reader and also pdftotext.exe part 
 of XPdf.  java -jar pdfbox-app-1.8.4.jar ExtractText Test1.pdf Out.txt.  I 
 don't want to have to rely on using pdftotext.exe from a PC since this is 
 part of an automated application.  I think the error relates to an unknown 
 font type and having to use the few fonts installed in the jar file.  I tried 
 running the API classes and trying to force a font from a certain location 
 but I still got errors.  I thought I loaded the font with the loadTTF method 
 but I don't know if that did anything with the font.  I would really like to 
 have this working straight from the ExtractText class anyway.
 Here are the errors I am getting.  I tried this from both a Windows 7 PC and 
 our IBM i in the PASE environment but I get the same errors.  The section 
 starting processEncodedText and on repeats a few times so I just included the 
 first entries.
  
 Mar 10, 2014 3:50:44 PM org.apache.pdfbox.pdmodel.font.PDFontFactory 
 createFont   
 WARNING: Substituting TrueType for unknown font subtype=  
 
 Mar 10, 2014 3:50:45 PM org.apache.pdfbox.util.PDFStreamEngine 
 processOperator
 WARNING: java.lang.NullPointerException   
 
 Throwable occurred: java.lang.NullPointerException
 
 at 
 org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.loadDescriptorDictionary(PDTrueTypeFont.java:375)
 at 
 org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.ensureFontDescriptor(PDTrueTypeFont.java:221)
 
 at 
 org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.init(PDTrueTypeFont.java:119) 

 at 
 org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:121)
   
 at 
 org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:204)  

 at 
 org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:604) 

 at 
 org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54)  

 at 
 org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)
  
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)

 at 
 org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:456)  

 at 
 org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:381) 

 at 
 org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:340)

 at 
 org.apache.pdfbox.ExtractText.startExtraction(ExtractText.java:275)   

 at org.apache.pdfbox.ExtractText.main(ExtractText.java:85)
   
 at org.apache.pdfbox.PDFBox.main(PDFBox.java:58)  
   
 Mar 10, 2014 3:50:45 PM org.apache.pdfbox.util.PDFStreamEngine 
 processEncodedText   
 WARNING: java.lang.NullPointerException   
   
 Throwable occurred: java.lang.NullPointerException
 
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:355)
 at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45) 
 
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554)

 at 
 org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
   
 at 
 org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)

Re: [GSoC 2014]Implement shading with Coons and tensor-product patch meshes

2014-03-18 Thread Thimal Kempitiya

Hi Tilman,
I'll look in to the PDF spec related to Function Type thanks for the that.

Thanks for the tips on the proposal

I uploaded my proposal to the melange

here is the url
https://www.google-melange.com/gsoc/proposal/review/student/google/gsoc2014/thimal/5649050225344512

I have suggested new method simple method to find patch of given point and
according to pdf spec type 6 can take as special case of type 7. so given
12 points we can calculate other 4 values and use same implementation to
type 6.

I would be glad if you can give feed back on my proposal.

On Wed, Mar 12, 2014 at 11:42 PM, Tilman Hausherr thaush...@t-online.dewrote:

Hello,

The function is something used mostly by shading types 1, 2 and 3. It uses
as input either the coordinates, or the result of a formula based on them.
Enter FunctionType in the PDF spec.

Re: the proposal, no I don't have a sample. I don't even know how the
google format looks like. What I'd expect to see is your background, what
you are studying, what are you mostly focused on in these studies, what are
your skills / experiences, and why do you think you're the one for this
project. And maybe a few lines how you're going to crack the two core
problems (1. point inside/outside, 2. color). If you don't know, then maybe
a few lines explaining what you will want to learn to know it.

Tilman

Am 12.03.2014 14:39, schrieb Thimal Kempitiya:

Hi Tilman,

Thanks for the feedback. What you mean by the function calculations is
it
function evaluation method can you please give more information on it.

About the proposal what advise can you give, is there specific way that
pdfbox expect apart form the gsoc format and is there any sample proposal
that we can get idea about writing proposal.

On Sun, Mar 9, 2014 at 8:42 PM, Tilman Hausherr thaush...@t-online.de
wrote:

Hello,

Yes this is an interesting idea. It would save the recalculation of y1y0
* (y + j - coords[1]) everytime. (Unless the java compiler detects this
already)
But don't expect too much from it - I believe more time is lost in
function calculation (at least for types 1, 2 and 3 where functions are
mandatory).

Tilman

Am 09.03.2014 15:12, schrieb Thimal Kempitiya:

Thanks Tilman

for optimization in speed I think we need to facus on methods which use
again and again like getRaster

for the axial shading part current implementation in the getRaster
method
we calculate the x' value for the raster inside the for by for loop

for (int j = 0; j h; j++)
{
for (int i = 0; i w; i++)
{
useBackground = false;
double inputValue = x1x0 * (x + i - coords[0]);
inputValue += y1y0 * (y + j - coords[1]);

but all the time changing happen in the i and j values and they vary
from
0jh and 0iw
so the contribution form i and j values can be calculation in separate 2
for loops which run from 0 to h and 0 to w and calculate these values
separately and put them in 2 arrays and when we need to evaluate we can
add
to the input value

this will reduce the calculations inside the for by for loop and put
them
inside a 2 for loops this may be speed up the axial shading

what you think about it

On Fri, Mar 7, 2014 at 11:44 PM, Tilman Hausherr thaush...@t-online.de

wrote:

Am 07.03.2014 15:03, schrieb Thimal Kempitiya:

Thanks Tilman for the feedback

http://www.particleincell.com/blog/2012/quad-interpolation/ seems
like
opposite of what we are going need to check whether its work with this
by
implementing it (but can easily implement if we used library with
matrix
manipulations)

This is really up to you :-) Re: the pure math parts, its rather me

who
is learning something.

Re: library, you can use the java standard library, or any library with
Apache license or compatible license.

can I know more about the optional part in the issue

Optional:
Review and optimize the complete shading package; implement cubic
spline
interpolation for type 0 (sampled) functions.

where I can get more information about the cubic spline interpolation
for
type 0 (sampled) functions and in what aspects do you expect the
optimization.

Optimization for speed. Especially the axial shading. It gets slow
when

the shaded area is very large.

The cubic spline interpolation is mentioned in the PDF spec at the
type 0
(sampled) functions, it is the part where order = 3. In the PDF spec,
search for it, or for Additional entries specific to a type 0 function
dictionary. Its really just a nice to have and of low priority.
There's
a note from adobe telling that it is not done for printing.

Tilman

On Tue, Mar 4, 2014 at 10:13 PM, Tilman Hausherr
thaush...@t-online.de

wrote:

Am 04.03.2014 15:19, schrieb Thimal Kempitiya:

Hi,

I checked the code related to the shading and studied the pdf spec

related
to the type 6. As I see it is

[jira] [Commented] (PDFBOX-1987) Provide a PDF Lexer as a base for PDF parsing

2014-03-18 Thread Maruan Sahyoun (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939008#comment-13939008
 ] 

Maruan Sahyoun commented on PDFBOX-1987:


PDFBOX-276 describes such a file. PDF.js has some files with invalid hex 
strings. There are some files which have missing CR and/or LF at the end of a 
stream ...

 Provide a PDF Lexer as a base for PDF parsing
 -

 Key: PDFBOX-1987
 URL: https://issues.apache.org/jira/browse/PDFBOX-1987
 Project: PDFBox
  Issue Type: Improvement
  Components: Parsing
Reporter: Maruan Sahyoun
Priority: Minor
 Fix For: 2.0.0

 Attachments: src.zip


 In order to enhance the parsing process and as a foundation for a combination 
 of the different parsers a PDF lexer should be provided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2014-03-18 Thread Hannes Erven (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939027#comment-13939027
 ] 

Hannes Erven commented on PDFBOX-1512:
--

The issue is not related to a specific sorting algorithm. At the moment, the 
desired order of the elements is not sufficiently defined.

Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls, some algorithms (as those Java7 defaults to) detect this and 
throw an exception.

I did try hacking the Comparator, but so far it doesn't pass the TextExtract 
test cases :-(

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
Assignee: Andreas Lehmkühler
 Attachments: FOP-2252.pdf, TextPositionComparator.java, 
 WFI_PDFParser_TextPostionComparator.txt, immo-kurier_arsenal_93x62.pdf


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (PDFBOX-1969) JPEGFactory bug

2014-03-18 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-1969.
-

Resolution: Fixed

 JPEGFactory bug
 ---

 Key: PDFBOX-1969
 URL: https://issues.apache.org/jira/browse/PDFBOX-1969
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Steven Burg
 Fix For: 2.0.0


 Attempted to run the RubberStampWithImage sample and received the following 
 errors:
 Exception in thread main java.lang.NullPointerException
at 
 org.apache.pdfbox.pdmodel.graphics.image.JPEGFactory.createFromStream(JPEGFactory.java:72)
at 
 org.apache.pdfbox.examples.pdmodel.RubberStampWithImage.doIt(RubberStampWithImage.java:93)
at 
 org.apache.pdfbox.examples.pdmodel.RubberStampWithImage.main(RubberStampWithImage.java:185)
 This happens with any jog I tested with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: [GSoC 2014]Implement shading with Coons and tensor-product patch meshes

2014-03-18 Thread Tilman Hausherr


Am 18.03.2014 10:41, schrieb Thimal Kempitiya:

I would be glad if you can give feed back on my proposal.


Hello,
The URL doesn't work except for you, the text appeared in the mentors 
list and I can also see it in the dashboard. I will give feedback there.

Tilman

[jira] [Updated] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1466:


Attachment: pdfbox-1466-01-img10.png
pdfbox-1466-01-img9.png

Here are two blurry images I found within the file. They appear in the 
rendering, but not in Adobe Reader. Very mysterious.

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934711#comment-13934711
 ] 

Tilman Hausherr edited comment on PDFBOX-1466 at 3/18/14 1:11 PM:
--

Here's a current rendering, it is almost perfect now. The only problem left is 
a weird shadow. I'm not sure whether the shadow is related to the pattern 
colorspace.


was (Author: tilman):
Here's a current rendering, it is almost perfect now. The only problem left is 
a weird shadow. I'm not sure whether the shadow is related to the pattern 
colorspace. It might be a similar problem as in PDFBOX-1830 and PDFBOX-1954 
(line width).

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread Maruan Sahyoun (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun updated PDFBOX-1466:
---

Attachment: report_Seite_1_Bild_0004.png
report_Seite_1_Bild_0003.png
report_Seite_1_Bild_0002.png
report_Seite_1_Bild_0001.png

These are the images in use within the PDF as exported by Adobe Acrobat. 

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png, report_Seite_1_Bild_0001.png, 
 report_Seite_1_Bild_0002.png, report_Seite_1_Bild_0003.png, 
 report_Seite_1_Bild_0004.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread Maruan Sahyoun (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939235#comment-13939235
 ] 

Maruan Sahyoun commented on PDFBOX-1466:


I added the images in use as exported by Adobe Acrobat. 
report_Seite_1_Bild_0004.png looks like it’s the same as 
pdfbox-1466-01-img10.png. For pdfbox-1466-01-img9.png this could be a mask as 
these are used within the PDF when inspecting how that was generated.



 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png, report_Seite_1_Bild_0001.png, 
 report_Seite_1_Bild_0002.png, report_Seite_1_Bild_0003.png, 
 report_Seite_1_Bild_0004.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1990) Support creating PDF from lossless encoded images

2014-03-18 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939325#comment-13939325
 ] 

Tilman Hausherr commented on PDFBOX-1990:
-

NullOutputStream optimized for speed in rev 1578940.

 Support creating PDF from lossless encoded images
 -

 Key: PDFBOX-1990
 URL: https://issues.apache.org/jira/browse/PDFBOX-1990
 Project: PDFBox
  Issue Type: Improvement
Reporter: Tilman Hausherr
Priority: Minor

 Currently we support the insertion of TIFF and JPEG into a PDF, but not PNG. 
 We can pass a BufferedImage, but this one will be JPEG compressed which is 
 not a good thing for graphics with sharp edges. I suggest that we support PNG 
 as well. It is possible because the Flate Filter supports both directions.
 My implementation (coming in a few minutes) is just an RGB based start that 
 begs for improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1990) Support creating PDF from lossless encoded images

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939626#comment-13939626
 ] 

John Hewson commented on PDFBOX-1990:
-

This looks good, a couple of random thoughts:

1) The static factory method doesn't need the word Lossless in it, because 
it's already in the factory name:
{code}
LosslessFactory.createLosslessFromImage(...)
{code}

vs.
{code}
LosslessFactory.createFromImage(...)
{code}

2) When naming variables for parameters in the public API, prefer short words 
over abbreviations, e.g.

{code}
BufferedImage bim   ---   BufferedImage image
{code}

Also, if a local variable name is already a short word, prefer not abbreviating 
it:

{code}
Color co   ---   Color color
{code}

:)

 Support creating PDF from lossless encoded images
 -

 Key: PDFBOX-1990
 URL: https://issues.apache.org/jira/browse/PDFBOX-1990
 Project: PDFBox
  Issue Type: Improvement
Reporter: Tilman Hausherr
Priority: Minor

 Currently we support the insertion of TIFF and JPEG into a PDF, but not PNG. 
 We can pass a BufferedImage, but this one will be JPEG compressed which is 
 not a good thing for graphics with sharp edges. I suggest that we support PNG 
 as well. It is possible because the Flate Filter supports both directions.
 My implementation (coming in a few minutes) is just an RGB based start that 
 begs for improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1990) Support creating PDF from lossless encoded images

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939626#comment-13939626
 ] 

John Hewson edited comment on PDFBOX-1990 at 3/18/14 6:42 PM:
--

This looks good, a couple of random thoughts:

1) The static factory method doesn't need the word Lossless in it, because 
it's already in the factory name:
{code}
LosslessFactory.createLosslessFromImage(...)
{code}

vs.
{code}
LosslessFactory.createFromImage(...)
{code}

2) When naming variables for parameters in the public API, prefer short words 
over abbreviations, e.g.

{code}
BufferedImage bim   ---   BufferedImage image
{code}

3) Also, if a local variable name is already a short word, prefer not 
abbreviating it:

{code}
Color co   ---   Color color
{code}

:)


was (Author: jahewson):
This looks good, a couple of random thoughts:

1) The static factory method doesn't need the word Lossless in it, because 
it's already in the factory name:
{code}
LosslessFactory.createLosslessFromImage(...)
{code}

vs.
{code}
LosslessFactory.createFromImage(...)
{code}

2) When naming variables for parameters in the public API, prefer short words 
over abbreviations, e.g.

{code}
BufferedImage bim   ---   BufferedImage image
{code}

Also, if a local variable name is already a short word, prefer not abbreviating 
it:

{code}
Color co   ---   Color color
{code}

:)

 Support creating PDF from lossless encoded images
 -

 Key: PDFBOX-1990
 URL: https://issues.apache.org/jira/browse/PDFBOX-1990
 Project: PDFBox
  Issue Type: Improvement
Reporter: Tilman Hausherr
Priority: Minor

 Currently we support the insertion of TIFF and JPEG into a PDF, but not PNG. 
 We can pass a BufferedImage, but this one will be JPEG compressed which is 
 not a good thing for graphics with sharp edges. I suggest that we support PNG 
 as well. It is possible because the Flate Filter supports both directions.
 My implementation (coming in a few minutes) is just an RGB based start that 
 begs for improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1987) Provide a PDF Lexer as a base for PDF parsing

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939630#comment-13939630
 ] 

John Hewson commented on PDFBOX-1987:
-

Thanks, it's a tricky problem to solve.

 Provide a PDF Lexer as a base for PDF parsing
 -

 Key: PDFBOX-1987
 URL: https://issues.apache.org/jira/browse/PDFBOX-1987
 Project: PDFBox
  Issue Type: Improvement
  Components: Parsing
Reporter: Maruan Sahyoun
Priority: Minor
 Fix For: 2.0.0

 Attachments: src.zip


 In order to enhance the parsing process and as a foundation for a combination 
 of the different parsers a PDF lexer should be provided.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939639#comment-13939639
 ] 

John Hewson commented on PDFBOX-1512:
-

{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one sort which allowed. Perhaps we need some more sophisticated rule for 
determining reading order? Perhaps [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance?

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
Assignee: Andreas Lehmkühler
 Attachments: FOP-2252.pdf, TextPositionComparator.java, 
 WFI_PDFParser_TextPostionComparator.txt, immo-kurier_arsenal_93x62.pdf


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939639#comment-13939639
 ] 

John Hewson edited comment on PDFBOX-1512 at 3/18/14 6:55 PM:
--

{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rule for determining 
reading order? Perhaps [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance?


was (Author: jahewson):
{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one sort which allowed. Perhaps we need some more sophisticated rule for 
determining reading order? Perhaps [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance?

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
Assignee: Andreas Lehmkühler
 Attachments: FOP-2252.pdf, TextPositionComparator.java, 
 WFI_PDFParser_TextPostionComparator.txt, immo-kurier_arsenal_93x62.pdf


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939639#comment-13939639
 ] 

John Hewson edited comment on PDFBOX-1512 at 3/18/14 6:55 PM:
--

{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rules for 
determining reading order? Perhaps [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance?


was (Author: jahewson):
{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rule for determining 
reading order? Perhaps [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance?

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
Assignee: Andreas Lehmkühler
 Attachments: FOP-2252.pdf, TextPositionComparator.java, 
 WFI_PDFParser_TextPostionComparator.txt, immo-kurier_arsenal_93x62.pdf


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939639#comment-13939639
 ] 

John Hewson edited comment on PDFBOX-1512 at 3/18/14 6:55 PM:
--

{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rules for 
determining reading order? Could [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] be of relevance?


was (Author: jahewson):
{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rules for 
determining reading order? Perhaps [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance?

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
Assignee: Andreas Lehmkühler
 Attachments: FOP-2252.pdf, TextPositionComparator.java, 
 WFI_PDFParser_TextPostionComparator.txt, immo-kurier_arsenal_93x62.pdf


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939639#comment-13939639
 ] 

John Hewson edited comment on PDFBOX-1512 at 3/18/14 6:56 PM:
--

{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rules for 
determining reading order? Off the top of my head, it seems like [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance.


was (Author: jahewson):
{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rules for 
determining reading order? [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance.

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
Assignee: Andreas Lehmkühler
 Attachments: FOP-2252.pdf, TextPositionComparator.java, 
 WFI_PDFParser_TextPostionComparator.txt, immo-kurier_arsenal_93x62.pdf


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939639#comment-13939639
 ] 

John Hewson edited comment on PDFBOX-1512 at 3/18/14 6:55 PM:
--

{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rules for 
determining reading order? [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] may be of relevance.


was (Author: jahewson):
{quote}
Some algorithms don't care, which may result in inconsistent ordering across 
multiple calls
{quote}

I don't see how this would happen unless the algorithm was randomised. For a 
given input the output should always be the same, regardless.

But as you say the ordering is not sufficiently defined, so there may be more 
than one solution. Perhaps we need some more sophisticated rules for 
determining reading order? Could [Topological 
sorting|http://en.wikipedia.org/wiki/Topological_sorting] be of relevance?

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
Assignee: Andreas Lehmkühler
 Attachments: FOP-2252.pdf, TextPositionComparator.java, 
 WFI_PDFParser_TextPostionComparator.txt, immo-kurier_arsenal_93x62.pdf


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939651#comment-13939651
 ] 

John Hewson commented on PDFBOX-1466:
-

The two blurry images do appear in Adobe Reader, over the green border around 
the star. You have to zoom in to around 2000% to be order to see them though.

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png, report_Seite_1_Bild_0001.png, 
 report_Seite_1_Bild_0002.png, report_Seite_1_Bild_0003.png, 
 report_Seite_1_Bild_0004.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939651#comment-13939651
 ] 

John Hewson edited comment on PDFBOX-1466 at 3/18/14 7:00 PM:
--

Tilman, the two blurry images do appear in Adobe Reader, over the green 
border around the star. You have to zoom in to around 2000% to be order to see 
them though.


was (Author: jahewson):
The two blurry images do appear in Adobe Reader, over the green border around 
the star. You have to zoom in to around 2000% to be order to see them though.

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png, report_Seite_1_Bild_0001.png, 
 report_Seite_1_Bild_0002.png, report_Seite_1_Bild_0003.png, 
 report_Seite_1_Bild_0004.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939651#comment-13939651
 ] 

John Hewson edited comment on PDFBOX-1466 at 3/18/14 7:01 PM:
--

Tilman, the two blurry images do appear in Adobe Reader, over the green 
border around the star. You have to zoom in to around 2000% to be able to see 
them.


was (Author: jahewson):
Tilman, the two blurry images do appear in Adobe Reader, over the green 
border around the star. You have to zoom in to around 2000% to be able to see 
them though.

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png, report_Seite_1_Bild_0001.png, 
 report_Seite_1_Bild_0002.png, report_Seite_1_Bild_0003.png, 
 report_Seite_1_Bild_0004.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread John Hewson (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939651#comment-13939651
 ] 

John Hewson edited comment on PDFBOX-1466 at 3/18/14 7:00 PM:
--

Tilman, the two blurry images do appear in Adobe Reader, over the green 
border around the star. You have to zoom in to around 2000% to be able to see 
them though.


was (Author: jahewson):
Tilman, the two blurry images do appear in Adobe Reader, over the green 
border around the star. You have to zoom in to around 2000% to be order to see 
them though.

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png, report_Seite_1_Bild_0001.png, 
 report_Seite_1_Bild_0002.png, report_Seite_1_Bild_0003.png, 
 report_Seite_1_Bild_0004.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Removing processStream and processSubStream

2014-03-18 Thread John Hewson

Hi All

I’m still working on getting Tiling Patterns to render correctly, and need to 
make some
changes to core PDFBox functionality in order to proceed. My problem is that 
tiling
patterns are defined in their parent stream’s initial coordinate space, rather 
than the
coordinate space defined by the CTM. However, in PDFBox there is no way to 
access
the parent stream, so I can’t find out what it’s initial matrix is. The manner 
in which the
initial coordinate space is determined is different for pages, forms, and 
patterns

What this means is that the parent stream’s initial coordinate space needs to 
be passed
to processStream and processSubStream in PDFStreamEngine. This will necessarily 
be
a breaking change, and it will affect all downstream subclasses of 
PDFStreamEngine.

Because this has to be a breaking change, I propose that we go all the way and 
make
the new API bulletproof, 1) so that we won’t have to introduce breaking changes 
in the
future if we encounter similar issues, 2) so that the caller of the method 
can’t pass the
wrong data in the parameters. We would remove the two generic methods:

public void processStream(PDResources resources, COSStream cosStream, 
PDRectangle drawingSize, int rotation)
public void processSubStream(PDResources resources, COSStream cosStream)

and replace them with four specific methods:

public void processPage(PDPage page)
public void processForm(PDFormXObject form)
public void processTilingPattern(PDTilingPattern pattern)
public void processType3Font(PDType3Font font)

This would mean that the various “proces” methods have access to their 
parent
stream, and can read any of its public fields in the future without introducing 
breaking
changes by altering the method’s parameters.

What do you think?

-- John

[jira] [Commented] (PDFBOX-1466) Rendering of pattern colorspace fails

2014-03-18 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939856#comment-13939856
 ] 

Tilman Hausherr commented on PDFBOX-1466:
-

Yeah, looking there at 2000% a blurry effect appears for a short time on the 
green star, before the red star is painted, and after that there is still a 
rest.

 Rendering of pattern colorspace fails
 -

 Key: PDFBOX-1466
 URL: https://issues.apache.org/jira/browse/PDFBOX-1466
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.7.1, 1.8.4, 2.0.0
 Environment: Windows 7, JDK 1.6 / 1.7
Reporter: Maurice Koch
  Labels: tilingpattern
 Fix For: 2.0.0

 Attachments: pdfbox-1466-01-img10.png, pdfbox-1466-01-img9.png, 
 pdfbox-1466.pdf-1.png, report.pdf, report.png, report_Seite_1_Bild_0001.png, 
 report_Seite_1_Bild_0002.png, report_Seite_1_Bild_0003.png, 
 report_Seite_1_Bild_0004.png


 I was trying to print a pdf which was generated by iText v2.1.5. 
 Unfortunately parts of it were printed in white – the filling color was 
 missing. I could reduce the problem to the attached PDF. When trying to print 
 with e.g. PDocument.silentPrint I get the following info message:
 [INFO] [org.apache.pdfbox.pdfviewer.PageDrawer] ColorSpace Pattern doesn't 
 provide a non-stroking color, using white instead!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1990) Support creating PDF from lossless encoded images

2014-03-18 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939884#comment-13939884
 ] 

Tilman Hausherr commented on PDFBOX-1990:
-

Done as suggested in rev 1579073 and 1579074.

 Support creating PDF from lossless encoded images
 -

 Key: PDFBOX-1990
 URL: https://issues.apache.org/jira/browse/PDFBOX-1990
 Project: PDFBox
  Issue Type: Improvement
Reporter: Tilman Hausherr
Priority: Minor

 Currently we support the insertion of TIFF and JPEG into a PDF, but not PNG. 
 We can pass a BufferedImage, but this one will be JPEG compressed which is 
 not a good thing for graphics with sharp edges. I suggest that we support PNG 
 as well. It is possible because the Flate Filter supports both directions.
 My implementation (coming in a few minutes) is just an RGB based start that 
 begs for improvement.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (PDFBOX-1991) Shading PaintContexts should not depend on the page height

2014-03-18 Thread John Hewson (JIRA)

John Hewson created PDFBOX-1991:
---

 Summary: Shading PaintContexts should not depend on the page height
 Key: PDFBOX-1991
 URL: https://issues.apache.org/jira/browse/PDFBOX-1991
 Project: PDFBox
  Issue Type: Improvement
  Components: Rendering
Affects Versions: 2.0.0
Reporter: John Hewson
Priority: Minor


I'd like to remove the page height parameter from PDPattern as soon as possible 
because of doubts over its safety (i.e. the current stream being processed may 
be a pattern or a form, not a page). Before I do that we need to remove its 
only use, which is...

The page height is passed to all shading PaintContext subclasses but it is only 
used in GouraudShadingContext. However, all other drawing in PDFBox is done 
using the native PDF y-axis which is flipped via a call to Graphics2D#scale(0, 
-1) but the following code in GouraudShadingContext flips the y-axis:

{code}
v.point = new Point.Double(v.point.getX(), pageHeight + xform.getTranslateY() - 
v.point.getY());
{code}

So it seems like this could be removed and the y-axis inversion done elsewhere 
with either a Matrix, AffineTransform or Grpahics2D#scale.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-1991) Shading PaintContexts should not depend on the page height

2014-03-18 Thread John Hewson (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-1991:


Description: 
I'd like to remove the page height parameter from PDPattern as soon as possible 
because of doubts over its safety (i.e. the current stream being processed may 
be a pattern or a form, not a page). Before I do that we need to remove its 
only use, which is...

The page height is passed to all shading PaintContext subclasses but it is only 
used in GouraudShadingContext. However, all other drawing in PDFBox is done 
using the native PDF y-axis which is flipped via a call to Graphics2D#scale(0, 
-1) but the following code in GouraudShadingContext flips the y-axis:

v.point = new Point.Double(v.point.getX(), pageHeight + xform.getTranslateY() - 
v.point.getY());

So it seems like this could be removed and the y-axis inversion done elsewhere 
with either a Matrix, AffineTransform or Grpahics2D#scale.

  was:
I'd like to remove the page height parameter from PDPattern as soon as possible 
because of doubts over its safety (i.e. the current stream being processed may 
be a pattern or a form, not a page). Before I do that we need to remove its 
only use, which is...

The page height is passed to all shading PaintContext subclasses but it is only 
used in GouraudShadingContext. However, all other drawing in PDFBox is done 
using the native PDF y-axis which is flipped via a call to Graphics2D#scale(0, 
-1) but the following code in GouraudShadingContext flips the y-axis:

{code}
v.point = new Point.Double(v.point.getX(), pageHeight + xform.getTranslateY() - 
v.point.getY());
{code}

So it seems like this could be removed and the y-axis inversion done elsewhere 
with either a Matrix, AffineTransform or Grpahics2D#scale.


 Shading PaintContexts should not depend on the page height
 --

 Key: PDFBOX-1991
 URL: https://issues.apache.org/jira/browse/PDFBOX-1991
 Project: PDFBox
  Issue Type: Improvement
  Components: Rendering
Affects Versions: 2.0.0
Reporter: John Hewson
Priority: Minor

 I'd like to remove the page height parameter from PDPattern as soon as 
 possible because of doubts over its safety (i.e. the current stream being 
 processed may be a pattern or a form, not a page). Before I do that we need 
 to remove its only use, which is...
 The page height is passed to all shading PaintContext subclasses but it is 
 only used in GouraudShadingContext. However, all other drawing in PDFBox is 
 done using the native PDF y-axis which is flipped via a call to 
 Graphics2D#scale(0, -1) but the following code in GouraudShadingContext flips 
 the y-axis:
 v.point = new Point.Double(v.point.getX(), pageHeight + xform.getTranslateY() 
 - v.point.getY());
 So it seems like this could be removed and the y-axis inversion done 
 elsewhere with either a Matrix, AffineTransform or Grpahics2D#scale.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Reopened] (PDFBOX-1988) PDFBox ExtractText issue of PDF with no embedded fonts

[jira] [Resolved] (PDFBOX-1988) PDFBox ExtractText issue of PDF with no embedded fonts

Re: [GSoC 2014]Implement shading with Coons and tensor-product patch meshes

[jira] [Commented] (PDFBOX-1987) Provide a PDF Lexer as a base for PDF parsing

[jira] [Commented] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

[jira] [Resolved] (PDFBOX-1969) JPEGFactory bug

Re: [GSoC 2014]Implement shading with Coons and tensor-product patch meshes

[jira] [Updated] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Updated] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Commented] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Commented] (PDFBOX-1990) Support creating PDF from lossless encoded images

[jira] [Commented] (PDFBOX-1990) Support creating PDF from lossless encoded images

[jira] [Comment Edited] (PDFBOX-1990) Support creating PDF from lossless encoded images

[jira] [Commented] (PDFBOX-1987) Provide a PDF Lexer as a base for PDF parsing

[jira] [Commented] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

[jira] [Comment Edited] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

[jira] [Commented] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Comment Edited] (PDFBOX-1466) Rendering of pattern colorspace fails

Removing processStream and processSubStream

[jira] [Commented] (PDFBOX-1466) Rendering of pattern colorspace fails

[jira] [Commented] (PDFBOX-1990) Support creating PDF from lossless encoded images

[jira] [Created] (PDFBOX-1991) Shading PaintContexts should not depend on the page height

[jira] [Updated] (PDFBOX-1991) Shading PaintContexts should not depend on the page height

30 matches

Site Navigation

Mail list logo

Footer information