[jira] [Commented] (PDFBOX-4649) High CPU load an memory usage, when converting PDF to Image

2019-09-11 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927948#comment-16927948
 ] 

Tilman Hausherr commented on PDFBOX-4649:
-

Very weird... make sure you have the latest java version, I think for 1.8 the 
latest with the old oracle license was 192 or 202. Or use amazon corretto.

I found one weird thing: from the command line, I could render 
{{331577-5_b_19ez1.pdf}} with -Xmx2g in rgb. But for bitonal (which you use) I 
needed -Xmx2400m to make it work. (java itself is at fault, sometimes it 
converts to rgb internally)

You could use less memory by using a scratch file setting in PDDocument.load(), 
but then it will be much slower.

> High CPU load an memory usage, when converting PDF to Image
> ---
>
> Key: PDFBOX-4649
> URL: https://issues.apache.org/jira/browse/PDFBOX-4649
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.16
>Reporter: Willie Chieukam
>Priority: Critical
> Attachments: 331577-5_b_19ez1.pdf, 332699-5_c_19ez7.pdf, 
> 335520-5_c_19ezb.pdf, 335521-5_c_19ezd.pdf
>
>
> Hello!
> we are running a business web application, that is using pdfbox to convert
>  pdf-files to images using using pdfRenderer.renderImageWithDPI(parameters).
> When we try to convert the attached pdf, the CPU load of tomcat, running in a 
> docker container on openshift, is raising and it seems, that the process 
> hangs. The tomcat process is no more responsive and we get an memory 
> overflow. Also the server load is very high meanwhile.
> We are using
> + org.apache.pdfbox:pdfbox v 2.0.16
>  + org.apache.pdfbox:pdfbox-tools v 2.0.16
>  + org.apache.pdfbox:jbig2-imageio:3.0.2
> Our Code looks like this:
> {code:java}
> public void saveImageFromPDF(Path filePath, Path imagePath, Integer 
> IMAGE_DPI, Float IMAGE_QUALITY) {
> try (PDDocument pddocument = 
> PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))) {
> PDFRenderer pdfRenderer = new PDFRenderer(pddocument);
> for (Integer i = 0; i < pddocument.getNumberOfPages(); i++) {
> try (OutputStream outputStream = documentServiceUtility
> 
> .getFileOutputStream(imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION))) {
> BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);
> ImageIOUtil.writeImage(bufferedImage, 
> IMAGE_FILE_EXTENSION, outputStream, IMAGE_DPI, IMAGE_QUALITY);
> LOG.debug("Image of document {} successfully saved.",
> imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION));
> } catch (Throwable ex) {
> throw new NiehoffPDDocumentHanderException(filePath, ex);
> }
> }
> } catch (Exception e) {
> throw new NiehoffPDDocumentHanderException(filePath, e);
> }
> }
> {code}
> Line throwing the exception
> *{color:#FF}BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);{color}*
>   
>  Do you have an idea, how to prevent this?
> Thank you very much and best regards,
>  Willie



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4649) High CPU load an memory usage, when converting PDF to Image

2019-09-11 Thread Willie Chieukam (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927927#comment-16927927
 ] 

Willie Chieukam commented on PDFBOX-4649:
-

We are running the application as mentioned in a docker container with 
following Java Opts:

Extract from Dockerfile:
{code}
ENV JAVA_OPTS="-Xmx8g -Xms4g"
{code}

> High CPU load an memory usage, when converting PDF to Image
> ---
>
> Key: PDFBOX-4649
> URL: https://issues.apache.org/jira/browse/PDFBOX-4649
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.16
>Reporter: Willie Chieukam
>Priority: Critical
> Attachments: 331577-5_b_19ez1.pdf, 332699-5_c_19ez7.pdf, 
> 335520-5_c_19ezb.pdf, 335521-5_c_19ezd.pdf
>
>
> Hello!
> we are running a business web application, that is using pdfbox to convert
>  pdf-files to images using using pdfRenderer.renderImageWithDPI(parameters).
> When we try to convert the attached pdf, the CPU load of tomcat, running in a 
> docker container on openshift, is raising and it seems, that the process 
> hangs. The tomcat process is no more responsive and we get an memory 
> overflow. Also the server load is very high meanwhile.
> We are using
> + org.apache.pdfbox:pdfbox v 2.0.16
>  + org.apache.pdfbox:pdfbox-tools v 2.0.16
>  + org.apache.pdfbox:jbig2-imageio:3.0.2
> Our Code looks like this:
> {code:java}
> public void saveImageFromPDF(Path filePath, Path imagePath, Integer 
> IMAGE_DPI, Float IMAGE_QUALITY) {
> try (PDDocument pddocument = 
> PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))) {
> PDFRenderer pdfRenderer = new PDFRenderer(pddocument);
> for (Integer i = 0; i < pddocument.getNumberOfPages(); i++) {
> try (OutputStream outputStream = documentServiceUtility
> 
> .getFileOutputStream(imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION))) {
> BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);
> ImageIOUtil.writeImage(bufferedImage, 
> IMAGE_FILE_EXTENSION, outputStream, IMAGE_DPI, IMAGE_QUALITY);
> LOG.debug("Image of document {} successfully saved.",
> imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION));
> } catch (Throwable ex) {
> throw new NiehoffPDDocumentHanderException(filePath, ex);
> }
> }
> } catch (Exception e) {
> throw new NiehoffPDDocumentHanderException(filePath, e);
> }
> }
> {code}
> Line throwing the exception
> *{color:#FF}BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);{color}*
>   
>  Do you have an idea, how to prevent this?
> Thank you very much and best regards,
>  Willie



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4649) High CPU load an memory usage, when converting PDF to Image

2019-09-11 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927909#comment-16927909
 ] 

Tilman Hausherr edited comment on PDFBOX-4649 at 9/11/19 7:08 PM:
--

What -Xmx option are you using?

With PDFDebugger I can display these files at 72dpi in 5 seconds with -Xmx4g. 
If I set the CPU to "ridiculous speed" more, it goes down to 2 seconds. 
Additional time will be needed to save the files.

It will be slower with higher dpi. It went up to about 3 seconds at 400% which 
is about 288dpi.

You can increase speed slightly by changing
{code}
PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))
{code}
to
{code}
PDDocument.load(new File(filePath))
{code}



was (Author: tilman):
With PDFDebugger I can display these files at 72dpi in 5 seconds with -Xmx4g. 
If I set the CPU to "ridiculous speed" more, it goes down to 2 seconds. 
Additional time will be needed to save the files.

It will be slower with higher dpi. It went up to about 3 seconds at 400% which 
is about 288dpi.

You can increase speed slightly by changing
{code}
PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))
{code}
to
{code}
PDDocument.load(new File(filePath))
{code}


> High CPU load an memory usage, when converting PDF to Image
> ---
>
> Key: PDFBOX-4649
> URL: https://issues.apache.org/jira/browse/PDFBOX-4649
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.16
>Reporter: Willie Chieukam
>Priority: Critical
> Attachments: 331577-5_b_19ez1.pdf, 332699-5_c_19ez7.pdf, 
> 335520-5_c_19ezb.pdf, 335521-5_c_19ezd.pdf
>
>
> Hello!
> we are running a business web application, that is using pdfbox to convert
>  pdf-files to images using using pdfRenderer.renderImageWithDPI(parameters).
> When we try to convert the attached pdf, the CPU load of tomcat, running in a 
> docker container on openshift, is raising and it seems, that the process 
> hangs. The tomcat process is no more responsive and we get an memory 
> overflow. Also the server load is very high meanwhile.
> We are using
> + org.apache.pdfbox:pdfbox v 2.0.16
>  + org.apache.pdfbox:pdfbox-tools v 2.0.16
>  + org.apache.pdfbox:jbig2-imageio:3.0.2
> Our Code looks like this:
> {code:java}
> public void saveImageFromPDF(Path filePath, Path imagePath, Integer 
> IMAGE_DPI, Float IMAGE_QUALITY) {
> try (PDDocument pddocument = 
> PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))) {
> PDFRenderer pdfRenderer = new PDFRenderer(pddocument);
> for (Integer i = 0; i < pddocument.getNumberOfPages(); i++) {
> try (OutputStream outputStream = documentServiceUtility
> 
> .getFileOutputStream(imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION))) {
> BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);
> ImageIOUtil.writeImage(bufferedImage, 
> IMAGE_FILE_EXTENSION, outputStream, IMAGE_DPI, IMAGE_QUALITY);
> LOG.debug("Image of document {} successfully saved.",
> imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION));
> } catch (Throwable ex) {
> throw new NiehoffPDDocumentHanderException(filePath, ex);
> }
> }
> } catch (Exception e) {
> throw new NiehoffPDDocumentHanderException(filePath, e);
> }
> }
> {code}
> Line throwing the exception
> *{color:#FF}BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);{color}*
>   
>  Do you have an idea, how to prevent this?
> Thank you very much and best regards,
>  Willie



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4649) High CPU load an memory usage, when converting PDF to Image

2019-09-11 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927909#comment-16927909
 ] 

Tilman Hausherr commented on PDFBOX-4649:
-

With PDFDebugger I can display these files at 72dpi in 5 seconds with -Xmx4g. 
If I set the CPU to "ridiculous speed" more, it goes down to 2 seconds. 
Additional time will be needed to save the files.

It will be slower with higher dpi. It went up to about 3 seconds at 400% which 
is about 288dpi.

You can increase speed slightly by changing
{code}
PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))
{code}
to
{code}
PDDocument.load(new File(filePath))
{code}


> High CPU load an memory usage, when converting PDF to Image
> ---
>
> Key: PDFBOX-4649
> URL: https://issues.apache.org/jira/browse/PDFBOX-4649
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.16
>Reporter: Willie Chieukam
>Priority: Critical
> Attachments: 331577-5_b_19ez1.pdf, 332699-5_c_19ez7.pdf, 
> 335520-5_c_19ezb.pdf, 335521-5_c_19ezd.pdf
>
>
> Hello!
> we are running a business web application, that is using pdfbox to convert
>  pdf-files to images using using pdfRenderer.renderImageWithDPI(parameters).
> When we try to convert the attached pdf, the CPU load of tomcat, running in a 
> docker container on openshift, is raising and it seems, that the process 
> hangs. The tomcat process is no more responsive and we get an memory 
> overflow. Also the server load is very high meanwhile.
> We are using
> + org.apache.pdfbox:pdfbox v 2.0.16
>  + org.apache.pdfbox:pdfbox-tools v 2.0.16
>  + org.apache.pdfbox:jbig2-imageio:3.0.2
> Our Code looks like this:
> {code:java}
> public void saveImageFromPDF(Path filePath, Path imagePath, Integer 
> IMAGE_DPI, Float IMAGE_QUALITY) {
> try (PDDocument pddocument = 
> PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))) {
> PDFRenderer pdfRenderer = new PDFRenderer(pddocument);
> for (Integer i = 0; i < pddocument.getNumberOfPages(); i++) {
> try (OutputStream outputStream = documentServiceUtility
> 
> .getFileOutputStream(imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION))) {
> BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);
> ImageIOUtil.writeImage(bufferedImage, 
> IMAGE_FILE_EXTENSION, outputStream, IMAGE_DPI, IMAGE_QUALITY);
> LOG.debug("Image of document {} successfully saved.",
> imagePath.resolve(Integer.toString(i) + "." + 
> IMAGE_FILE_EXTENSION));
> } catch (Throwable ex) {
> throw new NiehoffPDDocumentHanderException(filePath, ex);
> }
> }
> } catch (Exception e) {
> throw new NiehoffPDDocumentHanderException(filePath, e);
> }
> }
> {code}
> Line throwing the exception
> *{color:#FF}BufferedImage bufferedImage = 
> pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);{color}*
>   
>  Do you have an idea, how to prevent this?
> Thank you very much and best regards,
>  Willie



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4649) High CPU load an memory usage, when converting PDF to Image

2019-09-11 Thread Willie Chieukam (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Willie Chieukam updated PDFBOX-4649:

Description: 
Hello!

we are running a business web application, that is using pdfbox to convert
 pdf-files to images using using pdfRenderer.renderImageWithDPI(parameters).

When we try to convert the attached pdf, the CPU load of tomcat, running in a 
docker container on openshift, is raising and it seems, that the process hangs. 
The tomcat process is no more responsive and we get an memory overflow. Also 
the server load is very high meanwhile.

We are using

+ org.apache.pdfbox:pdfbox v 2.0.16
 + org.apache.pdfbox:pdfbox-tools v 2.0.16
 + org.apache.pdfbox:jbig2-imageio:3.0.2

Our Code looks like this:
{code:java}
public void saveImageFromPDF(Path filePath, Path imagePath, Integer 
IMAGE_DPI, Float IMAGE_QUALITY) {
try (PDDocument pddocument = 
PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))) {
PDFRenderer pdfRenderer = new PDFRenderer(pddocument);

for (Integer i = 0; i < pddocument.getNumberOfPages(); i++) {
try (OutputStream outputStream = documentServiceUtility

.getFileOutputStream(imagePath.resolve(Integer.toString(i) + "." + 
IMAGE_FILE_EXTENSION))) {

BufferedImage bufferedImage = 
pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);
ImageIOUtil.writeImage(bufferedImage, IMAGE_FILE_EXTENSION, 
outputStream, IMAGE_DPI, IMAGE_QUALITY);
LOG.debug("Image of document {} successfully saved.",
imagePath.resolve(Integer.toString(i) + "." + 
IMAGE_FILE_EXTENSION));
} catch (Throwable ex) {
throw new NiehoffPDDocumentHanderException(filePath, ex);
}
}
} catch (Exception e) {
throw new NiehoffPDDocumentHanderException(filePath, e);
}
}
{code}
Line throwing the exception

*{color:#FF}BufferedImage bufferedImage = pdfRenderer.renderImageWithDPI(i, 
IMAGE_DPI, ImageType.BINARY);{color}*
  
 Do you have an idea, how to prevent this?

Thank you very much and best regards,
 Willie

  was:
Hello!

we are running a business web application, that is using pdfbox to convert
pdf-files to images using using pdfRenderer.renderImageWithDPI(parameters).

When we try to convert the attached pdf, the CPU load of tomcat is raising
and it seems, that the process hangs. The tomcat process is no more responsive 
and we get an memory overflow. Also the server load is very high meanwhile.

We are using

+ org.apache.pdfbox:pdfbox v 2.0.16
+ org.apache.pdfbox:pdfbox-tools v 2.0.16
+ org.apache.pdfbox:jbig2-imageio:3.0.2

Our Code looks like this:

{code}
public void saveImageFromPDF(Path filePath, Path imagePath, Integer 
IMAGE_DPI, Float IMAGE_QUALITY) {
try (PDDocument pddocument = 
PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))) {
PDFRenderer pdfRenderer = new PDFRenderer(pddocument);

for (Integer i = 0; i < pddocument.getNumberOfPages(); i++) {
try (OutputStream outputStream = documentServiceUtility

.getFileOutputStream(imagePath.resolve(Integer.toString(i) + "." + 
IMAGE_FILE_EXTENSION))) {

BufferedImage bufferedImage = 
pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);
ImageIOUtil.writeImage(bufferedImage, IMAGE_FILE_EXTENSION, 
outputStream, IMAGE_DPI, IMAGE_QUALITY);
LOG.debug("Image of document {} successfully saved.",
imagePath.resolve(Integer.toString(i) + "." + 
IMAGE_FILE_EXTENSION));
} catch (Throwable ex) {
throw new NiehoffPDDocumentHanderException(filePath, ex);
}
}
} catch (Exception e) {
throw new NiehoffPDDocumentHanderException(filePath, e);
}
}
{code}

Line throwing the exception

*{color:red}BufferedImage bufferedImage = pdfRenderer.renderImageWithDPI(i, 
IMAGE_DPI, ImageType.BINARY);{color}*
 
Do you have an idea, how to prevent this?

Thank you very much and best regards,
Willie



> High CPU load an memory usage, when converting PDF to Image
> ---
>
> Key: PDFBOX-4649
> URL: https://issues.apache.org/jira/browse/PDFBOX-4649
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.16
>Reporter: Willie Chieukam
>Priority: Critical
> Attachments: 331577-5_b_19ez1.pdf, 332699-5_c_19ez7.pdf, 
> 335520-5_c_19ezb.pdf, 335521-5_c_19ezd.pdf
>
>
> Hello!
> we are running a business web application, that is using pdfbox to convert
>  

[jira] [Created] (PDFBOX-4649) High CPU load an memory usage, when converting PDF to Image

2019-09-11 Thread Willie Chieukam (Jira)
Willie Chieukam created PDFBOX-4649:
---

 Summary: High CPU load an memory usage, when converting PDF to 
Image
 Key: PDFBOX-4649
 URL: https://issues.apache.org/jira/browse/PDFBOX-4649
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.16
Reporter: Willie Chieukam
 Attachments: 331577-5_b_19ez1.pdf, 332699-5_c_19ez7.pdf, 
335520-5_c_19ezb.pdf, 335521-5_c_19ezd.pdf

Hello!

we are running a business web application, that is using pdfbox to convert
pdf-files to images using using pdfRenderer.renderImageWithDPI(parameters).

When we try to convert the attached pdf, the CPU load of tomcat is raising
and it seems, that the process hangs. The tomcat process is no more responsive 
and we get an memory overflow. Also the server load is very high meanwhile.

We are using

+ org.apache.pdfbox:pdfbox v 2.0.16
+ org.apache.pdfbox:pdfbox-tools v 2.0.16
+ org.apache.pdfbox:jbig2-imageio:3.0.2

Our Code looks like this:

{code}
public void saveImageFromPDF(Path filePath, Path imagePath, Integer 
IMAGE_DPI, Float IMAGE_QUALITY) {
try (PDDocument pddocument = 
PDDocument.load(Files.newInputStream(filePath, StandardOpenOption.READ))) {
PDFRenderer pdfRenderer = new PDFRenderer(pddocument);

for (Integer i = 0; i < pddocument.getNumberOfPages(); i++) {
try (OutputStream outputStream = documentServiceUtility

.getFileOutputStream(imagePath.resolve(Integer.toString(i) + "." + 
IMAGE_FILE_EXTENSION))) {

BufferedImage bufferedImage = 
pdfRenderer.renderImageWithDPI(i, IMAGE_DPI, ImageType.BINARY);
ImageIOUtil.writeImage(bufferedImage, IMAGE_FILE_EXTENSION, 
outputStream, IMAGE_DPI, IMAGE_QUALITY);
LOG.debug("Image of document {} successfully saved.",
imagePath.resolve(Integer.toString(i) + "." + 
IMAGE_FILE_EXTENSION));
} catch (Throwable ex) {
throw new NiehoffPDDocumentHanderException(filePath, ex);
}
}
} catch (Exception e) {
throw new NiehoffPDDocumentHanderException(filePath, e);
}
}
{code}

Line throwing the exception

*{color:red}BufferedImage bufferedImage = pdfRenderer.renderImageWithDPI(i, 
IMAGE_DPI, ImageType.BINARY);{color}*
 
Do you have an idea, how to prevent this?

Thank you very much and best regards,
Willie




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-11 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927838#comment-16927838
 ] 

Tilman Hausherr commented on PDFBOX-4648:
-

I assume this is a follow-up of PDFBOX-4647. You could have reopened the issue. 
Anyway, I have attached two text extractions, one by PDFBox and one by Adobe. 
What are you missing?

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-11 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4648:

Attachment: 5e214f828f164322a6600f183191dda5.pdf
5e214f828f164322a6600f183191dda5-PDFBox.txt
5e214f828f164322a6600f183191dda5-Adobe.txt

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4071) Improve code quality (3)

2019-09-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927825#comment-16927825
 ] 

ASF subversion and git services commented on PDFBOX-4071:
-

Commit 1866803 from Tilman Hausherr in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1866803 ]

PDFBOX-4071: update bc version

> Improve code quality (3)
> 
>
> Key: PDFBOX-4071
> URL: https://issues.apache.org/jira/browse/PDFBOX-4071
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Priority: Major
> Attachments: pdfbox-screenshot-bad.png, pdfbox-screenshot-good.png
>
>
> This is a longterm issue for the task to improve code quality, by using the 
> [SonarQube 
> report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
>  hints in different IDEs, the FindBugs tool and other code quality tools.
> This is a follow-up of PDFBOX-2852, which was getting too long.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-4616) Orientation changing when splitting a particular PDF document with clipped output

2019-09-11 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-4616.
---
Resolution: Cannot Reproduce

Closing as there was no further feedback and no java test code. You can still 
reopen and/or comment. But I hope you were able to solve the problem with the 
advice given.

> Orientation changing when splitting a particular PDF document with clipped 
> output
> -
>
> Key: PDFBOX-4616
> URL: https://issues.apache.org/jira/browse/PDFBOX-4616
> Project: PDFBox
>  Issue Type: Bug
>  Components: .NET
>Affects Versions: 2.0.12
> Environment: PDF extracted from Crystal report
>Reporter: Jitendra Mahendrapuri Goswami
>Priority: Major
> Attachments: PDF_Split_Issue.zip
>
>
> When I am splitting a PDF(with landscape orientation) that is extracted from 
> crystal report, resulting files are converted in Portrait orientation with 
> partial part getting removed. We are using getDocumentCatalog().getPages() to 
> get particular page and saving it.
>  
> The same file when opened in Internet Explorer and selected to print, detects 
> orientation correctly. It is working when using 
> Please share if there is any specific reason for such behavior and the way to 
> handle it in coding.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-4609) At least one signature is invalid

2019-09-11 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-4609.
---
Resolution: Not A Problem

Closing, nothing has happened for a few weeks here.

> At least  one signature is invalid
> --
>
> Key: PDFBOX-4609
> URL: https://issues.apache.org/jira/browse/PDFBOX-4609
> Project: PDFBox
>  Issue Type: Wish
>  Components: .NET
>Affects Versions: 1.8.15
>Reporter: bal
>Priority: Major
> Attachments: Debug.txt, keystore.p12
>
>
> I am getting signature is  invalid  error in the signature panel of pdf after 
> pkcs 7 signature insertion. I can see the name of the signer with the signed 
> by in signature panel. I am not able to find out disallowed changes pdfbox 
> does resulting into invalid pdf by acrobat reader.   Is it possible to 
> validate the pdf with pdfbox? Thanks in advance. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-4633) Make PDFBox available as an Eclipse plugin

2019-09-11 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-4633.
---
Resolution: Incomplete

Closing as it is unclear what is wished. You can still reopen and/or comment.

> Make PDFBox available as an Eclipse plugin
> --
>
> Key: PDFBOX-4633
> URL: https://issues.apache.org/jira/browse/PDFBOX-4633
> Project: PDFBox
>  Issue Type: Improvement
>Reporter: Henning von Bargen
>Priority: Minor
>
> It would be great if PDFBox supported OSGI metadata and could be used as a 
> regular Eclipse plugin.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-4640) PDF Annotations missed when merging documents

2019-09-11 Thread Tilman Hausherr (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-4640.
---
Resolution: Cannot Reproduce

Closing because we can't reproduce and due to lack of feedback. You can reopen 
and/or comment. I suspect that the problem is more complex and also related to 
a second specific PDF.

> PDF Annotations missed when merging documents
> -
>
> Key: PDFBOX-4640
> URL: https://issues.apache.org/jira/browse/PDFBOX-4640
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.0.16
>Reporter: Daniel Martin Garcia
>Priority: Major
>  Labels: merge
> Attachments: highlighted pdf.pdf, result.pdf
>
>
> Hi,
> There is a bug when PDFBox merges documents with annotations, but the 
> annotation opacity is not 100%. If the opacity is less than 100%, the 
> annotation is lost.
>  
> I attach a document with an annotation which opacity is 60%, if you create a 
> test to merge this pdf with other PDF, the annotation won't be in the merged 
> pdf.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-11 Thread wanling (Jira)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wanling updated PDFBOX-4648:

Description: 
No PostScript name information is provided for the font Arial-BoldMT

OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
implemented in PDFBox and will be ignored
 No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold

 

Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see it 
 completely.

  was:
OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
implemented in PDFBox and will be ignored
No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold

 

Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see it 
 completely.


> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-11 Thread wanling (Jira)
wanling created PDFBOX-4648:
---

 Summary: OpenType Layout tables used in font ABCDEE+Times New 
Roman,Bold are not implemented in PDFBox and will be ignored
 Key: PDFBOX-4648
 URL: https://issues.apache.org/jira/browse/PDFBOX-4648
 Project: PDFBox
  Issue Type: Improvement
  Components: Text extraction
Affects Versions: 2.0.4
Reporter: wanling


OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
implemented in PDFBox and will be ignored
No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold

 

Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see it 
 completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org