[
https://issues.apache.org/jira/browse/PDFBOX-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818284#comment-17818284
]
weiteFeng edited comment on PDFBOX-5771 at 2/19/24 6:21 AM:
------------------------------------------------------------
I tried other versions of pdfbox and jdk, but the problem still remains.
And I found that simply opening the pdf file and saving it to an output stream
can easily reproduce the problem.
Here is my minimal reproduction code:
{noformat}
public void testMemLeak() throws IOException {
Path path = Paths.get("path to pdf file");
byte[] bytes = Files.readAllBytes(path);
CountDownLatch latch = new CountDownLatch(3000);
ExecutorService executorService = Executors.newFixedThreadPool(4);
for (int i = 0; i < 3000; i++) {
executorService.submit(() -> {
try
{ PDDocument pdf = PDDocument.load(bytes); ByteArrayOutputStream output = new
ByteArrayOutputStream(); pdf.save(output); pdf.close(); output.close(); }
catch (Exception e)
{ System.out.println(e); throw new RuntimeException(e); }
finally
{ latch.countDown(); }
});
}
try
{ latch.await(); }
catch (InterruptedException e)
{ Thread.currentThread().interrupt(); }
executorService.shutdown();
}
{noformat}
was (Author: JIRAUSER304306):
I tried other versions of pdfbox and jdk, but the problem still remains.
And I found that simply opening the pdf file and saving it to an output stream
can easily reproduce the problem.
Here is my minimal reproduction code:
public void testMemLeak() throws IOException {
Path path = Paths.get("path to pdf file");
byte[] bytes = Files.readAllBytes(path);
CountDownLatch latch = new CountDownLatch(3000);
ExecutorService executorService = Executors.newFixedThreadPool(4);
for (int i = 0; i < 3000; i++) {
executorService.submit(() -> {
try {
PDDocument pdf = PDDocument.load(bytes);
ByteArrayOutputStream output = new ByteArrayOutputStream();
pdf.save(output);
pdf.close();
output.close();
} catch (Exception e) {
System.out.println(e);
throw new RuntimeException(e);
}finally {
latch.countDown();
}
});
}
try {
latch.await();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
executorService.shutdown();
}
> Adding watermarks to PDFs may suffer native memory leaks
> --------------------------------------------------------
>
> Key: PDFBOX-5771
> URL: https://issues.apache.org/jira/browse/PDFBOX-5771
> Project: PDFBox
> Issue Type: Bug
> Environment: java8, java11
> pdfbox2, pdfbox3
> DebianGNU/Linux8
> Reporter: weiteFeng
> Priority: Critical
> Attachments: App_2024_02_19_103625.jfr, Designing Data Intensive
> Applications.pdf, image-2024-02-19-10-41-19-720.png, pdf_for_test.pdf
>
>
> When using the following code to add a watermark to a PDF file, the memory
> usage of the Java process will gradually increase, even exceeding the limit
> of the maximum heap memory usage. When the process uses memory exceeding the
> maximum memory of the machine, the Java process will be killed by the
> operating system.
> When analyzing the dumped memory, I found that when the Java process occupies
> a large amount of memory (viewed through the top command), the heap memory of
> the process actually does not occupy too much space, so I inferred that there
> may be a native memory leak in this code, due to I don't have a deep
> understanding of Linux memory analysis, so I can't find the problem in this
> code.
> I wonder if you have any suggestion.
> The following is the code I use to add watermarks to PDF:
>
> {code:java}
> public ByteBuffer AddWaterMark(AddWaterMarkRequest request) throws Exception {
> byte[] b = TBaseHelper.copyBinary(request.PDFContent).array();
> PDDocument pdf = PDDocument.load(b);
> ByteArrayOutputStream output = new ByteArrayOutputStream();
> if (Objects.equals(request.WaterMark.Font, "") ||
> !this.isFontExist(request.WaterMark.Font))
> { request.WaterMark.Font = pdf_rpcConstants.WenQuanDengKuanZhengHei; }
> PDFont font = getFontFromByte(pdf, request.WaterMark.Font);
> addWaterMark(pdf, request.WaterMark, font);
> pdf.save(output);
> pdf.close();
> return ByteBuffer.wrap(output.toByteArray());
> }
>
> private void addWaterMark(PDDocument pdf, WaterMark waterMark, PDFont font)
> throws Exception {
> List<String> texts = waterMark.getWaterMarkTexts();
> for (PDPage page : pdf.getPages()) {
> PDPageContentStream cs = new PDPageContentStream(pdf, page,
> PDPageContentStream.AppendMode.APPEND, true, true);
> PDExtendedGraphicsState r0 = new PDExtendedGraphicsState();
> r0.setNonStrokingAlphaConstant((float) waterMark.AlphaConstant);
> r0.setAlphaSourceFlag(true);
> cs.setGraphicsStateParameters(r0);
> cs.setNonStrokingColor(new Color(waterMark.Color.Red,
> waterMark.Color.Green, waterMark.Color.Blue));
> float horizontalSpacing = waterMark.HorizontalSpacing;
> float verticalSpacing = waterMark.VerticalSpacing;
> int horizontalNumber = (int) (page.getMediaBox().getWidth() /
> horizontalSpacing) + 2;
> int verticalNumber = (int) (page.getMediaBox().getHeight() /
> verticalSpacing) + 2;
> cs.beginText();
> cs.setFont(font, waterMark.getFontSize());
> for (int i = 0; i <= horizontalNumber; i++) {
> for (int j = 0; j < verticalNumber; j++) {
> for (int k = 0; k < texts.size(); k++) {
> float tx = waterMark.StartIndexX + (i - 1) *
> horizontalSpacing + k * waterMark.getLineSpace();
> float ty = waterMark.StartIndexY + j * verticalSpacing;
> if (i % 2 == 0)
> {
> cs.setTextMatrix(Matrix.getRotateInstance(waterMark.RotateTheta, tx, ty));
> }
> else
> {
> cs.setTextMatrix(Matrix.getRotateInstance(waterMark.RotateTheta, tx, ty +
> waterMark.Offset)); }
> cs.showText(texts.get(k));
> }
> }
> }
> cs.endText();
> cs.restoreGraphicsState();
> cs.close();
> }
> }
> {code}
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]