[ 
https://issues.apache.org/jira/browse/PDFBOX-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17796643#comment-17796643
 ] 

Axel Howind commented on PDFBOX-5731:
-------------------------------------

 
{quote}I think the logic is a bit strange. Registering for recycling when a 
certain name is empty seems to have no effect.
{quote}
I don't get what you mean here? `name` is the COSName instance. If name is 
null, it means it's not in the cache, and we create a new one. The synchronized 
block makes sure that even when multiple threads try to create identical 
COSName instances at the same time, only one is created. I used double checked 
locking to avoid having to make the complete method synchronized.
{quote}jdk8 does not seem to have Cleaner.create(), register() methods.
{quote}
Yes, exactly. That's why I wrote "This works from Java 9 onwards"  in the 
original comment. As PDFBox 3.x requires Java 11, I see no problem with that.

In Java 8 you had finalize() for this kind of cleanup action, finalize has been 
deprecated for 6 years now.
{quote}Moreover, Cleaner is generally used for automatic recycling of off-heap 
memory. Using Cleaner may affect other functions in the system.
{quote}
Ah, did you get that off ChatGPT? Cleaner has been introduced _exactly_ for 
this kind oh cleanup actions. Freeing off-heap memory is just one (and the most 
common example) for using Cleaner, because the JVM cannot handle that by itself 
without a user supplied cleanup action. Other use cases are all kinds of 
resources acquired using JNI calls etc.

Starting with Java 9, using Cleaner is the recommended way of handling resource 
cleanup (in cases where automatic resource management with try-with-resources 
is not possible), see [JEP 421|https://openjdk.org/jeps/421] for details.
{quote}like this:
{quote}
Your code essentially re-implements what Cleaner does, just that Cleaner is 
maintained and tested by the JDK developers. If in doubt, trust the JDK devs to 
do it right.

Your code also has the issue that the cleanup is only triggered when a new 
COSName instance is created. If you have one million COSName instances in your 
document, then load another document, the map will not be cleaned unless you 
add a new COSName instance that is different from the million ones you already 
have.

 

I have changed your test code like this to better see what is going on:
{code:java}
    public static void main(String[] args) throws InterruptedException {
        for (int i = 0; i < 10_000_000; i++) {
            test("123_"+i);
            if (i%1_000_000==0) {
                System.err.println(i+", size="+NAME_MAP.size());
            }
        }
        System.err.println("after loop: "+NAME_MAP.size());
        System.gc();
        Thread.sleep(10_000);
        System.err.println("System.gc(): "+NAME_MAP.size());
        clearResources();
        System.err.println("after clearResources(): "+NAME_MAP.size());
    }

    private static void test(String name){
        final COSName pdfName = COSName.getPDFName(name);
    }
{code}
And this is the output I get (using Cleaner):
{noformat}
0, size=570
1000000, size=970902
2000000, size=1883450
3000000, size=2820630
4000000, size=3749494
5000000, size=4698729
6000000, size=5646001
7000000, size=6623271
8000000, size=7574087
9000000, size=8574087
after loop: 9520798
System.gc(): 569
after clearResources(): 569
{noformat}
As you can see, the cleaning action runs when the COSName objects are GCed.

I did not run your version using a reference queue because your code is not 
based on trunk and does not compile. I would expect the results to be more or 
less the same.

As for PDFBox 3+ I'd definitely recommend the Cleaner based solution as it is 
much simpler than manipulating ReferenceQueues yourself (that's also the main 
reason Cleaner was introduced, it uses ReferenceQueues internally).

If you absolutely need a patch for PDFBox 2, please prepare a patch using 
ReferenceQueue.

> org.apache.pdfbox.cos.COSName#nameMap There is a memory leak problem.
> ---------------------------------------------------------------------
>
>                 Key: PDFBOX-5731
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5731
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.30, 3.0.1 PDFBox
>            Reporter: liu
>            Priority: Major
>         Attachments: COSName.java, 
> attempted_fix_for_PDFBOX-5731__clear_out_unused_COSName_instances_automatically.patch,
>  
> attempted_fix_for_PDFBOX-5731__clear_out_unused_COSName_instances_automatically_2.patch,
>  
> attempted_fix_for_PDFBOX-5731__clear_out_unused_COSName_instances_automatically_using_cleaner.patch,
>  image-2023-12-08-16-02-12-293.png, image-2023-12-13-17-04-22-073.png, 
> screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png, 
> screenshot-5.png
>
>
>  !image-2023-12-08-16-02-12-293.png! 
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to