Ok to the fix. But Sergey need to respond as to whether the answer about the test satisfies him.

-phil.

On 11/09/2017 02:18 AM, Prahalad Kumar Narayanan wrote:
Hello Everyone

First, Thanks to Phil & Sergey for their time in review & suggestions.

This is a follow-up on Phil's suggestion to investigate whether the fix 
proposed in webrev.00 could lead to deadlocks.
Link: http://cr.openjdk.java.net/~pnarayanan/8188083/webrev.00/

I inspected the code, and wrote code snippets to check the behavior.
Inferences are as follows-
     1. FilteredImageSource helps in producing the image, applying an image 
filter on the pixels, and sharing the result to the ImageConsumer.
     2. The object holds- An ImageProducer, An ImageFilter, and instantiates a 
HashTable<ImageConsumer, ImageFilter> for its operation.

     3. The methods of FilteredImageSource are not supposed to be invoked 
directly from user-code. So how are these methods invoked?
           . startProduction:
                 . This method is invoked when a user creates an Image like- 
Component.createImage(new FilteredImageSource(imgProducer, imgFilter));
                 . Such creation could occur on any thread in user code and 
this internally invokes FilteredImageSource.startProduction through 
ToolkitImage.
                 . The method updates the hash table as required and invokes 
ImageProducer.startProduction to produce the image.
           . removeConsumer:
                 . This method is invoked once ImageProducer is done with image 
preparation.
                 . The exact thread in which the removeConsumer gets invoked 
depends on how ImageProducer prepares the image.

     4. Is there a possibility of a deadlock with the proposed fix ?
           . In my inspection & observation with code snippets, I found that 
removeConsumer could be invoked in two ways- Multi-threaded & Single threaded.
           . Multi-threaded-
                 . This was observed with FileImageSource as ImageProducer
                 . startProduction on this image producer requests image decode 
operation on a different thread- ImageFetcher thread.
                 . Once decoding is complete, removeConsumer is invoked on the 
ImageFetcher thread.
                 . Though startProduction & removeConsumer are invoked on 
different threads, there is no possibility of a deadlock as the code flow from 
startProduction returns immediately after posting the request to decode image onto 
ImageFetcher 's queue.
           . Single threaded-
                 . This was observed with OffscreenImageSource as ImageProducer
                 . startProduction and removeConsumer are invoked on the same 
thread.
                 . Since re-entrant threads acquire the locks that they already 
hold, this wouldn't result in a deadlock.
           . Thus, in both cases I don't see possibility of a deadlock. Hence, 
we can consider the change in webrev.00 safe to fix the issue.

     5. I have run JTreg test cases before and after the proposed change on 
java/awt/image/ and javax/swing (where ImageFilter is used).
          No new regressions were observed during the test.

Kindly review by assessing the above information.
If you believe, the fix could be approved, kindly review the CSR as well.

Thank you once again for your time
Have a good day

Prahalad

----- Original Message -----
From: Philip Race
Sent: Thursday, October 26, 2017 9:30 PM
To: Prahalad Kumar Narayanan
Cc: Sergey Bylokhov; 2d-dev
Subject: Re: [OpenJDK 2D-Dev] [10] RFR: JDK-8188083- NPE in 
java.awt.image.FilteredImageSource.startProduction

The screenshot shows you directly calling this method in your test which
the documentation says you are not supposed to do.
So I am not able to be 100% sure that the test you have re-creates what
the submitter saw .. in his stack trace you have below it seems to be valid.

But to have a NPE at line 181 is odd.
  181             proxies.put(ic, imgf);

What is null ? proxies was just initialised in this same method so
I don't see how it can be null unless something elsewhere is
resetting it to null.

The only place I see that is removeConsumer
  138     public synchronized void removeConsumer(ImageConsumer ic) {
  139         if (proxies != null) {
  140             ImageFilter imgf =  proxies.get(ic);
  141             if (imgf != null) {
  142                 src.removeConsumer(imgf);
  143                 proxies.remove(ic);
  144                 if (proxies.isEmpty()) {
  145                     proxies = null;
  146                 }
  147             }
  148         }
  149     }

Now that method is synchronized .. OK so I think it might make sense
to mark the additional methods synchronized too but then I automatically
worry about introducing a deadlock.

I can't say for sure that you have done so however and likely there
are no nested locks here so it is probably OK but please run as many
tests as you can find to try to ensure that.

Adding or removing sychronized is binary compatible but since
these are public API methods you should still do a CSR
before pushing if this ends up being the approved fix.

-phil.



On 10/25/17, 11:58 PM, Prahalad Kumar Narayanan wrote:
Hello Sergey

Thank you for the suggestion.

I 've updated JBS with below information.
    1 . Stack trace information as provided by submitter-
         java.lang.NullPointerException
             at 
java.awt.image.FilteredImageSource.startProduction(FilteredImageSource.java:181)
             at 
java.awt.image.FilteredImageSource.startProduction(FilteredImageSource.java:183)
             at 
sun.awt.image.ImageRepresentation.startProduction(ImageRepresentation.java:732)
             at sun.awt.image.ToolkitImage.addWatcher(ToolkitImage.java:221)
     2. Screenshot showing the occurrence of the exception at 
FilteredImageSource.java: 181.
             . The image also shows the test passing with JDK10 containing the 
fix on VM running Linux Ubuntu x64.

Thank you
Have a good day

Prahalad N.

-----Original Message-----
From: Sergey Bylokhov
Sent: Thursday, October 26, 2017 12:05 PM
To: Prahalad Kumar Narayanan; 2d-dev
Subject: Re: [OpenJDK 2D-Dev] [10] RFR: JDK-8188083- NPE in 
java.awt.image.FilteredImageSource.startProduction

Hi, Prahalad.
Can you please add a stack trace which include a line numbers to the bug 
report? Currently it is unclear in what line an exception is occurred.

On 25/10/2017 21:07, Prahalad Kumar Narayanan wrote:
Hello Everyone

Good day to you.

Kindly review a fix for the bug
      Bug ID: JDK-8188083
      Description: Null Pointer Exception in
java.awt.image.FilteredImageSource.startProduction

Root Cause
      . FilteredImageSource implements ImageProducer interface
      . All the methods of FilteredImageSource operate on a common data 
-HashTable, but only a few are synchronized methods.
      . Thus, when synchronized & un-synchronized methods access / modify the 
hash table in a multi-threaded scenario, it renders the class vulnerable to this 
exception.

Exception occurrence
      . The submitter has mentioned that there is no test-case to reproduce to 
this issue.
      . Luckily I was able to observe this issue with a "crude" test code
            . The test triggered 7k threads in an ExecutorService randomly 
adding/ removing/ invoking startProduction on FilteredImageSource object.
            . I didn't feel it robust enough to be added as a part of 
regression tests. I tried optimizing the test but in vain.
            . Hence the test code isn't added to the webrev.

Details on the Fix:
      . The concerned methods have been "synchronized" in the fix.
      . No new regression failures were observed with this change.

Kindly review the change and provide your feedback.
In addition, kindly suggest whether this requires CSR as it adds "synchronized" 
to method signature.

Review Link: http://cr.openjdk.java.net/~pnarayanan/8188083/webrev.00/

Thank you for your time in review
Have a good day

Prahalad N.







--
Best regards, Sergey.

Reply via email to