Patch updated, see
https://issues.apache.org/bugzilla/show_bug.cgi?id=46962#c15

Alex Giotis


On Mar 1, 2012, at 2:20 PM, Alexios Giotis wrote:

> Vincent, 
> 
> No need to send more info. I just reproduced it on Windows Server 2008 R2 
> Enterprise running the latest, 'non-business', but not supported as of Nov 
> 2009, Sun JRE 1.5.0_22. The production servers are all running JDK6 and I 
> could not test it on a Mac.
> 
> The bug is indeed related to the concurrent hash map iterators and is fixed 
> on JDK6, see
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6312056
> 
> I will update the patch and I hope a committer will check it.
> 
> Alex Giotis
> 
> 
> 
> 
> On Mar 1, 2012, at 3:13 AM, Alexios Giotis wrote:
> 
>> Hi Vincent,
>> 
>> The unit tests where added by Mehdi but I could happily update them to Junit 
>> 4 and create the separate methods.
>> 
>> Related to the NullPointerException that you found, I tried many times but 
>> unfortunately I can't reproduce them on my Macbook running MacOS X 10.7.3  
>> with any JRE and have not seen it in any heavily loaded server. What is the 
>> environment that you tested it ? I could test it on different servers/OS or 
>> try to write a test of ConcurrentHashMap that can trigger it.
>> 
>> 
>> Related to what is happening, this is tricky. The ConcurrentHashMap does not 
>> allow putting null values, so one would not expect to get back null values. 
>> But in the cleanReclaimedMapEntries() we are getting the values using an 
>> iterator over the entrySet. As the javadoc says, the view's iterator are 
>> "weakly consistent", never throw ConcurrentModificationException and they 
>> are actually designed to be used by only one thread at a time. So this could 
>> explain why a null value could be returned. 
>> 
>> Depending on how this is reproduced, we could either avoid this by just 
>> checking for a null value or we could put the iteration in a block 
>> synchronized on the map instance. This loop is done once per 10000 items 
>> added, so it should not be slower. But before making this change, I need a 
>> way to reproduce it so that we are sure about it.
>> 
>> 
>> Alex Giotis
>> 
>> 
>> 
>> 
>> On Feb 29, 2012, at 11:42 PM, Vincent Hennebert wrote:
>> 
>>> Sorry for the delay about this. I had a look at the patch some time ago
>>> but didn’t have enough time to apply it.
>>> 
>>> I noticed that a NPE occurs when adding the following piece of code to
>>> PropertyCacheTestCase and running it:
>>> 
>>>  private final PropertyCache<Integer> cache = new PropertyCache<Integer>();
>>> 
>>>  private class CacheFiller implements Runnable {
>>> 
>>>      private final int start;
>>> 
>>>      CacheFiller(int start) {
>>>          this.start = start;
>>>      }
>>> 
>>>      public void run() {
>>>          for (int i = 0; i < 1000000; i++) {
>>>              cache.fetch(new Integer(start + i));
>>>          }
>>>      }
>>>  }
>>> 
>>>  public void testCleanUp() throws InterruptedException {
>>>      Thread t1 = new Thread(new CacheFiller(0));
>>>      Thread t2 = new Thread(new CacheFiller(10000));
>>>      t1.start();
>>>      t2.start();
>>>      t1.join();
>>>      t2.join();
>>>  }
>>> 
>>> Here’s the stacktrace:
>>> Exception in thread "Thread-1" java.lang.NullPointerException
>>>     at
>>> org.apache.fop.fo.properties.PropertyCache.cleanReclaimedMapEntries(PropertyCache.java:153)
>>>     at 
>>> org.apache.fop.fo.properties.PropertyCache.fetch(PropertyCache.java:105)
>>>     at
>>> org.apache.fop.fo.properties.PropertyCacheTestCase$CacheFiller.run(PropertyCacheTestCase.java:68)
>>>     at java.lang.Thread.run(Thread.java:595)
>>> 
>>> It happens when running the test with a Sun 1.5 JRE but apparently not
>>> with OpenJDK 1.6. I’m still unsure about what’s happening.
>>> 
>>> It would be good to update the test cases to JUnit 4. Also, there are
>>> some clean-ups and improvements to bring. Among other things, symmetry
>>> is not properly tested in FObjectTest (symmetric.equals(getSut()) should
>>> also be run); the tests for reflexivity, symmetry and transitivity
>>> should be put in separate methods; testFailureCases is misnamed; etc.
>>> Little things but that take some time to fix.
>>> 
>>> Vincent
>>> 
>>> 
>>> On 28/02/12 17:09, mehdi houshmand wrote:
>>>> Hi Guys,
>>>> 
>>>> My apologies for the lack of transparency on this issue, but I didn't
>>>> actually review the changes you made here, in fact, I barely looked at
>>>> what PropertyCache actually does. I had some free time, and added a
>>>> bunch of unit tests.
>>>> 
>>>> The reason this hasn't been committed yet was because Vincent said he
>>>> had some questions about the patch. That's as far as I know, maybe he
>>>> could give some feedback on the issue.
>>>> 
>>>> Let me reiterate my apologies again on this, it's not fair that this
>>>> has been ignored. I'll endeavour to make the process more transparent
>>>> in future, I hope this doesn't prevent you or any other contributors
>>>> from submitting patches.
>>>> 
>>>> Mehdi
>>>> 
>>>> 
>>>> On 28 February 2012 16:52, Glenn Adams <gl...@skynav.com> wrote:
>>>>> I support committing this patch, however I don't see an ICLA listed at [1]
>>>>> for Alexios. Alexios, if you have not submitted an ICLA [2], please do so.
>>>>> 
>>>>> I would be happy to apply the patch (if Mehdi doesn't have the time).
>>>>> 
>>>>> [1] http://people.apache.org/committer-index.html#unlistedclas
>>>>> [2] http://www.apache.org/licenses/icla.txt
>>>>> 
>>>>> 
>>>>> On Tue, Feb 28, 2012 at 6:27 AM, Alexios Giotis <alex.gio...@gmail.com>
>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> About 6 months ago, I had a deadlock issue that regularly stopped
>>>>>> production servers. While I was opening a bugzilla ticket, I found that 
>>>>>> this
>>>>>> was already reported back in 2009. This issue is still opened as it was
>>>>>> difficult to reproduce. On that issue, I added:
>>>>>> 
>>>>>> [1] An explanation of why a deadlock is possible.
>>>>>> [1] Stacktraces of deadlocked threads from a production server.
>>>>>> [2] A small unit test that adds a Thread.sleep() to the PropertyCache to
>>>>>> make it always reproducable.
>>>>>> [3] A patch solving this issue.
>>>>>> [4] Explanations of why the patch rewrites the existing PropertyCache
>>>>>> class.
>>>>>> 
>>>>>> This was then reviewed and unit tests were added [5]. On top of this, I
>>>>>> have committed the fix in my private branch and it works well on several 
>>>>>> big
>>>>>> production systems. This is as far as I can go before a FOP committer 
>>>>>> takes
>>>>>> it further. I am writing this because:
>>>>>> 
>>>>>> - Deadlocks should be fixed. When they occur, there is no way around 
>>>>>> them.
>>>>>> - The trunk is moving, the patch is aging and it will be more difficult 
>>>>>> to
>>>>>> apply it over time.
>>>>>> - It is discouraging for submitting more patches.
>>>>>> 
>>>>>> 
>>>>>> Alexios Giotis
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> [1] https://issues.apache.org/bugzilla/show_bug.cgi?id=46962#c3
>>>>>> [2] https://issues.apache.org/bugzilla/attachment.cgi?id=27342
>>>>>> [3] 
>>>>>> https://issues.apache.org/bugzilla/attachment.cgi?id=27477&action=diff
>>>>>> [4] https://issues.apache.org/bugzilla/show_bug.cgi?id=46962#c7
>>>>>> [5] https://issues.apache.org/bugzilla/show_bug.cgi?id=46962#c9
>>>>>> 
>>>>> 
>> 
> 

Reply via email to