Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2014-01-07 Thread Sandeep Konchady
Hi David/Martin,

If you agree with Kalyan's fix for this issue, could one of you please sponsor 
the push.

Thanks,
Sandeep

On Dec 23, 2013, at 11:17 AM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:

 Hi David/Martin, could any one of you sponsor this change for me?
 
 ---
 Thanks
 kalyan
 
 On 12/20/2013 10:28 PM, David Holmes wrote:
 On 21/12/2013 4:19 AM, srikalyan wrote:
 Hi David, i retained only the changes to ITERS, ProbleMList.txt and
 upstream changes by Doug Lea(as pointed by Martin), could you please
 review the new change available here
 http://cr.openjdk.java.net/~srikchan/Regression/6772009-CancelledLockLoop-webrev/
  
 .
 
 Ok.
 
 Thanks,
 David
 
 -- 
 Thanks
 kalyan
 Ph: (408)-585-8040
 
 
 On 12/19/13, 8:10 PM, David Holmes wrote:
 Sorry Kalyan but I don't see the need for all the incidental changes
 if the primary change is to just increase the iterations. I also don't
 see why you need to do anything for BrokenBarrierException as it is
 not expected to happen and the test should just fail if it does.
 
 David
 
 On 10/12/2013 6:15 AM, srikalyan wrote:
 Hi David/Martin a gentle reminder for review.
 
 -- 
 Thanks
 kalyan
 Ph: (408)-585-8040
 
 
 On 12/2/13, 11:21 AM, srikalyan wrote:
 Hi David, Thanks for the review, the new webrev is hosted at
 http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/ 
 . Please see inline text.
 
 On 11/20/13, 6:35 PM, David Holmes wrote:
 On 21/11/2013 10:28 AM, Martin Buchholz wrote:
 I again tried and failed to reproduce a failure.  Even if I go whole
 hog
 and multiply TIMEOUT by 100 and divide ITERS by 100, the test
 continues to
 PASS.  Is it just me?!
 
 I think you are going the wrong way Martin - you want the timeout to
 be smaller than the time it takes to execute ITERS.
 
 I don't think there's any reason to make result long.  It's not even
 used
 except to inhibit hotspot optimizations.
 
 +private volatile long result = 17;//Can get int overflow,so
 using long
 
 Further the subsequent use of += is incorrect as it is not an atomic
 operation. Even if we don't care about the value, it looks bad.
 Made the necessary changes for atomic update.
 
 I'm not sure result must be updated if we get a
 BrokenBarrierException either. Probably harmless, but necessary?
 I retained it in the fix for completeness in updating the numbers,
 please let me know if you still think otherwise.
 
 need to fix spelling and spacing below.
 
 +barrier.await();//If a BrokeBarrierException happens
 here(due to
 
 There are a number of style issues with spacing around comments.
 Fixed the spelling error and styling issues.
 
 And I don't think this change is sufficient to claim co-author status
 with Doug either ;-)
 Removed the claim :)
 
 The additional tracing may be useful but seems stylistically
 different from the rest of the code.
 Retained the tracking to understand if it is again the timing issue
 which is the cause in an event of a failure, however i can remove it
 if you think it is not necessary (OR) include an alternate solution as
 you may want to suggest.
 
 Overall I'm suspicious that the changed timeout actually fixes
 anything - normally we need to add longer timeouts not shorter ones.
 Does this fail on a range of machines or only specific ones? Have we
 verified that the clocks/timers are behaving properly on those
 systems?
 Here the time out is not about waiting for threads to complete
 something but to be interrupted before being considered done, so we
 decreased the timeout. However we now chose to increase the number of
 iterations to 500 from 100(thanks to tristan for the
 suggestion) instead of decreasing the timeout as done earlier because
 the increasing iterations ensures the threads are busy for long time
 curtailing the need to touch the timeout.
 
 
 Thanks,
 David
 
 -- 
 Thanks
 kalyan
 Ph: (408)-585-8040
 
 
 
 
 
 On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
 srikalyan.chandrashe...@oracle.com wrote:
 
  Hi Martin , apologies for the delay , was trying to get help for
 hosting
 my webrev.  .  Please see inline text.
 
 
 On 11/19/13, 10:35 PM, Martin Buchholz wrote:
 
 Hi Kalyan,
 
  None of us can review your changes yet because you haven't given
 us a
 URL of your webrev.
 
 It is located here
 http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/
  
 
 
 
 
  I've tried to make the jsr166 copy of CancelledLockLoops fail by
 adjusting ITERS and TIMEOUT radically up and down, but the test
 just keeps
 on passing for me.  Hints appreciated.
 
 Bump up the timeout to 500ms and you will see a failure (i can
 see it
 consistently on my machine Linux 64bit,8GBRAM,dual cores, with
 JDK 1.8
 latest any promoted build).
 
 
 
 On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
 srikalyan.chandrashe...@oracle.com wrote:
 
Suggested Fix:
 a) Decrease the timeout from 100 to 50ms which will ensure that
 the test
 will pass even on faster 

Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-12-23 Thread srikalyan chandrashekar

Hi David/Martin, could any one of you sponsor this change for me?

---
Thanks
kalyan

On 12/20/2013 10:28 PM, David Holmes wrote:

On 21/12/2013 4:19 AM, srikalyan wrote:

Hi David, i retained only the changes to ITERS, ProbleMList.txt and
upstream changes by Doug Lea(as pointed by Martin), could you please
review the new change available here
http://cr.openjdk.java.net/~srikchan/Regression/6772009-CancelledLockLoop-webrev/ 


.


Ok.

Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040


On 12/19/13, 8:10 PM, David Holmes wrote:

Sorry Kalyan but I don't see the need for all the incidental changes
if the primary change is to just increase the iterations. I also don't
see why you need to do anything for BrokenBarrierException as it is
not expected to happen and the test should just fail if it does.

David

On 10/12/2013 6:15 AM, srikalyan wrote:

Hi David/Martin a gentle reminder for review.

--
Thanks
kalyan
Ph: (408)-585-8040


On 12/2/13, 11:21 AM, srikalyan wrote:

Hi David, Thanks for the review, the new webrev is hosted at
http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/ 


. Please see inline text.

On 11/20/13, 6:35 PM, David Holmes wrote:

On 21/11/2013 10:28 AM, Martin Buchholz wrote:
I again tried and failed to reproduce a failure.  Even if I go 
whole

hog
and multiply TIMEOUT by 100 and divide ITERS by 100, the test
continues to
PASS.  Is it just me?!


I think you are going the wrong way Martin - you want the timeout to
be smaller than the time it takes to execute ITERS.

I don't think there's any reason to make result long.  It's not 
even

used
except to inhibit hotspot optimizations.

+private volatile long result = 17;//Can get int 
overflow,so

using long


Further the subsequent use of += is incorrect as it is not an atomic
operation. Even if we don't care about the value, it looks bad.

Made the necessary changes for atomic update.


I'm not sure result must be updated if we get a
BrokenBarrierException either. Probably harmless, but necessary?

I retained it in the fix for completeness in updating the numbers,
please let me know if you still think otherwise.



need to fix spelling and spacing below.

+barrier.await();//If a BrokeBarrierException 
happens

here(due to


There are a number of style issues with spacing around comments.

Fixed the spelling error and styling issues.


And I don't think this change is sufficient to claim co-author 
status

with Doug either ;-)

Removed the claim :)


The additional tracing may be useful but seems stylistically
different from the rest of the code.

Retained the tracking to understand if it is again the timing issue
which is the cause in an event of a failure, however i can remove it
if you think it is not necessary (OR) include an alternate 
solution as

you may want to suggest.


Overall I'm suspicious that the changed timeout actually fixes
anything - normally we need to add longer timeouts not shorter ones.
Does this fail on a range of machines or only specific ones? Have we
verified that the clocks/timers are behaving properly on those
systems?

Here the time out is not about waiting for threads to complete
something but to be interrupted before being considered done, so we
decreased the timeout. However we now chose to increase the number of
iterations to 500 from 100(thanks to tristan for the
suggestion) instead of decreasing the timeout as done earlier because
the increasing iterations ensures the threads are busy for long time
curtailing the need to touch the timeout.



Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040







On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
srikalyan.chandrashe...@oracle.com wrote:


  Hi Martin , apologies for the delay , was trying to get help for
hosting
my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

  None of us can review your changes yet because you haven't given
us a
URL of your webrev.

It is located here
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/ 






  I've tried to make the jsr166 copy of CancelledLockLoops fail by
adjusting ITERS and TIMEOUT radically up and down, but the test
just keeps
on passing for me.  Hints appreciated.

Bump up the timeout to 500ms and you will see a failure (i can
see it
consistently on my machine Linux 64bit,8GBRAM,dual cores, with
JDK 1.8
latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:


Suggested Fix:

a) Decrease the timeout from 100 to 50ms which will ensure that
the test
will pass even on faster machines




  This doesn't look like a permanent fix - it just makes the
failing case
rarer.

Thats true , the other way is to make the main thread wait on
TIMEOUT
after firing the interrupts instead of other way round, but that
would be
over-optimization which probably is not desirable as well.  The 50
ms was
arrived at empirically 

Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-12-20 Thread srikalyan
Hi David, i retained only the changes to ITERS, ProbleMList.txt and 
upstream changes by Doug Lea(as pointed by Martin), could you please 
review the new change available here 
http://cr.openjdk.java.net/~srikchan/Regression/6772009-CancelledLockLoop-webrev/ 
.


--
Thanks
kalyan
Ph: (408)-585-8040


On 12/19/13, 8:10 PM, David Holmes wrote:
Sorry Kalyan but I don't see the need for all the incidental changes 
if the primary change is to just increase the iterations. I also don't 
see why you need to do anything for BrokenBarrierException as it is 
not expected to happen and the test should just fail if it does.


David

On 10/12/2013 6:15 AM, srikalyan wrote:

Hi David/Martin a gentle reminder for review.

--
Thanks
kalyan
Ph: (408)-585-8040


On 12/2/13, 11:21 AM, srikalyan wrote:

Hi David, Thanks for the review, the new webrev is hosted at
http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/
. Please see inline text.

On 11/20/13, 6:35 PM, David Holmes wrote:

On 21/11/2013 10:28 AM, Martin Buchholz wrote:

I again tried and failed to reproduce a failure.  Even if I go whole
hog
and multiply TIMEOUT by 100 and divide ITERS by 100, the test
continues to
PASS.  Is it just me?!


I think you are going the wrong way Martin - you want the timeout to
be smaller than the time it takes to execute ITERS.


I don't think there's any reason to make result long.  It's not even
used
except to inhibit hotspot optimizations.

+private volatile long result = 17;//Can get int overflow,so
using long


Further the subsequent use of += is incorrect as it is not an atomic
operation. Even if we don't care about the value, it looks bad.

Made the necessary changes for atomic update.


I'm not sure result must be updated if we get a
BrokenBarrierException either. Probably harmless, but necessary?

I retained it in the fix for completeness in updating the numbers,
please let me know if you still think otherwise.



need to fix spelling and spacing below.

+barrier.await();//If a BrokeBarrierException happens
here(due to


There are a number of style issues with spacing around comments.

Fixed the spelling error and styling issues.


And I don't think this change is sufficient to claim co-author status
with Doug either ;-)

Removed the claim :)


The additional tracing may be useful but seems stylistically
different from the rest of the code.

Retained the tracking to understand if it is again the timing issue
which is the cause in an event of a failure, however i can remove it
if you think it is not necessary (OR) include an alternate solution as
you may want to suggest.


Overall I'm suspicious that the changed timeout actually fixes
anything - normally we need to add longer timeouts not shorter ones.
Does this fail on a range of machines or only specific ones? Have we
verified that the clocks/timers are behaving properly on those 
systems?

Here the time out is not about waiting for threads to complete
something but to be interrupted before being considered done, so we
decreased the timeout. However we now chose to increase the number of
iterations to 500 from 100(thanks to tristan for the
suggestion) instead of decreasing the timeout as done earlier because
the increasing iterations ensures the threads are busy for long time
curtailing the need to touch the timeout.



Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040







On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
srikalyan.chandrashe...@oracle.com wrote:


  Hi Martin , apologies for the delay , was trying to get help for
hosting
my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

  None of us can review your changes yet because you haven't given
us a
URL of your webrev.

It is located here
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/ 





  I've tried to make the jsr166 copy of CancelledLockLoops fail by
adjusting ITERS and TIMEOUT radically up and down, but the test
just keeps
on passing for me.  Hints appreciated.

Bump up the timeout to 500ms and you will see a failure (i can 
see it
consistently on my machine Linux 64bit,8GBRAM,dual cores, with 
JDK 1.8

latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:


Suggested Fix:

a) Decrease the timeout from 100 to 50ms which will ensure that
the test
will pass even on faster machines




  This doesn't look like a permanent fix - it just makes the
failing case
rarer.

Thats true , the other way is to make the main thread wait on 
TIMEOUT

after firing the interrupts instead of other way round, but that
would be
over-optimization which probably is not desirable as well.  The 50
ms was
arrived at empirically after running several 100 times on multiple
configurations and did not cause failures.

--
Thanks
kalyan
Ph: (408)-585-8040





Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-12-20 Thread David Holmes

On 21/12/2013 4:19 AM, srikalyan wrote:

Hi David, i retained only the changes to ITERS, ProbleMList.txt and
upstream changes by Doug Lea(as pointed by Martin), could you please
review the new change available here
http://cr.openjdk.java.net/~srikchan/Regression/6772009-CancelledLockLoop-webrev/
.


Ok.

Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040


On 12/19/13, 8:10 PM, David Holmes wrote:

Sorry Kalyan but I don't see the need for all the incidental changes
if the primary change is to just increase the iterations. I also don't
see why you need to do anything for BrokenBarrierException as it is
not expected to happen and the test should just fail if it does.

David

On 10/12/2013 6:15 AM, srikalyan wrote:

Hi David/Martin a gentle reminder for review.

--
Thanks
kalyan
Ph: (408)-585-8040


On 12/2/13, 11:21 AM, srikalyan wrote:

Hi David, Thanks for the review, the new webrev is hosted at
http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/
. Please see inline text.

On 11/20/13, 6:35 PM, David Holmes wrote:

On 21/11/2013 10:28 AM, Martin Buchholz wrote:

I again tried and failed to reproduce a failure.  Even if I go whole
hog
and multiply TIMEOUT by 100 and divide ITERS by 100, the test
continues to
PASS.  Is it just me?!


I think you are going the wrong way Martin - you want the timeout to
be smaller than the time it takes to execute ITERS.


I don't think there's any reason to make result long.  It's not even
used
except to inhibit hotspot optimizations.

+private volatile long result = 17;//Can get int overflow,so
using long


Further the subsequent use of += is incorrect as it is not an atomic
operation. Even if we don't care about the value, it looks bad.

Made the necessary changes for atomic update.


I'm not sure result must be updated if we get a
BrokenBarrierException either. Probably harmless, but necessary?

I retained it in the fix for completeness in updating the numbers,
please let me know if you still think otherwise.



need to fix spelling and spacing below.

+barrier.await();//If a BrokeBarrierException happens
here(due to


There are a number of style issues with spacing around comments.

Fixed the spelling error and styling issues.


And I don't think this change is sufficient to claim co-author status
with Doug either ;-)

Removed the claim :)


The additional tracing may be useful but seems stylistically
different from the rest of the code.

Retained the tracking to understand if it is again the timing issue
which is the cause in an event of a failure, however i can remove it
if you think it is not necessary (OR) include an alternate solution as
you may want to suggest.


Overall I'm suspicious that the changed timeout actually fixes
anything - normally we need to add longer timeouts not shorter ones.
Does this fail on a range of machines or only specific ones? Have we
verified that the clocks/timers are behaving properly on those
systems?

Here the time out is not about waiting for threads to complete
something but to be interrupted before being considered done, so we
decreased the timeout. However we now chose to increase the number of
iterations to 500 from 100(thanks to tristan for the
suggestion) instead of decreasing the timeout as done earlier because
the increasing iterations ensures the threads are busy for long time
curtailing the need to touch the timeout.



Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040







On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
srikalyan.chandrashe...@oracle.com wrote:


  Hi Martin , apologies for the delay , was trying to get help for
hosting
my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

  None of us can review your changes yet because you haven't given
us a
URL of your webrev.

It is located here
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/




  I've tried to make the jsr166 copy of CancelledLockLoops fail by
adjusting ITERS and TIMEOUT radically up and down, but the test
just keeps
on passing for me.  Hints appreciated.

Bump up the timeout to 500ms and you will see a failure (i can
see it
consistently on my machine Linux 64bit,8GBRAM,dual cores, with
JDK 1.8
latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:


Suggested Fix:

a) Decrease the timeout from 100 to 50ms which will ensure that
the test
will pass even on faster machines




  This doesn't look like a permanent fix - it just makes the
failing case
rarer.

Thats true , the other way is to make the main thread wait on
TIMEOUT
after firing the interrupts instead of other way round, but that
would be
over-optimization which probably is not desirable as well.  The 50
ms was
arrived at empirically after running several 100 times on multiple
configurations and did not cause failures.

--
Thanks
kalyan
Ph: (408)-585-8040





Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-12-19 Thread David Holmes
Sorry Kalyan but I don't see the need for all the incidental changes if 
the primary change is to just increase the iterations. I also don't see 
why you need to do anything for BrokenBarrierException as it is not 
expected to happen and the test should just fail if it does.


David

On 10/12/2013 6:15 AM, srikalyan wrote:

Hi David/Martin a gentle reminder for review.

--
Thanks
kalyan
Ph: (408)-585-8040


On 12/2/13, 11:21 AM, srikalyan wrote:

Hi David, Thanks for the review, the new webrev is hosted at
http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/
. Please see inline text.

On 11/20/13, 6:35 PM, David Holmes wrote:

On 21/11/2013 10:28 AM, Martin Buchholz wrote:

I again tried and failed to reproduce a failure.  Even if I go whole
hog
and multiply TIMEOUT by 100 and divide ITERS by 100, the test
continues to
PASS.  Is it just me?!


I think you are going the wrong way Martin - you want the timeout to
be smaller than the time it takes to execute ITERS.


I don't think there's any reason to make result long.  It's not even
used
except to inhibit hotspot optimizations.

+private volatile long result = 17;//Can get int overflow,so
using long


Further the subsequent use of += is incorrect as it is not an atomic
operation. Even if we don't care about the value, it looks bad.

Made the necessary changes for atomic update.


I'm not sure result must be updated if we get a
BrokenBarrierException either. Probably harmless, but necessary?

I retained it in the fix for completeness in updating the numbers,
please let me know if you still think otherwise.



need to fix spelling and spacing below.

+barrier.await();//If a BrokeBarrierException happens
here(due to


There are a number of style issues with spacing around comments.

Fixed the spelling error and styling issues.


And I don't think this change is sufficient to claim co-author status
with Doug either ;-)

Removed the claim :)


The additional tracing may be useful but seems stylistically
different from the rest of the code.

Retained the tracking to understand if it is again the timing issue
which is the cause in an event of a failure, however i can remove it
if you think it is not necessary (OR) include an alternate solution as
you may want to suggest.


Overall I'm suspicious that the changed timeout actually fixes
anything - normally we need to add longer timeouts not shorter ones.
Does this fail on a range of machines or only specific ones? Have we
verified that the clocks/timers are behaving properly on those systems?

Here the time out is not about waiting for threads to complete
something but to be interrupted before being considered done, so we
decreased the timeout. However we now chose to increase the number of
iterations to 500 from 100(thanks to tristan for the
suggestion) instead of decreasing the timeout as done earlier because
the increasing iterations ensures the threads are busy for long time
curtailing the need to touch the timeout.



Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040







On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
srikalyan.chandrashe...@oracle.com wrote:


  Hi Martin , apologies for the delay , was trying to get help for
hosting
my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

  None of us can review your changes yet because you haven't given
us a
URL of your webrev.

It is located here
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/



  I've tried to make the jsr166 copy of CancelledLockLoops fail by
adjusting ITERS and TIMEOUT radically up and down, but the test
just keeps
on passing for me.  Hints appreciated.

Bump up the timeout to 500ms and you will see a failure (i can see it
consistently on my machine Linux 64bit,8GBRAM,dual cores, with JDK 1.8
latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:


Suggested Fix:

a) Decrease the timeout from 100 to 50ms which will ensure that
the test
will pass even on faster machines




  This doesn't look like a permanent fix - it just makes the
failing case
rarer.

Thats true , the other way is to make the main thread wait on TIMEOUT
after firing the interrupts instead of other way round, but that
would be
over-optimization which probably is not desirable as well.  The 50
ms was
arrived at empirically after running several 100 times on multiple
configurations and did not cause failures.

--
Thanks
kalyan
Ph: (408)-585-8040





Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-12-09 Thread srikalyan

Hi David/Martin a gentle reminder for review.

--
Thanks
kalyan
Ph: (408)-585-8040


On 12/2/13, 11:21 AM, srikalyan wrote:
Hi David, Thanks for the review, the new webrev is hosted at 
http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/ 
. Please see inline text.


On 11/20/13, 6:35 PM, David Holmes wrote:

On 21/11/2013 10:28 AM, Martin Buchholz wrote:
I again tried and failed to reproduce a failure.  Even if I go whole 
hog
and multiply TIMEOUT by 100 and divide ITERS by 100, the test 
continues to

PASS.  Is it just me?!


I think you are going the wrong way Martin - you want the timeout to 
be smaller than the time it takes to execute ITERS.


I don't think there's any reason to make result long.  It's not even 
used

except to inhibit hotspot optimizations.

+private volatile long result = 17;//Can get int overflow,so 
using long


Further the subsequent use of += is incorrect as it is not an atomic 
operation. Even if we don't care about the value, it looks bad.

Made the necessary changes for atomic update.


I'm not sure result must be updated if we get a 
BrokenBarrierException either. Probably harmless, but necessary?
I retained it in the fix for completeness in updating the numbers, 
please let me know if you still think otherwise.



need to fix spelling and spacing below.

+barrier.await();//If a BrokeBarrierException happens
here(due to


There are a number of style issues with spacing around comments.

Fixed the spelling error and styling issues.


And I don't think this change is sufficient to claim co-author status 
with Doug either ;-)

Removed the claim :)


The additional tracing may be useful but seems stylistically 
different from the rest of the code.
Retained the tracking to understand if it is again the timing issue 
which is the cause in an event of a failure, however i can remove it 
if you think it is not necessary (OR) include an alternate solution as 
you may want to suggest.


Overall I'm suspicious that the changed timeout actually fixes 
anything - normally we need to add longer timeouts not shorter ones. 
Does this fail on a range of machines or only specific ones? Have we 
verified that the clocks/timers are behaving properly on those systems?
Here the time out is not about waiting for threads to complete 
something but to be interrupted before being considered done, so we 
decreased the timeout. However we now chose to increase the number of 
iterations to 500 from 100(thanks to tristan for the 
suggestion) instead of decreasing the timeout as done earlier because 
the increasing iterations ensures the threads are busy for long time 
curtailing the need to touch the timeout.




Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040







On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
srikalyan.chandrashe...@oracle.com wrote:

  Hi Martin , apologies for the delay , was trying to get help for 
hosting

my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

  None of us can review your changes yet because you haven't given 
us a

URL of your webrev.

It is located here
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/ 




  I've tried to make the jsr166 copy of CancelledLockLoops fail by
adjusting ITERS and TIMEOUT radically up and down, but the test 
just keeps

on passing for me.  Hints appreciated.

Bump up the timeout to 500ms and you will see a failure (i can see it
consistently on my machine Linux 64bit,8GBRAM,dual cores, with JDK 1.8
latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:


Suggested Fix:
a) Decrease the timeout from 100 to 50ms which will ensure that 
the test

will pass even on faster machines



  This doesn't look like a permanent fix - it just makes the 
failing case

rarer.

Thats true , the other way is to make the main thread wait on TIMEOUT
after firing the interrupts instead of other way round, but that 
would be
over-optimization which probably is not desirable as well.  The 50 
ms was

arrived at empirically after running several 100 times on multiple
configurations and did not cause failures.

--
Thanks
kalyan
Ph: (408)-585-8040





Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-12-02 Thread srikalyan
Hi David, Thanks for the review, the new webrev is hosted at 
http://cr.openjdk.java.net/~cl/host_for_kal/6772009-CancelledLockLoop/ . 
Please see inline text.


On 11/20/13, 6:35 PM, David Holmes wrote:

On 21/11/2013 10:28 AM, Martin Buchholz wrote:

I again tried and failed to reproduce a failure.  Even if I go whole hog
and multiply TIMEOUT by 100 and divide ITERS by 100, the test 
continues to

PASS.  Is it just me?!


I think you are going the wrong way Martin - you want the timeout to 
be smaller than the time it takes to execute ITERS.


I don't think there's any reason to make result long.  It's not even 
used

except to inhibit hotspot optimizations.

+private volatile long result = 17;//Can get int overflow,so 
using long


Further the subsequent use of += is incorrect as it is not an atomic 
operation. Even if we don't care about the value, it looks bad.

Made the necessary changes for atomic update.


I'm not sure result must be updated if we get a BrokenBarrierException 
either. Probably harmless, but necessary?
I retained it in the fix for completeness in updating the numbers, 
please let me know if you still think otherwise.



need to fix spelling and spacing below.

+barrier.await();//If a BrokeBarrierException happens
here(due to


There are a number of style issues with spacing around comments.

Fixed the spelling error and styling issues.


And I don't think this change is sufficient to claim co-author status 
with Doug either ;-)

Removed the claim :)


The additional tracing may be useful but seems stylistically different 
from the rest of the code.
Retained the tracking to understand if it is again the timing issue 
which is the cause in an event of a failure, however i can remove it if 
you think it is not necessary (OR) include an alternate solution as you 
may want to suggest.


Overall I'm suspicious that the changed timeout actually fixes 
anything - normally we need to add longer timeouts not shorter ones. 
Does this fail on a range of machines or only specific ones? Have we 
verified that the clocks/timers are behaving properly on those systems?
Here the time out is not about waiting for threads to complete something 
but to be interrupted before being considered done, so we decreased 
the timeout. However we now chose to increase the number of iterations 
to 500 from 100(thanks to tristan for the suggestion) instead of 
decreasing the timeout as done earlier because the increasing iterations 
ensures the threads are busy for long time curtailing the need to touch 
the timeout.




Thanks,
David


--
Thanks
kalyan
Ph: (408)-585-8040







On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
srikalyan.chandrashe...@oracle.com wrote:

  Hi Martin , apologies for the delay , was trying to get help for 
hosting

my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

  None of us can review your changes yet because you haven't given us a
URL of your webrev.

It is located here
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/ 




  I've tried to make the jsr166 copy of CancelledLockLoops fail by
adjusting ITERS and TIMEOUT radically up and down, but the test just 
keeps

on passing for me.  Hints appreciated.

Bump up the timeout to 500ms and you will see a failure (i can see it
consistently on my machine Linux 64bit,8GBRAM,dual cores, with JDK 1.8
latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:


Suggested Fix:
a) Decrease the timeout from 100 to 50ms which will ensure that 
the test

will pass even on faster machines



  This doesn't look like a permanent fix - it just makes the failing 
case

rarer.

Thats true , the other way is to make the main thread wait on TIMEOUT
after firing the interrupts instead of other way round, but that 
would be
over-optimization which probably is not desirable as well.  The 50 
ms was

arrived at empirically after running several 100 times on multiple
configurations and did not cause failures.

--
Thanks
kalyan
Ph: (408)-585-8040





Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-11-20 Thread srikalyan
Hi Martin , apologies for the delay , was trying to get help for hosting 
my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

None of us can review your changes yet because you haven't given us a 
URL of your webrev.
It is located here 
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/ 



I've tried to make the jsr166 copy of CancelledLockLoops fail by 
adjusting ITERS and TIMEOUT radically up and down, but the test just 
keeps on passing for me.  Hints appreciated.
Bump up the timeout to 500ms and you will see a failure (i can see it 
consistently on my machine Linux 64bit,8GBRAM,dual cores, with JDK 1.8 
latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com 
mailto:srikalyan.chandrashe...@oracle.com wrote:



Suggested Fix:
a) Decrease the timeout from 100 to 50ms which will ensure
that the test will pass even on faster machines



This doesn't look like a permanent fix - it just makes the failing 
case rarer.
Thats true , the other way is to make the main thread wait on TIMEOUT 
after firing the interrupts instead of other way round, but that would 
be over-optimization which probably is not desirable as well.  The 50 ms 
was arrived at empirically after running several 100 times on multiple 
configurations and did not cause failures.


--
Thanks
kalyan
Ph: (408)-585-8040




Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-11-20 Thread srikalyan
Hi David , webrev is hosted here 
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/ 
.


--
Thanks
kalyan
Ph: (408)-585-8040


On 11/19/13, 11:03 AM, David Holmes wrote:

Hi,

Attachments are stripped. Please post on cr.openjdk.java.net (get a 
colleague to host this if you don't have an account yet.)


Thanks,
David

On 19/11/2013 4:12 PM, srikalyan chandrashekar wrote:

Hi all, I am working on bug JDK-6772009
https://bugs.openjdk.java.net/browse/JDK-6772009 .

Root Cause:
The timeout value gives too much grace period on faster machines on
which the TO BE INTERRUPTED threads complete faster before being
interrupted at right time.

Suggested Fix:
a) Decrease the timeout from 100 to 50ms which will ensure that the test
will pass even on faster machines , and ensures the threads will be
canceled when running and anyways there is a Barrier to ensure the test
threads all complete gracefully.
Miscellaneous fixes
b) Convert result from int to long(possible integer overflow otherwise)
c) Catch BrokenBarrierException in more granular fashion in
ReentrantLockLoop to update and print the results (which will be missed
otherwise)
Add more diagnostics
d) Assign names to threads
e) Print which threads update the 'completed' variable.

I have attached the webrev for convenience as the changes are
interleaved and is best viewed as sdiff.
Please let me know if you have any comments or suggestions.



Thank you



Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-11-20 Thread David Holmes

On 21/11/2013 10:28 AM, Martin Buchholz wrote:

I again tried and failed to reproduce a failure.  Even if I go whole hog
and multiply TIMEOUT by 100 and divide ITERS by 100, the test continues to
PASS.  Is it just me?!


I think you are going the wrong way Martin - you want the timeout to be 
smaller than the time it takes to execute ITERS.



I don't think there's any reason to make result long.  It's not even used
except to inhibit hotspot optimizations.

+private volatile long result = 17;//Can get int overflow,so using long


Further the subsequent use of += is incorrect as it is not an atomic 
operation. Even if we don't care about the value, it looks bad.


I'm not sure result must be updated if we get a BrokenBarrierException 
either. Probably harmless, but necessary?



need to fix spelling and spacing below.

+barrier.await();//If a BrokeBarrierException happens
here(due to


There are a number of style issues with spacing around comments.

And I don't think this change is sufficient to claim co-author status 
with Doug either ;-)


The additional tracing may be useful but seems stylistically different 
from the rest of the code.


Overall I'm suspicious that the changed timeout actually fixes anything 
- normally we need to add longer timeouts not shorter ones. Does this 
fail on a range of machines or only specific ones? Have we verified that 
the clocks/timers are behaving properly on those systems?


Thanks,
David





On Wed, Nov 20, 2013 at 11:52 AM, srikalyan 
srikalyan.chandrashe...@oracle.com wrote:


  Hi Martin , apologies for the delay , was trying to get help for hosting
my webrev.  .  Please see inline text.


On 11/19/13, 10:35 PM, Martin Buchholz wrote:

Hi Kalyan,

  None of us can review your changes yet because you haven't given us a
URL of your webrev.

It is located here
http://cr.openjdk.java.net/~cl/host_for_srikalyan_6772009_CancelledLockLoops/


  I've tried to make the jsr166 copy of CancelledLockLoops fail by
adjusting ITERS and TIMEOUT radically up and down, but the test just keeps
on passing for me.  Hints appreciated.

Bump up the timeout to 500ms and you will see a failure (i can see it
consistently on my machine Linux 64bit,8GBRAM,dual cores, with JDK 1.8
latest any promoted build).



On Tue, Nov 19, 2013 at 6:39 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com wrote:


Suggested Fix:

a) Decrease the timeout from 100 to 50ms which will ensure that the test
will pass even on faster machines




  This doesn't look like a permanent fix - it just makes the failing case
rarer.

Thats true , the other way is to make the main thread wait on TIMEOUT
after firing the interrupts instead of other way round, but that would be
over-optimization which probably is not desirable as well.  The 50 ms was
arrived at empirically after running several 100 times on multiple
configurations and did not cause failures.

--
Thanks
kalyan
Ph: (408)-585-8040





Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-11-19 Thread David Holmes

Hi,

Attachments are stripped. Please post on cr.openjdk.java.net (get a 
colleague to host this if you don't have an account yet.)


Thanks,
David

On 19/11/2013 4:12 PM, srikalyan chandrashekar wrote:

Hi all, I am working on bug JDK-6772009
https://bugs.openjdk.java.net/browse/JDK-6772009 .

Root Cause:
The timeout value gives too much grace period on faster machines on
which the TO BE INTERRUPTED threads complete faster before being
interrupted at right time.

Suggested Fix:
a) Decrease the timeout from 100 to 50ms which will ensure that the test
will pass even on faster machines , and ensures the threads will be
canceled when running and anyways there is a Barrier to ensure the test
threads all complete gracefully.
Miscellaneous fixes
b) Convert result from int to long(possible integer overflow otherwise)
c) Catch BrokenBarrierException in more granular fashion in
ReentrantLockLoop to update and print the results (which will be missed
otherwise)
Add more diagnostics
d) Assign names to threads
e) Print which threads update the 'completed' variable.

I have attached the webrev for convenience as the changes are
interleaved and is best viewed as sdiff.
Please let me know if you have any comments or suggestions.



Thank you



Re: RFR for JDK-6772009 Intermittent test failure: java/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java test failed with 'Completed != 2'

2013-11-19 Thread srikalyan chandrashekar
Hi Martin, i incorporated the recent changes from the pointer as well. I 
have reproduced the failure, the logs of which are attached to the bug 
JDK-6772009 https://bugs.openjdk.java.net/browse/JDK-6772009 . The 
failed log is especially interesting .


--
Thanks
kalyan

On 11/18/13 10:15 PM, Martin Buchholz wrote:

Thanks for working on this.
There have been some recent upstream changes to this test as well. 
 Please incorporate them.

http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/test/jtreg/util/concurrent/locks/ReentrantLock/CancelledLockLoops.java?view=co
The jsr166 maintainers haven't been able to reproduce any failures in 
this test.  Do you have any hints that might help us?




On Mon, Nov 18, 2013 at 10:12 PM, srikalyan chandrashekar 
srikalyan.chandrashe...@oracle.com 
mailto:srikalyan.chandrashe...@oracle.com wrote:


Hi all, I am working on bug JDK-6772009
https://bugs.openjdk.java.net/browse/JDK-6772009 .

Root Cause:
The timeout value gives too much grace period on faster machines
on which the TO BE INTERRUPTED threads complete faster before
being interrupted at right time.

Suggested Fix:
a) Decrease the timeout from 100 to 50ms which will ensure that
the test will pass even on faster machines , and ensures the
threads will be canceled when running and anyways there is a
Barrier to ensure the test threads all complete gracefully.
Miscellaneous fixes
b) Convert result from int to long(possible integer overflow
otherwise)
c) Catch BrokenBarrierException in more granular fashion in
ReentrantLockLoop to update and print the results (which will be
missed otherwise)
Add more diagnostics
d) Assign names to threads
e) Print which threads update the 'completed' variable.

I have attached the webrev for convenience as the changes are
interleaved and is best viewed as sdiff.
Please let me know if you have any comments or suggestions.



Thank you

-- 


--
Thanks
kalyan