Re: Review Request 41915: Kill flaky TaskObserverTest.

2016-01-05 Thread Bill Farner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41915/#review112875
---

Ship it!


Ship It!

- Bill Farner


On Jan. 4, 2016, 9:15 p.m., John Sirois wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41915/
> ---
> 
> (Updated Jan. 4, 2016, 9:15 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Bugs: AURORA-1570
> https://issues.apache.org/jira/browse/AURORA-1570
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Previously, a mock threading.Event was waited on in one thread
> and the count of waits was read in another thread.  Most thread
> memory models do not guaranty reads are fresh in this scenario
> unless there is a memory barrier of some sort forcing per-cpu
> caches to be flushed.
> 
> Since the test really only verified correct conversion of a poll
> interval to fractional seconds - kill the test as not pulling its
> weight.
> 
>  src/test/python/apache/thermos/observer/BUILD |  1 +
>  src/test/python/apache/thermos/observer/test_task_observer.py | 45 
> -
>  2 files changed, 1 insertion(+), 45 deletions(-)
> 
> 
> Diffs
> -
> 
>   src/test/python/apache/thermos/observer/BUILD 
> f3f697c35ee5473072171926923459a4dde15545 
>   src/test/python/apache/thermos/observer/test_task_observer.py 
> ace15c5305e75fac3a82971f4d71b92bcb37bafc 
> 
> Diff: https://reviews.apache.org/r/41915/diff/
> 
> 
> Testing
> ---
> 
> Before this change I got a failure between 1/5 and 1/10th of the
> time via:
> ```
> while true
> do
>   ./pants test src/test/python/apache/thermos/observer/ -- -kTaskObserverTest
> done
> ```
> 
> After the change I cannot trigger the failure.
> 
> 
> Thanks,
> 
> John Sirois
> 
>



Re: Review Request 41915: Kill flaky TaskObserverTest.

2016-01-04 Thread Maxim Khutornenko


> On Jan. 5, 2016, 3:25 a.m., Maxim Khutornenko wrote:
> > src/main/python/apache/thermos/observer/task_observer.py, line 72
> > 
> >
> > We usually put this default initialization directly into the arg list, 
> > e.g:'stop_event=threading.Event()'.
> 
> John Sirois wrote:
> I'm leery of that for mutable objects like an Event.  Surprising things 
> happen if/when the containing object gets constructed a 2nd time and the 
> single default Event has been mutated!
> 
> John Sirois wrote:
> Maxim - your test originally so your call.  For a refresh - see here: 
> https://reviews.apache.org/r/35527/
> 
> John Sirois wrote:
> Disregard comment immediately above - now moved up under Bill's comment 
> further above where it belongs.
> 
> Maxim Khutornenko wrote:
> Not sure how the current approach helps with anything. If a constructor 
> is called once with `stop_event` and another time without it, wouldn't we end 
> up in the same state with mutated field due to `stop_event` arg being None 
> during the second call?
> 
> John Sirois wrote:
> The issue is 2 or more calls without.  Say the interleaving goes like 
> this:
> ```
> to1 = TaskObserver(createPathDetector())
> to1.stop()
> to2 = TaskObserver(createPathDetector())
> ```
> 
> Since `to2` shares `to1`'s Event, its already `set` from the `stop` of 
> `to1` and so never runs.
> 
> Instead, by using `None` as the sentinel for the default, both `to1` and 
> `to2` each get their own fresh Event.
> 
> John Sirois wrote:
> This might explain things better than words:
> ```
> >>> def surprise(value, mutable=[]):
> ...   mutable.append(value)
> ...   return mutable
> ... 
> >>> surprise(42)
> [42]
> >>> surprise(1137)
> [42, 1137]
> >>> def surprise(value, mutable=None):
> ...   mutable = mutable or []
> ...   mutable.append(value)
> ...   return mutable
> ... 
> >>> surprise(42)
> [42]
> >>> surprise(1137)
> [1137]
> >>>
> ```

Ah, you're right. This python mutable default argument "feature" keeps catching 
me every time.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41915/#review112750
---


On Jan. 5, 2016, 5:15 a.m., John Sirois wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41915/
> ---
> 
> (Updated Jan. 5, 2016, 5:15 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Bugs: AURORA-1570
> https://issues.apache.org/jira/browse/AURORA-1570
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Previously, a mock threading.Event was waited on in one thread
> and the count of waits was read in another thread.  Most thread
> memory models do not guaranty reads are fresh in this scenario
> unless there is a memory barrier of some sort forcing per-cpu
> caches to be flushed.
> 
> Since the test really only verified correct conversion of a poll
> interval to fractional seconds - kill the test as not pulling its
> weight.
> 
>  src/test/python/apache/thermos/observer/BUILD |  1 +
>  src/test/python/apache/thermos/observer/test_task_observer.py | 45 
> -
>  2 files changed, 1 insertion(+), 45 deletions(-)
> 
> 
> Diffs
> -
> 
>   src/test/python/apache/thermos/observer/BUILD 
> f3f697c35ee5473072171926923459a4dde15545 
>   src/test/python/apache/thermos/observer/test_task_observer.py 
> ace15c5305e75fac3a82971f4d71b92bcb37bafc 
> 
> Diff: https://reviews.apache.org/r/41915/diff/
> 
> 
> Testing
> ---
> 
> Before this change I got a failure between 1/5 and 1/10th of the
> time via:
> ```
> while true
> do
>   ./pants test src/test/python/apache/thermos/observer/ -- -kTaskObserverTest
> done
> ```
> 
> After the change I cannot trigger the failure.
> 
> 
> Thanks,
> 
> John Sirois
> 
>



Re: Review Request 41915: Kill flaky TaskObserverTest.

2016-01-04 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41915/#review112769
---

Ship it!


Master (227a3a6) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Jan. 5, 2016, 5:15 a.m., John Sirois wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41915/
> ---
> 
> (Updated Jan. 5, 2016, 5:15 a.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Bugs: AURORA-1570
> https://issues.apache.org/jira/browse/AURORA-1570
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Previously, a mock threading.Event was waited on in one thread
> and the count of waits was read in another thread.  Most thread
> memory models do not guaranty reads are fresh in this scenario
> unless there is a memory barrier of some sort forcing per-cpu
> caches to be flushed.
> 
> Since the test really only verified correct conversion of a poll
> interval to fractional seconds - kill the test as not pulling its
> weight.
> 
>  src/test/python/apache/thermos/observer/BUILD |  1 +
>  src/test/python/apache/thermos/observer/test_task_observer.py | 45 
> -
>  2 files changed, 1 insertion(+), 45 deletions(-)
> 
> 
> Diffs
> -
> 
>   src/test/python/apache/thermos/observer/BUILD 
> f3f697c35ee5473072171926923459a4dde15545 
>   src/test/python/apache/thermos/observer/test_task_observer.py 
> ace15c5305e75fac3a82971f4d71b92bcb37bafc 
> 
> Diff: https://reviews.apache.org/r/41915/diff/
> 
> 
> Testing
> ---
> 
> Before this change I got a failure between 1/5 and 1/10th of the
> time via:
> ```
> while true
> do
>   ./pants test src/test/python/apache/thermos/observer/ -- -kTaskObserverTest
> done
> ```
> 
> After the change I cannot trigger the failure.
> 
> 
> Thanks,
> 
> John Sirois
> 
>



Re: Review Request 41915: Kill flaky TaskObserverTest.

2016-01-04 Thread John Sirois


> On Jan. 4, 2016, 8:25 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/thermos/observer/task_observer.py, line 72
> > 
> >
> > We usually put this default initialization directly into the arg list, 
> > e.g:'stop_event=threading.Event()'.
> 
> John Sirois wrote:
> I'm leery of that for mutable objects like an Event.  Surprising things 
> happen if/when the containing object gets constructed a 2nd time and the 
> single default Event has been mutated!
> 
> John Sirois wrote:
> Maxim - your test originally so your call.  For a refresh - see here: 
> https://reviews.apache.org/r/35527/
> 
> John Sirois wrote:
> Disregard comment immediately above - now moved up under Bill's comment 
> further above where it belongs.
> 
> Maxim Khutornenko wrote:
> Not sure how the current approach helps with anything. If a constructor 
> is called once with `stop_event` and another time without it, wouldn't we end 
> up in the same state with mutated field due to `stop_event` arg being None 
> during the second call?
> 
> John Sirois wrote:
> The issue is 2 or more calls without.  Say the interleaving goes like 
> this:
> ```
> to1 = TaskObserver(createPathDetector())
> to1.stop()
> to2 = TaskObserver(createPathDetector())
> ```
> 
> Since `to2` shares `to1`'s Event, its already `set` from the `stop` of 
> `to1` and so never runs.
> 
> Instead, by using `None` as the sentinel for the default, both `to1` and 
> `to2` each get their own fresh Event.

This might explain things better than words:
```
>>> def surprise(value, mutable=[]):
...   mutable.append(value)
...   return mutable
... 
>>> surprise(42)
[42]
>>> surprise(1137)
[42, 1137]
>>> def surprise(value, mutable=None):
...   mutable = mutable or []
...   mutable.append(value)
...   return mutable
... 
>>> surprise(42)
[42]
>>> surprise(1137)
[1137]
>>>
```


- John


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41915/#review112750
---


On Jan. 4, 2016, 10:15 p.m., John Sirois wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41915/
> ---
> 
> (Updated Jan. 4, 2016, 10:15 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Bugs: AURORA-1570
> https://issues.apache.org/jira/browse/AURORA-1570
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Previously, a mock threading.Event was waited on in one thread
> and the count of waits was read in another thread.  Most thread
> memory models do not guaranty reads are fresh in this scenario
> unless there is a memory barrier of some sort forcing per-cpu
> caches to be flushed.
> 
> Since the test really only verified correct conversion of a poll
> interval to fractional seconds - kill the test as not pulling its
> weight.
> 
>  src/test/python/apache/thermos/observer/BUILD |  1 +
>  src/test/python/apache/thermos/observer/test_task_observer.py | 45 
> -
>  2 files changed, 1 insertion(+), 45 deletions(-)
> 
> 
> Diffs
> -
> 
>   src/test/python/apache/thermos/observer/BUILD 
> f3f697c35ee5473072171926923459a4dde15545 
>   src/test/python/apache/thermos/observer/test_task_observer.py 
> ace15c5305e75fac3a82971f4d71b92bcb37bafc 
> 
> Diff: https://reviews.apache.org/r/41915/diff/
> 
> 
> Testing
> ---
> 
> Before this change I got a failure between 1/5 and 1/10th of the
> time via:
> ```
> while true
> do
>   ./pants test src/test/python/apache/thermos/observer/ -- -kTaskObserverTest
> done
> ```
> 
> After the change I cannot trigger the failure.
> 
> 
> Thanks,
> 
> John Sirois
> 
>



Re: Review Request 41915: Kill flaky TaskObserverTest.

2016-01-04 Thread John Sirois

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41915/
---

(Updated Jan. 4, 2016, 10:15 p.m.)


Review request for Aurora, Maxim Khutornenko and Bill Farner.


Changes
---

Kill test_task_observer.py.

Fixup the associated BUILD as well.

 src/main/python/apache/thermos/observer/task_observer.py  |  5 ++---
 src/test/python/apache/thermos/observer/BUILD |  1 +
 src/test/python/apache/thermos/observer/test_task_observer.py | 57 
--
 3 files changed, 3 insertions(+), 60 deletions(-)


Summary (updated)
-

Kill flaky TaskObserverTest.


Bugs: AURORA-1570
https://issues.apache.org/jira/browse/AURORA-1570


Repository: aurora


Description (updated)
---

Previously, a mock threading.Event was waited on in one thread
and the count of waits was read in another thread.  Most thread
memory models do not guaranty reads are fresh in this scenario
unless there is a memory barrier of some sort forcing per-cpu
caches to be flushed.

Since the test really only verified correct conversion of a poll
interval to fractional seconds - kill the test as not pulling its
weight.

 src/test/python/apache/thermos/observer/BUILD |  1 +
 src/test/python/apache/thermos/observer/test_task_observer.py | 45 
-
 2 files changed, 1 insertion(+), 45 deletions(-)


Diffs (updated)
-

  src/test/python/apache/thermos/observer/BUILD 
f3f697c35ee5473072171926923459a4dde15545 
  src/test/python/apache/thermos/observer/test_task_observer.py 
ace15c5305e75fac3a82971f4d71b92bcb37bafc 

Diff: https://reviews.apache.org/r/41915/diff/


Testing
---

Before this change I got a failure between 1/5 and 1/10th of the
time via:
```
while true
do
  ./pants test src/test/python/apache/thermos/observer/ -- -kTaskObserverTest
done
```

After the change I cannot trigger the failure.


Thanks,

John Sirois