[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2016-02-18 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Maxim Uvarov  changed:

   What|Removed |Added

 Status|CONFIRMED   |RESOLVED
 Resolution|--- |FIXED

--- Comment #21 from Maxim Uvarov  ---
f73b184 linux-generic: timer use SIGEV_THREAD_ID

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-11-19 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Mike Holmes  changed:

   What|Removed |Added

   Assignee|ola.liljed...@linaro.org|ivan.khoronz...@linaro.org

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-10-30 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #20 from Stuart Haslam  ---
I don't think this bug should be closed. We've never regularly seen it in CI
but it's still fairly easily reproducible using the sequence in comment #10,
and we think we know the root of the problem and how to fix it (comment #18).

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-10-29 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #19 from Mike Holmes  ---
CI is not showing this bug currently, will close next week unless there is
feedback.

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-10 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Mike Holmes  changed:

   What|Removed |Added

   Assignee|lng-odp@lists.linaro.org|ola.liljed...@linaro.org

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-08 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #13 from Stuart Haslam  ---
In my case the tp in the signal handler is a valid timer pool pointer, or at
least it was a short time ago, so it's stale rather than corrupted.

Commenting out the odp_shm_free() in odp_timer_pool_del() makes the crash go
away (but then so does altering the timing in many other ways so it's not
conclusive).

I suspect the problem is that the call to timer_delete() from
odp_timer_pool_del() disarms and deletes the (POSIX) timer, but it's possible a
timer has just expired and the event hasn't been delivered yet. man
timer_delete says this;

"The treatment of any pending signal generated by the deleted timer is
unspecified."

We're delivering via threads rather than signals but I assume the same applies.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-08 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Bill Fischofer  changed:

   What|Removed |Added

 CC||bill.fischo...@linaro.org

--- Comment #14 from Bill Fischofer  ---
If that is the case then the issue is that odp_timer_pool_del() needs to
synchronize to allow events to flush before completing its operation.  This is
a common convention in async processing as teardowns always have to deal with
the fact that you potentially have one or more in-flight operations at the time
you want to clean up.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-08 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #15 from Ola Liljedahl  ---
Synchronize with what? We don't know if there is any timer signal in-flight. If
we could inject our own signal into the kernel signal queues, we could wait for
that signal to be handled (assuming it would be handled after any already
pending timer signal). But we are not using signals. Any other way to wait
until this queue of events (which may cause threads to be spawned and executed)
has drained?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-08 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #16 from Stuart Haslam  ---
I thought about just setting some flag in odp_timer_pool_del and waiting for
the next timer expiry to action on it, but assuming that it's possible for
there to be more than 1 outstanding event (in different threads, which may be
processed out of order), I can't see how you can be sure that you've processed
the last one.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-08 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #17 from Bill Fischofer  ---
To properly clean up the following is required.

1. Be able to quiesce the context so that no new events will be accepted by the
APIs.

2. Modify the context so that it tracks the number of events outstanding
against it. A single pending event counter that gets incremented on each new
event and decremented as the signals are received should suffice.

3. Modify the event handler so that when a signal is received it also checks
"Am I being deleted"? And if yes, then completes the delete when if it's the
last event to be received for that context.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-08 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #18 from Ola Liljedahl  ---
We don't want the periodic timer thread to perform the teardown (as some
unspecified time). The teardown needs to be complete (all resources freed) when
odp_timer_pool_destroy (?) returns.

Stuart suggested we create our own thread explicitly for processing these POSIX
timer events. Then we can also delete this thread as part of the teardown. I
think such a thread could be an ODP control plane thread, it shouldn't perform
too many ODP calls (odp_queue_enq once in a while). It is not a worker thread
and it can share CPU with other control plane threads.

I'll look into this.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-07 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #12 from Ola Liljedahl  ---
Perhaps we should add an explicit check that the sigval pointer is correct. Add
a field with some magic value to the timer pool and check that tp->magic has
the expected value.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-02 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #10 from Stuart Haslam  ---
I'm still seeing this occasionally, but I'm able to provoke it reliably like
this;

ulimit -c unlimited
rm core
while true; do ./test/validation/timer/timer_main; [ -f core ] && break; done

(may need to modify according to your /proc/sys/kernel/core_pattern)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


Re: [lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-02 Thread Bill Fischofer
I seem to be able to generate this fairly reliably.  Here's what gdb shows:

core was generated by `./timer_main'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  odp_atomic_fetch_inc_u64 (atom=)
at ./include/odp/atomic.h:158
158 return __atomic_fetch_add(>v, 1, __ATOMIC_RELAXED);
(gdb) bt
#0  odp_atomic_fetch_inc_u64 (atom=)
at ./include/odp/atomic.h:158
#1  timer_notify (sigval=sigval@entry=...) at odp_timer.c:646
#2  0x7f9a8b5dbeee in timer_sigev_thread (arg=0x7f9a840008c0)
at ../sysdeps/unix/sysv/linux/timer_routines.c:62
#3  0x7f9a8adcb6aa in start_thread (arg=0x7f9a889fe700)
at pthread_create.c:333
#4  0x7f9a8ab00eed in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

It appears that timer_notify() can get called with a bad sigval
occasionally?  Since timer_notify uses the returned value as a pointer
without validating it that can cause easy segfaults.  I did try adding a
check--if (!tp) return;  but I'm still seeing the segfault so something
else may be causing the resulting area to be non-addressable.  Perhaps a
clue?

On Wed, Sep 2, 2015 at 10:14 AM,  wrote:

> *Comment # 10  on bug
> 1615  from Stuart Haslam
>  *
>
> I'm still seeing this occasionally, but I'm able to provoke it reliably like
> this;
>
> ulimit -c unlimited
> rm core
> while true; do ./test/validation/timer/timer_main; [ -f core ] && break; done
>
> (may need to modify according to your /proc/sys/kernel/core_pattern)
>
> --
> You are receiving this mail because:
>
>- You are the assignee for the bug.
>- You are on the CC list for the bug.
>
>
> ___
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-09-02 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #11 from Bill Fischofer  ---
I seem to be able to generate this fairly reliably.  Here's what gdb shows:

core was generated by `./timer_main'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  odp_atomic_fetch_inc_u64 (atom=)
at ./include/odp/atomic.h:158
158 return __atomic_fetch_add(>v, 1, __ATOMIC_RELAXED);
(gdb) bt
#0  odp_atomic_fetch_inc_u64 (atom=)
at ./include/odp/atomic.h:158
#1  timer_notify (sigval=sigval@entry=...) at odp_timer.c:646
#2  0x7f9a8b5dbeee in timer_sigev_thread (arg=0x7f9a840008c0)
at ../sysdeps/unix/sysv/linux/timer_routines.c:62
#3  0x7f9a8adcb6aa in start_thread (arg=0x7f9a889fe700)
at pthread_create.c:333
#4  0x7f9a8ab00eed in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

It appears that timer_notify() can get called with a bad sigval
occasionally?  Since timer_notify uses the returned value as a pointer
without validating it that can cause easy segfaults.  I did try adding a
check--if (!tp) return;  but I'm still seeing the segfault so something
else may be causing the resulting area to be non-addressable.  Perhaps a
clue?

On Wed, Sep 2, 2015 at 10:14 AM,  wrote:

> *Comment # 10  on bug
> 1615  from Stuart Haslam
>  *
>
> I'm still seeing this occasionally, but I'm able to provoke it reliably like
> this;
>
> ulimit -c unlimited
> rm core
> while true; do ./test/validation/timer/timer_main; [ -f core ] && break; done
>
> (may need to modify according to your /proc/sys/kernel/core_pattern)
>
> --
> You are receiving this mail because:
>
>- You are the assignee for the bug.
>- You are on the CC list for the bug.
>
>
> ___
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
>

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-08-27 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #9 from Mike Holmes mike.hol...@linaro.org ---
Has not been seen in CI for some time.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-08-20 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Stuart Haslam stuart.has...@linaro.org changed:

   What|Removed |Added

 CC||stuart.has...@linaro.org

--- Comment #8 from Stuart Haslam stuart.has...@linaro.org ---
I'm seeing a semi-reproducible SEGV running timer_main on x86_64 (with 3 worker
cpus). It's always in the same place;

#0  timer_notify (sigval=sigval@entry=...) at odp_timer.c:643
#1  0x77bd70ff in timer_sigev_thread (arg=0x78c0)
at ../nptl/sysdeps/unix/sysv/linux/timer_routines.c:63
#2  0x7700b182 in start_thread (arg=0x765fd700) at
pthread_create.c:312
#3  0x776f647d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

odp_timer.c:643 is;

uint64_t prev_tick = odp_atomic_fetch_inc_u64(tp-cur_tick);

tp has the value that was assigned during odp_timer_pool_create()

HEAD is at a1e62a98b

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-08-13 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #7 from Mike Holmes mike.hol...@linaro.org ---
Patch committed 
https://git.linaro.org/lng/odp.git/commit/6788a0e25b2f9c244668cf3d6f71d0f2eaefd32f

That may provide the fix, watching CI for re-occurrence.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-08-13 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Mike Holmes mike.hol...@linaro.org changed:

   What|Removed |Added

   Assignee|christian.ziet...@linaro.or |lng-odp@lists.linaro.org
   |g   |
 Status|IN_PROGRESS |CONFIRMED

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-07-30 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Maxim Uvarov maxim.uva...@linaro.org changed:

   What|Removed |Added

 CC||maxim.uva...@linaro.org

--- Comment #6 from Maxim Uvarov maxim.uva...@linaro.org ---
Ivan posted patch:
[lng-odp]  [Patch v3 0/2] example: timer: fix/improve test

Explanation of changes in v2.

Needed to review.

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-07-23 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Mike Holmes mike.hol...@linaro.org changed:

   What|Removed |Added

 Status|CONFIRMED   |IN_PROGRESS

--- Comment #4 from Mike Holmes mike.hol...@linaro.org ---
commit cb82ae might be the fix, need to monitor CI for future occurances

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-07-09 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #2 from Mike Holmes mike.hol...@linaro.org ---
Ivan posted patches http://patches.opendataplane.org/patch/2097/

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-07-09 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #3 from Mike Holmes mike.hol...@linaro.org ---
Ivan posted patches http://patches.opendataplane.org/patch/2097/

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-06-11 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

Mike Holmes mike.hol...@linaro.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |CONFIRMED
 Ever confirmed|0   |1
   Assignee|ola.liljed...@linaro.org|christian.ziet...@linaro.or
   ||g

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp


[lng-odp] [Bug 1615] odp_timer fails in CI with Segmentation fault

2015-06-08 Thread bugzilla-daemon
https://bugs.linaro.org/show_bug.cgi?id=1615

--- Comment #1 from Mike Holmes mike.hol...@linaro.org ---
Also happens on ARM Targets

https://ci.linaro.org/job/odp-api-check/ARCH=arm64,GIT_BRANCH=api-next,label=docker-utopic-arm64/309/

-- 
You are receiving this mail because:
You are on the CC list for the bug.___
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp