Re: [lttng-dev] [PATCH 1/3] tests/lib/Makefile.am: Remove test_seek_empty_packet and test_seek_big_trace from SCRIPT_LIST
On 2016-04-28 14:28, Michael Jeanson wrote: - On Apr 28, 2016, at 12:44 PM, Simon Marchi simon.mar...@polymtl.ca wrote: Ok thanks. Just an idea: I find it quite cumbersome to add a new entry in configure.ac for each test script that we want autoconf to process (such as in patch 3/3). I think it could be easier to have a single file (e.g. vars.sh) and have it contain definitions of various directories: srcdir=@srcdir@ top_srcdir=@top_srcdir@ abs_top_srcdir=@abs_top_srcdir@ builddir=@builddir@ ... This way, you could have only a single generated file (vars.sh.in -> vars.sh), and each test script would just have to source that file to use the values. As discussed on #lttng, here is the PR resulting from my updates to your patches : https://github.com/efficios/babeltrace/pull/43 Thanks for following up. Concerning the include file, I'm not convinced it would be simpler, we would end up with static scripts in the source dir including a generated file in the builddir. In the current state, even the static scripts are copied to the build dir, so it wouldn't be a problem. Anyway, if it's not a problem, I think it looks good in your PR. ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
Re: [lttng-dev] [PATCH 1/3] tests/lib/Makefile.am: Remove test_seek_empty_packet and test_seek_big_trace from SCRIPT_LIST
- On Apr 28, 2016, at 12:44 PM, Simon Marchi simon.mar...@polymtl.ca wrote: > > Ok thanks. > > Just an idea: I find it quite cumbersome to add a new entry in > configure.ac for each test script that we want autoconf to process (such > as in patch 3/3). I think it could be easier to have a single file > (e.g. vars.sh) and have it contain definitions of various directories: > > srcdir=@srcdir@ > top_srcdir=@top_srcdir@ > abs_top_srcdir=@abs_top_srcdir@ > builddir=@builddir@ > ... > > This way, you could have only a single generated file (vars.sh.in -> > vars.sh), and each test script would just have to source that file to > use the values. As discussed on #lttng, here is the PR resulting from my updates to your patches : https://github.com/efficios/babeltrace/pull/43 Concerning the include file, I'm not convinced it would be simpler, we would end up with static scripts in the source dir including a generated file in the builddir. ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
Re: [lttng-dev] [PATCH 1/3] tests/lib/Makefile.am: Remove test_seek_empty_packet and test_seek_big_trace from SCRIPT_LIST
On 2016-04-28 12:16, Michael Jeanson wrote: - On Apr 27, 2016, at 11:04 PM, Simon Marchi simon.mar...@polymtl.ca wrote: On 2016-04-27 22:52, Simon Marchi wrote: Since these files are generated by autoconf, they shouldn't be included in SCRIPT_LIST, which is the list of scripts to copy from the source directory to the build directory. This gets rid of these warnings when building: cp: cannot stat '/home/simark/src/babeltrace/tests/lib/test_seek_big_trace': No such file or directory cp: cannot stat '/home/simark/src/babeltrace/tests/lib/test_seek_empty_packet': No such file or directory Signed-off-by: Simon Marchi--- tests/lib/Makefile.am | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/tests/lib/Makefile.am b/tests/lib/Makefile.am index e23bcc1..a08cbf6 100644 --- a/tests/lib/Makefile.am +++ b/tests/lib/Makefile.am @@ -41,9 +41,7 @@ test_bt_values_SOURCES = test_bt_values.c test_ctf_ir_ref_SOURCES = test_ctf_ir_ref.c test_bt_ctf_field_type_validation_SOURCES = test_bt_ctf_field_type_validation.c -SCRIPT_LIST = test_seek_big_trace \ - test_seek_empty_packet \ - test_ctf_writer_complete +SCRIPT_LIST = test_ctf_writer_complete EXTRA_DIST = test_seek_big_trace.in test_seek_empty_packet.in CLEANFILES= test_seek_big_trace test_seek_empty_packet Sorry, forgot to mention that this series is meant for babeltrace. Good catch, I reviewed the patches and I'm working on an updated version to do further cleanup of the Makefile and test scripts generation. Ok thanks. Just an idea: I find it quite cumbersome to add a new entry in configure.ac for each test script that we want autoconf to process (such as in patch 3/3). I think it could be easier to have a single file (e.g. vars.sh) and have it contain definitions of various directories: srcdir=@srcdir@ top_srcdir=@top_srcdir@ abs_top_srcdir=@abs_top_srcdir@ builddir=@builddir@ ... This way, you could have only a single generated file (vars.sh.in -> vars.sh), and each test script would just have to source that file to use the values. ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
Re: [lttng-dev] [PATCH 1/3] tests/lib/Makefile.am: Remove test_seek_empty_packet and test_seek_big_trace from SCRIPT_LIST
- On Apr 27, 2016, at 11:04 PM, Simon Marchi simon.mar...@polymtl.ca wrote: > On 2016-04-27 22:52, Simon Marchi wrote: >> Since these files are generated by autoconf, they shouldn't be included >> in SCRIPT_LIST, which is the list of scripts to copy from the source >> directory to the build directory. This gets rid of these warnings when >> building: >> >> cp: cannot stat >> '/home/simark/src/babeltrace/tests/lib/test_seek_big_trace': No such >> file or directory >> cp: cannot stat >> '/home/simark/src/babeltrace/tests/lib/test_seek_empty_packet': No >> such file or directory >> >> Signed-off-by: Simon Marchi>> --- >> tests/lib/Makefile.am | 4 +--- >> 1 file changed, 1 insertion(+), 3 deletions(-) >> >> diff --git a/tests/lib/Makefile.am b/tests/lib/Makefile.am >> index e23bcc1..a08cbf6 100644 >> --- a/tests/lib/Makefile.am >> +++ b/tests/lib/Makefile.am >> @@ -41,9 +41,7 @@ test_bt_values_SOURCES = test_bt_values.c >> test_ctf_ir_ref_SOURCES = test_ctf_ir_ref.c >> test_bt_ctf_field_type_validation_SOURCES = >> test_bt_ctf_field_type_validation.c >> >> -SCRIPT_LIST = test_seek_big_trace \ >> -test_seek_empty_packet \ >> -test_ctf_writer_complete >> +SCRIPT_LIST = test_ctf_writer_complete >> EXTRA_DIST = test_seek_big_trace.in test_seek_empty_packet.in >> CLEANFILES= test_seek_big_trace test_seek_empty_packet > > Sorry, forgot to mention that this series is meant for babeltrace. Good catch, I reviewed the patches and I'm working on an updated version to do further cleanup of the Makefile and test scripts generation. ___ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU
- On Apr 28, 2016, at 9:47 AM, Yuxin Ren r...@gwmail.gwu.edu wrote: > Hi Boqun and Paul, > > Thank you so much for your help. > > I found one reason to use that lock. > In the slow path, a thread will move all waiters to a local queue. > https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L406 > After this, following thread can also perform grace period, as the > global waiter queue is empty. > Thus the rcu_gp_lock is used to ensure mutual exclusion. > > However, from real time aspect, such blocking will cause priority > inversion: higher priority writer can be blocked by low priority > writer. > Is there a way to modify the code to allow multiple threads to perform > grace period concurrently? By the way, there are other reasons for this lock in the urcu implementation: it protects the rcu_registry, and the rcu_gp.ctr count. Thanks, Mathieu > > Thanks again!! > Yuxin > > On Thu, Apr 28, 2016 at 8:44 AM, Boqun Fengwrote: >> Hi Paul and Yuxin, >> >> On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote: >>> Try building without it and see what happens when you run the tests. >>> >> >> I've run a 'regtest' with the following modification out of curiousity: >> >> diff --git a/urcu.c b/urcu.c >> index a5568bdbd075..9dc3c9feae56 100644 >> --- a/urcu.c >> +++ b/urcu.c >> @@ -398,8 +398,6 @@ void synchronize_rcu(void) >> /* We won't need to wake ourself up */ >> urcu_wait_set_state(, URCU_WAIT_RUNNING); >> >> - mutex_lock(_gp_lock); >> - >> /* >> * Move all waiters into our local queue. >> */ >> @@ -480,7 +478,6 @@ void synchronize_rcu(void) >> smp_mb_master(); >> out: >> mutex_unlock(_registry_lock); >> - mutex_unlock(_gp_lock); >> >> /* >> * Wakeup waiters only after we have completed the grace period >> >> >> And guess what, the result of the test was: >> >> Test Summary Report >> --- >> ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18) >> Failed tests: 30, 45, 60, 75, 90, 103, 105, 120, 135 >> 150, 165, 180, 195, 210, 225, 240, 253 >> 255 >> Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr 0.04 sys + 8981.02 cusr >> 597.64 csys = 9579.25 CPU) >> Result: FAIL >> >> And test case 30 for example is something like: >> >> tests/benchmark/test_urcu_mb 1 -d 0 -b 32768 >> >> and it failed because: >> >> lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' >> failed. >> zsh: abort (core dumped) ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 1 >> -d >> 0 -b 32768 >> >> So I think what was going on here was: >> >> CPU 0 (reader) CPU 1 (writer) >> CPU 2 (writer) >> === >> == >> rcu_read_lock(); >> new = >> malloc(sizeof(int)); >> local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old >> *new = 8; >> >>old = rcu_xchg_pointer(_rcu_pointer, new); >> synchronize_rcu(): >> urcu_wait_add(); // return 0 >> urcu_move_waiters(); // @gp_waiters is >> empty, >>// the next >> urcu_wait_add() will return 0 >> >> >>synchronize_rcu(): >> >> urcu_wait_add(); // return 0 >> >> mutex_lock(_register_lock); >> wait_for_readers(); // remove registered >> reader from @registery, >> // release >> rcu_register_lock and wait via poll() >> >> >> mutex_lock(_registry_lock); >> >> wait_for_readers(); // @regsitery is empty! we are so lucky >> >> return; >> >> >>if (old) >> >>free(old) >> rcu_read_unlock(); >> assert(*local_ptr==8); // but local_ptr(i.e. old) is already freed. >> >> >> So the point is there could be two writers calling synchronize_rcu() but >> not returning early(both of them enter the slow path to perform a grace >> period), so the rcu_gp_lock is necessary in this case. >> >> (Cc Mathieu) >> >> But this is only my understanding and I'm learning the URCU code too ;-) >> >> Regards, >> Boqun >> >> >>> Might well be that it is unnecessary, but I will defer to Mathieu >>> on
Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU
On Thu, Apr 28, 2016 at 02:38:40PM +, Mathieu Desnoyers wrote: > - On Apr 28, 2016, at 9:47 AM, Yuxin Ren r...@gwmail.gwu.edu wrote: > > > Hi Boqun and Paul, > > > > Thank you so much for your help. > > > > I found one reason to use that lock. > > In the slow path, a thread will move all waiters to a local queue. > > https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L406 > > After this, following thread can also perform grace period, as the > > global waiter queue is empty. > > Thus the rcu_gp_lock is used to ensure mutual exclusion. > > > > However, from real time aspect, such blocking will cause priority > > inversion: higher priority writer can be blocked by low priority > > writer. > > Is there a way to modify the code to allow multiple threads to perform > > grace period concurrently? > > Before we redesign urcu for RT, would it be possible to simply > use pi-mutexes (priority inheritance) instead to protect grace periods > from each other with the current urcu scheme ? Given that priority inversion can happen with low-priority readers blocking a grace period that a high-priority updater is waiting on, I stand by my earlier advice: Don't let high-priority updaters block waiting for grace periods. ;-) Thanx, Paul > Thanks, > > Mathieu > > > > > > Thanks again!! > > Yuxin > > > > On Thu, Apr 28, 2016 at 8:44 AM, Boqun Fengwrote: > >> Hi Paul and Yuxin, > >> > >> On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote: > >>> Try building without it and see what happens when you run the tests. > >>> > >> > >> I've run a 'regtest' with the following modification out of curiousity: > >> > >> diff --git a/urcu.c b/urcu.c > >> index a5568bdbd075..9dc3c9feae56 100644 > >> --- a/urcu.c > >> +++ b/urcu.c > >> @@ -398,8 +398,6 @@ void synchronize_rcu(void) > >> /* We won't need to wake ourself up */ > >> urcu_wait_set_state(, URCU_WAIT_RUNNING); > >> > >> - mutex_lock(_gp_lock); > >> - > >> /* > >> * Move all waiters into our local queue. > >> */ > >> @@ -480,7 +478,6 @@ void synchronize_rcu(void) > >> smp_mb_master(); > >> out: > >> mutex_unlock(_registry_lock); > >> - mutex_unlock(_gp_lock); > >> > >> /* > >> * Wakeup waiters only after we have completed the grace period > >> > >> > >> And guess what, the result of the test was: > >> > >> Test Summary Report > >> --- > >> ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18) > >> Failed tests: 30, 45, 60, 75, 90, 103, 105, 120, 135 > >> 150, 165, 180, 195, 210, 225, 240, 253 > >> 255 > >> Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr 0.04 sys + 8981.02 cusr > >> 597.64 csys = 9579.25 CPU) > >> Result: FAIL > >> > >> And test case 30 for example is something like: > >> > >> tests/benchmark/test_urcu_mb 1 -d 0 -b 32768 > >> > >> and it failed because: > >> > >> lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' > >> failed. > >> zsh: abort (core dumped) ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 > >> 1 -d > >> 0 -b 32768 > >> > >> So I think what was going on here was: > >> > >> CPU 0 (reader) CPU 1 (writer) > >> CPU 2 (writer) > >> === > >> == > >> rcu_read_lock(); > >> new = > >> malloc(sizeof(int)); > >> local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old > >> *new = 8; > >> > >> old = rcu_xchg_pointer(_rcu_pointer, new); > >> synchronize_rcu(): > >> urcu_wait_add(); // return 0 > >> urcu_move_waiters(); // @gp_waiters is > >> empty, > >>// the next > >> urcu_wait_add() will return 0 > >> > >> > >> synchronize_rcu(): > >> > >>urcu_wait_add(); // return 0 > >> > >> mutex_lock(_register_lock); > >> wait_for_readers(); // remove registered > >> reader from @registery, > >> // release > >> rcu_register_lock and wait via poll() > >> > >> > >>mutex_lock(_registry_lock); > >> > >>wait_for_readers(); // @regsitery is empty! we are so lucky > >> > >>return; > >> > >>
Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU
- On Apr 28, 2016, at 9:47 AM, Yuxin Ren r...@gwmail.gwu.edu wrote: > Hi Boqun and Paul, > > Thank you so much for your help. > > I found one reason to use that lock. > In the slow path, a thread will move all waiters to a local queue. > https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L406 > After this, following thread can also perform grace period, as the > global waiter queue is empty. > Thus the rcu_gp_lock is used to ensure mutual exclusion. > > However, from real time aspect, such blocking will cause priority > inversion: higher priority writer can be blocked by low priority > writer. > Is there a way to modify the code to allow multiple threads to perform > grace period concurrently? Before we redesign urcu for RT, would it be possible to simply use pi-mutexes (priority inheritance) instead to protect grace periods from each other with the current urcu scheme ? Thanks, Mathieu > > Thanks again!! > Yuxin > > On Thu, Apr 28, 2016 at 8:44 AM, Boqun Fengwrote: >> Hi Paul and Yuxin, >> >> On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote: >>> Try building without it and see what happens when you run the tests. >>> >> >> I've run a 'regtest' with the following modification out of curiousity: >> >> diff --git a/urcu.c b/urcu.c >> index a5568bdbd075..9dc3c9feae56 100644 >> --- a/urcu.c >> +++ b/urcu.c >> @@ -398,8 +398,6 @@ void synchronize_rcu(void) >> /* We won't need to wake ourself up */ >> urcu_wait_set_state(, URCU_WAIT_RUNNING); >> >> - mutex_lock(_gp_lock); >> - >> /* >> * Move all waiters into our local queue. >> */ >> @@ -480,7 +478,6 @@ void synchronize_rcu(void) >> smp_mb_master(); >> out: >> mutex_unlock(_registry_lock); >> - mutex_unlock(_gp_lock); >> >> /* >> * Wakeup waiters only after we have completed the grace period >> >> >> And guess what, the result of the test was: >> >> Test Summary Report >> --- >> ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18) >> Failed tests: 30, 45, 60, 75, 90, 103, 105, 120, 135 >> 150, 165, 180, 195, 210, 225, 240, 253 >> 255 >> Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr 0.04 sys + 8981.02 cusr >> 597.64 csys = 9579.25 CPU) >> Result: FAIL >> >> And test case 30 for example is something like: >> >> tests/benchmark/test_urcu_mb 1 -d 0 -b 32768 >> >> and it failed because: >> >> lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' >> failed. >> zsh: abort (core dumped) ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 1 >> -d >> 0 -b 32768 >> >> So I think what was going on here was: >> >> CPU 0 (reader) CPU 1 (writer) >> CPU 2 (writer) >> === >> == >> rcu_read_lock(); >> new = >> malloc(sizeof(int)); >> local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old >> *new = 8; >> >>old = rcu_xchg_pointer(_rcu_pointer, new); >> synchronize_rcu(): >> urcu_wait_add(); // return 0 >> urcu_move_waiters(); // @gp_waiters is >> empty, >>// the next >> urcu_wait_add() will return 0 >> >> >>synchronize_rcu(): >> >> urcu_wait_add(); // return 0 >> >> mutex_lock(_register_lock); >> wait_for_readers(); // remove registered >> reader from @registery, >> // release >> rcu_register_lock and wait via poll() >> >> >> mutex_lock(_registry_lock); >> >> wait_for_readers(); // @regsitery is empty! we are so lucky >> >> return; >> >> >>if (old) >> >>free(old) >> rcu_read_unlock(); >> assert(*local_ptr==8); // but local_ptr(i.e. old) is already freed. >> >> >> So the point is there could be two writers calling synchronize_rcu() but >> not returning early(both of them enter the slow path to perform a grace >> period), so the rcu_gp_lock is necessary in this case. >> >> (Cc Mathieu) >> >> But this is only my understanding and I'm learning the URCU code too ;-) >> >> Regards, >> Boqun >> >> >>> Might well be that it is
Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU
On Thu, Apr 28, 2016 at 09:47:23AM -0400, Yuxin Ren wrote: > Hi Boqun and Paul, > > Thank you so much for your help. > > I found one reason to use that lock. > In the slow path, a thread will move all waiters to a local queue. > https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L406 > After this, following thread can also perform grace period, as the > global waiter queue is empty. > Thus the rcu_gp_lock is used to ensure mutual exclusion. > > However, from real time aspect, such blocking will cause priority > inversion: higher priority writer can be blocked by low priority > writer. > Is there a way to modify the code to allow multiple threads to perform > grace period concurrently? If a thread has real-time requirements, you shouldn't have it do synchronous grace periods, just as you shouldn't have it do (say) sleep(10). You should instead either (1) have some other non-realtime thread do the cleanup activities involving synchronize_rcu() or (2) have the real-time thread use the asynchronous call_rcu(). Thanx, Paul > Thanks again!! > Yuxin > > On Thu, Apr 28, 2016 at 8:44 AM, Boqun Fengwrote: > > Hi Paul and Yuxin, > > > > On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote: > >> Try building without it and see what happens when you run the tests. > >> > > > > I've run a 'regtest' with the following modification out of curiousity: > > > > diff --git a/urcu.c b/urcu.c > > index a5568bdbd075..9dc3c9feae56 100644 > > --- a/urcu.c > > +++ b/urcu.c > > @@ -398,8 +398,6 @@ void synchronize_rcu(void) > > /* We won't need to wake ourself up */ > > urcu_wait_set_state(, URCU_WAIT_RUNNING); > > > > - mutex_lock(_gp_lock); > > - > > /* > > * Move all waiters into our local queue. > > */ > > @@ -480,7 +478,6 @@ void synchronize_rcu(void) > > smp_mb_master(); > > out: > > mutex_unlock(_registry_lock); > > - mutex_unlock(_gp_lock); > > > > /* > > * Wakeup waiters only after we have completed the grace period > > > > > > And guess what, the result of the test was: > > > > Test Summary Report > > --- > > ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18) > > Failed tests: 30, 45, 60, 75, 90, 103, 105, 120, 135 > > 150, 165, 180, 195, 210, 225, 240, 253 > > 255 > > Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr 0.04 sys + 8981.02 cusr > > 597.64 csys = 9579.25 CPU) > > Result: FAIL > > > > And test case 30 for example is something like: > > > > tests/benchmark/test_urcu_mb 1 -d 0 -b 32768 > > > > and it failed because: > > > > lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' > > failed. > > zsh: abort (core dumped) ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 > > 1 -d 0 -b 32768 > > > > So I think what was going on here was: > > > > CPU 0 (reader) CPU 1 (writer) > > CPU 2 (writer) > > === > > == > > rcu_read_lock(); > > new = malloc(sizeof(int)); > > local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old > > *new = 8; > > > > old = rcu_xchg_pointer(_rcu_pointer, new); > > synchronize_rcu(): > > urcu_wait_add(); // return 0 > > urcu_move_waiters(); // @gp_waiters is > > empty, > >// the next > > urcu_wait_add() will return 0 > > > > > > synchronize_rcu(): > > > > urcu_wait_add(); // return 0 > > > > mutex_lock(_register_lock); > > wait_for_readers(); // remove registered > > reader from @registery, > > // release > > rcu_register_lock and wait via poll() > > > > > > mutex_lock(_registry_lock); > > > > wait_for_readers(); // @regsitery is empty! we are so lucky > > > > return; > > > > > > if (old) > > > > free(old) > > rcu_read_unlock(); > > assert(*local_ptr==8); // but
Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU
On Thu, Apr 28, 2016 at 08:44:01PM +0800, Boqun Feng wrote: > Hi Paul and Yuxin, > > On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote: > > Try building without it and see what happens when you run the tests. > > > > I've run a 'regtest' with the following modification out of curiousity: > > diff --git a/urcu.c b/urcu.c > index a5568bdbd075..9dc3c9feae56 100644 > --- a/urcu.c > +++ b/urcu.c > @@ -398,8 +398,6 @@ void synchronize_rcu(void) > /* We won't need to wake ourself up */ > urcu_wait_set_state(, URCU_WAIT_RUNNING); > > - mutex_lock(_gp_lock); > - > /* >* Move all waiters into our local queue. >*/ > @@ -480,7 +478,6 @@ void synchronize_rcu(void) > smp_mb_master(); > out: > mutex_unlock(_registry_lock); > - mutex_unlock(_gp_lock); > > /* >* Wakeup waiters only after we have completed the grace period > > > And guess what, the result of the test was: > > Test Summary Report > --- > ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18) > Failed tests: 30, 45, 60, 75, 90, 103, 105, 120, 135 > 150, 165, 180, 195, 210, 225, 240, 253 > 255 > Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr 0.04 sys + 8981.02 cusr > 597.64 csys = 9579.25 CPU) > Result: FAIL > > And test case 30 for example is something like: > > tests/benchmark/test_urcu_mb 1 -d 0 -b 32768 > > and it failed because: > > lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' > failed. > zsh: abort (core dumped) ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 1 > -d 0 -b 32768 > > So I think what was going on here was: > > CPU 0 (reader)CPU 1 (writer) > CPU 2 (writer) > === > == > rcu_read_lock(); > new = malloc(sizeof(int)); > local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old > *new = 8; > > old = rcu_xchg_pointer(_rcu_pointer, new); > synchronize_rcu(): > urcu_wait_add(); // return 0 > urcu_move_waiters(); // @gp_waiters is empty, > // the next > urcu_wait_add() will return 0 > > > synchronize_rcu(): > > urcu_wait_add(); // return 0 > > mutex_lock(_register_lock); > wait_for_readers(); // remove registered > reader from @registery, > // release > rcu_register_lock and wait via poll() > > > mutex_lock(_registry_lock); > > wait_for_readers(); // @regsitery is empty! we are so lucky > > return; > > > if (old) > > free(old) > rcu_read_unlock(); > assert(*local_ptr==8); // but local_ptr(i.e. old) is already freed. > > > So the point is there could be two writers calling synchronize_rcu() but > not returning early(both of them enter the slow path to perform a grace > period), so the rcu_gp_lock is necessary in this case. > > (Cc Mathieu) > > But this is only my understanding and I'm learning the URCU code too ;-) Nothing quite like actually trying it and seeing what happens! One of the best learning methods that I know of. Assuming the act of actually trying it is non-fatal, of course. ;-) Thanx, Paul > Regards, > Boqun > > > > Might well be that it is unnecessary, but I will defer to Mathieu > > on that point. > > > > Thanx, Paul > > > > On Wed, Apr 27, 2016 at 10:18:04PM -0400, Yuxin Ren wrote: > > > As they don't currently perform grace period, why do we use the > > > rcu_gp_lock? > > > > > > Thank you. > > > Yuxin > > > > > > On Wed, Apr 27, 2016 at 10:08 PM, Paul E. McKenney > > >wrote: > > > > On Wed, Apr 27, 2016 at 09:34:16PM -0400, Yuxin Ren wrote: > > > >> Hi, > > > >> > > > >> I am learning the URCU code. > > > >> > > > >> Why do we need rcu_gp_lock in synchronize_rcu? > > > >> https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L401 > > > >> > > > >> In the
Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU
Hi Boqun and Paul, Thank you so much for your help. I found one reason to use that lock. In the slow path, a thread will move all waiters to a local queue. https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L406 After this, following thread can also perform grace period, as the global waiter queue is empty. Thus the rcu_gp_lock is used to ensure mutual exclusion. However, from real time aspect, such blocking will cause priority inversion: higher priority writer can be blocked by low priority writer. Is there a way to modify the code to allow multiple threads to perform grace period concurrently? Thanks again!! Yuxin On Thu, Apr 28, 2016 at 8:44 AM, Boqun Fengwrote: > Hi Paul and Yuxin, > > On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote: >> Try building without it and see what happens when you run the tests. >> > > I've run a 'regtest' with the following modification out of curiousity: > > diff --git a/urcu.c b/urcu.c > index a5568bdbd075..9dc3c9feae56 100644 > --- a/urcu.c > +++ b/urcu.c > @@ -398,8 +398,6 @@ void synchronize_rcu(void) > /* We won't need to wake ourself up */ > urcu_wait_set_state(, URCU_WAIT_RUNNING); > > - mutex_lock(_gp_lock); > - > /* > * Move all waiters into our local queue. > */ > @@ -480,7 +478,6 @@ void synchronize_rcu(void) > smp_mb_master(); > out: > mutex_unlock(_registry_lock); > - mutex_unlock(_gp_lock); > > /* > * Wakeup waiters only after we have completed the grace period > > > And guess what, the result of the test was: > > Test Summary Report > --- > ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18) > Failed tests: 30, 45, 60, 75, 90, 103, 105, 120, 135 > 150, 165, 180, 195, 210, 225, 240, 253 > 255 > Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr 0.04 sys + 8981.02 cusr > 597.64 csys = 9579.25 CPU) > Result: FAIL > > And test case 30 for example is something like: > > tests/benchmark/test_urcu_mb 1 -d 0 -b 32768 > > and it failed because: > > lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' > failed. > zsh: abort (core dumped) ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 1 > -d 0 -b 32768 > > So I think what was going on here was: > > CPU 0 (reader) CPU 1 (writer) > CPU 2 (writer) > === > == > rcu_read_lock(); > new = malloc(sizeof(int)); > local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old > *new = 8; > > old = rcu_xchg_pointer(_rcu_pointer, new); > synchronize_rcu(): > urcu_wait_add(); // return 0 > urcu_move_waiters(); // @gp_waiters is > empty, >// the next > urcu_wait_add() will return 0 > > > synchronize_rcu(): > > urcu_wait_add(); // return 0 > > mutex_lock(_register_lock); > wait_for_readers(); // remove registered > reader from @registery, > // release > rcu_register_lock and wait via poll() > > > mutex_lock(_registry_lock); > > wait_for_readers(); // @regsitery is empty! we are so lucky > > return; > > > if (old) > > free(old) > rcu_read_unlock(); > assert(*local_ptr==8); // but local_ptr(i.e. old) is already freed. > > > So the point is there could be two writers calling synchronize_rcu() but > not returning early(both of them enter the slow path to perform a grace > period), so the rcu_gp_lock is necessary in this case. > > (Cc Mathieu) > > But this is only my understanding and I'm learning the URCU code too ;-) > > Regards, > Boqun > > >> Might well be that it is unnecessary, but I will defer to Mathieu >> on that point. >> >> Thanx, Paul >> >> On Wed, Apr 27, 2016 at 10:18:04PM -0400, Yuxin Ren wrote: >> > As they don't currently perform grace period, why do we use the >> > rcu_gp_lock? >> > >> >
Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU
Hi Paul and Yuxin, On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote: > Try building without it and see what happens when you run the tests. > I've run a 'regtest' with the following modification out of curiousity: diff --git a/urcu.c b/urcu.c index a5568bdbd075..9dc3c9feae56 100644 --- a/urcu.c +++ b/urcu.c @@ -398,8 +398,6 @@ void synchronize_rcu(void) /* We won't need to wake ourself up */ urcu_wait_set_state(, URCU_WAIT_RUNNING); - mutex_lock(_gp_lock); - /* * Move all waiters into our local queue. */ @@ -480,7 +478,6 @@ void synchronize_rcu(void) smp_mb_master(); out: mutex_unlock(_registry_lock); - mutex_unlock(_gp_lock); /* * Wakeup waiters only after we have completed the grace period And guess what, the result of the test was: Test Summary Report --- ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18) Failed tests: 30, 45, 60, 75, 90, 103, 105, 120, 135 150, 165, 180, 195, 210, 225, 240, 253 255 Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr 0.04 sys + 8981.02 cusr 597.64 csys = 9579.25 CPU) Result: FAIL And test case 30 for example is something like: tests/benchmark/test_urcu_mb 1 -d 0 -b 32768 and it failed because: lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' failed. zsh: abort (core dumped) ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 1 -d 0 -b 32768 So I think what was going on here was: CPU 0 (reader) CPU 1 (writer) CPU 2 (writer) === == rcu_read_lock(); new = malloc(sizeof(int)); local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old *new = 8; old = rcu_xchg_pointer(_rcu_pointer, new); synchronize_rcu(): urcu_wait_add(); // return 0 urcu_move_waiters(); // @gp_waiters is empty, // the next urcu_wait_add() will return 0 synchronize_rcu(): urcu_wait_add(); // return 0 mutex_lock(_register_lock); wait_for_readers(); // remove registered reader from @registery, // release rcu_register_lock and wait via poll() mutex_lock(_registry_lock); wait_for_readers(); // @regsitery is empty! we are so lucky return; if (old) free(old) rcu_read_unlock(); assert(*local_ptr==8); // but local_ptr(i.e. old) is already freed. So the point is there could be two writers calling synchronize_rcu() but not returning early(both of them enter the slow path to perform a grace period), so the rcu_gp_lock is necessary in this case. (Cc Mathieu) But this is only my understanding and I'm learning the URCU code too ;-) Regards, Boqun > Might well be that it is unnecessary, but I will defer to Mathieu > on that point. > > Thanx, Paul > > On Wed, Apr 27, 2016 at 10:18:04PM -0400, Yuxin Ren wrote: > > As they don't currently perform grace period, why do we use the rcu_gp_lock? > > > > Thank you. > > Yuxin > > > > On Wed, Apr 27, 2016 at 10:08 PM, Paul E. McKenney > >wrote: > > > On Wed, Apr 27, 2016 at 09:34:16PM -0400, Yuxin Ren wrote: > > >> Hi, > > >> > > >> I am learning the URCU code. > > >> > > >> Why do we need rcu_gp_lock in synchronize_rcu? > > >> https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L401 > > >> > > >> In the comment, it says this lock ensures mutual exclusion between > > >> threads calling synchronize_rcu(). > > >> But only the first thread added to waiter queue can proceed to detect > > >> grace period. > > >> How can multiple threads currently perform the grace thread? > > > > > > They don't concurrently perform grace periods, and it would be wasteful > > > for them to do so. Instead, the first one performs the grace period, > > > and all that were waiting at the time it started get the benefit of that > > > same grace