Re: [lttng-dev] RCU consistency guarantees

2019-12-15 Thread Yuxin Ren
On Sun, Dec 15, 2019 at 3:30 PM Paul E. McKenney  wrote:

> On Sat, Dec 14, 2019 at 01:31:31AM -0500, Yuxin Ren wrote:
> > Hi Paul
> >
> > On Sat, Dec 7, 2019 at 5:42 PM Paul E. McKenney 
> wrote:
> >
> > > On Sat, Dec 07, 2019 at 03:04:42PM -0500, Yuxin Ren wrote:
> > > > Thanks a lot for your help. I have some questions below.
> > > >
> > > > On Sat, Dec 7, 2019 at 1:37 AM Paul E. McKenney 
> > > wrote:
> > > >
> > > > > On Fri, Dec 06, 2019 at 07:00:13PM -0500, Yuxin Ren wrote:
> > > > > > Thanks so much for your great help.
> > > > > > I definitely will look at those resources and papers!
> > > > > >
> > > > > > One more thing that I am confused
> > > > > > As I mentioned earlier, someone said One key distinction is that
> both
> > > > > MVCC
> > > > > > and RLU provide much stronger consistency guarantees to readers
> than
> > > does
> > > > > > RCU ...) (https://lwn.net/Articles/777036/).
> > > > >
> > > > > That someone was in fact me.  ;-)
> > > > >
> > > > > > I am not sure if the above statement is correct or not. But in
> > > general,
> > > > > > How can we compare RCU consistency guarantees to other techniques
> > > (such
> > > > > as
> > > > > > RLU)?
> > > > > > How to reason about which one has stronger or weaker guarantees?
> > > > >
> > > > > I suggest starting from the use case.  For concreteness, let's
> assume
> > > > > that we are using a hash table.  At one extreme, imagine a use
> case in
> > > > > which each event makes exactly one hash-table operation.  No
> > > information
> > > > > is carried from one event to the next.  (This might well be the
> case
> > > > > for simple web server.)  Such a use case cannot tell the difference
> > > > > between RCU on the one hand and MVCC/RLU on the other.
> > > > >
> > > > > At the other extreme, suppose that each event either accesses or
> > > updates
> > > > > multiple entries in the hash table.  In this case, MVCC/RLU will
> rule
> > > > > out outcomes that RCU would permit.  For example, suppose we had
> four
> > > > > events accessing two different elements in different buckets of the
> > > > > hash table:
> > > > >
> > > > > E1: Adds 32 to the hash table.
> > > > > E2: Adds 1729 to the hash table.
> > > > > E3: Within a read-side critical section, looks up 32 then
> 1729.
> > > > > E4: Within a read-side critical section, looks up 1729
> then 32.
> > > > >
> > > > > Given either MVCC or RLU, it will not be possible for E3 to find
> 32 but
> > > > > not 1729 and at the same time for E4 to find 1729 but not 32.
> Given
> > > RCU,
> > > > > this outcome is possible.
> > > > >
> > > > When you say "Within a read-side section", do you mean within a
> single
> > > same
> > > > read section? such as
> > > >
> > > > > read_lock()
> > > > > lookup(32)
> > > > > lookup(1729)
> > > > > read_unlock()
> > > >
> > > >
> > > > How about putting two lookups into two read-side sections? Do we
> still
> > > have
> > > > the problem, then?
> > > >
> > > > > read_lock()
> > > > > lookup(32)
> > > > > read_unlock()
> > > > > read_lock()
> > > > > lookup(1729)
> > > > > read_unlock()
> > >
> > > Without in any way agreeing with your characterization of this as a
> > > problem, because rcu_read_lock() and rcu_read_unlock() provide
> > > absolutely no ordering guarantees in the absence of a grace period,
> > > any non-grace-period-related reordering that can happen with a single
> > > RCU read-side critical section can also happen when that critical
> > > section is split in two as you have done above.
> > >
> > > > Could you kindly give me more clues why RCU can see such reorder,
> while
> > > RLU
> > > > can prevent it?
> > >
> > > Here are minimal C-language implementations for RCU that can (and are)
> > > actual

Re: [lttng-dev] RCU consistency guarantees

2019-12-13 Thread Yuxin Ren
Hi Paul

On Sat, Dec 7, 2019 at 5:42 PM Paul E. McKenney  wrote:

> On Sat, Dec 07, 2019 at 03:04:42PM -0500, Yuxin Ren wrote:
> > Thanks a lot for your help. I have some questions below.
> >
> > On Sat, Dec 7, 2019 at 1:37 AM Paul E. McKenney 
> wrote:
> >
> > > On Fri, Dec 06, 2019 at 07:00:13PM -0500, Yuxin Ren wrote:
> > > > Thanks so much for your great help.
> > > > I definitely will look at those resources and papers!
> > > >
> > > > One more thing that I am confused
> > > > As I mentioned earlier, someone said One key distinction is that both
> > > MVCC
> > > > and RLU provide much stronger consistency guarantees to readers than
> does
> > > > RCU ...) (https://lwn.net/Articles/777036/).
> > >
> > > That someone was in fact me.  ;-)
> > >
> > > > I am not sure if the above statement is correct or not. But in
> general,
> > > > How can we compare RCU consistency guarantees to other techniques
> (such
> > > as
> > > > RLU)?
> > > > How to reason about which one has stronger or weaker guarantees?
> > >
> > > I suggest starting from the use case.  For concreteness, let's assume
> > > that we are using a hash table.  At one extreme, imagine a use case in
> > > which each event makes exactly one hash-table operation.  No
> information
> > > is carried from one event to the next.  (This might well be the case
> > > for simple web server.)  Such a use case cannot tell the difference
> > > between RCU on the one hand and MVCC/RLU on the other.
> > >
> > > At the other extreme, suppose that each event either accesses or
> updates
> > > multiple entries in the hash table.  In this case, MVCC/RLU will rule
> > > out outcomes that RCU would permit.  For example, suppose we had four
> > > events accessing two different elements in different buckets of the
> > > hash table:
> > >
> > > E1: Adds 32 to the hash table.
> > > E2: Adds 1729 to the hash table.
> > > E3: Within a read-side critical section, looks up 32 then 1729.
> > > E4: Within a read-side critical section, looks up 1729 then 32.
> > >
> > > Given either MVCC or RLU, it will not be possible for E3 to find 32 but
> > > not 1729 and at the same time for E4 to find 1729 but not 32.  Given
> RCU,
> > > this outcome is possible.
> > >
> > When you say "Within a read-side section", do you mean within a single
> same
> > read section? such as
> >
> > > read_lock()
> > > lookup(32)
> > > lookup(1729)
> > > read_unlock()
> >
> >
> > How about putting two lookups into two read-side sections? Do we still
> have
> > the problem, then?
> >
> > > read_lock()
> > > lookup(32)
> > > read_unlock()
> > > read_lock()
> > > lookup(1729)
> > > read_unlock()
>
> Without in any way agreeing with your characterization of this as a
> problem, because rcu_read_lock() and rcu_read_unlock() provide
> absolutely no ordering guarantees in the absence of a grace period,
> any non-grace-period-related reordering that can happen with a single
> RCU read-side critical section can also happen when that critical
> section is split in two as you have done above.
>
> > Could you kindly give me more clues why RCU can see such reorder, while
> RLU
> > can prevent it?
>
> Here are minimal C-language implementations for RCU that can (and are)
> actually used:
>
Great. We use the same thing in our real-time work [1]


> #define rcu_read_lock()
> #define rcu_read_unlock()
>
> Please compare these to the read-side markers presented in the RLU paper,
> and then tell me your thoughts on the answer to your question.  ;-)
>
I submit my homework here, but I do not think I did it well.
1. I believe in the default URCU implementation, it has memory barrier
inside the read_lock / read_unlock.
2. From the RLU paper, it shows the code for read-side operation.
RLU_READER_LOCK(ctx)
ctx.is-writer←false
ctx.run-cnt←ctx.run-cnt+1.
memory fence
   ctx.local-clock←global-clock
RLU_READER_UNLOCK(ctx)
ctx.run-cnt←ctx.run-cnt+1
if (ctx.is-writer)
RLU_COMMIT_WRITE_LOG(ctx)

We can ignore the writer check, as in our case, the reader never do update.
My understanding is that read-side operations are only used to facilitate
the quiescence detection.
the run cnt is used to decide if a thread is active (if it is currently
inside a read-side section).
the local clock is used to check if the active thr

Re: [lttng-dev] RCU consistency guarantees

2019-12-07 Thread Yuxin Ren
Thanks a lot for your help. I have some questions below.

On Sat, Dec 7, 2019 at 1:37 AM Paul E. McKenney  wrote:

> On Fri, Dec 06, 2019 at 07:00:13PM -0500, Yuxin Ren wrote:
> > Thanks so much for your great help.
> > I definitely will look at those resources and papers!
> >
> > One more thing that I am confused
> > As I mentioned earlier, someone said One key distinction is that both
> MVCC
> > and RLU provide much stronger consistency guarantees to readers than does
> > RCU ...) (https://lwn.net/Articles/777036/).
>
> That someone was in fact me.  ;-)
>
> > I am not sure if the above statement is correct or not. But in general,
> > How can we compare RCU consistency guarantees to other techniques (such
> as
> > RLU)?
> > How to reason about which one has stronger or weaker guarantees?
>
> I suggest starting from the use case.  For concreteness, let's assume
> that we are using a hash table.  At one extreme, imagine a use case in
> which each event makes exactly one hash-table operation.  No information
> is carried from one event to the next.  (This might well be the case
> for simple web server.)  Such a use case cannot tell the difference
> between RCU on the one hand and MVCC/RLU on the other.
>
> At the other extreme, suppose that each event either accesses or updates
> multiple entries in the hash table.  In this case, MVCC/RLU will rule
> out outcomes that RCU would permit.  For example, suppose we had four
> events accessing two different elements in different buckets of the
> hash table:
>
> E1: Adds 32 to the hash table.
> E2: Adds 1729 to the hash table.
> E3: Within a read-side critical section, looks up 32 then 1729.
> E4: Within a read-side critical section, looks up 1729 then 32.
>
> Given either MVCC or RLU, it will not be possible for E3 to find 32 but
> not 1729 and at the same time for E4 to find 1729 but not 32.  Given RCU,
> this outcome is possible.
>
When you say "Within a read-side section", do you mean within a single same
read section? such as

> read_lock()
> lookup(32)
> lookup(1729)
> read_unlock()


How about putting two lookups into two read-side sections? Do we still have
the problem, then?

> read_lock()
> lookup(32)
> read_unlock()
> read_lock()
> lookup(1729)
> read_unlock()


Could you kindly give me more clues why RCU can see such reorder, while RLU
can prevent it?


> This is because MVCC and RLU provide readers a consistent view of
> the updates, and RCU does not.  Of course, it is often the case that a
> consistent view is not needed, in which case the MVCC and RLU guarantees
> are incurring read-side overhead for no reason.  But if the use case
> requires consistent readers, RCU is not an option.
>
> The reason a consistent view is not always needed is that speed-of-light
> delays make it impossible to provide a consistent view of the outside
> world.  In the common case where the use case interacts with the
> outside world, the algorithms absolutely must be designed to tolerate
> inconsistency, which opens the door to things like RCU.
>

I am confused here. I think speed-of-light delays happen everywhere, not
only bound to RCU, but also  any other synchronization approach (such RLU).
If so, how do others (RLU) provide consistent views?

Thanks for your education.
Yuxin

>
>         Thanx, Paul
>
> > Thanks
> > Yuxin
> >
> > On Fri, Dec 6, 2019 at 11:30 AM Paul E. McKenney 
> wrote:
> >
> > > On Fri, Dec 06, 2019 at 10:59:05AM -0500, Mathieu Desnoyers wrote:
> > > > ----- On Dec 6, 2019, at 3:51 PM, Yuxin Ren 
> wrote:
> > > >
> > > > > On Fri, Dec 6, 2019 at 5:49 AM Mathieu Desnoyers < [
> > > > > mailto:mathieu.desnoy...@efficios.com |
> mathieu.desnoy...@efficios.com
> > > ] >
> > > > > wrote:
> > > >
> > > > >> - On Dec 5, 2019, at 8:17 PM, Yuxin Ren < [ mailto:
> > > r...@gwmail.gwu.edu |
> > > > >> r...@gwmail.gwu.edu ] > wrote:
> > > >
> > > > >>> Hi,
> > > > >>> I am a student, and learning RCU now, but still know very little
> > > about it.
> > > > >>> Are there any documents/papers/materials which (in)formally
> define
> > > and explain
> > > > >>> RCU consistency guarantees?
> > > >
> > > > >> You may want to have a look at
> > > >
> > > > >> User-Level Implementations of Read-Copy Update
> > > > >> Article in IEEE Transactions on Parallel and Distri

Re: [lttng-dev] RCU consistency guarantees

2019-12-06 Thread Yuxin Ren
Thanks so much for your great help.
I definitely will look at those resources and papers!

One more thing that I am confused
As I mentioned earlier, someone said One key distinction is that both MVCC
and RLU provide much stronger consistency guarantees to readers than does
RCU ...) (https://lwn.net/Articles/777036/).
I am not sure if the above statement is correct or not. But in general,
How can we compare RCU consistency guarantees to other techniques (such as
RLU)?
How to reason about which one has stronger or weaker guarantees?

Thanks
Yuxin

On Fri, Dec 6, 2019 at 11:30 AM Paul E. McKenney  wrote:

> On Fri, Dec 06, 2019 at 10:59:05AM -0500, Mathieu Desnoyers wrote:
> > - On Dec 6, 2019, at 3:51 PM, Yuxin Ren  wrote:
> >
> > > On Fri, Dec 6, 2019 at 5:49 AM Mathieu Desnoyers < [
> > > mailto:mathieu.desnoy...@efficios.com | mathieu.desnoy...@efficios.com
> ] >
> > > wrote:
> >
> > >> - On Dec 5, 2019, at 8:17 PM, Yuxin Ren < [ mailto:
> r...@gwmail.gwu.edu |
> > >> r...@gwmail.gwu.edu ] > wrote:
> >
> > >>> Hi,
> > >>> I am a student, and learning RCU now, but still know very little
> about it.
> > >>> Are there any documents/papers/materials which (in)formally define
> and explain
> > >>> RCU consistency guarantees?
> >
> > >> You may want to have a look at
> >
> > >> User-Level Implementations of Read-Copy Update
> > >> Article in IEEE Transactions on Parallel and Distributed Systems
> 23(2):375 - 382
> > >> · March 2012
> >
> > > Thanks for your info.
> > > However, I do not think URCU talks about any consistency model
> formally.
> >
> > > From previous communication with Paul, he said RCU is not designed for
> > > linearizability, and it is totally acceptable that RCU is not
> linearizable.
> > > However, I am curious how to accurately/formally Characterize RCU
> consistency
> > > model/guarantees
> >
> > Adding Paul E. McKenney in CC.
> >
> > I am referring to the section "Overview of RCU semantics" in the paper.
> Not sure it has the level of
> > formality you are looking for though. Paul, do you have pointers to
> additional material ?
>
> Indeed I do!  The Linux kernel memory model (LKMM) includes RCU.  It is
> in tools/memory-model in recent kernel source trees, which includes
> documentation.  This is an executable model, which means that you
> can create litmus tests and have the model formally and automatically
> evaluate them.
>
> There are also a number of publications covering LKMM:
>
> o   A formal kernel memory-ordering model
> https://lwn.net/Articles/718628/
> https://lwn.net/Articles/720550/
>
> These cover the release stores and dependency ordering that
> provide RCU's publish-subscribe guarantees.
>
> Backup material here:
>
>
> https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/
>
> With these two likely being of particular interest:
>
>
> https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/RCUguarantees.html
>
> https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/LWNLinuxMM/srcu.html
>
> o   Frightening Small Children and Disconcerting Grown-ups:
> Concurrency in the Linux Kernel
> https://dl.acm.org/citation.cfm?id=3177156
> http://www0.cs.ucl.ac.uk/staff/j.alglave/papers/asplos18.pdf
>
> Backup material:
>
> http://diy.inria.fr/linux/
>
> o   Who's afraid of a big bad optimizing compiler?
> https://lwn.net/Articles/793253/
>
> o   Calibrating your fear of big bad optimizing compilers
> https://lwn.net/Articles/799218/
>
> These last two justify use of normal C-language assignment
> statements to initialize and access data referenced by
> RCU-protected pointers.
>
> There is a large body of litmus tests (thousands of them) here:
>
> https://github.com/paulmckrcu/litmus
>
> Many of these litmus tests involve RCU, and these can be located by
> search for files containing rcu_read_lock(), rcu_read_unlock(),
> synchronize_rcu(), and so on.
>
> Or were you looking for something else?
>
> Thanx, Paul
>
> > Thanks,
> >
> > Mathieu
> >
> > >> as a starting point.
> >
> > >> Thanks,
> >
> > >> Mathieu
> >
> > >>> I know there are some consistency models in the database area (such
> as PRAM,
> > >>&

[lttng-dev] RCU consistency guarantees

2019-12-06 Thread Yuxin Ren
Hi,

I am a student, and learning RCU now, but still know very little about it.
Are there any documents/papers/materials which (in)formally define and
explain RCU consistency guarantees?

I know there are some consistency models in the database area (such as
PRAM, Read Uncommitted, etc) from https://jepsen.io/consistency and [1].
How does RCU related to those consistency models?

I also found some comments online (One key distinction is that both MVCC
and RLU provide much stronger consistency guarantees to readers than does
RCU ...) (https://lwn.net/Articles/777036/).
I do not understand how we reason/dresibe/compare the
consistency guarantees. ( I even do not know what consistency guarantees
provided by RCU formally)
Could someone explain this to me?

[1] Bailis, P., Davidson, A., Fekete, A., Ghodsi, A., Hellerstein, J. M., &
Stoica, I. (2013). Highly available transactions: Virtues and limitations.
Proceedings of the VLDB Endowment, 7(3), 181-192.

Thanks
Yuxin
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] RCU consistency guarantees

2019-12-06 Thread Yuxin Ren
On Fri, Dec 6, 2019 at 5:49 AM Mathieu Desnoyers <
mathieu.desnoy...@efficios.com> wrote:

>
> - On Dec 5, 2019, at 8:17 PM, Yuxin Ren  wrote:
>
> Hi,
> I am a student, and learning RCU now, but still know very little about it.
> Are there any documents/papers/materials which (in)formally define and
> explain RCU consistency guarantees?
>
>
> You may want to have a look at
>
> User-Level Implementations of Read-Copy Update
> Article in IEEE Transactions on Parallel and Distributed Systems 23(2):375
> - 382 · March 2012
>

Thanks for your info.
However, I do not think URCU talks about any consistency model formally.

>From previous communication with Paul, he said RCU is not
designed for linearizability, and it is totally acceptable that RCU is
not linearizable.
However, I am curious how to accurately/formally Characterize RCU
consistency model/guarantees

>
> as a starting point.
>
> Thanks,
>
> Mathieu
>
>
> I know there are some consistency models in the database area (such as
> PRAM, Read Uncommitted, etc) from https://jepsen.io/consistency and [1].
> How does RCU related to those consistency models?
>
> I also found some comments online (One key distinction is that both MVCC
> and RLU provide much stronger consistency guarantees to readers than does
> RCU ...) (https://lwn.net/Articles/777036/).
> I do not understand how we reason/dresibe/compare the
> consistency guarantees. ( I even do not know what consistency guarantees
> provided by RCU formally)
> Could someone explain this to me?
>
>
>
> [1] Bailis, P., Davidson, A., Fekete, A., Ghodsi, A., Hellerstein, J. M.,
> & Stoica, I. (2013). Highly available transactions: Virtues and
> limitations. Proceedings of the VLDB Endowment, 7(3), 181-192.
>
> Thanks
> Yuxin
>
> ___
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
>
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] RCU consistency guarantees

2019-12-05 Thread Yuxin Ren
Hi,

I am a student, and learning RCU now, but still know very little about it.
Are there any documents/papers/materials which (in)formally define and
explain RCU consistency guarantees?

I know there are some consistency models in the database area (such as
PRAM, Read Uncommitted, etc) from https://jepsen.io/consistency and [1].
How does RCU related to those consistency models?

I also found some comments online (One key distinction is that both MVCC
and RLU provide much stronger consistency guarantees to readers than does
RCU ...) (https://lwn.net/Articles/777036/).
I do not understand how we reason/dresibe/compare the
consistency guarantees. ( I even do not know what consistency guarantees
provided by RCU formally)
Could someone explain this to me?

[1] Bailis, P., Davidson, A., Fekete, A., Ghodsi, A., Hellerstein, J. M., &
Stoica, I. (2013). Highly available transactions: Virtues and limitations.
Proceedings of the VLDB Endowment, 7(3), 181-192.

Thanks
Yuxin
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] question about the RCU variant in CITRUS tree paper

2017-05-12 Thread Yuxin Ren
Hi Paul,

Thank you for your reply.

If I understand your reply correctly, the update-side lock you
mentioned is the lock used in the tree deletion algorithm.
But their urcu_synchronize contains no lock.
So I think the lock is kind of problem caused by their usage of RCU,
not from their urcu_synchronize implementation.

I want to compare their RCU implementation with the U-RCU
implementation, because the authors argued their implementation
performs better than U-RCU.
Is it possible to use their new RCU implementation as a drop-in
replacement for U-RCU?

I am relative new to RCU, so my question could be stupid.
Many thanks for your time
Yuxin

On Thu, May 11, 2017 at 4:23 PM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Thu, May 11, 2017 at 04:05:45PM -0400, Yuxin Ren wrote:
>> Hi,
>>
>> I am learning U-RCU now.
>> And I read paper Concurrent Updates with RCU: Search Tree as an Example
>> ( 
>> https://pdfs.semanticscholar.org/73e4/cd29273cf9d98d35bc184330e694ba798987.pdf
>> )
>>
>> In this paper, the authors present a variant RCU implementation, and
>> argued their new RCU has better performance than default U-RCU.
>>
>> Do you think their argument and implementation is correct in all cases?
>> If they are right, will you wan to integrate their improment to U-RCU
>> implementation?
>>
>> For your convenience, I paste the related text from the paper here.
>> "In our implementation, each thread has a counter and flag, the
>> counter counts the number of critical sections executed by the thread
>> and a flag indicates if the thread is currently inside its read-side
>> critical section. The rcu_read_lock operation increments the counter
>> and sets the flag to true, while the rcu_read_unlock operation sets
>> the flag to false. When a thread executes a synchronize_rcu operation,
>> it waits for every other thread, until one of two things occurs:
>> either the thread has increased its counter or the thread’s flag is
>> set to false. "
>>
>> One its implementation can be found from synchrobench
>> https://github.com/gramoli/synchrobench/blob/master/c-cpp/src/trees/tree-lock/new_urcu.c
>
> I covered this one here:  https://lwn.net/Articles/667593/
>
> The short version is that they are working around what I consider to
> be a design bug in their algorithm, namely that they are holding the
> update-side lock across RCU grace periods.  They do this to achieve
> linearizability, which is prized by many conference referees/reviewers,
> but not as useful in practice as is commonly supposed.
>
> But it does have a broken URL to the paper, so I will send your working
> version to the LWN editors CCing you.  Thank you for that!
>
> Thanx, Paul
>
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] question about the RCU variant in CITRUS tree paper

2017-05-11 Thread Yuxin Ren
Hi,

I am learning U-RCU now.
And I read paper Concurrent Updates with RCU: Search Tree as an Example
( https://pdfs.semanticscholar.org/73e4/cd29273cf9d98d35bc184330e694ba798987.pdf
)

In this paper, the authors present a variant RCU implementation, and
argued their new RCU has better performance than default U-RCU.

Do you think their argument and implementation is correct in all cases?
If they are right, will you wan to integrate their improment to U-RCU
implementation?

For your convenience, I paste the related text from the paper here.
"In our implementation, each thread has a counter and flag, the
counter counts the number of critical sections executed by the thread
and a flag indicates if the thread is currently inside its read-side
critical section. The rcu_read_lock operation increments the counter
and sets the flag to true, while the rcu_read_unlock operation sets
the flag to false. When a thread executes a synchronize_rcu operation,
it waits for every other thread, until one of two things occurs:
either the thread has increased its counter or the thread’s flag is
set to false. "

One its implementation can be found from synchrobench
https://github.com/gramoli/synchrobench/blob/master/c-cpp/src/trees/tree-lock/new_urcu.c


Thanks a lot!
Yuxin
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] RCU on non-cache-coherent memory

2016-08-01 Thread Yuxin Ren
Hi all,

Is there any research or publications about RCU on top of
non-cache-coherent multi-core architecture?
Not only RCU, any other synchronization technique on top of
non-cache-coherent multi-core
is also helpful.

Thanks a lot!
Yuxin
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU

2016-04-28 Thread Yuxin Ren
Hi Boqun and Paul,

Thank you so much for your help.

I found one reason to use that lock.
In the slow path, a thread will move all waiters to a local queue.
https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L406
After this, following thread can also perform grace period, as the
global waiter queue is empty.
Thus the rcu_gp_lock is used to ensure mutual exclusion.

However, from real time aspect, such blocking will cause priority
inversion: higher priority writer can be blocked by low priority
writer.
Is there a way to modify the code to allow multiple threads to perform
grace period concurrently?

Thanks again!!
Yuxin

On Thu, Apr 28, 2016 at 8:44 AM, Boqun Feng <boqun.f...@gmail.com> wrote:
> Hi Paul and Yuxin,
>
> On Wed, Apr 27, 2016 at 09:23:27PM -0700, Paul E. McKenney wrote:
>> Try building without it and see what happens when you run the tests.
>>
>
> I've run a 'regtest' with the following modification out of curiousity:
>
> diff --git a/urcu.c b/urcu.c
> index a5568bdbd075..9dc3c9feae56 100644
> --- a/urcu.c
> +++ b/urcu.c
> @@ -398,8 +398,6 @@ void synchronize_rcu(void)
> /* We won't need to wake ourself up */
> urcu_wait_set_state(, URCU_WAIT_RUNNING);
>
> -   mutex_lock(_gp_lock);
> -
> /*
>  * Move all waiters into our local queue.
>  */
> @@ -480,7 +478,6 @@ void synchronize_rcu(void)
> smp_mb_master();
>  out:
> mutex_unlock(_registry_lock);
> -   mutex_unlock(_gp_lock);
>
> /*
>  * Wakeup waiters only after we have completed the grace period
>
>
> And guess what, the result of the test was:
>
> Test Summary Report
> ---
> ./run-urcu-tests.sh 1 (Wstat: 0 Tests: 979 Failed: 18)
>   Failed tests:  30, 45, 60, 75, 90, 103, 105, 120, 135
>   150, 165, 180, 195, 210, 225, 240, 253
>   255
> Files=2, Tests=996, 1039 wallclock secs ( 0.55 usr  0.04 sys + 8981.02 cusr 
> 597.64 csys = 9579.25 CPU)
> Result: FAIL
>
> And test case 30 for example is something like:
>
> tests/benchmark/test_urcu_mb   1 -d 0 -b 32768
>
> and it failed because:
>
> lt-test_urcu_mb: test_urcu.c:183: thr_reader: Assertion `*local_ptr == 8' 
> failed.
> zsh: abort (core dumped)  ~/userspace-rcu/tests/benchmark/test_urcu_mb 4 4 1 
> -d 0 -b 32768
>
> So I think what was going on here was:
>
> CPU 0 (reader)  CPU 1 (writer)
>   CPU 2 (writer)
> ===   
>   ==
> rcu_read_lock();  
>   new = malloc(sizeof(int));
> local_ptr = rcu_dereference(test_rcu_pointer); // local_ptr == old
>   *new = 8;
>   
>   old = rcu_xchg_pointer(_rcu_pointer, new);
> synchronize_rcu():
>   urcu_wait_add(); // return 0
>   urcu_move_waiters(); // @gp_waiters is 
> empty,
>// the next 
> urcu_wait_add() will return 0
>
>   
>   synchronize_rcu():
>   
> urcu_wait_add(); // return 0
>
>   mutex_lock(_register_lock);
>   wait_for_readers(); // remove registered 
> reader from @registery,
>   // release 
> rcu_register_lock and wait via poll()
>
>   
> mutex_lock(_registry_lock);
>   
> wait_for_readers(); // @regsitery is empty! we are so lucky
>   
> return;
>
>   
>   if (old)
>   
>   free(old)
> rcu_read_unlock();
> assert(*local_ptr==8); // but local_ptr(i.e. old) is already freed.
>
>
> So the point is there could be two writers calling synchronize_rcu() but
> not returning early(both of them enter the slow path to perform a grace
> period), so the rcu_gp_lock is necessary in this case.
>
> (Cc  Mathieu)
>
> But this is only my understanding and I'm learning the URCU code too ;-)
>

Re: [lttng-dev] Question about lock in synchronize_rcu implementation of URCU

2016-04-27 Thread Yuxin Ren
As they don't currently perform grace period, why do we use the rcu_gp_lock?

Thank you.
Yuxin

On Wed, Apr 27, 2016 at 10:08 PM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Wed, Apr 27, 2016 at 09:34:16PM -0400, Yuxin Ren wrote:
>> Hi,
>>
>> I am learning the URCU code.
>>
>> Why do we need rcu_gp_lock in synchronize_rcu?
>> https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L401
>>
>> In the comment, it says this lock ensures mutual exclusion between
>> threads calling synchronize_rcu().
>> But only the first thread added to waiter queue can proceed to detect
>> grace period.
>> How can multiple threads currently perform the grace thread?
>
> They don't concurrently perform grace periods, and it would be wasteful
> for them to do so.  Instead, the first one performs the grace period,
> and all that were waiting at the time it started get the benefit of that
> same grace period.
>
> Any that arrived after the first grace period performs the first
> grace period are served by whichever of them performs the second
> grace period.
>
> Thanx, Paul
>
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] Question about lock in synchronize_rcu implementation of URCU

2016-04-27 Thread Yuxin Ren
Hi,

I am learning the URCU code.

Why do we need rcu_gp_lock in synchronize_rcu?
https://github.com/urcu/userspace-rcu/blob/master/urcu.c#L401

In the comment, it says this lock ensures mutual exclusion between
threads calling synchronize_rcu().
But only the first thread added to waiter queue can proceed to detect
grace period.
How can multiple threads currently perform the grace thread?

Thanks a lot!
Yuxin
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] real time Userspace RCU

2016-04-15 Thread Yuxin Ren
I really appreciate your reply!!

I have one more question.
From real time aspect, preemption of URCU read-side critical sections
can cause priority-inversion issue.
Does current URCU implementation deal with this problem?
And if so, how does it solve this?

Thanks
Yuxin

On Thu, Mar 31, 2016 at 7:22 AM, Paul E. McKenney
<paul...@linux.vnet.ibm.com> wrote:
> On Thu, Mar 31, 2016 at 09:20:07AM +0800, Yuxin Ren wrote:
>> Thank you all!!
>>
>> I agree URCU does timing quite well.
>> But are there any formal response time analysis for URCU/RCU (both
>> read  and update path)?
>
> Not that I know of.  You could be the first!
>
>> Or could anyone guide me how to do RTA for URCU/RCU?
>
> Compile something with a simple RCU read-side critical section, and then
> count the instructions.  QSBR will of course work best, but MB will also
> have good bounds.  Signal-based will be a bit more complicated.
>
> Much depends on what type of RTA you want to do.  Brandenburg's
> dissertation has quite a bit of info and many citations:
>
> http://www.cs.unc.edu/~bbb/diss/
>
> You can also take an experimental approach, though a great many
> runs are required.  OSADL (https://www.osadl.org/) does quite a
> bit of this work on -rt Linux.
>
> Thanx, Paul
>
>> Thanks again.
>>
>> On Fri, Mar 11, 2016 at 10:00 PM, Mathieu Desnoyers
>> <mathieu.desnoy...@efficios.com> wrote:
>> > - On Mar 11, 2016, at 6:45 AM, Paul E. McKenney 
>> > paul...@linux.vnet.ibm.com wrote:
>> >
>> >> On Thu, Mar 10, 2016 at 08:53:05PM +, Mathieu Desnoyers wrote:
>> >>> - On Mar 10, 2016, at 3:33 PM, Yuxin Ren r...@gwmail.gwu.edu wrote:
>> >>>
>> >>> > Thank you for your reply.
>> >>> >
>> >>> > I want to generally understand how to apply urcu to real time systems.
>> >>> > I know real time system focus on predictability on both timing and
>> >>> > memory consumption.
>> >>> > So how does real time urcu support predictability?
>> >>> > Could you provide me some papers, documents or any materials about any
>> >>> > aspect of real time urcu?
>> >>>
>> >>> Adding Paul E. McKenney in CC, who may have some thoughts on this
>> >>> topic.
>> >>
>> >> URCU does timing quite well, given that the read-side primitives each
>> >> execute a fixed sequence of instructions.  Updates using call_rcu()
>> >> can be used to minimize update-side latency, but if you need to bound
>> >> memory overhead, the best way to do that is to make sure that updates
>> >> are not on the critical path, and then use synchronize_rcu() instead
>> >> of call_rcu().  In that case, the total amount of memory waiting for
>> >> reclamation is bounded by the maximum size of an RCU-protected memory
>> >> block times the number of threads.
>> >
>> > An intermediate solution if both update throughput and bounded-memory
>> > are required (but the application would not have real-time constraints
>> > on updates) would be to use the defer_rcu() API in liburcu. It amortizes
>> > the cost of synchronize_rcu() over many defer_rcu() calls with a worker
>> > thread, but only up to an upper bound. When the upper bound is reached,
>> > the defer queue call empties the defer queue itself.
>> >
>> > Thanks,
>> >
>> > Mathieu
>> >
>> >>
>> >> So can you design your application so that updates are off the critical
>> >> path?  If so, you can get both bounded read-side accesses and bounded
>> >> memory footprint.
>> >>
>> >> This of course assumes that your data structures are simple enough
>> >> that readers don't need to use retry techniques.
>> >>
>> >> The following info might be helpful:
>> >>
>> >> http://www2.rdrop.com/users/paulmck/realtime/paper/DetSyncRCU.2009.08.18a.pdf
>> >> http://www2.rdrop.com/users/paulmck/realtime/paper/DetSyncRCU.2009.09.29a.pdf
>> >>
>> >> http://www2.rdrop.com/users/paulmck/realtime/paper/RTLWS2012occcRT.2012.10.19e.pdf
>> >>
>> >> It also depends on your timeframe.  Microseconds?  Life is hard.
>> >> Milliseconds?  Care is required, but you have a fair amount of freedom.
>> >> Seconds?  Life is not so hard.  Unless you need to do two seconds of
>> >> computation in one second or some s

Re: [lttng-dev] real time Userspace RCU

2016-03-30 Thread Yuxin Ren
Thank you all!!

I agree URCU does timing quite well.
But are there any formal response time analysis for URCU/RCU (both
read  and update path)?
Or could anyone guide me how to do RTA for URCU/RCU?

Thanks again.

On Fri, Mar 11, 2016 at 10:00 PM, Mathieu Desnoyers
<mathieu.desnoy...@efficios.com> wrote:
> - On Mar 11, 2016, at 6:45 AM, Paul E. McKenney 
> paul...@linux.vnet.ibm.com wrote:
>
>> On Thu, Mar 10, 2016 at 08:53:05PM +, Mathieu Desnoyers wrote:
>>> - On Mar 10, 2016, at 3:33 PM, Yuxin Ren r...@gwmail.gwu.edu wrote:
>>>
>>> > Thank you for your reply.
>>> >
>>> > I want to generally understand how to apply urcu to real time systems.
>>> > I know real time system focus on predictability on both timing and
>>> > memory consumption.
>>> > So how does real time urcu support predictability?
>>> > Could you provide me some papers, documents or any materials about any
>>> > aspect of real time urcu?
>>>
>>> Adding Paul E. McKenney in CC, who may have some thoughts on this
>>> topic.
>>
>> URCU does timing quite well, given that the read-side primitives each
>> execute a fixed sequence of instructions.  Updates using call_rcu()
>> can be used to minimize update-side latency, but if you need to bound
>> memory overhead, the best way to do that is to make sure that updates
>> are not on the critical path, and then use synchronize_rcu() instead
>> of call_rcu().  In that case, the total amount of memory waiting for
>> reclamation is bounded by the maximum size of an RCU-protected memory
>> block times the number of threads.
>
> An intermediate solution if both update throughput and bounded-memory
> are required (but the application would not have real-time constraints
> on updates) would be to use the defer_rcu() API in liburcu. It amortizes
> the cost of synchronize_rcu() over many defer_rcu() calls with a worker
> thread, but only up to an upper bound. When the upper bound is reached,
> the defer queue call empties the defer queue itself.
>
> Thanks,
>
> Mathieu
>
>>
>> So can you design your application so that updates are off the critical
>> path?  If so, you can get both bounded read-side accesses and bounded
>> memory footprint.
>>
>> This of course assumes that your data structures are simple enough
>> that readers don't need to use retry techniques.
>>
>> The following info might be helpful:
>>
>> http://www2.rdrop.com/users/paulmck/realtime/paper/DetSyncRCU.2009.08.18a.pdf
>> http://www2.rdrop.com/users/paulmck/realtime/paper/DetSyncRCU.2009.09.29a.pdf
>>
>> http://www2.rdrop.com/users/paulmck/realtime/paper/RTLWS2012occcRT.2012.10.19e.pdf
>>
>> It also depends on your timeframe.  Microseconds?  Life is hard.
>> Milliseconds?  Care is required, but you have a fair amount of freedom.
>> Seconds?  Life is not so hard.  Unless you need to do two seconds of
>> computation in one second or some such.  ;-)
>>
>>   Thanx, Paul
>>
>>> Thanks,
>>>
>>> Mathieu
>>>
>>> >
>>> > Thanks again!
>>> >
>>> > On Thu, Mar 10, 2016 at 1:52 PM, Michel Dagenais
>>> > <michel.dagen...@polymtl.ca> wrote:
>>> >> Real-time and embedded systems is an important current focus for the 
>>> >> LTTng
>>> >> toolchain research. Do you have specific needs for userspace RCU?
>>> >>
>>> >> - Mail original -
>>> >>> Hi,
>>> >>>
>>> >>>  Is there any work or research about Userspace RCU on real time or
>>> >>> embedded systems?
>>> >>> Any information is welcome.
>>> >>>
>>> >>> Thanks a lot!
>>> >>> Yuxin
>>> >>> ___
>>> >>> lttng-dev mailing list
>>> >>> lttng-dev@lists.lttng.org
>>> >>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>> >>>
>>> > ___
>>> > lttng-dev mailing list
>>> > lttng-dev@lists.lttng.org
>>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>>
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> http://www.efficios.com
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


Re: [lttng-dev] real time Userspace RCU

2016-03-10 Thread Yuxin Ren
Thank you for your reply.

I want to generally understand how to apply urcu to real time systems.
I know real time system focus on predictability on both timing and
memory consumption.
So how does real time urcu support predictability?
Could you provide me some papers, documents or any materials about any
aspect of real time urcu?

Thanks again!

On Thu, Mar 10, 2016 at 1:52 PM, Michel Dagenais
 wrote:
> Real-time and embedded systems is an important current focus for the LTTng 
> toolchain research. Do you have specific needs for userspace RCU?
>
> - Mail original -
>> Hi,
>>
>>  Is there any work or research about Userspace RCU on real time or
>> embedded systems?
>> Any information is welcome.
>>
>> Thanks a lot!
>> Yuxin
>> ___
>> lttng-dev mailing list
>> lttng-dev@lists.lttng.org
>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


[lttng-dev] real time Userspace RCU

2016-03-10 Thread Yuxin Ren
Hi,

 Is there any work or research about Userspace RCU on real time or
embedded systems?
Any information is welcome.

Thanks a lot!
Yuxin
___
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev