Re: assign_desc() barriers: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
On 2019-08-25, John Ogness wrote: >> --- /dev/null >> +++ b/kernel/printk/ringbuffer.c >> +static bool assign_desc(struct prb_reserved_entry *e) >> +{ [...] >> +atomic_long_set_release(&d->id, atomic_long_read(&d->id) + >> +DESCS_COUNT(rb)); > > atomic_long_set_release() might be a bit confusing here. > There is no related acquire. >>> >>> As the comment states, this release is for prb_getdesc() users. The >>> only prb_getdesc() user is _dataring_pop(). (i.e. the descriptor's >>> ID is not what _dataring_pop() was expecting), then the tail must >>> have moved and _dataring_pop() needs to see that. Since there are no >>> data dependencies between descriptor ID and tail_pos, an explicit >>> memory barrier is used. More on this below. > >> + The two related barriers are in different source files >> and APIs: >> >>+ assign_desc() in ringbuffer.c; ringbuffer API >>+ _dataring_pop in dataring.c; dataring API > > Agreed. This is a consequence of the ID management being within the > high-level ringbuffer code. I could have added an smp_rmb() to the > NULL case in prb_getdesc(). Then both barriers would be in the same > file. However, this would mean smp_rmb() is called many times > (particularly by readers) when it is not necessary. What I wrote here is wrong. prb_getdesc() is not called "many times (particularly by readers)". It is only called once within the writer function _dataring_pop(). Looking at this again, I think it would be better to move the smp_rmb() into the NULL case of prb_getdesc(). Then both barrier pairs are located (and documented) in the same file. This also simplifies the documentation by not saying "the caller's smp_rmb() everywhere". I would also change _dataring_pop() so that the smp_rmb() is located within the handling of the other two failed checks (begin_lpos != tail_lpos and !_datablock_valid()). Then the out: at the end is just return atomic_long_read(&dr->tail_lpos). After modifying the code in this way, I think it looks more straight forward and would have avoided your confusion: The RMB in dataring.c:_dataring_pop() matches the MB in dataring.c:dataring_push() and the RMB in ringbuffer.c:prb_getdesc() matches the SET_RELEASE in ringbuffer.c:assign_desc(). John Ogness
Re: assign_desc() barriers: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
On 2019-08-22, Petr Mladek wrote: > --- /dev/null > +++ b/kernel/printk/ringbuffer.c > +/** > + * assign_desc() - Assign a descriptor to the caller. > + * > + * @e: The entry structure to store the assigned descriptor to. > + * > + * Find an available descriptor to assign to the caller. First it is > checked > + * if the tail descriptor from the committed list can be recycled. If > not, > + * perhaps a never-used descriptor is available. Otherwise, data blocks > will > + * be invalidated until the tail descriptor from the committed list can > be > + * recycled. > + * > + * Assigned descriptors are invalid until data has been reserved for > them. > + * > + * Return: true if a descriptor was assigned, otherwise false. > + * > + * This will only fail if it was not possible to invalidate data blocks > in > + * order to recycle a descriptor. This can happen if a writer has > reserved but > + * not yet committed data and that reserved data is currently the oldest > data. > + */ > +static bool assign_desc(struct prb_reserved_entry *e) > +{ > + struct printk_ringbuffer *rb = e->rb; > + struct prb_desc *d; > + struct nl_node *n; > + unsigned long i; > + > + for (;;) { > + /* > + * jA: > + * > + * Try to recycle a descriptor on the committed list. > + */ > + n = numlist_pop(&rb->nl); > + if (n) { > + d = container_of(n, struct prb_desc, list); > + break; > + } > + > + /* Fallback to static never-used descriptors. */ > + if (atomic_read(&rb->desc_next_unused) < DESCS_COUNT(rb)) { > + i = atomic_fetch_inc(&rb->desc_next_unused); > + if (i < DESCS_COUNT(rb)) { > + d = &rb->descs[i]; > + atomic_long_set(&d->id, i); > + break; > + } > + } > + > + /* > + * No descriptor available. Make one available for recycling > + * by invalidating data (which some descriptor will be > + * referencing). > + */ > + if (!dataring_pop(&rb->dr)) > + return false; > + } > + > + /* > + * jB: > + * > + * Modify the descriptor ID so that users of the descriptor see that > + * it has been recycled. A _release() is used so that prb_getdesc() > + * callers can see all data ringbuffer updates after issuing a > + * pairing smb_rmb(). See iA for details. > + * > + * Memory barrier involvement: > + * > + * If dB->iA reads from jB, then dI reads the same value as > + * jA->cD->hA. > + * > + * Relies on: > + * > + * RELEASE from jA->cD->hA to jB > + *matching > + * RMB between dB->iA and dI > + */ > + atomic_long_set_release(&d->id, atomic_long_read(&d->id) + > + DESCS_COUNT(rb)); atomic_long_set_release() might be a bit confusing here. There is no related acquire. >> >> As the comment states, this release is for prb_getdesc() users. The >> only prb_getdesc() user is _dataring_pop(). (i.e. the descriptor's >> ID is not what _dataring_pop() was expecting), then the tail must >> have moved and _dataring_pop() needs to see that. Since there are no >> data dependencies between descriptor ID and tail_pos, an explicit >> memory barrier is used. More on this below. After reading through your post, I think you are pairing the wrong barriers together. jB pairs with dH (i.e. the set_release() in assign_desc() pairs with the smp_rmb() in _dataring_pop()). (The comment for jB wrongly says dI instead of dH! Argh!) > OK, let me show how complicated and confusing this looks for me: I want to address all your points here. _Not_ because I want to justify or defend my insanity, but because it may help to provide some clarity. > + The two related barriers are in different source files > and APIs: > >+ assign_desc() in ringbuffer.c; ringbuffer API >+ _dataring_pop in dataring.c; dataring API Agreed. This is a consequence of the ID management being within the high-level ringbuffer code. I could have added an smp_rmb() to the NULL case in prb_getdesc(). Then both barriers would be in the same file. However, this would mean smp_rmb() is called many times (particularly by readers) when it is not necessary. > + Both the related barriers are around "id" manipulation. > But one is in dataring, other is in descriptors array. > One is about an old released "id". One is about a newly > assigned "id". dB is not the pairing barrier of jB. As dB's comment says, it pairs with gA. (The load_acquire(id) in _dataring_pop()
Re: assign_desc() barriers: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
On Wed 2019-08-21 07:52:26, John Ogness wrote: > On 2019-08-20, Petr Mladek wrote: > >> > --- /dev/null > >> > +++ b/kernel/printk/ringbuffer.c > >> > +/** > >> > + * assign_desc() - Assign a descriptor to the caller. > >> > + * > >> > + * @e: The entry structure to store the assigned descriptor to. > >> > + * > >> > + * Find an available descriptor to assign to the caller. First it is > >> > checked > >> > + * if the tail descriptor from the committed list can be recycled. If > >> > not, > >> > + * perhaps a never-used descriptor is available. Otherwise, data blocks > >> > will > >> > + * be invalidated until the tail descriptor from the committed list can > >> > be > >> > + * recycled. > >> > + * > >> > + * Assigned descriptors are invalid until data has been reserved for > >> > them. > >> > + * > >> > + * Return: true if a descriptor was assigned, otherwise false. > >> > + * > >> > + * This will only fail if it was not possible to invalidate data blocks > >> > in > >> > + * order to recycle a descriptor. This can happen if a writer has > >> > reserved but > >> > + * not yet committed data and that reserved data is currently the > >> > oldest data. > >> > + */ > >> > +static bool assign_desc(struct prb_reserved_entry *e) > >> > +{ > >> > +struct printk_ringbuffer *rb = e->rb; > >> > +struct prb_desc *d; > >> > +struct nl_node *n; > >> > +unsigned long i; > >> > + > >> > +for (;;) { > >> > +/* > >> > + * jA: > >> > + * > >> > + * Try to recycle a descriptor on the committed list. > >> > + */ > >> > +n = numlist_pop(&rb->nl); > >> > +if (n) { > >> > +d = container_of(n, struct prb_desc, list); > >> > +break; > >> > +} > >> > + > >> > +/* Fallback to static never-used descriptors. */ > >> > +if (atomic_read(&rb->desc_next_unused) < > >> > DESCS_COUNT(rb)) { > >> > +i = atomic_fetch_inc(&rb->desc_next_unused); > >> > +if (i < DESCS_COUNT(rb)) { > >> > +d = &rb->descs[i]; > >> > +atomic_long_set(&d->id, i); > >> > +break; > >> > +} > >> > +} > >> > + > >> > +/* > >> > + * No descriptor available. Make one available for > >> > recycling > >> > + * by invalidating data (which some descriptor will be > >> > + * referencing). > >> > + */ > >> > +if (!dataring_pop(&rb->dr)) > >> > +return false; > >> > +} > >> > + > >> > +/* > >> > + * jB: > >> > + * > >> > + * Modify the descriptor ID so that users of the descriptor see > >> > that > >> > + * it has been recycled. A _release() is used so that > >> > prb_getdesc() > >> > + * callers can see all data ringbuffer updates after issuing a > >> > + * pairing smb_rmb(). See iA for details. > >> > + * > >> > + * Memory barrier involvement: > >> > + * > >> > + * If dB->iA reads from jB, then dI reads the same value as > >> > + * jA->cD->hA. > >> > + * > >> > + * Relies on: > >> > + * > >> > + * RELEASE from jA->cD->hA to jB > >> > + *matching > >> > + * RMB between dB->iA and dI > >> > + */ > >> > +atomic_long_set_release(&d->id, atomic_long_read(&d->id) + > >> > +DESCS_COUNT(rb)); > >> > >> atomic_long_set_release() might be a bit confusing here. > >> There is no related acquire. > > As the comment states, this release is for prb_getdesc() users. The only > prb_getdesc() user is _dataring_pop(). > (i.e. the descriptor's ID is not what _dataring_pop() was expecting), > then the tail must have moved and _dataring_pop() needs to see > that. Since there are no data dependencies between descriptor ID and > tail_pos, an explicit memory barrier is used. More on this below. OK, let me show how complicated and confusing this looks for me: + The two related barriers are in different source files and APIs: + assign_desc() in ringbuffer.c; ringbuffer API + _dataring_pop in dataring.c; dataring API + Both the related barriers are around "id" manipulation. But one is in dataring, other is in descriptors array. One is about an old released "id". One is about a newly assigned "id". + The release() barrier is called once for each assigned descriptor. The acquire() barrier is called more times or not at all depending on the amount of free space in dataring. + prb_getdesc() is mentioned in the comment but the barrier is in _dataring_pop() + prb_getdesc() is called via dr->
Re: assign_desc() barriers: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
On 2019-08-20, Petr Mladek wrote: >> > --- /dev/null >> > +++ b/kernel/printk/ringbuffer.c >> > +/** >> > + * assign_desc() - Assign a descriptor to the caller. >> > + * >> > + * @e: The entry structure to store the assigned descriptor to. >> > + * >> > + * Find an available descriptor to assign to the caller. First it is >> > checked >> > + * if the tail descriptor from the committed list can be recycled. If not, >> > + * perhaps a never-used descriptor is available. Otherwise, data blocks >> > will >> > + * be invalidated until the tail descriptor from the committed list can be >> > + * recycled. >> > + * >> > + * Assigned descriptors are invalid until data has been reserved for them. >> > + * >> > + * Return: true if a descriptor was assigned, otherwise false. >> > + * >> > + * This will only fail if it was not possible to invalidate data blocks in >> > + * order to recycle a descriptor. This can happen if a writer has >> > reserved but >> > + * not yet committed data and that reserved data is currently the oldest >> > data. >> > + */ >> > +static bool assign_desc(struct prb_reserved_entry *e) >> > +{ >> > + struct printk_ringbuffer *rb = e->rb; >> > + struct prb_desc *d; >> > + struct nl_node *n; >> > + unsigned long i; >> > + >> > + for (;;) { >> > + /* >> > + * jA: >> > + * >> > + * Try to recycle a descriptor on the committed list. >> > + */ >> > + n = numlist_pop(&rb->nl); >> > + if (n) { >> > + d = container_of(n, struct prb_desc, list); >> > + break; >> > + } >> > + >> > + /* Fallback to static never-used descriptors. */ >> > + if (atomic_read(&rb->desc_next_unused) < DESCS_COUNT(rb)) { >> > + i = atomic_fetch_inc(&rb->desc_next_unused); >> > + if (i < DESCS_COUNT(rb)) { >> > + d = &rb->descs[i]; >> > + atomic_long_set(&d->id, i); >> > + break; >> > + } >> > + } >> > + >> > + /* >> > + * No descriptor available. Make one available for recycling >> > + * by invalidating data (which some descriptor will be >> > + * referencing). >> > + */ >> > + if (!dataring_pop(&rb->dr)) >> > + return false; >> > + } >> > + >> > + /* >> > + * jB: >> > + * >> > + * Modify the descriptor ID so that users of the descriptor see that >> > + * it has been recycled. A _release() is used so that prb_getdesc() >> > + * callers can see all data ringbuffer updates after issuing a >> > + * pairing smb_rmb(). See iA for details. >> > + * >> > + * Memory barrier involvement: >> > + * >> > + * If dB->iA reads from jB, then dI reads the same value as >> > + * jA->cD->hA. >> > + * >> > + * Relies on: >> > + * >> > + * RELEASE from jA->cD->hA to jB >> > + *matching >> > + * RMB between dB->iA and dI >> > + */ >> > + atomic_long_set_release(&d->id, atomic_long_read(&d->id) + >> > + DESCS_COUNT(rb)); >> >> atomic_long_set_release() might be a bit confusing here. >> There is no related acquire. As the comment states, this release is for prb_getdesc() users. The only prb_getdesc() user is _dataring_pop(). If getdesc() returns NULL (i.e. the descriptor's ID is not what _dataring_pop() was expecting), then the tail must have moved and _dataring_pop() needs to see that. Since there are no data dependencies between descriptor ID and tail_pos, an explicit memory barrier is used. More on this below. >> In fact, d->id manipulation has barriers from both sides: >> >> + smp_rmb() before so that all reads are finished before >> the id is updated (release) > > Uh, this statement does not make sense. The read barrier is not > needed here. Instead the readers need it. > > Well, we might need a write barrier before d->id manipulation. > It should be in numlist_pop() after successfully updating nl->tail_id. > It will allow readers to detect that the desriptor is being reused > (not in valid tail_id..head_id range) before we start manipulating it. > >> + smp_wmb() after so that the new ID is written before other >> related values are modified (acquire). >> >> The smp_wmb() barrier is in prb_reserve(). I would move it here. > > This still makes sense. I would move the write barrier from > prb_reserve() here. The issue is that _dataring_pop() needs to see a moved dataring tail if prb_getdesc() fails. Just because numlist_pop() succeeded, doesn't mean that this was the task that changed the dataring tail. I.e. another CPU could observe that this task changed the ID but _not_ yet see that another task changed the dataring tail. Issuing an smp_mb() before setting the the new ID would also suffice, but that is a pretty big hammer for something that a set_release can take care of. > Sigh, I have to admit that I am not familiar wi
Re: assign_desc() barriers: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
On Tue 2019-08-20 10:22:53, Petr Mladek wrote: > On Thu 2019-08-08 00:32:26, John Ogness wrote: > > --- /dev/null > > +++ b/kernel/printk/ringbuffer.c > > +/** > > + * assign_desc() - Assign a descriptor to the caller. > > + * > > + * @e: The entry structure to store the assigned descriptor to. > > + * > > + * Find an available descriptor to assign to the caller. First it is > > checked > > + * if the tail descriptor from the committed list can be recycled. If not, > > + * perhaps a never-used descriptor is available. Otherwise, data blocks > > will > > + * be invalidated until the tail descriptor from the committed list can be > > + * recycled. > > + * > > + * Assigned descriptors are invalid until data has been reserved for them. > > + * > > + * Return: true if a descriptor was assigned, otherwise false. > > + * > > + * This will only fail if it was not possible to invalidate data blocks in > > + * order to recycle a descriptor. This can happen if a writer has reserved > > but > > + * not yet committed data and that reserved data is currently the oldest > > data. > > + */ > > +static bool assign_desc(struct prb_reserved_entry *e) > > +{ > > + struct printk_ringbuffer *rb = e->rb; > > + struct prb_desc *d; > > + struct nl_node *n; > > + unsigned long i; > > + > > + for (;;) { > > + /* > > +* jA: > > +* > > +* Try to recycle a descriptor on the committed list. > > +*/ > > + n = numlist_pop(&rb->nl); > > + if (n) { > > + d = container_of(n, struct prb_desc, list); > > + break; > > + } > > + > > + /* Fallback to static never-used descriptors. */ > > + if (atomic_read(&rb->desc_next_unused) < DESCS_COUNT(rb)) { > > + i = atomic_fetch_inc(&rb->desc_next_unused); > > + if (i < DESCS_COUNT(rb)) { > > + d = &rb->descs[i]; > > + atomic_long_set(&d->id, i); > > + break; > > + } > > + } > > + > > + /* > > +* No descriptor available. Make one available for recycling > > +* by invalidating data (which some descriptor will be > > +* referencing). > > +*/ > > + if (!dataring_pop(&rb->dr)) > > + return false; > > + } > > + > > + /* > > +* jB: > > +* > > +* Modify the descriptor ID so that users of the descriptor see that > > +* it has been recycled. A _release() is used so that prb_getdesc() > > +* callers can see all data ringbuffer updates after issuing a > > +* pairing smb_rmb(). See iA for details. > > +* > > +* Memory barrier involvement: > > +* > > +* If dB->iA reads from jB, then dI reads the same value as > > +* jA->cD->hA. > > +* > > +* Relies on: > > +* > > +* RELEASE from jA->cD->hA to jB > > +*matching > > +* RMB between dB->iA and dI > > +*/ > > + atomic_long_set_release(&d->id, atomic_long_read(&d->id) + > > + DESCS_COUNT(rb)); > > atomic_long_set_release() might be a bit confusing here. > There is no related acquire. > > In fact, d->id manipulation has barriers from both sides: > > + smp_rmb() before so that all reads are finished before > the id is updated (release) Uh, this statement does not make sense. The read barrier is not needed here. Instead the readers need it. Well, we might need a write barrier before d->id manipulation. It should be in numlist_pop() after successfully updating nl->tail_id. It will allow readers to detect that the desriptor is being reused (not in valid tail_id..head_id range) before we start manipulating it. > + smp_wmb() after so that the new ID is written before other > related values are modified (acquire). > > The smp_wmb() barrier is in prb_reserve(). I would move it here. This still makes sense. I would move the write barrier from prb_reserve() here. Sigh, I have to admit that I am not familiar with the _acquire(), _release(), and _relaxed() variants of the atomic operations. They probably make it easier to implement some locking API. I am not sure how to use it here. This code implements a complex interlock between several variables. I mean that several variables lock each other in a cycle, like a state machine? In each case, it is not a simple locking where we check state of a single variable. Best Regards, Petr
assign_desc() barriers: Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation
On Thu 2019-08-08 00:32:26, John Ogness wrote: > --- /dev/null > +++ b/kernel/printk/ringbuffer.c > +/** > + * assign_desc() - Assign a descriptor to the caller. > + * > + * @e: The entry structure to store the assigned descriptor to. > + * > + * Find an available descriptor to assign to the caller. First it is checked > + * if the tail descriptor from the committed list can be recycled. If not, > + * perhaps a never-used descriptor is available. Otherwise, data blocks will > + * be invalidated until the tail descriptor from the committed list can be > + * recycled. > + * > + * Assigned descriptors are invalid until data has been reserved for them. > + * > + * Return: true if a descriptor was assigned, otherwise false. > + * > + * This will only fail if it was not possible to invalidate data blocks in > + * order to recycle a descriptor. This can happen if a writer has reserved > but > + * not yet committed data and that reserved data is currently the oldest > data. > + */ > +static bool assign_desc(struct prb_reserved_entry *e) > +{ > + struct printk_ringbuffer *rb = e->rb; > + struct prb_desc *d; > + struct nl_node *n; > + unsigned long i; > + > + for (;;) { > + /* > + * jA: > + * > + * Try to recycle a descriptor on the committed list. > + */ > + n = numlist_pop(&rb->nl); > + if (n) { > + d = container_of(n, struct prb_desc, list); > + break; > + } > + > + /* Fallback to static never-used descriptors. */ > + if (atomic_read(&rb->desc_next_unused) < DESCS_COUNT(rb)) { > + i = atomic_fetch_inc(&rb->desc_next_unused); > + if (i < DESCS_COUNT(rb)) { > + d = &rb->descs[i]; > + atomic_long_set(&d->id, i); > + break; > + } > + } > + > + /* > + * No descriptor available. Make one available for recycling > + * by invalidating data (which some descriptor will be > + * referencing). > + */ > + if (!dataring_pop(&rb->dr)) > + return false; > + } > + > + /* > + * jB: > + * > + * Modify the descriptor ID so that users of the descriptor see that > + * it has been recycled. A _release() is used so that prb_getdesc() > + * callers can see all data ringbuffer updates after issuing a > + * pairing smb_rmb(). See iA for details. > + * > + * Memory barrier involvement: > + * > + * If dB->iA reads from jB, then dI reads the same value as > + * jA->cD->hA. > + * > + * Relies on: > + * > + * RELEASE from jA->cD->hA to jB > + *matching > + * RMB between dB->iA and dI > + */ > + atomic_long_set_release(&d->id, atomic_long_read(&d->id) + > + DESCS_COUNT(rb)); atomic_long_set_release() might be a bit confusing here. There is no related acquire. In fact, d->id manipulation has barriers from both sides: + smp_rmb() before so that all reads are finished before the id is updated (release) + smp_wmb() after so that the new ID is written before other related values are modified (acquire). The smp_wmb() barrier is in prb_reserve(). I would move it here. Best Regards, Petr > + > + e->desc = d; > + return true; > +} > + > +/** > + * prb_reserve() - Reserve data in the ringbuffer. > + * > + * @e:The entry structure to setup. > + * > + * @rb: The ringbuffer to reserve data in. > + * > + * @size: The size of the data to reserve. > + * > + * This is the public function available to writers to reserve data. > + * > + * Context: Any context. Disables local interrupts on success. > + * Return: A pointer to the reserved data or an ERR_PTR if data could not be > + * reserved. > + * > + * If the provided size is legal, this will only fail if it was not possible > + * to invalidate the oldest data block. This can happen if a writer has > + * reserved but not yet committed data and that reserved data is currently > + * the oldest data. > + * > + * The ERR_PTR values and their meaning: > + * > + * * -EINVAL: illegal @size value > + * * -EBUSY: failed to reserve a descriptor (@fail count incremented) > + * * -ENOMEM: failed to reserve data (invalid descriptor committed) > + */ > +char *prb_reserve(struct prb_reserved_entry *e, struct printk_ringbuffer *rb, > + unsigned int size) > +{ > + struct prb_desc *d; > + unsigned long id; > + char *buf; > + > + if (!dataring_checksize(&rb->dr, size)) > + return ERR_PTR(-EINVAL); > + > + e->rb = rb; > + > + /* > + * Disable interrupts during the reserve/commit window in order to > + * minimize the number of reserved but not yet committed data blocks > +