On 07/20, Dan Williams wrote:
> On Mon, Jul 17, 2017 at 4:56 PM, Vishal Verma <[email protected]> 
> wrote:
> > Clearing errors or badblocks during a BTT write requires sending an ACPI
> > DSM, which means potentially sleeping. Since a BTT IO happens in atomic
> > context (preemption disabled, spinlocks may be held), we cannot perform
> > error clearing in the course of an IO. Due to this error clearing for
> > BTT IOs has hitherto been disabled.
> >
> > In this patch we move error clearing out of the atomic section, and thus
> > re-enable error clearing with BTTs. When we are about to add a block to
> > the free list, we check if it was previously marked as an error, and if
> > it was, we add it to the freelist, but also set a flag that says error
> > clearing will be required. We then drop the lane (ending the atomic
> > context), and send a zero buffer so that the error can be cleared. The
> > error flag in the free list is protected by the nd 'lane', and is set
> > only be a thread while it holds that lane. When the error is cleared,
> > the flag is cleared, but while holding a mutex for that freelist index.
> >
> > When writing, we check for two things -
> > 1/ If the freelist mutex is held or if the error flag is set. If so,
> > this is an error block that is being (or about to be) cleared.
> > 2/ If the block is a known badblock based on nsio->bb
> >
> > The second check is required because the BTT map error flag for a map
> > entry only gets set when an error LBA is read. If we write to a new
> > location that may not have the map error flag set, but still might be in
> > the region's badblock list, we can trigger an EIO on the write, which is
> > undesirable and completely avoidable.
> >
> > Cc: Jeff Moyer <[email protected]>
> > Cc: Toshi Kani <[email protected]>
> > Cc: Dan Williams <[email protected]>
> > Signed-off-by: Vishal Verma <[email protected]>
> > ---
> [..]
> > diff --git a/drivers/nvdimm/btt.h b/drivers/nvdimm/btt.h
> > index 2bc0d10b..69c1f90 100644
> > --- a/drivers/nvdimm/btt.h
> > +++ b/drivers/nvdimm/btt.h
> > @@ -15,6 +15,7 @@
> >  #ifndef _LINUX_BTT_H
> >  #define _LINUX_BTT_H
> >
> > +#include <linux/badblocks.h>
> >  #include <linux/types.h>
> >
> >  #define BTT_SIG_LEN 16
> > @@ -41,6 +42,7 @@
> >  #define ent_lba(ent) (ent & MAP_LBA_MASK)
> >  #define ent_e_flag(ent) (!!(ent & MAP_ERR_MASK))
> >  #define ent_z_flag(ent) (!!(ent & MAP_TRIM_MASK))
> > +#define set_e_flag(ent) (ent |= MAP_ERR_MASK)
> >
> >  enum btt_init_state {
> >         INIT_UNCHECKED = 0,
> > @@ -82,6 +84,8 @@ struct free_entry {
> >         u32 block;
> >         u8 sub;
> >         u8 seq;
> > +       struct mutex err_lock;
> > +       u8 has_err;
> 
> Do we really need an error lock per lane? Given the rarity of lock
> contention I think we can save this memory and just have one error
> handling lock per-btt.

Yes, I could make it per arena, that makes more sense - but the only
downside might be all writes to that arena get blocked behind the error
clear instead of just the one(s) that pick that free block. That said,
it shouldn't be a big deal as errors should be rare..

_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to