----- Original Message -----
> On Tue, Dec 22, 2020 at 9:39 PM Bob Peterson <[email protected]> wrote:
> >
> > Hi,
> >
> > Before this patch, journal recovery was done by a workqueue function that
> > operated on a per-journal basis. The problem is, these could run
> > simultaneously
> > which meant that they could all use the same bio, sd_log_bio, to do their
> > writing to all the various journals. These operations overwrote one another
> > eventually causing memory corruption.
> >
> > This patch makes the recovery workqueue operate on a per-superblock basis,
> > which means a mount point using, for example journal0, could do recovery
> > for all journals that need recovery. This is done consecutively so the
> > sd_log_bio is only referenced by one recovery at a time, thus avoiding the
> > chaos.
> >
> > Since the journal recovery requests can come in any order, and
> > unpredictably,
> > the new work func loops until there are no more journals to be recovered.
> >
> > Since multiple processes may request recovery of a journal, and since they
> > all now use the same sdp-based workqueue, it's okay for them to get an
> > error from queue_work: Queueing work while there's already work queued.
> >
> > Signed-off-by: Bob Peterson <[email protected]>
> 
> Can't this be a simple loop like below?
> 
> repeat:
>         spin_lock(&sdp->sd_jindex_spin);
>         list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) {
>                 if (test_bit(JDF_RECOVERY, &jd->jd_flags)) {
>                         spin_unlock(&sdp->sd_jindex_spin);
>                         gfs2_recover_one(jd);
>                         goto repeat;
>                 }
>         }
>         spin_unlock(&sdp->sd_jindex_spin);
> 

Yes, that's just as effective. I just hate gotos. Whichever you prefer is fine 
with me.
I'm okay with not reporting the count too.

Bob

Reply via email to