----- Original Message ----- > On Tue, Dec 22, 2020 at 9:39 PM Bob Peterson <[email protected]> wrote: > > > > Hi, > > > > Before this patch, journal recovery was done by a workqueue function that > > operated on a per-journal basis. The problem is, these could run > > simultaneously > > which meant that they could all use the same bio, sd_log_bio, to do their > > writing to all the various journals. These operations overwrote one another > > eventually causing memory corruption. > > > > This patch makes the recovery workqueue operate on a per-superblock basis, > > which means a mount point using, for example journal0, could do recovery > > for all journals that need recovery. This is done consecutively so the > > sd_log_bio is only referenced by one recovery at a time, thus avoiding the > > chaos. > > > > Since the journal recovery requests can come in any order, and > > unpredictably, > > the new work func loops until there are no more journals to be recovered. > > > > Since multiple processes may request recovery of a journal, and since they > > all now use the same sdp-based workqueue, it's okay for them to get an > > error from queue_work: Queueing work while there's already work queued. > > > > Signed-off-by: Bob Peterson <[email protected]> > > Can't this be a simple loop like below? > > repeat: > spin_lock(&sdp->sd_jindex_spin); > list_for_each_entry(jd, &sdp->sd_jindex_list, jd_list) { > if (test_bit(JDF_RECOVERY, &jd->jd_flags)) { > spin_unlock(&sdp->sd_jindex_spin); > gfs2_recover_one(jd); > goto repeat; > } > } > spin_unlock(&sdp->sd_jindex_spin); >
Yes, that's just as effective. I just hate gotos. Whichever you prefer is fine with me. I'm okay with not reporting the count too. Bob
