> On Oct. 6, 2014, 11:31 p.m., Bayard Bell wrote: > > I'm a bit uncomfortable with the level of bug reporting and fix analysis > > here. The changes would seem sufficient to preventing panic in this case of > > import failure, but what's not clear is how we expect this to be handled > > thereafter, given that we don't handle it in these code paths, such that we > > know that we are doing the right thing, minimally for the range of errors > > that metaslab_init() can currently be passed by space_map_open(). > > Prakash Surya wrote: > metaslab_init() is only called from vdev_metaslab_init(). > vdev_metaslab_init() is called from two locations, vdev_load() and > vdev_expand(). vdev_load() handles this error (any vdev_metaslab_init() > error) by setting the the vdev_state, and vdev_expand handles this error > (again, any vdev_metaslab_init() error) by panic'ing. > > Do you have any specific concerns? The error seems to be handled clearly > to me. Am I overlooking something? > > Also, I should note, vdev_metaslab_init() can return an error if > dbuf_read() fails; so callers of vdev_metaslab_init() already had to support > the possiblity of this function failing. > > Bayard Bell wrote: > My concerns aren't with the code per se but with the lack of explicit > analysis offered about the immediate problems and concerns about generalising > a fix beyond those facts. It's not that I don't reckon you've done that, but > as a reviewer I'd prefer to have the root cause and fix analysis fleshed out, > which I understand to be what advocates also now require (thus what I wrote > up for 5133 in the issue tracker is quite extensive by comparison, where most > of the detail was provided at RTI). > > That's to say I grant that what I'm asking is a matter of preference, but > I take my preferences to be consistent with the bar for RTI and no less > relevant at this stage. > > Prakash Surya wrote: > Ah, OK. Yes, I agree, I should have given this review a proper > explanation in the description/testing sections. > > Bayard Bell wrote: > Thank you. > > There's one other broader question that I have, which is less error > handling and more error presentation. You've already noted that we have error > handling via vdev_state(), but for the specific problem of import failure, > should we log more specifically for diagnostic and recovery hints (e.g. "just > fix the labels, mate")? > > It seems self-evidently correct that we don't want to panic in case of an > import failure, but ought we not present the error condition more directly > than, say, expecting examination of FMA telemetry, as people are going to > want to direct and expedite resolution of lost data availability and to > understand whether data integrity issues are reasonably implicated? > > George Wilson wrote: > It might be possible to do better error presentation in some cases but > this one is tricky. For example, if the error is a checksum error you can't > tell between a bad config, on-disk corruption, a device that is currently > inaccessible, etc. > > Prakash Surya wrote: > Bayard, did you see George's response in the bug report > https://www.illumos.org/issues/5213 ? How do you feel about deferring the > error reporting investigation for another patch? Or do you think that's work > that needs to happen in this review?
Apologies, prakash, missed the update and your response. I think it's fine to defer the further diagnostic issues for a later change, as this does provide immediate and needed relief. - Bayard ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.csiden.org/r/110/#review273 ----------------------------------------------------------- On Oct. 6, 2014, 2:02 p.m., Prakash Surya wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.csiden.org/r/110/ > ----------------------------------------------------------- > > (Updated Oct. 6, 2014, 2:02 p.m.) > > > Review request for OpenZFS Developer Mailing List and Christopher Siden. > > > Bugs: 5213 > https://www.illumos.org/projects/illumos-gate//issues/5213 > > > Repository: illumos-gate > > > Description > ------- > > 5213 panic in metaslab_init due to space_map_open returning ENXIO > Reviewed by: Matthew Ahrens <[email protected]> > Reviewed by: George Wilson <[email protected]> > > Original author: Prakash Surya > > > Diffs > ----- > > usr/src/uts/common/fs/zfs/vdev.c 7571b21a5f8cd5c0a1d6406556bcc22fcaceb814 > usr/src/uts/common/fs/zfs/sys/metaslab.h > 7fd47c08c9df5a7a3f6f24c6a4deda84502d0b19 > usr/src/uts/common/fs/zfs/metaslab.c > a33ec7f628108f5c3d2206583b060e2d580a7dca > > Diff: https://reviews.csiden.org/r/110/diff/ > > > Testing > ------- > > > Thanks, > > Prakash Surya > >
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
