hi, > On Nov 22, 2011, at 2:17 AM, YAMAMOTO Takashi wrote: > >> hi, >> >>> On Fri, Nov 18, 2011 at 06:31:21AM +0000, YAMAMOTO Takashi wrote: >>>>>> postgresql assumes instant lseek(SEEK_END) to get the size of >>>>>> their heap files. >>>>>> >>>>>> http://rhaas.blogspot.com/2011/11/linux-lseek-scalability.html >>>>>> >>>>>> as fsync etc keeps the vnode lock during i/o, it might cause severe >>>>>> performance regression. >>>>> >>>>> I wonder if it is worth having a separate VOP for that, which would >>>>> retrieve a subset of vattr without lock held. There are potentially >>>>> more uses in the tree. >>>> >>>> while it's good to have VOP_GETATTR take a mask to specify a set of >>>> requested attributes, i don't think it's worth to have a serparete VOP >>>> for this. (PR/30250 is related) >>>> >>>> IMO we should make unlocked VOP calls safe (against revoke and umount -f) >>>> and put VOP_GETATTR locking back into filesystem internal. >>> >>> ad looked into that and concluded it was prohibitively expensive; too >>> much contention on every entry/exit. >> >> what's "it"? >> >> i agree naive reference counting on each VOP calls can be too expensive. > > If "it" looked like the attached sketch, are two atomic adds per call too > expensive? Could even make it cheaper if VOP_LOCK() only takes the pre-op > block, VOP_UNLOCK() only takes the post-op block and VOPs with a locked vnode > take none of them making VOP_LOCK, VOP_XXX ... VOP_UNLOCK sequences atomic > with regard to revoke.
it doesn't seem to be much cheaper than a normal rwlock, which would involve the same number of atomic ops in uncontended cases. YAMAMOTO Takashi > > -- > Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany) > > #define VWANTREVOKE 0x40000000 > > revoke() > ... > oc = atomic_add_32_nv(&vp->v_opcnt, VWANTREVOKE); > while (oc != VWANTREVOKE) { > cv_wait(&vp->v_cv, vp->v_interlock); > oc = vp->v_opcnt; > } > ... > atomic_add_32(&vp->v_opcnt, -VWANTREVOKE); > cv_broadcast(&vp->v_cv); > > > vop_xxx() > ... > /* pre-op */ > for (;;) { > oc = atomic_add_32_nv(&vp->v_opcnt, 1); > if (predict_true((oc & VWANTREVOKE) == 0) > break; > oc = atomic_add_32_nv(&vp->v_opcnt, -1); > mutex_enter(&vp->v_interlock); > while (oc & VWANTREVOKE) { > cv_wait(&vp->v_cv, vp->v_interlock); > oc = vp->v_opcnt; > } > mutex_exit(&vp->v_interlock); > } > ... > VCALL(...) > ... > /* post-op */ > oc = atomic_add_32_nv(&vp->v_opcnt, -1); > if (predict_false(oc == VWANTREVOKE)) { > mutex_enter(&vp->v_interlock); > cv_broadcast(&vp->v_cv); > mutex_exit(&vp->v_interlock); > }