On Fri, Feb 08, 2008 at 07:09:07AM -0800, Arjan van de Ven wrote: > David Miller wrote: > >From: Linus Torvalds <[EMAIL PROTECTED]> > >Date: Thu, 7 Feb 2008 09:42:56 -0800 (PST) > > > >>Can we please just stop doing these one-by-one assignments, and just do > >>something like > >> > >> memset(rq, 0, sizeof(*rq)); > >> rq->q = q; > >> rq->ref_count = 1; > >> INIT_HLIST_NODE(&rq->hash); > >> RB_CLEAR_NODE(&rq->rb_node); > >> > >>instead? > >> > >>The memset() is likely faster and smaller than one-by-one assignments > >>anyway, even if the one-by-ones can avoid initializing some field or > >>there ends up being a double initialization.. > > > >The problem is store buffer compression. At least a few years > >ago this made a huge difference in sk_buff initialization in the > >networking. > > > >Maybe cpus these days have so much store bandwith that doing > >things like the above is OK, but I doubt it :-) > > on modern x86 cpus the memset may even be faster if the memory isn't in > cache; > the "explicit" method ends up doing Write Allocate on the cache lines > (so read them from memory) even though they then end up being written > entirely. > With memset the CPU is told that the entire range is set to a new value, and > the WA can be avoided for the whole-cachelines in the range.
Don't you have write combining store buffers? Or is it still speculatively issuing the reads even before the whole cacheline is combined? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/