Chris, any final comments? Everyone else seems happy with this approach now.
On Wed, Nov 28, 2012 at 4:36 PM, Chandler Carruth <[email protected]>wrote: > Hey Chris! > > We chatted briefly about this patch during the devmtg, and I wanted to > follow up here about it. What follows is a very brief summary of our > discussion as I recall it -- chime in if I missed anything important. > > On Fri, Oct 5, 2012 at 5:09 PM, Chris Lattner <[email protected]> wrote: > > Hi Chandler, > > > > Sorry for the delay, I still haven't had a chance to look at your patch. > > That said, here are some high level comments based on my understanding of > > it: > > > > 1) I'm not worried about "weird" integer sizes in registers variables. > LLVM > > is pretty good at promoting stuff to "native" integers, and if there are > > bugs handling them, we want to know anyway. If this is a convenient way > to > > model arithmetic on an i56 bitfield, then that's cool with me. > > Cool, no disagreement here. > > > > 2) I don't like the idea of "weird" integer sizes on loads and stores to > > struct memory though. One of the major things that the current bitfield > > design is intended to encourage is to allow CSE and DSE of accesses to > > struct fields. It currently does this by widening out loads (too much!) > but > > if the widening is the problem, I'd rather see this expanded out into a > > series of loads (e.g. a i8 load and an i16 load, instead of an i24 load). > > The reason for this is that the optimizer won't be splitting up i24 > loads - > > codegen will, and exposing truth to the mid-level optimizer is the right > > thing to do. > > This is I think the interesting question. We discussed quite a bit in > person and I'm not going to try to repeat all of it here... > > I think my essential argument is that the way the memory model (both > C++'s and LLVM's) works, it is specifically useful for the frontend to > emit maximally widened and/or merged stores. This exposes the maximal > width of store which will not introduce a data race. Without this > information, the optimizer can't freely use natural width operations > in many cases, and it becomes harder for the optimizer to combine > loads and stores to unrelated parts of a bitfield because they may > appear to be important to leave separate to avoid data races. > > > One possibility which we specifically discussed was pre-splitting the > wide load or store operations into "natural" integer widths for the > target. This has (in my mind) two significant downsides: > > 1) LLVM will hoist some of these operations across basic blocks which > will prevent the target from merging them when profitable (say the > memory is aligned, and it can do a 128bit store) > > 2) It duplicates the logic to intelligently decompose a load or store > of a wide integer between the frontend and the codegen layer in LLVM. > This is particularly frustrating because the codegen layer will in > many cases do a *better* job of decomposing things that the frontend > will by taking advantage of the exact instructions available and the > final alignment of the memory location. > > > > 3) This is the sort of change that needs some numbers to back it up. > > Performance results would be best, but a decent proxy for them in this > case > > would be to look at the code size of functions in 176.gcc or something > else > > that uses a bunch of bitfields. > > I think this is crucial. =] I've run the LNT test suite with and > without this patch, and I get no regressions (or improvements) in > performance. That said, my box is quite noisy. If I can submit the > patch, the LNT build bots will give us much more stable data I > suspect. > > > I don't have 176.gcc handy, but I looked at the binary sizes for every > program in the LNT test suite. Here are the benchmarks which changed > by at least 1%. First column is the benchmark, then the old size of > the '.text' section, then the new size, and finally "(new - old) / > old" to show the % change. > > MultiSource/Benchmarks/Fhourstones-3.1/Output/fhourstones3.1.simple, > 4872, 4984, 0.022988505747126436 > > SingleSource/UnitTests/Output/2009-04-16-BitfieldInitialization.simple, > 952, 888, -0.067226890756302518 > SingleSource/UnitTests/Output/ms_struct-bitfield-init-1.simple, 664, > 680, 0.024096385542168676 > SingleSource/UnitTests/Output/2006-01-23-UnionInit.simple, 904, 920, > 0.017699115044247787 > SingleSource/Regression/C++/Output/2011-03-28-Bitfield.simple, 568, > 552, -0.028169014084507043 > > So, one benchmark grew by 2.2%, but its a tiny benchmark to start > with, so this isn't terribly interesting. > > Of the bitfield-heavy single source tests, we have 1.7%, 2.4%, -2.8% > and -6.7%. So the growth here doesn't seem worrisome at all. > > > Any other benchmarking you'd like me to do to validate this? >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
