Re: [cfe-commits] PATCH: Large re-working of bitfield IR-gen, and a fix for PR13691

Chandler Carruth Tue, 04 Dec 2012 11:09:10 -0800

Chris, any final comments? Everyone else seems happy with this approach now.



On Wed, Nov 28, 2012 at 4:36 PM, Chandler Carruth <[email protected]>wrote:

> Hey Chris!
>
> We chatted briefly about this patch during the devmtg, and I wanted to
> follow up here about it. What follows is a very brief summary of our
> discussion as I recall it -- chime in if I missed anything important.
>
> On Fri, Oct 5, 2012 at 5:09 PM, Chris Lattner <[email protected]> wrote:
> > Hi Chandler,
> >
> > Sorry for the delay, I still haven't had a chance to look at your patch.
> > That said, here are some high level comments based on my understanding of
> > it:
> >
> > 1) I'm not worried about "weird" integer sizes in registers variables.
>  LLVM
> > is pretty good at promoting stuff to "native" integers, and if there are
> > bugs handling them, we want to know anyway.  If this is a convenient way
> to
> > model arithmetic on an i56 bitfield, then that's cool with me.
>
> Cool, no disagreement here.
>
>
> > 2) I don't like the idea of "weird" integer sizes on loads and stores to
> > struct memory though.  One of the major things that the current bitfield
> > design is intended to encourage is to allow CSE and DSE of accesses to
> > struct fields.  It currently does this by widening out loads (too much!)
> but
> > if the widening is the problem, I'd rather see this expanded out into a
> > series of loads (e.g. a i8 load and an i16 load, instead of an i24 load).
> > The reason for this is that the optimizer won't be splitting up i24
> loads -
> > codegen will, and exposing truth to the mid-level optimizer is the right
> > thing to do.
>
> This is I think the interesting question. We discussed quite a bit in
> person and I'm not going to try to repeat all of it here...
>
> I think my essential argument is that the way the memory model (both
> C++'s and LLVM's) works, it is specifically useful for the frontend to
> emit maximally widened and/or merged stores. This exposes the maximal
> width of store which will not introduce a data race. Without this
> information, the optimizer can't freely use natural width operations
> in many cases, and it becomes harder for the optimizer to combine
> loads and stores to unrelated parts of a bitfield because they may
> appear to be important to leave separate to avoid data races.
>
>
> One possibility which we specifically discussed was pre-splitting the
> wide load or store operations into "natural" integer widths for the
> target. This has (in my mind) two significant downsides:
>
> 1) LLVM will hoist some of these operations across basic blocks which
> will prevent the target from merging them when profitable (say the
> memory is aligned, and it can do a 128bit store)
>
> 2) It duplicates the logic to intelligently decompose a load or store
> of a wide integer between the frontend and the codegen layer in LLVM.
> This is particularly frustrating because the codegen layer will in
> many cases do a *better* job of decomposing things that the frontend
> will by taking advantage of the exact instructions available and the
> final alignment of the memory location.
>
>
> > 3) This is the sort of change that needs some numbers to back it up.
> > Performance results would be best, but a decent proxy for them in this
> case
> > would be to look at the code size of functions in 176.gcc or something
> else
> > that uses a bunch of bitfields.
>
> I think this is crucial. =] I've run the LNT test suite with and
> without this patch, and I get no regressions (or improvements) in
> performance. That said, my box is quite noisy. If I can submit the
> patch, the LNT build bots will give us much more stable data I
> suspect.
>
>
> I don't have 176.gcc handy, but I looked at the binary sizes for every
> program in the LNT test suite. Here are the benchmarks which changed
> by at least 1%. First column is the benchmark, then the old size of
> the '.text' section, then the new size, and finally "(new - old) /
> old" to show the % change.
>
> MultiSource/Benchmarks/Fhourstones-3.1/Output/fhourstones3.1.simple,
> 4872, 4984, 0.022988505747126436
>
> SingleSource/UnitTests/Output/2009-04-16-BitfieldInitialization.simple,
> 952, 888, -0.067226890756302518
> SingleSource/UnitTests/Output/ms_struct-bitfield-init-1.simple, 664,
> 680, 0.024096385542168676
> SingleSource/UnitTests/Output/2006-01-23-UnionInit.simple, 904, 920,
> 0.017699115044247787
> SingleSource/Regression/C++/Output/2011-03-28-Bitfield.simple, 568,
> 552, -0.028169014084507043
>
> So, one benchmark grew by 2.2%, but its a tiny benchmark to start
> with, so this isn't terribly interesting.
>
> Of the bitfield-heavy single source tests, we have 1.7%, 2.4%, -2.8%
> and -6.7%. So the growth here doesn't seem worrisome at all.
>
>
> Any other benchmarking you'd like me to do to validate this?
>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [cfe-commits] PATCH: Large re-working of bitfield IR-gen, and a fix for PR13691

Reply via email to