Re: [cfe-commits] PATCH: Large re-working of bitfield IR-gen, and a fix for PR13691

Chandler Carruth Wed, 03 Oct 2012 19:38:33 -0700

Chris, pinging again. It's been 2 weeks since my last ping, and there since
Daniel Dunbar handed it off to you. ;]



On Thu, Sep 20, 2012 at 6:35 PM, Chandler Carruth <[email protected]>wrote:

> Ping.
>
> I really want to move forward with this. We are observing real world bugs,
> races, crashes, and other badness surrounding bitfields used across
> threads. We cannot even distinguish between user error and compiler error
> because the compiler is known to generate the wrong thing.
>
> I'd love to hear any feedback or suggestions for further work here.
>
>
> On Fri, Sep 14, 2012 at 2:42 PM, Daniel Dunbar <[email protected]> wrote:
>
>> On Tue, Sep 11, 2012 at 11:02 AM, Chandler Carruth <[email protected]>
>> wrote:
>> > On Fri, Aug 31, 2012 at 11:46 AM, Daniel Dunbar <[email protected]>
>> wrote:
>> >>
>> >> Hi Chandler,
>> >>
>> >> I haven't done an extensive review of the patch yet (by which I would
>> >> mostly mean looking at the codegen, I trust you to verify the
>> >> correctness), so treat this as initial comments.
>> >
>> >
>> > Thanks for these. I'm attaching an updated patch with some fixes and
>> > improvements, see below.
>> >
>> >>
>> >> My one main initial comment is that I would like to treat the PR and
>> >> the rewrite as distinct issues. Obviously fixing the PR is a
>> >> no-brainer, but it would be nice to put in the minimal patch for that
>> >> in the current code. I wouldn't expect this to be hard or to result in
>> >> byte-by-byte accesses, but I haven't looked closely at what the change
>> >> should be. It sounds however like you attempted this and decided it
>> >> was hard enough to prefer a rewrite...
>> >
>> >
>> > Yes. Both Richard Smith and I looked at it, and neither of us had any
>> clear
>> > ideas about how to adapt the existing code to avoid the problem without
>> > creating  worse stores. As I explained in my previous email, the
>> existing
>> > strategy has two properties that make this hard: 1) it assumes it can
>> read
>> > before the bitfield in order to start off with an aligned load, and 2)
>> it
>> > wants to pre-split into strides which requires the frontend to implement
>> > intelligent striding logic.
>> >
>> >>
>> >> The one thing I am worried about is that the simplicity of your
>> >> approach is contingent on us using irregular integer sized types. If
>> >> we don't do that, I feel like the current version of the code is
>> >> pretty good (feel free to disagree though), so really the question
>> >> comes down to do we want to move to irregular sized integer types. At
>> >> the time the code was written, it was a conscious decision not to do
>> >> that. I agree with the general sentiment that SROA is already doing
>> >> this, I am just worried that if we ever needed to break that we would
>> >> be back to doing something like the current code is doing.
>> >
>> >
>> > Yes, we may, some day need to go back and change things. But we'll have
>> the
>> > old code kicking around in history. This at least gets us to correct
>> > behavior today and simplifies the code today.
>> >
>> >>
>> >> I did a reasonable amount of looking at the codegen on bitfield test
>> >> cases when I wrote the old code, but I probably did a poor job of
>> >> turning these into test cases. I'm assuming you probably already did
>> >> the same kind of testing on the new code? Do you have any test cases
>> >> that demonstrate better or worse codegen from the new model?
>> >
>> >
>> > Test cases I have looked at in depth are included. ;] Mostly, I
>> carefully
>> > checked the output of the existing bitfield tests. There are actually
>> quite
>> > a few. I added a test case to cover the correctness issue.
>> >
>> > I'll work on a few stress test cases based on the examples you and eli
>> gave,
>> > as well as some that I've come up with.
>> >
>> >
>> >>
>> >> I don't understand why this is true (the byte-by-byte part). If your
>> >> rewrite is generating a wider load, then there is nothing in the
>> >> structure of the old code that would prevent generating larger
>> >> accesses as well. This seems independent of the structural change. It
>> >> seems like implementing this in the current framework is a small
>> >> amount of code.
>> >
>> >
>> > Well, both Richard and I looked at it, and after I had tried a few
>> times,
>> > neither of us had really good ideas about how to make it a small amount
>> of
>> > code. To be fair, I didn't really *work* to find an elegant way to
>> implement
>> > it because after a few initial stabs and readings, I came to this
>> > conclusion:
>> >
>> >>
>> >> I agree that after we did that it might be very questionable why we
>> >> were doing it in the frontend.
>> >
>> >
>> > And once I was there, I wanted to do the right thing and solve this
>> using
>> > the facilities LLVM provides.
>> >
>> >>
>> >> >> (1) How well-specified and -tested are the paths for non-power-of-2
>> >> >> integer types in LLVM?  Do we actually promise that, say, storing
>> to an
>> >> >> i17
>> >> >> will not touch the fourth byte?  Do we provide reasonably portable
>> >> >> behavior
>> >> >> across targets, or is this an instance of the x86 backend having
>> >> >> received a
>> >> >> lot of attention and everything else being a festering morass of
>> latent
>> >> >> issues?  I know that SROA's behavior has exposed bugs before (IIRC
>> with
>> >> >> the
>> >> >> occasional response of making SROA less aggressive), not least of
>> which
>> >> >> because SROA's behavior doesn't really kick in a lot.
>> >> >
>> >> >
>> >> > Sure, this is a great point. I'm happy to do any testing I can here.
>> One
>> >> > thing that I think I've already done is ensure that we only produce
>> >> > i8-multiples, not i17s and such. I'll double check, but I would not
>> >> > expect
>> >> > to see any of those. My hope was that this would in the worst cased
>> be
>> >> > trivially legalized to a sequence of i8 operations, and in the best
>> >> > case,
>> >> > the optimal pattern.
>> >>
>> >> This is a concern of mine too. Of course, its also something we should
>> >> just probably make sure is nicely defined in the backend given that
>> >> SROA already uses it.
>> >
>> >
>> > So I talked to Duncan specifically about this and this is actually the
>> use
>> > case he feels was intended originally and should be best supported. I've
>> > done some LLVM-side testing and it seems to bear that out. Very
>> reasonable
>> > code was generated for integers in my testing.
>>
>> Ok, I have more or less been convinced this is a better approach.
>>
>> I think you should get explicit sign off from Chris though, because
>> when I discussed it in person with him he had a very strong feeling
>> that we did not want the FE to start generating off-size integer
>> memory accesses (as opposed to off-sized integer values). I don't
>> think I can clearly convey his argument, though, so I'd like to see
>> him chime in.
>>
>>  - Daniel
>>
>> >
>> >>
>> >> I agree with the approach. I'm worried there might be cases where the
>> >> backend does a worse job in codegen *today* on the off size integer
>> >> types, but I could be wrong. And of course the right thing to do here
>> >> is fix up the backend.
>> >>
>> >> Regardless, thanks for working in this area!
>> >
>> >
>> > Well, I've fixed a tiny bug, and done significantly more correctness
>> > testing. The patch attached passes self-host and the full testsuite. The
>> > test suite shows no significant performance regressions.
>> >
>> > I've also added something to try to mitigate the performance concerns.
>> I've
>> > added logic to detect a case when there is guranteed to be padding bytes
>> > after the bitfield and to widen the bitfield into the padding when
>> doing so
>> > produces more natural integer sizes.
>>
>
>

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [cfe-commits] PATCH: Large re-working of bitfield IR-gen, and a fix for PR13691

Reply via email to