Chris told me to ping him on IRC, which indicated he did want a chance to review this before commit, so I'm continuing to wait...
And to ping.... On Wed, Oct 3, 2012 at 9:10 PM, Daniel Dunbar <[email protected]>wrote: > Just go ahead and put it in, that should be enough to get a response if > one is forthcoming. :) > > - Daniel > > > On Oct 3, 2012, at 19:37, Chandler Carruth <[email protected]> wrote: > > Chris, pinging again. It's been 2 weeks since my last ping, and there > since Daniel Dunbar handed it off to you. ;] > > > On Thu, Sep 20, 2012 at 6:35 PM, Chandler Carruth <[email protected]>wrote: > >> Ping. >> >> I really want to move forward with this. We are observing real world >> bugs, races, crashes, and other badness surrounding bitfields used across >> threads. We cannot even distinguish between user error and compiler error >> because the compiler is known to generate the wrong thing. >> >> I'd love to hear any feedback or suggestions for further work here. >> >> >> On Fri, Sep 14, 2012 at 2:42 PM, Daniel Dunbar <[email protected]> wrote: >> >>> On Tue, Sep 11, 2012 at 11:02 AM, Chandler Carruth <[email protected]> >>> wrote: >>> > On Fri, Aug 31, 2012 at 11:46 AM, Daniel Dunbar <[email protected]> >>> wrote: >>> >> >>> >> Hi Chandler, >>> >> >>> >> I haven't done an extensive review of the patch yet (by which I would >>> >> mostly mean looking at the codegen, I trust you to verify the >>> >> correctness), so treat this as initial comments. >>> > >>> > >>> > Thanks for these. I'm attaching an updated patch with some fixes and >>> > improvements, see below. >>> > >>> >> >>> >> My one main initial comment is that I would like to treat the PR and >>> >> the rewrite as distinct issues. Obviously fixing the PR is a >>> >> no-brainer, but it would be nice to put in the minimal patch for that >>> >> in the current code. I wouldn't expect this to be hard or to result in >>> >> byte-by-byte accesses, but I haven't looked closely at what the change >>> >> should be. It sounds however like you attempted this and decided it >>> >> was hard enough to prefer a rewrite... >>> > >>> > >>> > Yes. Both Richard Smith and I looked at it, and neither of us had any >>> clear >>> > ideas about how to adapt the existing code to avoid the problem without >>> > creating worse stores. As I explained in my previous email, the >>> existing >>> > strategy has two properties that make this hard: 1) it assumes it can >>> read >>> > before the bitfield in order to start off with an aligned load, and 2) >>> it >>> > wants to pre-split into strides which requires the frontend to >>> implement >>> > intelligent striding logic. >>> > >>> >> >>> >> The one thing I am worried about is that the simplicity of your >>> >> approach is contingent on us using irregular integer sized types. If >>> >> we don't do that, I feel like the current version of the code is >>> >> pretty good (feel free to disagree though), so really the question >>> >> comes down to do we want to move to irregular sized integer types. At >>> >> the time the code was written, it was a conscious decision not to do >>> >> that. I agree with the general sentiment that SROA is already doing >>> >> this, I am just worried that if we ever needed to break that we would >>> >> be back to doing something like the current code is doing. >>> > >>> > >>> > Yes, we may, some day need to go back and change things. But we'll >>> have the >>> > old code kicking around in history. This at least gets us to correct >>> > behavior today and simplifies the code today. >>> > >>> >> >>> >> I did a reasonable amount of looking at the codegen on bitfield test >>> >> cases when I wrote the old code, but I probably did a poor job of >>> >> turning these into test cases. I'm assuming you probably already did >>> >> the same kind of testing on the new code? Do you have any test cases >>> >> that demonstrate better or worse codegen from the new model? >>> > >>> > >>> > Test cases I have looked at in depth are included. ;] Mostly, I >>> carefully >>> > checked the output of the existing bitfield tests. There are actually >>> quite >>> > a few. I added a test case to cover the correctness issue. >>> > >>> > I'll work on a few stress test cases based on the examples you and eli >>> gave, >>> > as well as some that I've come up with. >>> > >>> > >>> >> >>> >> I don't understand why this is true (the byte-by-byte part). If your >>> >> rewrite is generating a wider load, then there is nothing in the >>> >> structure of the old code that would prevent generating larger >>> >> accesses as well. This seems independent of the structural change. It >>> >> seems like implementing this in the current framework is a small >>> >> amount of code. >>> > >>> > >>> > Well, both Richard and I looked at it, and after I had tried a few >>> times, >>> > neither of us had really good ideas about how to make it a small >>> amount of >>> > code. To be fair, I didn't really *work* to find an elegant way to >>> implement >>> > it because after a few initial stabs and readings, I came to this >>> > conclusion: >>> > >>> >> >>> >> I agree that after we did that it might be very questionable why we >>> >> were doing it in the frontend. >>> > >>> > >>> > And once I was there, I wanted to do the right thing and solve this >>> using >>> > the facilities LLVM provides. >>> > >>> >> >>> >> >> (1) How well-specified and -tested are the paths for non-power-of-2 >>> >> >> integer types in LLVM? Do we actually promise that, say, storing >>> to an >>> >> >> i17 >>> >> >> will not touch the fourth byte? Do we provide reasonably portable >>> >> >> behavior >>> >> >> across targets, or is this an instance of the x86 backend having >>> >> >> received a >>> >> >> lot of attention and everything else being a festering morass of >>> latent >>> >> >> issues? I know that SROA's behavior has exposed bugs before (IIRC >>> with >>> >> >> the >>> >> >> occasional response of making SROA less aggressive), not least of >>> which >>> >> >> because SROA's behavior doesn't really kick in a lot. >>> >> > >>> >> > >>> >> > Sure, this is a great point. I'm happy to do any testing I can >>> here. One >>> >> > thing that I think I've already done is ensure that we only produce >>> >> > i8-multiples, not i17s and such. I'll double check, but I would not >>> >> > expect >>> >> > to see any of those. My hope was that this would in the worst cased >>> be >>> >> > trivially legalized to a sequence of i8 operations, and in the best >>> >> > case, >>> >> > the optimal pattern. >>> >> >>> >> This is a concern of mine too. Of course, its also something we should >>> >> just probably make sure is nicely defined in the backend given that >>> >> SROA already uses it. >>> > >>> > >>> > So I talked to Duncan specifically about this and this is actually the >>> use >>> > case he feels was intended originally and should be best supported. >>> I've >>> > done some LLVM-side testing and it seems to bear that out. Very >>> reasonable >>> > code was generated for integers in my testing. >>> >>> Ok, I have more or less been convinced this is a better approach. >>> >>> I think you should get explicit sign off from Chris though, because >>> when I discussed it in person with him he had a very strong feeling >>> that we did not want the FE to start generating off-size integer >>> memory accesses (as opposed to off-sized integer values). I don't >>> think I can clearly convey his argument, though, so I'd like to see >>> him chime in. >>> >>> - Daniel >>> >>> > >>> >> >>> >> I agree with the approach. I'm worried there might be cases where the >>> >> backend does a worse job in codegen *today* on the off size integer >>> >> types, but I could be wrong. And of course the right thing to do here >>> >> is fix up the backend. >>> >> >>> >> Regardless, thanks for working in this area! >>> > >>> > >>> > Well, I've fixed a tiny bug, and done significantly more correctness >>> > testing. The patch attached passes self-host and the full testsuite. >>> The >>> > test suite shows no significant performance regressions. >>> > >>> > I've also added something to try to mitigate the performance concerns. >>> I've >>> > added logic to detect a case when there is guranteed to be padding >>> bytes >>> > after the bitfield and to widen the bitfield into the padding when >>> doing so >>> > produces more natural integer sizes. >>> >> >> >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
