Chris, pinging again. It's been 2 weeks since my last ping, and there since Daniel Dunbar handed it off to you. ;]
On Thu, Sep 20, 2012 at 6:35 PM, Chandler Carruth <[email protected]>wrote: > Ping. > > I really want to move forward with this. We are observing real world bugs, > races, crashes, and other badness surrounding bitfields used across > threads. We cannot even distinguish between user error and compiler error > because the compiler is known to generate the wrong thing. > > I'd love to hear any feedback or suggestions for further work here. > > > On Fri, Sep 14, 2012 at 2:42 PM, Daniel Dunbar <[email protected]> wrote: > >> On Tue, Sep 11, 2012 at 11:02 AM, Chandler Carruth <[email protected]> >> wrote: >> > On Fri, Aug 31, 2012 at 11:46 AM, Daniel Dunbar <[email protected]> >> wrote: >> >> >> >> Hi Chandler, >> >> >> >> I haven't done an extensive review of the patch yet (by which I would >> >> mostly mean looking at the codegen, I trust you to verify the >> >> correctness), so treat this as initial comments. >> > >> > >> > Thanks for these. I'm attaching an updated patch with some fixes and >> > improvements, see below. >> > >> >> >> >> My one main initial comment is that I would like to treat the PR and >> >> the rewrite as distinct issues. Obviously fixing the PR is a >> >> no-brainer, but it would be nice to put in the minimal patch for that >> >> in the current code. I wouldn't expect this to be hard or to result in >> >> byte-by-byte accesses, but I haven't looked closely at what the change >> >> should be. It sounds however like you attempted this and decided it >> >> was hard enough to prefer a rewrite... >> > >> > >> > Yes. Both Richard Smith and I looked at it, and neither of us had any >> clear >> > ideas about how to adapt the existing code to avoid the problem without >> > creating worse stores. As I explained in my previous email, the >> existing >> > strategy has two properties that make this hard: 1) it assumes it can >> read >> > before the bitfield in order to start off with an aligned load, and 2) >> it >> > wants to pre-split into strides which requires the frontend to implement >> > intelligent striding logic. >> > >> >> >> >> The one thing I am worried about is that the simplicity of your >> >> approach is contingent on us using irregular integer sized types. If >> >> we don't do that, I feel like the current version of the code is >> >> pretty good (feel free to disagree though), so really the question >> >> comes down to do we want to move to irregular sized integer types. At >> >> the time the code was written, it was a conscious decision not to do >> >> that. I agree with the general sentiment that SROA is already doing >> >> this, I am just worried that if we ever needed to break that we would >> >> be back to doing something like the current code is doing. >> > >> > >> > Yes, we may, some day need to go back and change things. But we'll have >> the >> > old code kicking around in history. This at least gets us to correct >> > behavior today and simplifies the code today. >> > >> >> >> >> I did a reasonable amount of looking at the codegen on bitfield test >> >> cases when I wrote the old code, but I probably did a poor job of >> >> turning these into test cases. I'm assuming you probably already did >> >> the same kind of testing on the new code? Do you have any test cases >> >> that demonstrate better or worse codegen from the new model? >> > >> > >> > Test cases I have looked at in depth are included. ;] Mostly, I >> carefully >> > checked the output of the existing bitfield tests. There are actually >> quite >> > a few. I added a test case to cover the correctness issue. >> > >> > I'll work on a few stress test cases based on the examples you and eli >> gave, >> > as well as some that I've come up with. >> > >> > >> >> >> >> I don't understand why this is true (the byte-by-byte part). If your >> >> rewrite is generating a wider load, then there is nothing in the >> >> structure of the old code that would prevent generating larger >> >> accesses as well. This seems independent of the structural change. It >> >> seems like implementing this in the current framework is a small >> >> amount of code. >> > >> > >> > Well, both Richard and I looked at it, and after I had tried a few >> times, >> > neither of us had really good ideas about how to make it a small amount >> of >> > code. To be fair, I didn't really *work* to find an elegant way to >> implement >> > it because after a few initial stabs and readings, I came to this >> > conclusion: >> > >> >> >> >> I agree that after we did that it might be very questionable why we >> >> were doing it in the frontend. >> > >> > >> > And once I was there, I wanted to do the right thing and solve this >> using >> > the facilities LLVM provides. >> > >> >> >> >> >> (1) How well-specified and -tested are the paths for non-power-of-2 >> >> >> integer types in LLVM? Do we actually promise that, say, storing >> to an >> >> >> i17 >> >> >> will not touch the fourth byte? Do we provide reasonably portable >> >> >> behavior >> >> >> across targets, or is this an instance of the x86 backend having >> >> >> received a >> >> >> lot of attention and everything else being a festering morass of >> latent >> >> >> issues? I know that SROA's behavior has exposed bugs before (IIRC >> with >> >> >> the >> >> >> occasional response of making SROA less aggressive), not least of >> which >> >> >> because SROA's behavior doesn't really kick in a lot. >> >> > >> >> > >> >> > Sure, this is a great point. I'm happy to do any testing I can here. >> One >> >> > thing that I think I've already done is ensure that we only produce >> >> > i8-multiples, not i17s and such. I'll double check, but I would not >> >> > expect >> >> > to see any of those. My hope was that this would in the worst cased >> be >> >> > trivially legalized to a sequence of i8 operations, and in the best >> >> > case, >> >> > the optimal pattern. >> >> >> >> This is a concern of mine too. Of course, its also something we should >> >> just probably make sure is nicely defined in the backend given that >> >> SROA already uses it. >> > >> > >> > So I talked to Duncan specifically about this and this is actually the >> use >> > case he feels was intended originally and should be best supported. I've >> > done some LLVM-side testing and it seems to bear that out. Very >> reasonable >> > code was generated for integers in my testing. >> >> Ok, I have more or less been convinced this is a better approach. >> >> I think you should get explicit sign off from Chris though, because >> when I discussed it in person with him he had a very strong feeling >> that we did not want the FE to start generating off-size integer >> memory accesses (as opposed to off-sized integer values). I don't >> think I can clearly convey his argument, though, so I'd like to see >> him chime in. >> >> - Daniel >> >> > >> >> >> >> I agree with the approach. I'm worried there might be cases where the >> >> backend does a worse job in codegen *today* on the off size integer >> >> types, but I could be wrong. And of course the right thing to do here >> >> is fix up the backend. >> >> >> >> Regardless, thanks for working in this area! >> > >> > >> > Well, I've fixed a tiny bug, and done significantly more correctness >> > testing. The patch attached passes self-host and the full testsuite. The >> > test suite shows no significant performance regressions. >> > >> > I've also added something to try to mitigate the performance concerns. >> I've >> > added logic to detect a case when there is guranteed to be padding bytes >> > after the bitfield and to widen the bitfield into the padding when >> doing so >> > produces more natural integer sizes. >> > >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
