On Tue, Sep 11, 2012 at 11:02 AM, Chandler Carruth <[email protected]> wrote: > On Fri, Aug 31, 2012 at 11:46 AM, Daniel Dunbar <[email protected]> wrote: >> >> Hi Chandler, >> >> I haven't done an extensive review of the patch yet (by which I would >> mostly mean looking at the codegen, I trust you to verify the >> correctness), so treat this as initial comments. > > > Thanks for these. I'm attaching an updated patch with some fixes and > improvements, see below. > >> >> My one main initial comment is that I would like to treat the PR and >> the rewrite as distinct issues. Obviously fixing the PR is a >> no-brainer, but it would be nice to put in the minimal patch for that >> in the current code. I wouldn't expect this to be hard or to result in >> byte-by-byte accesses, but I haven't looked closely at what the change >> should be. It sounds however like you attempted this and decided it >> was hard enough to prefer a rewrite... > > > Yes. Both Richard Smith and I looked at it, and neither of us had any clear > ideas about how to adapt the existing code to avoid the problem without > creating worse stores. As I explained in my previous email, the existing > strategy has two properties that make this hard: 1) it assumes it can read > before the bitfield in order to start off with an aligned load, and 2) it > wants to pre-split into strides which requires the frontend to implement > intelligent striding logic. > >> >> The one thing I am worried about is that the simplicity of your >> approach is contingent on us using irregular integer sized types. If >> we don't do that, I feel like the current version of the code is >> pretty good (feel free to disagree though), so really the question >> comes down to do we want to move to irregular sized integer types. At >> the time the code was written, it was a conscious decision not to do >> that. I agree with the general sentiment that SROA is already doing >> this, I am just worried that if we ever needed to break that we would >> be back to doing something like the current code is doing. > > > Yes, we may, some day need to go back and change things. But we'll have the > old code kicking around in history. This at least gets us to correct > behavior today and simplifies the code today. > >> >> I did a reasonable amount of looking at the codegen on bitfield test >> cases when I wrote the old code, but I probably did a poor job of >> turning these into test cases. I'm assuming you probably already did >> the same kind of testing on the new code? Do you have any test cases >> that demonstrate better or worse codegen from the new model? > > > Test cases I have looked at in depth are included. ;] Mostly, I carefully > checked the output of the existing bitfield tests. There are actually quite > a few. I added a test case to cover the correctness issue. > > I'll work on a few stress test cases based on the examples you and eli gave, > as well as some that I've come up with. > > >> >> I don't understand why this is true (the byte-by-byte part). If your >> rewrite is generating a wider load, then there is nothing in the >> structure of the old code that would prevent generating larger >> accesses as well. This seems independent of the structural change. It >> seems like implementing this in the current framework is a small >> amount of code. > > > Well, both Richard and I looked at it, and after I had tried a few times, > neither of us had really good ideas about how to make it a small amount of > code. To be fair, I didn't really *work* to find an elegant way to implement > it because after a few initial stabs and readings, I came to this > conclusion: > >> >> I agree that after we did that it might be very questionable why we >> were doing it in the frontend. > > > And once I was there, I wanted to do the right thing and solve this using > the facilities LLVM provides. > >> >> >> (1) How well-specified and -tested are the paths for non-power-of-2 >> >> integer types in LLVM? Do we actually promise that, say, storing to an >> >> i17 >> >> will not touch the fourth byte? Do we provide reasonably portable >> >> behavior >> >> across targets, or is this an instance of the x86 backend having >> >> received a >> >> lot of attention and everything else being a festering morass of latent >> >> issues? I know that SROA's behavior has exposed bugs before (IIRC with >> >> the >> >> occasional response of making SROA less aggressive), not least of which >> >> because SROA's behavior doesn't really kick in a lot. >> > >> > >> > Sure, this is a great point. I'm happy to do any testing I can here. One >> > thing that I think I've already done is ensure that we only produce >> > i8-multiples, not i17s and such. I'll double check, but I would not >> > expect >> > to see any of those. My hope was that this would in the worst cased be >> > trivially legalized to a sequence of i8 operations, and in the best >> > case, >> > the optimal pattern. >> >> This is a concern of mine too. Of course, its also something we should >> just probably make sure is nicely defined in the backend given that >> SROA already uses it. > > > So I talked to Duncan specifically about this and this is actually the use > case he feels was intended originally and should be best supported. I've > done some LLVM-side testing and it seems to bear that out. Very reasonable > code was generated for integers in my testing.
Ok, I have more or less been convinced this is a better approach. I think you should get explicit sign off from Chris though, because when I discussed it in person with him he had a very strong feeling that we did not want the FE to start generating off-size integer memory accesses (as opposed to off-sized integer values). I don't think I can clearly convey his argument, though, so I'd like to see him chime in. - Daniel > >> >> I agree with the approach. I'm worried there might be cases where the >> backend does a worse job in codegen *today* on the off size integer >> types, but I could be wrong. And of course the right thing to do here >> is fix up the backend. >> >> Regardless, thanks for working in this area! > > > Well, I've fixed a tiny bug, and done significantly more correctness > testing. The patch attached passes self-host and the full testsuite. The > test suite shows no significant performance regressions. > > I've also added something to try to mitigate the performance concerns. I've > added logic to detect a case when there is guranteed to be padding bytes > after the bitfield and to widen the bitfield into the padding when doing so > produces more natural integer sizes. _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
