> > CONCLUSION: > > > > - the only correct workaround for TLBError > > is the one I suggested earlier: TLBError > > handler has to inspect the faulting opcode > > and fixup DAR and MD_EPN based on the GPR > > values if the faulting instruction is any > > of dcbf, dcbi, dcbst or dcbz. > > Performance of this solution could be > > improved (eliminate opcode-check in the > > vast majority of the cases) by storing > > a 'tag' value in DAR. > > Hi again > > I have been hacking on dcxx address decoder. Since assembler isn't my cup of > tea, > I used C mixed with asm statements. The resulting assembler isn't too bad > either IMHO. > > To load the instruction into r21 I used: > mfspr r20,SRR0 > andis. r21, r20, 0x8000 > beq 56f > > tophys(r21, r20) > lwz r21,0(r21) > 56: > This only works for kernel space addresses. I can't figure out how to get to > user space as well. > I can live without user space anyway. > > I am still thinking about the 'tag'. Since MD_EPN isn't set as well as DAR I > thinking about > storing a tag in MD_EPN instead. It's less intrusive. Maybe it is enough to > look at the > valid bit in MD_EPN? > > What do you think so far? > Oh, this should go into the DTLB Error handler.
Me again :-) I have completed and tested my workaround for the dcbx instructions. The workaround handles ALL dcbx instructions, ANY register combination and works both on kernel space and user space addresses. During my testing I noticed that memory allocated with consistent_alloc() or kmalloc() causes TLB Errors while vmalloc does not. If this is true(confirmation wanted) it means that the current impl. is fragile. Both dcbi and dcbz does not update DAR when a TLB error happens, instead the the previous setting of DAR is used. I also did some benchmarking using copy_page(dcbz enabled) and memcpy to memory allocated with kmalloc and/or vmalloc. copy_page is about 30% faster than memcpy even with the workaround applied. There is one concern left. I tag DAR with a "bad address" just before an exception is finished. In the TLB Error handler, check if DAR contains the "bad address" and if it does then the workaround is executed. I need find all exceptions where DAR is modified. Currently I tag DAR in STD_EXCEPTON(), DataAccess, Alignment, DataStore and DataTLBError. Have I missed any exception? I also need to find a good value for the "bad address". Currently I use 0xdead0000 and that's probably not the best value. Tagging with this value is a two instruction operation: lis r20, 0xdead mtspr DAR, r20 The test in the TLB Error handler look like this: mfspr r21, DAR lis r20, 0xdead cmpw cr0, r20, r21 beq- <workaround address> I can not see any reason NOT to add this to the BK tree(after some minor modifications mentioned above and a little cleanup). It fixes a real problem with dcbi and as a bonus you can use dcbz as well since it has the same problem that dcbi has and the fix is generic for all dcbx instructions. Dan's argument, "It's interesting to watch these hacks, but I can't justify complicating a general purpose function with more bus cycles by emulating a functional problem. By not using these instructions we have a working system that costs just a few more cycles during the memory copy/zero operations. If we had _working_ dcbz instructions, it would be a gain to use them, but from a system perspective it is going to cost more to "fix up" these than the code that already exists", is not valid. This is not only about optimization, but also about correctness of existing use of dcbi. Patch against 2.4.20 devel available on request until I have cleaned it up a little. Jocke ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/