On 06/20/2017 06:17 AM, Uros Bizjak wrote: > On Tue, Jun 20, 2017 at 2:13 PM, Florian Weimer <fwei...@redhat.com> wrote: >> On 06/20/2017 01:10 PM, Uros Bizjak wrote: >> >>> 74,99% a.out a.out [.] test_or >>> 12,50% a.out a.out [.] test_movb >>> 12,50% a.out a.out [.] test_movl >> >> Could you try notl/notb/negl/negb as well, please? > > These all have the same (long) runtime as test_or. That would be my expectation -- they (not/neg) are going to be RMW.
So we can we agree that moving away RMW to a simple W style instruction for the probe is where we want to go? Then we can kick around the exact form of that store. FWIW, we don't have to store zero -- ultimately we care about the side effect of triggering the page fault, not the value written. So we could just as easily store a register into the probed address to avoid the codesize cost of encoding an immediate I did that in my local s390 patches. It may not be necessary there, but it allowed me to avoid thinking too hard about the ISA and get s390 proof of concept code running :-) Jeff