Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-11-02 Thread Andi Kleen
Nuno Diegues writes: > Hello everyone, > > gently pinging to bring this back to life given the last patch I emailed. The patch is fine for me, but I cannot approve it. -Andi

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-11-02 Thread Nuno Diegues
Hello everyone, gently pinging to bring this back to life given the last patch I emailed. Best regards, -- Nuno Diegues On Mon, Aug 24, 2015 at 12:51 PM, Nuno Diegues wrote: > Hello everyone, > > after a summer internship and some backlog catching up in the past > weeks, I

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-08-24 Thread Nuno Diegues
Hello everyone, after a summer internship and some backlog catching up in the past weeks, I have finally got around to review the patch according to the latest discussion. The changes have not caused regression, and the latest speedup results are coherent with what we had before. In the

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-06-14 Thread Nuno Diegues
Hello everyone, just wanted to ping back to say that I have been overwhelmed with work and will be back on this as soon as possible, most likely during July. Best regards, -- Nuno Diegues On Tue, May 19, 2015 at 3:17 PM, Torvald Riegel trie...@redhat.com wrote: On Mon, 2015-05-18 at 23:27

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-19 Thread Torvald Riegel
On Mon, 2015-05-18 at 23:27 -0400, Nuno Diegues wrote: On Mon, May 18, 2015 at 5:29 PM, Torvald Riegel trie...@redhat.com wrote: Are there better options for the utility function, or can we tune it to be less affected by varying txn length and likelihood of txnal vs. nontxnal code? What

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Torvald Riegel
On Wed, 2015-04-29 at 23:23 -0400, Nuno Diegues wrote: Hello, I have taken the chance to improve the patch by addressing the comments above in this thread. Namely: - to use a simple random generator managed inside the library only - removed floating point usage and replaced by fixed

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Andi Kleen
Are there better options for the utility function, or can we tune it to There is nothing better that isn't a lot slower. -Andi

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Nuno Diegues
On Mon, May 18, 2015 at 5:29 PM, Torvald Riegel trie...@redhat.com wrote: First of all, sorry for taking so long to review this. Thank you for the contribution. Hello Torvald, thanks for taking the time to look into this! My major concern is about rdtsc being used. The relation to

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-18 Thread Torvald Riegel
On Mon, 2015-05-18 at 23:39 +0200, Andi Kleen wrote: Are there better options for the utility function, or can we tune it to There is nothing better that isn't a lot slower. Do you care to elaborate why? As-is, I find this statement to not be convincing; at the very least we need to

[PING] Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-14 Thread Andi Kleen
Ping! Could someone who can approve please review the patch? Thanks, -Andi Nuno Diegues n...@ist.utl.pt writes: Hello, I have taken the chance to improve the patch by addressing the comments above in this thread. Namely: - to use a simple random generator managed inside the library

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-05-05 Thread Nuno Diegues
Today I have received the news that the Copyright Assignment was completed with the FSF. On Thu, Apr 30, 2015 at 8:10 AM, Nuno Diegues n...@ist.utl.pt wrote: Patch looks good to me now. It would be perhaps nice to have an environment variable to turn the adaptive algorithm off for tests,

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-30 Thread Nuno Diegues
Patch looks good to me now. It would be perhaps nice to have an environment variable to turn the adaptive algorithm off for tests, but that's not critical. Yes, that makes perfect sense. It would be also nice to test it on something else, but I understand it's difficult to find other

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-29 Thread Andi Kleen
Nuno Diegues n...@ist.utl.pt writes: Hello, I have taken the chance to improve the patch by addressing the comments above in this thread. Namely: - to use a simple random generator managed inside the library only - removed floating point usage and replaced by fixed arithmetic - added

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-29 Thread Nuno Diegues
Hello, I have taken the chance to improve the patch by addressing the comments above in this thread. Namely: - to use a simple random generator managed inside the library only - removed floating point usage and replaced by fixed arithmetic - added some comments where relevant Re-running the

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-09 Thread Nuno Diegues
On Wed, Apr 8, 2015 at 6:54 PM, Andi Kleen a...@firstfloor.org wrote: On the STAMP suite of benchmarks for transactional memory (described here [1]). I have ran an unmodified GCC 5.0.0 against the patched GCC with these modifications and obtain the following speedups in STAMP with 4 threads

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-08 Thread Andi Kleen
Nuno Diegues n...@ist.utl.pt writes: What workloads did you test this on? +static inline float fastLog(float x) +{ + union { float f; uint32_t i; } vx = { x }; + float y = vx.i; + y *= 8.2629582881927490e-8f; + return y - 87.989971088f; +} + +static inline float fastSqrt(float x)

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-08 Thread Nuno Diegues
Thank you for the feedback. Comments inline. On Wed, Apr 8, 2015 at 3:05 PM, Andi Kleen a...@firstfloor.org wrote: Nuno Diegues n...@ist.utl.pt writes: What workloads did you test this on? On the STAMP suite of benchmarks for transactional memory (described here [1]). I have ran an

Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-08 Thread Andi Kleen
On the STAMP suite of benchmarks for transactional memory (described here [1]). I have ran an unmodified GCC 5.0.0 against the patched GCC with these modifications and obtain the following speedups in STAMP with 4 threads (on a Haswell with 4 cores, average 10 runs): I expect you'll need

[PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-07 Thread Nuno Diegues
Hi, the libitm package contains a fast path for x86 to use Intel Restricted Transactional Memory (RTM) when available. This Hardware Transactional Memory (HTM) requires a software-based fallback to execute the atomic blocks when the hardware fails. This may happen because the transaction is too