I'm quitting for the day, but happened to notice that all our v1.5 MTT
runs are failing with r26133, though tests ran fine as of r26129.
Things run fine on-node, but if you run even just "hostname" on a remote
node, the job fails with
orted: Command not found
I get this problem whether I inc
We made a big step forward today!
The used Kernel has a bug regarding to the shared L1 instruction cache in AMD
Bulldozer processors:
See
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=dfb09f9b7ab03fd367740e541a5caf830ed56726
and
http://developer.amd.com/Assets
On Mar 15, 2012, at 8:06 AM, Matthias Jurenz wrote:
> We made a big step forward today!
>
> The used Kernel has a bug regarding to the shared L1 instruction cache in AMD
> Bulldozer processors:
> See
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=dfb09f9b7ab03
What: Update ob1 to do the following:
- fallback on send after rdma_put_retries_limit failures of prepare_dst
- fallback on put (single non-pipelined) if the btl returns
OMPI_ERR_NOT_AVAILABLE on a get transaction.
When: Timeout in about one week (Mar 22)
Why: Two reasons:
Nathan,
I did not get any patch.
Regards,
Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory
On Mar 15, 2012, at 5:07 PM, Nathan Hjelm wrote:
>
>
> What: Update ob1 to do the following:
>- fallback on sen
Let me know what you find - I took a look at the code and it looks correct. All
required changes were included in the patch that was applied to the branch.
On Mar 14, 2012, at 11:27 PM, Eugene Loh wrote:
> I'm quitting for the day, but happened to notice that all our v1.5 MTT runs
> are failin