Yeah - to be clear, I had no problem with anything you did, Gilles. I was only
noting that several of them had positive comments, but they weren’t being
merged. Hate to see the good work lost or forgotten :-)
> On Nov 6, 2014, at 5:29 PM, Jeff Squyres (jsquyres)
> wrote:
Creating nightly hwloc snapshot git tarball was a success.
Snapshot: hwloc dev-266-g88e6e89
Start time: Thu Nov 6 21:01:01 EST 2014
End time: Thu Nov 6 21:02:47 EST 2014
Your friendly daemon,
Cyrador
Actually, I like the PRs; I like the nice github tools for commenting and
discussing.
I'm sorry I haven't followed up on the two you filed for me yet. :-(
On Nov 6, 2014, at 8:23 PM, Gilles Gouaillardet
wrote:
> My bad (mostly)
>
> I made quite a lot of PR
My bad (mostly)
I made quite a lot of PR to get some review before commiting to the master, and
did not follow up in a timely manner.
I closed two obsoletes PR today.
#245 should be ready for prime time.
#227 too unless George has an objection.
I asked Jeff to review #232 and #228 because
On Nov 6, 2014, at 6:21 PM, Ralph Castain wrote:
> I agree - I sent the note because I see people doing things a bit differently
> than expected. I have no issue with PRs for things where people want extra
> eyes on something before committing, or as part of an RFC. Just
I agree - I sent the note because I see people doing things a bit differently
than expected. I have no issue with PRs for things where people want extra eyes
on something before committing, or as part of an RFC. Just want to ensure folks
aren’t letting them languish expecting some kind of
HI Ralph,
We should discuss this on Tuesday. I thought we'd decided for master to
use a model where developers would directly push to ompi/master.
I'd be willing to pull the request from Giles marked as bugs tomorrow.
Howard
2014-11-06 13:16 GMT-07:00 Ralph Castain :
>
Looks like put and get functions should be added if possible. The MTL
layer looks like it is designed for two-sided only with no intention of
supporting one-sided.
-Nathan
On Thu, Nov 06, 2014 at 03:21:32PM -0700, Nathan Hjelm wrote:
>
> Great! We should probably try to figure out how the mtl
Great! We should probably try to figure out how the mtl layer can be
modified to expose those atomics. If possible this should be done before
the 1.9 branch to ensure the feature is available in the next release
series.
-Nathan
On Thu, Nov 06, 2014 at 05:15:30PM -0500, Joshua Ladd wrote:
>
MXM supports atomics.
On Thursday, November 6, 2014, Nathan Hjelm wrote:
> I haven't look at that yet. Would be great to get the new osc component
> working over both btls and mtls. I know portals supports atomics but I
> don't know whether psm does.
>
> -Nathan
>
> On Thu,
> On Nov 6, 2014, at 1:51 PM, Jeff Squyres (jsquyres)
> wrote:
>
> On Nov 6, 2014, at 4:06 PM, Joshua Ladd wrote:
>
>> Once again, many thanks to Alina for discovering and reporting this. Keep up
>> the MTT vigilance!
>
> (this is worthy of its
> On Nov 6, 2014, at 1:39 PM, Nathan Hjelm wrote:
>
> On Thu, Nov 06, 2014 at 04:29:44PM -0500, Joshua Ladd wrote:
>> On Thursday, November 6, 2014, Nathan Hjelm wrote:
>>
>> On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
>>> Nathan,
>>>
FWIW: I’m not planning on releasing tomorrow as we aren’t ready. We aren’t
releasing with a bug as bad as threading on by default as we know we can’t
really support that situation.
Nothing sacred about the release date - it’s just a target.
Frankly, I would even listen to the argument of
On Nov 6, 2014, at 4:06 PM, Joshua Ladd wrote:
> Once again, many thanks to Alina for discovering and reporting this. Keep up
> the MTT vigilance!
(this is worthy of its own thread)
+100
MTT vigilance is a tough job; many thanks for submitting good bug reports on
what
On Thu, Nov 06, 2014 at 04:29:44PM -0500, Joshua Ladd wrote:
>On Thursday, November 6, 2014, Nathan Hjelm wrote:
>
> On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
> >Nathan,
> >Has this bug always been present in OpenIB or is this a
On Thursday, November 6, 2014, Nathan Hjelm wrote:
> On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
> >Nathan,
> >Has this bug always been present in OpenIB or is this a recent
> addition?
> >If this is regression, I would also be inclined to say that
On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
>Nathan,
>Has this bug always been present in OpenIB or is this a recent addition?
>If this is regression, I would also be inclined to say that this is a
The bug is as old as the message coalescing feature in the openib
btl.
Nathan,
Has this bug always been present in OpenIB or is this a recent addition? If
this is regression, I would also be inclined to say that this is a blocker
for 1.8.4. This is a SIGNIFICANT bug. Both Howard and I were quite
surprised that all the while this code has been in use at LANL
in
On Nov 6, 2014, at 3:39 PM, Joshua Ladd wrote:
> Thank you for taking the time to investigate this, Jeff. SC is a hectic and
> stressful time for everyone on this list with many deadlines looming. This
> bug isn't a priority for us, however, it seems to me that your
Thank you for taking the time to investigate this, Jeff. SC is a hectic and
stressful time for everyone on this list with many deadlines looming. This
bug isn't a priority for us, however, it seems to me that your original
revert, the one that simply wants to disable threading by default (and for
I have 2, namely #228 (Fix --with-fortran=... logic) and #232 (RFC/weak symbols
status ignore).
I will look at them eventually, there just haven't been enough hours in the day
yet, especially with SC coming up. :-(
On Nov 6, 2014, at 3:16 PM, Ralph Castain wrote:
>
I suppose it’s too much to ask, but can we turn this thing “off” until you get
it fixed? Maybe you could test it posting to yourself in the meantime?
> Begin forwarded message:
>
> Date: November 6, 2014 at 12:17:48 PM PST
> From: mellanox-github
> Reply-To:
Hey folks
We seem to be creating a bunch of pull requests on the trunk (well, by “we” I
mean mostly Gilles) that are then being left hanging there, going stale. Some
of these are going to start conflicting with changes being made by others, or
even conflict with each other.
Can we do a
Not handling the multi-rail case at this point. Only issue atomics and
rdma operations over a single btl module (which should be a single
HCA).
-Nathan
On Thu, Nov 06, 2014 at 12:15:13PM -0700, Howard Pritchard wrote:
>HI Nathan,
>How would you get things right with atomics and
This thread digressed significantly from the original bug report; I did not
realize that the discussion was revolving around the fact that
MPI_THREAD_MULTIPLE no longer works *at all*.
So here's where we are:
1. MPI_THREAD_MULTIPLE doesn't work, even if you --enable-mpi-thread-multiple
2. It
HI Nathan,
How would you get things right with atomics and multirail?
Getting the memory consistency right would be really difficult.
You'd have to keep issuing zero length rdma reads and hoping
that that would have the effect of a pci-e flush in the case of
multiple updates to a given target
I haven't look at that yet. Would be great to get the new osc component
working over both btls and mtls. I know portals supports atomics but I
don't know whether psm does.
-Nathan
On Thu, Nov 06, 2014 at 08:45:15PM +0200, Mike Dubman wrote:
>btw, do you plan to add atomics API to MTL layer
btw, do you plan to add atomics API to MTL layer as well?
On Thu, Nov 6, 2014 at 5:23 PM, Nathan Hjelm wrote:
> At the moment I select the lowest latency BTL that can reach all of the
> ranks in the communicator used to create the window. I can add code to
> round-robin
Yeah, my bad - somehow, it showed up on the github pull request list for
ompi-release. I’ll remove it.
> On Nov 6, 2014, at 9:19 AM, Joshua Ladd wrote:
>
> We filed an RFC for the trunk at Jeff's request. This is a new feature.
>
>
> Josh
>
> On Thu, Nov 6, 2014 at
We filed an RFC for the trunk at Jeff's request. This is a new feature.
Josh
On Thu, Nov 6, 2014 at 12:13 PM, Joshua Ladd wrote:
> Yalla is only in trunk. Unless you want us to push it to 1.8.4 - we won't
> object :)
>
> Josh
>
> On Thu, Nov 6, 2014 at 11:46 AM, Ralph
Yalla is only in trunk. Unless you want us to push it to 1.8.4 - we won't
object :)
Josh
On Thu, Nov 6, 2014 at 11:46 AM, Ralph Castain
wrote:
> Hey folks
>
> Here is the NEWS I have for 1.8.4 so far - please respond with any
> additions/mods you would like to suggest
>
Hey folks
Here is the NEWS I have for 1.8.4 so far - please respond with any
additions/mods you would like to suggest
+1.8.4
+-
+- Removed inadvertent change that set --enable-mpi-thread-multiple "on"
+ by default, thus impacting performance for non-threaded apps
+- Significantly reduced
IIRC, you prefix the core number with a P to indicate physical
I’ll see what I can do about getting the physical notation re-implemented -
just can’t promise when that will happen
> On Nov 6, 2014, at 8:30 AM, Tom Wurgler wrote:
>
> Well, unless we can get LSF to use
Well, unless we can get LSF to use physical numbering, we are dead in the water
without a translator of some sort.
We are trying to figure how we can automate the translation in the meantime,
but we have a mix of clusters and the mapping is different between them.
We daily use 1.6.4 openmpi
On Nov 6, 2014, at 12:44 AM, George Bosilca wrote:
> PS: Sorry Dave I also pushed a master branch merge ...
It's not the end of the world, just try to keep an eye on it and avoid doing it
in the future. If you need any help avoiding it, feel free to ping me or the
devel@
Ugh….we used to have a switch for that purpose, but it became hard to manage
the code. I could reimplement at some point, but it won’t be in the immediate
future.
I gather the issue is that the system tools report physical numbering, and so
you have to mentally translate to create the
At the moment I select the lowest latency BTL that can reach all of the
ranks in the communicator used to create the window. I can add code to
round-robin windows over the available BTLs on multi-rail systems.
-Nathan
On Wed, Nov 05, 2014 at 06:38:25PM -0800, Paul Hargrove wrote:
>All
So we used lstopo with a arg of "--logical" and the output showed the core
numbering 0,1,2,3...47 instead of
0,4,8,12 etc.
The multiplying by 4 you speak of falls apart when you get to the second socket
as its physical numbers are
1,5,9,13... and its logical is 12,13,14,15
So the
Thanks! It fixes the problem with tcp.
Best regards,
Elena
On Thu, Nov 6, 2014 at 10:44 AM, George Bosilca wrote:
> I pushed a slightly better patch for the TCP BTL
> (54ddb0aece0892dcdb1a1293a3bd3902b5f3acdc). The correct scheme would be to
> OBJ_RETAIN the proc once it
I pushed a slightly better patch for the TCP BTL
(54ddb0aece0892dcdb1a1293a3bd3902b5f3acdc). The correct scheme would be to
OBJ_RETAIN the proc once it is attached to the btl_proc and release it upon
destruction of the btl_proc. However, for some obscure reason this doesn't
quite works, as the
Ralph,
i updated the MODEX flag to PMIX_GLOBAL
https://github.com/open-mpi/ompi/commit/d542c9ff2dc57ca5d260d0578fd5c1c556c598c7
Elena,
i was able to reproduce the issue (salloc -N 5 mpirun -np 2 is enough).
i was "lucky" to reproduce the issue : it happened because one of node
was misconfigured
41 matches
Mail list logo