How about if we start on this over e-mail and phone ? A face-to-face
meeting is good, but I am already booked Jan 5-9, maybe 12-13, Jan 16-Feb
6th, and Feb 8-11. I would prefer not to tack on something at the end of
the MPI Forum meeting, as I will have been gone from home for most of the
month
I have other meetings so far on Jan 21, and possibly Jan 6-8.
So I would ask we not have the Jan/Feb ORTE meeting either
of those weeks.
On Thu, Dec 4, 2008 at 5:50 PM, Ralph Castain wrote:
>
> On Dec 4, 2008, at 3:25 PM, Jeff Squyres wrote:
>
>> I don't know who's interested, so
What specifically do you have in mind ?
After talking with Jeff I withdraw my request to change the approach. This
is a good approach when one wants to send warnings to some sort of logging
system, in addition to errors. Sending the data up stream like I suggested
can¹t rely on the error
I don't know who's interested, so I thought I'd bring it up on the
devel list: let's start the basics for the January ORTE meeting. We
may be able to sketch out an agenda, but frankly, it may depend on how
far we get in the December meeting. So we may not be able to fully
decide that
The likelihood of a physical meeting about this in the near future is
unlikely; I think we're all facing travel restrictions and constraints
with the holidays coming up.
How about a teleconf to discuss the following about the notifier:
- what exactly is there today
- why what is there today
I'm beginning to believe that we need a design meeting specifically
over this question. Too many unknowns exist, with significant
potential problems lurking behind them. Frankly, this issue could have
a major impact on how we operate, performance, and a variety of other
factors going
On 12/4/08 2:28 PM, "Ralph Castain" wrote:
> I guess you lost me on this one. How are the btl's going to push an error "up"
> to a higher layer? The errors could contain an arbitrary amount of information
> in them. Since the btl API's currently only return ints, are you
I guess you lost me on this one. How are the btl's going to push an
error "up" to a higher layer? The errors could contain an arbitrary
amount of information in them. Since the btl API's currently only
return ints, are you proposing that we change all the btl APIs to
include a new error
Not exactly, it depends on what you push up the stack. If you push just an
error code, than you are right, there is very little value. However, if you
push up the error strings (or something like that), and have an upper layer
interact with SLURM or Moab¹s error reporting system, the btl¹s don¹t
That was my thought exactly. And since the point of the notifier
component is to return a *useful* description of what failure the BTL had
(like IB ran out of resource X again), that will be lost if we just push
that up to the next layer.
Just my $0.02, of course.
Brian
On Thu, 4 Dec 2008,
Hmmm...only problem with that idea is that the entity being
communicated to (e.g., SLURM, Moab) have no concept of MPI nor any way
to communicate via that system. They do, however, have APIs that
notifier can call, and know how to speak TCP via their own agreed-upon
protocols. And many
Here is where I think we should reconsider accessing the notifier component
in the btl. It creates dependencies in the btl that are not needed. The
idea of a notifier component is a good one, but I would defer using it to
upper layers, rather than embedding it in the guts of the communication
On 12/4/08 9:05 AM, "Jeff Squyres" wrote:
> After reflecting on this a bit, there's two more things I should have
> mentioned:
>
> 1. I think that moving the BTL's out into their own layer (or
> whatever) should be a separate effort than re-introducing the RSL (or
>
On Dec 4, 2008, at 7:05 AM, Jeff Squyres wrote:
After reflecting on this a bit, there's two more things I should
have mentioned:
1. I think that moving the BTL's out into their own layer (or
whatever) should be a separate effort than re-introducing the RSL
(or something like it). To
Yes, FTB utilizes the notifier framework. In addition, we have three
other components getting ready to be added to that framework that will
provide interfaces to Moab, SLURM, and a DOE monitoring program. The
first two will require messaging capabilities to tell the schedulers
about
I think you got it right. And I think we're pretty good in terms of
BTL usage of ORTE and OPAL (to include the new "notifier" service that
Ralph put in recently -- what the FTB will likely eventually use, I
think...?); those interfaces and abstraction barriers are
technologically
16 matches
Mail list logo