On Wed, Oct 01, 2008 at 08:08:48AM -0400, Jeff Squyres wrote:
> Per the call yesterday, I'll merge this into the trunk once I get it
> working with Brad on PPC.
>
> Brad and I discovered a missing htonl/ntohl somewhere in the code last
> night right before I had to go offline (i.e., we can see
Below is the patch in question.
Thanks,
Jon
Signed-Off-By: Jon Mason
--- /usr/include/infiniband/cm.h.orig 2008-09-21 15:36:46.0 -0700
+++ /usr/include/infiniband/cm.h2008-09-21 14:17:43.0 -0700
@@ -38,6 +38,7 @@
#include
#include
+#include
#ifdef __cplusplus
extern "C" {
I know this is slightly premature, but if someone could update the link
below to reflect that iWARP is now supported in OMPI it would be much
appreciated.
http://www.open-mpi.org/faq/?category=openfabrics#iwarp-support
Thanks,
Jon
4.6 (Final)
> ::
> /etc/rocks-release
> ::
> Rocks release 4.2.1 (Cydonia)
> login3%
Sorry, looks like I busted you. Please pull the latest bits and verify my
fix solves your issue.
Thanks,
Jon
>
>
> Jon Mason wrote:
>> On Tue, May 20, 2008 at 0
On Tue, May 20, 2008 at 02:48:49PM -0400, Pak Lui wrote:
> Hi,
>
> I am not familiar with get_iwarp_subnet_id and I am not sure why it is
> causing trunk to barf. I think I am using ofed 1.2.5. See attached for
That is in the 1.3 tree, not 1.2. There was a bug in Solaris that was
fixed recent
On Mon, May 19, 2008 at 10:12:19PM +0300, Gleb Natapov wrote:
> On Mon, May 19, 2008 at 01:52:22PM -0500, Jon Mason wrote:
> > On Mon, May 19, 2008 at 05:17:57PM +0300, Gleb Natapov wrote:
> > > On Mon, May 19, 2008 at 05:08:17PM +0300, Pavel Shamis (Pasha) w
On Mon, May 19, 2008 at 01:38:53PM -0400, Jeff Squyres wrote:
> On May 19, 2008, at 8:25 AM, Gleb Natapov wrote:
>
> > Is it possible to have sane SRQ implementation without HW flow
> > control?
>
> It seems pretty unlikely if the only available HW flow control is to
> terminate the connectio
On Mon, May 19, 2008 at 05:17:57PM +0300, Gleb Natapov wrote:
> On Mon, May 19, 2008 at 05:08:17PM +0300, Pavel Shamis (Pasha) wrote:
> > >> 5. ...?
> > >>
> > > What about moving posting of receive buffers into main thread. With
> > > SRQ it is easy: don't post anything in CPC thread. Main th
On Tuesday 13 May 2008 12:20:57 pm Brian W. Barrett wrote:
> On Tue, 13 May 2008, Don Kerr wrote:
> > I believe there are similar operations being used by other areas of open
> > mpi, place to start looking would be, opal/util/if.c.
>
> Yes, opal/util/if.h and opal/util/net.h provide a portable int
On Monday 12 May 2008 07:37:54 pm Jeff Squyres wrote:
> Short version:
> --
>
> I propose that we should disallow multiple different
> mca_btl_openib_receive_queues values (or receive_queues values from
> the INI file) to be used in a single MPI job for the v1.3 series.
>
> More details
On Tuesday 13 May 2008 09:56:13 am Don Kerr wrote:
> I believe btl_open_iwarp.c is making platform specific calls. I don't
Silly question, but I thought openib ONLY worked in Linux. If this is not the
case, then this whole chunk of code will have to be redesigned (if it can
even be done at all)
On Tuesday 06 May 2008 09:41:53 am Jeff Squyres wrote:
> I actually don't know what the RDMA CM requires for the LMC>0 case --
> does it require a unique IP address for every LID?
It requires a unique IP address for every hca/port in use by rdmacm.
I see the bug in rdmacm (since I don't believe
I am seeing some unusual behavior during the shutdown phase of ompi at the end
of my testcase. While running a IMB pingpong test over the rdmacm on openib, I
get cq flush errors on my iWARP adapters.
This error is happening because the remote node is still polling the endpoint
while the other
s - everywhere I've tried it, it works fine.
I'll double check and do a completely fresh svn pull and install and see where
that gets me.
Thanks for the help,
Jon
> Ralph
>
> On 4/2/08 5:41 PM, "Jon Mason" wrote:
> > On Wednesday 02 April 2008 05:04:47 pm
; 1. in the case where it works, can you verify that the ssh to launch
> > the orteds is still running?
> >
> > 2. in the case where it doesn't work, can you verify that the ssh to
> > launch the orteds has actually died?
> >
> > On Apr 2, 2008, at 4:58 PM,
On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote:
> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote:
> > I remember that someone had found a bug that caused orte_debug_flag to not
> > get properly set (local var covering over a global one) - could be that
>
x27;t think I am doing anyting different.
Did some setting change that perhaps I did not modify?
Thanks,
Jon
> On 4/2/08 10:41 AM, "George Bosilca" wrote:
> > I'm using this feature on the trunk with the version from yesterday.
> > It works without problems ...
>
it happened at r17920.
Thanks,
Jon
> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
> > I regressed my tree and it looks like it happened between 17590:17917
> >
> > On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
> >> I am noticing that ssh seems to be broken on
I regressed my tree and it looks like it happened between 17590:17917
On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
> I am noticing that ssh seems to be broken on trunk (and my cpc branch, as
> it is based on trunk). When I try to use xterm and gdb to debug, I only
> successfu
I am noticing that ssh seems to be broken on trunk (and my cpc branch, as it
is based on trunk). When I try to use xterm and gdb to debug, I only
successfully get 1 xterm. I have tried this on 2 different setups. I can
successfully get the xterm's on the 1.2 svn branch.
I am running the fo
On Mon, Mar 10, 2008 at 10:03:27AM -0500, Jeff Squyres wrote:
> On Mar 10, 2008, at 9:50 AM, Steve Wise wrote:
>
> > (just thinking out loud here): The OMPi code could be designed to
> > _not_
> > assume recv's are posted until the CPC indicates they are ready. IE
> > sort
> > of asynchronous
After discussing this issue with Jeff via private e-mails. I would like
to open the issue to the group for futher discussion.
Issue (as described by Steve Wise):
Currently OMPI uses qp 0 for all credit updates (by design). This breaks
when running over the chelsio rnic due to a race condition be
A quick sanity check.
When setting the cq depth in the openib btl, it checks the calculated
depth against the maxmium cq depth allowed and sets the minimum of those
two. However, I think it is checking the wrong variable. If I
understand correctly, ib_dev_attr.max_cq represents the maximum numbe
's once
>
> 3. ...?
4. Profit!
>
> I'd say that this optimization is pretty important for v1.3 (but it
> shouldn't be hard to do).
>
>
> On Jan 9, 2008, at 6:37 PM, Jon Mason wrote:
>
> > The new cpc selection framework is now in place. The pa
On Thu, Jan 10, 2008 at 11:17:48AM +0200, Pavel Shamis (Pasha) wrote:
> Jon Mason wrote:
> > The new cpc selection framework is now in place. The patch below allows
> > for dynamic selection of cpc methods based on what is available. It
> > also allows for inclusion/exclu
The new cpc selection framework is now in place. The patch below allows
for dynamic selection of cpc methods based on what is available. It
also allows for inclusion/exclusions of methods. It even futher allows
for modifying the priorities of certain cpc methods to better determine
the optimal c
On Wed, Dec 12, 2007 at 01:35:33PM -0500, Jeff Squyres wrote:
> I agree with Gleb's idea. More below.
>
> On Dec 12, 2007, at 12:24 PM, Jon Mason wrote:
>
> > Ok, glad I got this conversation started :)
> >
> > So, we need a slight redesign to determine
Ok, glad I got this conversation started :)
So, we need a slight redesign to determine the cm method (unless forced
via commandline arg). This can be determined by calling all the
individual open routines, and having them return a priority based on
their ability to function. For example, the xoo
Currently, alternate CMs cannot be called because
ompi_btl_openib_connect_base_open forces a choice of either oob or xoob
(and goes into an erroneous error path if you pick something else).
This patch reorganizes ompi_btl_openib_connect_base_open so that new
functions can easily be added. New Open
ting (especially for trivial fixes like this :-) ).
Sorry, I was just trying to err on the side of caution and openness. Do
you have a rule of thumb for what should be sent on the list versus
simply committed?
Thanks,
Jon
>
>
> On Dec 10, 2007, at 4:05 PM, Jon Mason wrote:
>
> >
Slight word usage and grammar error in the openib btl help test. I
believe the change below is the intended meaning.
Thanks,
Jon
Index: ompi/mca/btl/openib/help-mpi-btl-openib.txt
===
--- ompi/mca/btl/openib/help-mpi-btl-openib.txt
There is a double call to ompi_btl_openib_connect_base_open in
mca_btl_openib_mca_setup_qps(). It looks like someone just forgot to
clean-up the previous call when they added the check for the return
code.
I ran a quick IMB test over IB to verify everything is still working.
Thanks,
Jon
Index:
On Tue, Dec 04, 2007 at 11:40:17AM -0800, Arlin Davis wrote:
> Jon Mason wrote:
>> While working on OMPI udapl btl, I have noticed some "interesting"
>> behavior. OFA udapl wants the evd queues to be a power of 2 and
>> then will subtract 1 for book keeping (ie, so
While working on OMPI udapl btl, I have noticed some "interesting"
behavior. OFA udapl wants the evd queues to be a power of 2 and
then will subtract 1 for book keeping (ie, so that internal head and
tail pointers never touch except when the ring is empty). OFA udapl
will report the queue length
I created a public branch to make available the patch which gets OPMI
uDAPL to kinda work on iWARP. The branch can be found at:
http://svn.open-mpi.org/svn/ompi/tmp-public/iwarp-ompi-v1.2/
The branch contains an updated version of the patch Steve Wise sent out
some time ago. Below is the patch (
35 matches
Mail list logo