I've tried this with RHEL4 U3 x86_64 LionMini SDR, SLES10 x86_64 LionCub DDR,
and RHEL4 U3 x86_64 LionMini DDR so far.
You reported an oops. One which OS/HW/FW did you observe it?
--
MST
___
general mailing list
general@lists.openfabrics.org
Quoting Roland Dreier [EMAIL PROTECTED]:
Subject: Re: [ofa-general] questions about OFED 1.2 IPoIB bonding
Out of curiousity, why does this cause a catastrophic error? I would
have thought a work request with a bogus bus address would generate an
affiliated error, since you know
Quoting Scott Weitzenkamp (sweitzen) [EMAIL PROTECTED]:
Subject: [ofa-general] Changed default bugzilla Priority/Severity from
P1/Blocker to P3/Normal
Now there will be no more accidental P1/Blocker bugs.
Good idea.
--
MST
___
general
Are all your ports DDR or do you have a mix ? If all are DDR, you can
configure the default partition to use this rate.
If I get this right, user has to manually configure the rate in a mixed subnet.
Is that correct?
If yes, I'm actually not too happy with this.
Would something like the
Hi,
On Tuesday 10 April 2007 20:25, Al Chu wrote:
On Tue, 2007-04-10 at 12:14 -0500, Steve Wise wrote:
I just built the ofed-1.2-rc1 kit on an IBM P5 PPC with SLES 10 and some
the apps got built as 32b. Seems like gcc on this distro defaults to
32b:
My opinion is this bug is with Suse.
This email was generated automatically, please do not reply
Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod
--with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod
--with-addr_trans-mod --with-rds-mod --with-cxgb3-mod
Passed:
Passed on i686 with
I did followed most of the discussions between you and MoniS re the
ipoib/bonding integration in OFED 1.2 and elsewhere, however: i don't
see why bonding is basically broken for ipoib, if you don't mind,
please tell me the bottom line from your perspective.
Here's a short summary of issues
Quoting Hal Rosenstock [EMAIL PROTECTED]:
Subject: Re: multicast join failed for...
On Mon, 2007-04-09 at 18:47, Egor Tur wrote:
Hi folk.
ib1: multicast join failed for ff12:601b::::::0001,
status -22
ib0: multicast join failed for
Very good idea
Tziporet
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Roland Dreier wrote:
472 Data corruption with Lustre+OFED when using FMR on memfree HCAs
We see it also with iser, basically only on scsi --read-- which from
IB perspective is RDMA write from the target to the initiator.
The env we see it is Sinai (25204) hw_ver=A0 and
Quoting Or Gerlitz [EMAIL PROTECTED]:
Subject: Re: [ofa-general] iser/lustre memfree issues
Roland Dreier wrote:
472 Data corruption with Lustre+OFED when using FMR on memfree HCAs
We see it also with iser, basically only on scsi --read-- which from
IB perspective is RDMA
On Wed, 2007-04-04 at 22:13 -0500, Steve Wise wrote:
On Wed, 2007-04-04 at 10:57 -0500, Steve Wise wrote:
I just built and installed today's daily ofed-1.2 build and mvapich2
doesn't work at all over iwarp. The build is
OFED-1.2-20070404-0600.tgz.
I've opened bug 520 to track this.
See attachment.
-
Ennis Del Mar wakes before fiv
The stale coffee is boiling up
They were raised on small, poo
pic26.gif
Description: GIF image
___
general mailing list
general@lists.openfabrics.org
On Wed, 2007-04-11 at 03:22, Michael S. Tsirkin wrote:
Are all your ports DDR or do you have a mix ? If all are DDR, you can
configure the default partition to use this rate.
If I get this right, user has to manually configure the rate in a mixed
subnet.
Is that correct?
I'm not sure;
On Wed, 2007-04-11 at 05:49, Michael S. Tsirkin wrote:
Quoting Hal Rosenstock [EMAIL PROTECTED]:
Subject: Re: multicast join failed for...
On Mon, 2007-04-09 at 18:47, Egor Tur wrote:
Hi folk.
ib1: multicast join failed for
ff12:601b::::::0001, status
On Tue, 2007-04-10 at 17:02 -0500, Steve Wise wrote:
Vlad,
Please pull these cxgb3 and iw_cxgb3 changes from
git://git.openfabrics.org/~swise/ofed_1_2 ofed_1_2
Thanks,
Steve.
Divy Le Ray:
Ensure that the TCAM active region size is at least 16.
If yes, I'm actually not too happy with this.
Would something like the following heuristic work better?
- select the max rate between all participants
The issue is that one doesn't know all the participants in a group as
they are joined dynamically.
(I think we've been over this
Anyone know what's going on with gitweb on the OFA server ?
When I try:
http://www.openfabrics.org/gitweb/
I get:
Internal Server Error
The server encountered an internal error or misconfiguration and was
unable to complete your request.
Please contact the server administrator, [EMAIL
On Wed, 2007-04-11 at 14:12, Michael S. Tsirkin wrote:
If yes, I'm actually not too happy with this.
Would something like the following heuristic work better?
- select the max rate between all participants
The issue is that one doesn't know all the participants in a group as
Hi Hal. After a long delay, it seems to work for me. Thanks.
-jeff
On 11 Apr 2007 14:17:25 -0400, Hal Rosenstock [EMAIL PROTECTED] wrote:
Anyone know what's going on with gitweb on the OFA server ?
When I try:
http://www.openfabrics.org/gitweb/
I get:
Internal Server Error
The server
The utils component started as a place for installer bugs. We then
created an Installer component.
I view utils as a place for bugs on tvflash, mstflint, perftest,
anything that does not fit in another component.
We could create compoents for ibutils, tvflash, mstflint, etc. if that
would be
On Wed, 2007-04-11 at 14:57, Scott Weitzenkamp (sweitzen) wrote:
The utils component started as a place for installer bugs. We then
created an Installer component.
I view utils as a place for bugs on tvflash, mstflint, perftest,
anything that does not fit in another component.
We could
I added ibutils.
-Original Message-
From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 11, 2007 11:57 AM
To: Scott Weitzenkamp (sweitzen)
Cc: OpenFabricsEWG; general@lists.openfabrics.org; Eitan Zahavi
Subject: RE: Bugzilla setup for utils component
On Wed,
Documentation/user_mad.txt: Clarify transaction ID usage
Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]
diff --git a/Documentation/infiniband/user_mad.txt
b/Documentation/infiniband/user_mad.txt
index 750fe5e..1d2dbf1 100644
--- a/Documentation/infiniband/user_mad.txt
+++
Quoting Hal Rosenstock [EMAIL PROTECTED]:
Subject: Re: multicast join failed for...
On Wed, 2007-04-11 at 14:12, Michael S. Tsirkin wrote:
If yes, I'm actually not too happy with this.
Would something like the following heuristic work better?
- select the max rate between all
Sean:
A question about rdmacm library. I use rdma_connect/accept to
wire the IB connection between A and B. Somehow the IB connection is
broken by either process B dies, or a bad cable. If process A just
receives messages from process B, can process A get a
RDMA_CM_EVENT_DISCONNECTED
+Transaction IDs
+
+ Clients of the MAD layer can use the lower 32 bits of the
+ transaction ID field to track mad request/response pairs. The
+ upper 32 bits are reserved for use by the kernel ib_mad module.
This is a good addition. But I think it would be worth saying which
half
On Wed, 2007-04-11 at 17:02, Roland Dreier wrote:
+Transaction IDs
+
+ Clients of the MAD layer can use the lower 32 bits of the
+ transaction ID field to track mad request/response pairs. The
+ upper 32 bits are reserved for use by the kernel ib_mad module.
This is a good
On Wed, 2007-04-11 at 17:02, Roland Dreier wrote:
+Transaction IDs
+
+ Clients of the MAD layer can use the lower 32 bits of the
+ transaction ID field to track mad request/response pairs. The
+ upper 32 bits are reserved for use by the kernel ib_mad module.
This is a good
This is a good addition. But I think it would be worth saying which
half of the TID is the lower half. I would fix it up myself but I
don't know off the top of my head which byte order the TID is
interpreted with.
Should this be described relative to network (rather than host)
On Wed, 2007-04-11 at 17:13, Roland Dreier wrote:
This is a good addition. But I think it would be worth saying which
half of the TID is the lower half. I would fix it up myself but I
don't know off the top of my head which byte order the TID is
interpreted with.
Should this
On Wed, 2007-04-11 at 15:47, Michael S. Tsirkin wrote:
Quoting Hal Rosenstock [EMAIL PROTECTED]:
Subject: Re: multicast join failed for...
On Wed, 2007-04-11 at 14:12, Michael S. Tsirkin wrote:
If yes, I'm actually not too happy with this.
Would something like the
I haven't tried adding or removing storage, just failover. I guess
leave 91-srp.rules in for now, it seems benign.
Scott
-Original Message-
From: Ishai Rabinovitz [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 10, 2007 9:46 PM
To: Chieng Etta
Cc: Scott Weitzenkamp (sweitzen);
BTW: any idea how this ever got triggered? The only way I can see is
if you're either not using libipathverbs and libibverbs and you just
create the CQ some other way, which seems unlikely. Do you know how
Jason triggered this bug?
Yes, it was because he was using 32-bit userspace and
On 4/11/07, Michael S. Tsirkin [EMAIL PROTECTED] wrote:
I did followed most of the discussions between you and MoniS re the
ipoib/bonding integration in OFED 1.2 and elsewhere, however: i don't
see why bonding is basically broken for ipoib, if you don't mind,
please tell me the bottom line
On 4/12/07, Roland Dreier [EMAIL PROTECTED] wrote:
Could you try commenting out just these 2 lines in mthca_cmd.c:
if (dev-mthca_flags MTHCA_FLAG_SINAI_OPT)
MTHCA_PUT(inbox, 0x1, INIT_HCA_FLAGS1_OFFSET);
(reverting your changes, that is keeping
Roland Dreier wrote:
BTW: any idea how this ever got triggered? The only way I can see is
if you're either not using libipathverbs and libibverbs and you just
create the CQ some other way, which seems unlikely. Do you know how
Jason triggered this bug?
Yes, it was because he was using
-Original Message-
From: Sean Hefty [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 11, 2007 4:50 PM
To: Tang, Changqing
Cc: general@lists.openfabrics.org
Subject: Re: How fast to get RDMA_CM_EVENT_DISCONNECTED ?
A question about rdmacm library. I use
Serveral seconds to detect connection failure is not acceptable for us,
so if I use rdmacm, I want to know if I detect the connection
failure faster than heart-beat message.
In general, use of the rdma or ib cm will not help detect failures on active
connection any faster. If the remove process
Hi folk.
I see that my small problem has been interesting.
Thanks for your help.
Rate 6 is 20 Gb/sec whereas 3 is 10 Gb/sec. So the port is 4x DDR (rate
6) and the group is 4x SDR. The request is for equal to the rate so it
fails.
Are all your ports DDR or do you have a mix ? If all
On 11 Apr 2007 17:45:54 -0400
Hal Rosenstock [EMAIL PROTECTED] wrote:
On Wed, 2007-04-11 at 15:47, Michael S. Tsirkin wrote:
- previously we had some client failing join
which is worse.
Maybe not. Maybe that's what the admin wants (to keep the higher rate
rather than degrade the group
Quoting Hal Rosenstock [EMAIL PROTECTED]:
Subject: Re: multicast join failed for...
On Wed, 2007-04-11 at 15:47, Michael S. Tsirkin wrote:
Quoting Hal Rosenstock [EMAIL PROTECTED]:
Subject: Re: multicast join failed for...
On Wed, 2007-04-11 at 14:12, Michael S. Tsirkin wrote:
Yes, Internally in A, if the # of receives exceeds lowwater(4), an ack
will be sent back. I assume ACK is not trigered at the moment.
when A is trying to receive a message from B, and the message never
shows, A acctualy sends a heart beat back to B, however, it takes
serveral seconds for
Yes, get a stready stream of these on sender.
ib0: TX ring full, stopping kernel net queue
Aha. (As a note, it's always useful to set debug level when
you experience problems).
Why am I getting low throughput of IP multicast vs IPoIB UD UDP unicast?
It's something in the hardware - it's
Michael or Roland, could you please try iperf with UDP vs multicast, so
I'm not the middleman here?
Scott
-Original Message-
From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED]
Sent: Wednesday, April 11, 2007 9:17 PM
To: [EMAIL PROTECTED]; Roland Dreier; Scott
Weitzenkamp
45 matches
Mail list logo