I asked our IB expert Jack for hints and he told me this:
From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume 1,
revision 1.2.1
* Local Length Error - ... Generated for a
Work Request posted to the local Receive Queue when the sum of
the Data Segment lengths is too small to
Hi,
We are testing QoS.
We have defined service level rules in opensm and implemented qos-policy.
Implementation wasn't fully done in perftest tools.
So, I've implemented pertest tools by adding -L option to set service
levels.
I took OFED 1.3 git version. The lastest commit is :
commit
Oren is perftest maintainer
Tziporet
Celine Bourde wrote:
Hi,
We are testing QoS.
We have defined service level rules in opensm and implemented qos-policy.
Implementation wasn't fully done in perftest tools.
So, I've implemented pertest tools by adding -L option to set service
levels.
I
Celine Bourde wrote:
We are testing QoS. We have defined service level rules in opensm and
implemented qos-policy. Implementation wasn't fully done in perftest
tools. So, I've implemented pertest tools by adding -L option to set
service levels.
I found qperf to be very useful for QoS
Amir Vadai a écrit :
I asked our IB expert Jack for hints and he told me this:
From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume 1,
revision 1.2.1
* Local Length Error - ... Generated for a
Work Request posted to the local Receive Queue when the sum of
the Data Segment
Al Chu a écrit :
On Thu, 2008-10-23 at 14:53 +0200, Philippe Gregoire wrote:
Hi Yevgeny,
Is it possible to write this service so it will be able to manage multiple
instances of opensm on the same node, I mean start and stop all instances at
the same time or separately.
This will be very
This email was generated automatically, please do not reply
git_url: git://git.openfabrics.org/ofed_1_4/linux-2.6.git
git_branch: ofed_kernel
Common build parameters:
Passed:
Passed on i686 with linux-2.6.16
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with
Amir Vadai a écrit :
I asked our IB expert Jack for hints and he told me this:
From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume 1,
revision 1.2.1
* Local Length Error - ... Generated for a
Work Request posted to the local Receive Queue when the sum of
the Data Segment
On Mon, Oct 27, 2008 at 11:09 AM, Nicolas Morey Chaisemartin
[EMAIL PROTECTED] wrote:
Amir Vadai a écrit :
I asked our IB expert Jack for hints and he told me this:
From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume 1,
revision 1.2.1
* Local Length Error - ... Generated
Dotan Barak a écrit :
On Mon, Oct 27, 2008 at 11:09 AM, Nicolas Morey Chaisemartin
[EMAIL PROTECTED] wrote:
Amir Vadai a écrit :
I asked our IB expert Jack for hints and he told me this:
From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume 1,
revision 1.2.1
* Local
Philippe Gregoire wrote:
Al Chu a écrit :
On Thu, 2008-10-23 at 14:53 +0200, Philippe Gregoire wrote:
Hi Yevgeny,
Is it possible to write this service so it will be able to manage
multiple instances of opensm on the same node, I mean start and stop
all instances at the same time or
I opened a bug in bugzilla with your research:
https://bugs.openfabrics.org/show_bug.cgi?id=1311
Nicolas Morey Chaisemartin wrote:
Amir Vadai a écrit :
I asked our IB expert Jack for hints and he told me this:
From Section 11.6.2 (COMPLETION RETURN STATUS0 of the IB Spec volume
1, revision
Yevgeny Kliteynik a écrit :
Philippe Gregoire wrote:
Al Chu a écrit :
On Thu, 2008-10-23 at 14:53 +0200, Philippe Gregoire wrote:
Hi Yevgeny,
Is it possible to write this service so it will be able to manage
multiple instances of opensm on the same node, I mean start and
stop all
---
drivers/infiniband/ulp/sdp/sdp_bcopy.c |4 ++--
drivers/infiniband/ulp/sdp/sdp_cma.c |8
drivers/infiniband/ulp/sdp/sdp_main.c | 11 +--
3 files changed, 11 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/ulp/sdp/sdp_bcopy.c
This is the agenda for the OFED meeting today:
1. Review bugs status and decide on their priority
1262cri [EMAIL PROTECTED] congestion hang
with RDS
1298cri [EMAIL PROTECTED] nfsrdma rh5.1 causes
kernel panic
1299cri [EMAIL PROTECTED] nfs module
Looks OK... probably not worth checking
ClassPortInfo:CapabilityMask.PortCountersXmitWaitSupported to make sure
this field is defined, although it is unfortunate that the IB spec says
that PortXmitWait is undefined rather than 0 when it isn't supported.
Anyway, one question:
static
Some architectures support weak ordering in which case better
performance is possible. IB registered memory used for data can be
weakly ordered becuase the the completion queues' buffers are
registered as strongly ordered. This will result in flushing all data
related outstanding DMA
The first two look OK for 2.6.28 I guess, although they don't seem to be
regression fixes and appeared at the tail end of the merge window. I'll
probably sneak them into -rc3. But this patch is just an optimization,
right? So I'll wait for 2.6.29 for this one.
At 12:19 PM 10/27/2008, Roland Dreier wrote:
Some architectures support weak ordering in which case better
performance is possible. IB registered memory used for data can be
weakly ordered becuase the the completion queues' buffers are
registered as strongly ordered. This will result in
On Mon, 27 Oct 2008 10:40:17 +0100
Philippe Gregoire [EMAIL PROTECTED] wrote:
Al Chu a écrit :
On Thu, 2008-10-23 at 14:53 +0200, Philippe Gregoire wrote:
Hi Yevgeny,
Is it possible to write this service so it will be able to manage multiple
instances of opensm on the same node,
Roland Dreier wrote:
Looks OK... probably not worth checking
ClassPortInfo:CapabilityMask.PortCountersXmitWaitSupported to make sure
this field is defined, although it is unfortunate that the IB spec says
that PortXmitWait is undefined rather than 0 when it isn't supported.
Anyway, one
On Mon, Oct 27, 2008 at 12:32:06PM -0400, Talpey, Thomas wrote:
Eli - is there some reason you chose mode 0444 to protect against writing the
setting after module loading? It looks like the value is inspected
dynamically.
I think the value of the parameter should be determined at driver
Hey Philippe,
On Mon, 2008-10-27 at 10:40 +0100, Philippe Gregoire wrote:
Al Chu a écrit :
On Thu, 2008-10-23 at 14:53 +0200, Philippe Gregoire wrote:
Hi Yevgeny,
Is it possible to write this service so it will be able to manage multiple
instances of opensm on the same node, I
I'm referring to these:
ib0: multicast join failed for ff12:401b::::::, status
-11
The patch in http://lists.openfabrics.org/pipermail/general/2008-May/050551.html
is causing them.
The patch creates a state when there is no sm_ah, so all alloc_mad() calls
return -11
Al Chu wrote:
Hey Philippe,
On Mon, 2008-10-27 at 10:40 +0100, Philippe Gregoire wrote:
Al Chu a écrit :
On Thu, 2008-10-23 at 14:53 +0200, Philippe Gregoire wrote:
Hi Yevgeny,
Is it possible to write this service so it will be able to manage multiple
instances of opensm on the same
Yevgeny Petrilin wrote:
Signed-off-by: Yevgeny Petrilin [EMAIL PROTECTED]
---
drivers/net/mlx4/fw.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/mlx4/fw.c b/drivers/net/mlx4/fw.c
index be09fdb..cee199c 100644
--- a/drivers/net/mlx4/fw.c
+++
Huang Weiyi wrote:
Removed duplicated #include linux/cpumask.h in
drivers/net/mlx4/en_main.c.
Signed-off-by: Huang Weiyi [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
diff --git a/drivers/net/mlx4/en_main.c b/drivers/net/mlx4/en_main.c
index 1b0eebf..4b9794e 100644
---
On Mon, 2008-10-27 at 20:53 +0200, Yevgeny Kliteynik wrote:
Al Chu wrote:
Hey Philippe,
On Mon, 2008-10-27 at 10:40 +0100, Philippe Gregoire wrote:
Al Chu a écrit :
On Thu, 2008-10-23 at 14:53 +0200, Philippe Gregoire wrote:
Hi Yevgeny,
Is it possible to write this service
UD packets sent to the local IB port (loopback) have a zero length
reported in the send work request completion entry. This fixes it
by using a copy of the WQE to copy the data.
According to the IB spec (as I read it at least), the bytes transferred
field of a completion entry is only
Hi all,
I am configuring an opteron cluster with connectX Infiniband. I have a
problem that if I run one of the NAS tests, it works the first, and maybe 2nd
time, but after that the jobs instantly fail with messages like this-
[Rank 44][cm.c: line 860]poll CQ failed -2
[Rank 51][cm.c: line
On Mon, 2008-10-27 at 15:30 -0700, Roland Dreier wrote:
UD packets sent to the local IB port (loopback) have a zero length
reported in the send work request completion entry. This fixes it
by using a copy of the WQE to copy the data.
According to the IB spec (as I read it at least), the
On Monday 27 October 2008, Rick Warner wrote:
Hi all,
I am configuring an opteron cluster with connectX Infiniband. I have a
problem that if I run one of the NAS tests, it works the first, and maybe
2nd time, but after that the jobs instantly fail with messages like this-
[Rank 44][cm.c:
Hello
On a several hundred node cluster we run here we have experienced
several large (512+ core) job die with the following left in several of
the node's logs. Below is an example from two different nodes. 22 nodes
had this error after the large run died.
What is this error and why would
ib_mthca :02:00.0: Catastrophic error detected: internal error
This means your HCA detected an internal error -- overheating, power
glitch, cosmic ray, firmware bug, something like that.
___
general mailing list
general@lists.openfabrics.org
34 matches
Mail list logo