[openib-general] [Bug 263] OFED 1.1 rc6: IPoIB Oops during IPoIB failover loop

2006-10-17 Thread bugzilla-daemon
http://openib.org/bugzilla/show_bug.cgi?id=263 --- Comment #10 from [EMAIL PROTECTED] 2006-10-16 23:05 --- I'm trying debug_level=1 now, sorry for the delay, but I wanted to finish other rc7 testing. --- You are receiving this mail because: --- You are the assignee for

[openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma
I am going to add below ethtool ops in ipoib. Anything comments? Once ethtool support is added, GSO will be get/set directly through ethtool as Michael pointed out earlier. static struct ethtool_ops ipoib_ethtool_ops = { .get_settings = ipoib_get_settings, .set_settings

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Shirley Ma [EMAIL PROTECTED]: /* can be added later once ipoib support sg .get_sg = ethtool_op_get_sg, .set_sg = ethtool_op_set_sg, */ The difficulty here is that sg currently requires checksum offloading in netdevice. -- MST ___

[openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Michael Newton
On Tue May 9 05:06:13 PDT 2006, Leonid Arsh leonida at voltaire.com posted a patch under the Subject [openib-general][RFC][PATCH] core/sysfs.c: ability to reset port counters in which /sys/class/infiniband/*/ports/*/counters/* were made writeable, so that they could be reset by writing zero

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Or Gerlitz
Tang, Changqing wrote: We tested RC7, but fork() does not work: 1. system() causes IB to fail. 2. fork(), child calling exit(0) immediately also causes IB to fail. Anyone has tested fork() related issue ? --CQ Tang, HP-MPI Hi CQ, The fork() support patches were incorporated into

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 09:21:12AM +0200, Or Gerlitz wrote: Tang, Changqing wrote: We tested RC7, but fork() does not work: 1. system() causes IB to fail. 2. fork(), child calling exit(0) immediately also causes IB to fail. Anyone has tested fork() related issue ? What kernel are you

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tziporet Koren
[EMAIL PROTECTED] wrote: On Tue, Oct 17, 2006 at 09:21:12AM +0200, Or Gerlitz wrote: Tang, Changqing wrote: We tested RC7, but fork() does not work: 1. system() causes IB to fail. 2. fork(), child calling exit(0) immediately also causes IB to fail. Anyone has tested fork() related

Re: [openib-general] [openfabrics-ewg] We wish to do the 1.1 release next week

2006-10-17 Thread Moni Levy
Sounds like a great idea. We don't have blocking issues, but would be happy to test the pre-release. Moni On 10/16/06, Tziporet Koren [EMAIL PROTECTED] wrote: This patch is already in. We will publish latest pre-release version tomorrow so everybody can do latest checks. Is this OK?

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 09:52:44AM +0200, Tziporet Koren wrote: [EMAIL PROTECTED] wrote: On Tue, Oct 17, 2006 at 09:21:12AM +0200, Or Gerlitz wrote: Tang, Changqing wrote: We tested RC7, but fork() does not work: 1. system() causes IB to fail. 2. fork(), child calling exit(0)

[openib-general] [ucma] executing the ucmatose with local IPoIB IP address of port 2 fails

2006-10-17 Thread Dotan Barak
Hi. I'm using the following configuration: * Host Architecture : x86_64 Linux Distribution: SUSE Linux Enterprise Server 10 (x86_64) VERSION = 10 Kernel Version: 2.6.16.21-0.8-smp GCC Version : gcc (GCC) 4.1.0 (SUSE Linux)

Re: [openib-general] [PATCHv2] osm: fixing OSM_LOG_DIR and OSM_CACHE_DIR treatment

2006-10-17 Thread Hal Rosenstock
On Mon, 2006-10-16 at 18:39, Yevgeny Kliteynik wrote: Hi Hal. [snip] So will you supply another patch with this approach ? -- Hal Here it is. -- Yevgeny Signed-off-by: Yevgeny Kliteynik [EMAIL PROTECTED] Thanks. Applied. -- Hal

[openib-general] some OFED source/build questions

2006-10-17 Thread Or Gerlitz
Hi Vlad, I have few questions on some issues i have run into while working with OFED-1.1-rc7 over RH4 U3. I did some reading before, so if its RTFM, please point me to the relevant doc... 1) backports patches Looking in ./SOURCES/openib-1.1/configure there is a case for each system per

Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Tziporet Koren
Arlin Davis wrote: Can someone double check the ib_cm kernel patch (sean_cm_drep_on_not_found.patch) again and verify the build process. I don't see the cm_issue_drep symbol in an RC7 build. From the build logs it appears that the patch is applied but I do not see the symbol in the

Re: [openib-general] [openfabrics-ewg] OFED 1.1 release schedule

2006-10-17 Thread Ishai Rabinovitz
Hi, Let me first explain why the current OFED release does not support SRP-HA on RHEL4. SRP-HA is using Device Mapper multipath. Multipath prerequisites include udev of higher version than 050. RHEL4 distributions includes udev 039. udev is an important part of the distribution and I do

Re: [openib-general] OpenFabrics Developer Summit at SC06, Tampa Nov 16 - 17

2006-10-17 Thread Jeff Squyres
I have copied this information to the wiki -- please make all updates there so that there is a single reference point to find all the information about the meeting. Thanks! https://openib.org/tiki/tiki-index.php?page=Meeting+Minutes On Oct 15, 2006, at 5:02 PM, Bill Boas wrote: To

[openib-general] Tools for development

2006-10-17 Thread Jeff Squyres
Per the teleconference last week, I'd like to survey the developers about the tools that should be installed on the new OFA server (is there a plan to migrate there yet?). As I understand it (please correct me if I get this wrong): - The community has decided to stay with git for kernel

Re: [openib-general] [PATCH] opensm/diags: fix regular expression in dump_lfts.sh

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 07:42, Sasha Khapyorsky wrote: This fixes regular expression in dump_lfts.sh script, which is used for switch's LIDs extraction. Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- diags/scripts/dump_lfts.sh |2 +- 1 files changed, 1 insertions(+), 1

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford [EMAIL PROTECTED]: Subject: Re: RHEL5 and OFED ... On Sat, 2006-10-14 at 22:14 +0200, Michael S. Tsirkin wrote: Quoting r. Doug Ledford [EMAIL PROTECTED]: Sorry. RHEL5 Beta1 has been out for a while, but OFED 1.1 still isn't done yet. Obviously, I wasn't able

Re: [openib-general] Tools for development

2006-10-17 Thread Michael S. Tsirkin
Quoting Jeff Squyres [EMAIL PROTECTED]: - Some version of git and svn are installed on the new server, but that's about it The tool versions installed on openib are ancient. Can site admins please install latest svn and git versions from source? -- MST

Re: [openib-general] [RFC] Notice/InformInfo event reporting

2006-10-17 Thread Rimmer, Todd
From: Sean Hefty [mailto:[EMAIL PROTECTED] Sent: Monday, October 16, 2006 6:57 PM To: Rimmer, Todd Cc: Matt Leininger; openib Subject: Re: [openib-general] [RFC] Notice/InformInfo event reporting Rimmer, Todd wrote: In a functioning fabric, events will be rare. However its when you

Re: [openib-general] Tools for development

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Jeff Squyres [EMAIL PROTECTED]: It seems like trac can integrate with both SVN and git and would also provide us with integrated wiki capabilities. One feature that bugzilla has (and that seems to be disabled in openib bugzilla :() is mail integration, where I can Cc bugzilla and

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Michael S. Tsirkin
Quoting Michael Newton [EMAIL PROTECTED]: Also, these counts do not wrap: they peg at all 1s. At infiniband speeds, these counts can peg out very quickly indeed, to the point they can really only be of use if they can be reset each time there read. I mostly use them to verify network health.

Re: [openib-general] [RFC] Notice/InformInfo event reporting

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 09:43, Rimmer, Todd wrote: From: Sean Hefty [mailto:[EMAIL PROTECTED] Sent: Monday, October 16, 2006 6:57 PM To: Rimmer, Todd Cc: Matt Leininger; openib Subject: Re: [openib-general] [RFC] Notice/InformInfo event reporting Rimmer, Todd wrote: In a

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Michael S. Tsirkin
Quoting r. [EMAIL PROTECTED] [EMAIL PROTECTED]: From the OFED release notes: 3. Fork support from kernel 2.6.12 and above is available provided that applications do not use threads. Only system() or fork() and immediate exec() are supported. system() works because parent calls

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tang, Changqing
What kernel are you testing? system() should work (in non threaded apps at least) with modern kernel. -- Gleb. ___ From the OFED release notes: 3. Fork support from kernel 2.6.12 and above is available provided

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote: From: Michael Newton Sent: Tuesday, October 17, 2006 3:02 AM To: openib-general@openib.org Subject: [openib-general] sysfs exposure of port counters useless? These are 32 bit counters. The rcv/xmit_data counters count 32-bit

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tziporet Koren
system() works because parent calls wait(). fork() and immediate exec() may very well fail. I propose to fix the release notes. Hi Gleb, Can you send me the correct description for the RN. Thanks, Tziporet ___ openib-general mailing list

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 09:02:19AM -0500, Tang, Changqing wrote: What kernel are you testing? system() should work (in non threaded apps at least) with modern kernel. -- Gleb. ___ From the OFED

Re: [openib-general] uDAPL problem

2006-10-17 Thread Steve Wise
On Mon, 2006-10-16 at 18:01 -0400, Steve Smaldone wrote: Hi, Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm device appears. However, it now fails with the following: $ ./dapltest -T S -D IB1 ... DAT Registry: dat_ia_openv (IB1,1:2,0) called DAT Registry: IA IB1,

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 04:03:27PM +0200, Tziporet Koren wrote: system() works because parent calls wait(). fork() and immediate exec() may very well fail. I propose to fix the release notes. Hi Gleb, Can you send me the correct description for the RN. 3. Fork support from kernel

[openib-general] srp trouble on RHEL4 U4

2006-10-17 Thread Mirochnick Natalia
Hello, I'm trying to setup SRP connection (SRP in OFED 1.0). IB card is Silverstorm 7000. ib_srp module is loaded, but after attempt to to create an SRP device (as it was described in manual srp_release_notes.txt) the error appears in /var/log/messages: kernel: REJ reason 0x0 What's wrong?

Re: [openib-general] Tools for development

2006-10-17 Thread Steve Wise
On Tue, 2006-10-17 at 09:17 -0400, Jeff Squyres wrote: Per the teleconference last week, I'd like to survey the developers about the tools that should be installed on the new OFA server (is there a plan to migrate there yet?). As I understand it (please correct me if I get this wrong):

Re: [openib-general] Tools for development

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [openib-general] Tools for development Michael The tool versions installed on openib are ancient. Can Michael site admins please install latest svn and git versions Michael from source? What distro is on the new

Re: [openib-general] Tools for development

2006-10-17 Thread Roland Dreier
Michael But I think while generally using distro-supplied Michael packages is the thing to do, for svn/git it makes sense Michael to get the latest and grates and do the updates manually Michael since they are the main services we get from openib.org - Michael so getting more

Re: [openib-general] Tools for development

2006-10-17 Thread Roland Dreier
-- Was there a plan for any consolidation of the various git repositories?) I don't see any reason to consolidate -- the whole point of git is that it makes distributed development easier. Being able to have a private tree that I can screw up and rebuild whenever I need to is kind of

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford [EMAIL PROTECTED]: Subject: Re: RHEL5 and OFED ... On Tue, 2006-10-17 at 15:35 +0200, Michael S. Tsirkin wrote: Quoting r. Doug Ledford [EMAIL PROTECTED]: Subject: Re: RHEL5 and OFED ... On Sat, 2006-10-14 at 22:14 +0200, Michael S. Tsirkin wrote:

Re: [openib-general] uDAPL problem

2006-10-17 Thread Stephen Smaldone
Steve Wise wrote: On Mon, 2006-10-16 at 18:01 -0400, Steve Smaldone wrote: Hi, Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm device appears. However, it now fails with the following: $ ./dapltest -T S -D IB1 ... DAT Registry: dat_ia_openv (IB1,1:2,0) called

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
OK, here's what I actually put in my tree. Can you eyeball this and maybe give it a quick test? If it looks good to you, I'll send it on to the stable team for 2.6.18.x. - R commit 1f5c23e2c10d642a23aa3ebb449670a5184b6aab Author: Arthur Kepner [EMAIL PROTECTED] Date: Mon Oct 16 20:22:35

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH] use mmiowb after doorbell ring OK, here's what I actually put in my tree. Can you eyeball this and maybe give it a quick test? If it looks good to you, I'll send it on to the stable team for 2.6.18.x. BTW, something like

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma
Michael S. Tsirkin [EMAIL PROTECTED] wrote on 10/16/2006 11:12:03 PM: Quoting r. Shirley Ma [EMAIL PROTECTED]: /* can be added later once ipoib support sg .get_sg = ethtool_op_get_sg, .set_sg = ethtool_op_set_sg, */ The difficulty here is that sg currently requires checksum

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tang, Changqing
3. Fork support from kernel 2.6.12 and above is available provided that applications do not use threads. The fork() is supported as long as parent process does not run before child exits or calls exec(). After fork(), in child, before exec(), can we call printf(), putenv(), or even re-direct

Re: [openib-general] Tools for development

2006-10-17 Thread Matt Leininger
On Tue, 2006-10-17 at 07:49 -0700, Roland Dreier wrote: Michael The tool versions installed on openib are ancient. Can Michael site admins please install latest svn and git versions Michael from source? What distro is on the new openfabrics.org server? Ubuntu. If it's

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote: 3. Fork support from kernel 2.6.12 and above is available provided that applications do not use threads. The fork() is supported as long as parent process does not run before child exits or calls exec(). After fork(), in

[openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Yevgeny Kliteynik
Hi Hal Fixing more things in the multicast test flow. Still have things to do in case when multicast group removal fails, and have to add some cleanup (as we've discussed previously). -- Yevgeny Signed-off-by: Yevgeny Kliteynik [EMAIL PROTECTED] Index: osmtest/osmt_multicast.c

Re: [openib-general] Tools for development

2006-10-17 Thread Sasha Khapyorsky
On 09:17 Tue 17 Oct , Jeff Squyres wrote: Per the teleconference last week, I'd like to survey the developers about the tools that should be installed on the new OFA server (is there a plan to migrate there yet?). As I understand it (please correct me if I get this wrong): - The

Re: [openib-general] Tools for development

2006-10-17 Thread Sasha Khapyorsky
On 17:04 Tue 17 Oct , Michael S. Tsirkin wrote: Quoting r. Steve Wise [EMAIL PROTECTED]: At the risk of opening a can of worms, is there any reason we don't move the user stuff into its own git tree? This would get rid of svn altogether... If we do, that should probably be multiple

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tziporet Koren
Tang, Changqing wrote: Thanks, I still use 2.6.9-34, Or Gerlitz told me that fork() support is only in libibverbs1.1 which is not released yet. Both OFED 1.0 and 1.1 use libibverbs1.0, is it still true ? --CQ You need to make a difference between full fork support that will be

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tang, Changqing
Thanks for the clarification. --CQ You need to make a difference between full fork support that will be available only in libibverbs1.1 and the system /fork exec fork support that is depend on the kernel only and available from kernel 2.6.12. See also the explanation from Gleb on this

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Shirley Ma [EMAIL PROTECTED]: /* can be added later once ipoib support sg .get_sg = ethtool_op_get_sg, .set_sg = ethtool_op_set_sg, */ The difficulty here is that sg currently requires checksum offloading in netdevice. I read the discussion in net-dev. Hmm, any

Re: [openib-general] Tools for development

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Matt Leininger [EMAIL PROTECTED]: Developers had requested git 1.4, but Ubuntu had an older version. We went ahead and installed git from source. I'd prefer to stick to Ubuntu packages if possible. We have much to gain from newer versions - just look at gitweb change log. But my

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Michael S. Tsirkin
Quoting r. [EMAIL PROTECTED] [EMAIL PROTECTED]: Subject: Re: [openfabrics-ewg] OFED 1.1 RC7 fork() issue. On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote: 3. Fork support from kernel 2.6.12 and above is available provided that applications do not use threads. The fork()

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 06:48:22PM +0200, Michael S. Tsirkin wrote: Quoting r. [EMAIL PROTECTED] [EMAIL PROTECTED]: Subject: Re: [openfabrics-ewg] OFED 1.1 RC7 fork() issue. On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote: 3. Fork support from kernel 2.6.12 and above

Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Sean Hefty
Tziporet Koren wrote: I checked it and saw that the patch is applied, but since in the patch Sean put the cm_issue_drep as a static, thus nm does not show it. from the patch: +static int cm_issue_drep(struct cm_port *port, cm_issue_rej is also static, but shows up. Do you really need the

Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count

2006-10-17 Thread Sean Hefty
Krishna Kumar2 wrote: Hmmm, OK, I will re-phrase this patch to reduce nesting. Something similar to: if (cma_any_addr...) { ret = rdma_translate_ip(..); if (ret) goto err1; mutex_lock ret = cma_acquire_dev mutex_unlock if (ret)

Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Sean Hefty [EMAIL PROTECTED]: Subject: Re: OFED 1.1 release schedule Tziporet Koren wrote: I checked it and saw that the patch is applied, but since in the patch Sean put the cm_issue_drep as a static, thus nm does not show it. from the patch: +static int cm_issue_drep(struct

Re: [openib-general] [PATCH] If addr_handler() got error, do not set state as OK

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c --- org/drivers/infiniband/core/cma.c 2006-10-10 15:45:27.0 +0530 +++ new/drivers/infiniband/core/cma.c 2006-10-10 15:59:53.0 +0530 @@ -1515,6 +1515,8 @@ static void

Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Sean Hefty
Michael S. Tsirkin wrote: Could be a compiler thing: maybe cm_issue_rej is used in ore than one place? To make sure, you can try removing the static keryword and see if this appears. That could be. cm_issue_rej is called from multiple locations, whereas cm_issue_drep is not. - Sean

Re: [openib-general] uDAPL problem

2006-10-17 Thread Arlin Davis
Stephen Smaldone wrote: Arlin Davis wrote: Steve Smaldone wrote: Hi, Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm device appears. However, it now fails with the following: $ ./dapltest -T S -D IB1 ... DAT Registry: dat_ia_openv (IB1,1:2,0) called DAT

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael BTW, something like this will be needed for userspace too? Ugh, I forgot about that. I don't think an mmiowb() equivalent is available from userspace. However, the problem only arises if userspace uses the same QP/CQ/SRQ from multiple nodes at the same time -- so maybe we can live

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Roland Dreier
Shirley I read the discussion in net-dev. Since IB packet has its Shirley own CRC (ICRC, VCRC). Is it a good idea to enable Shirley checksum unnecessary in a pure IB Fabrics for large MTU Shirley 64K. It requires some negotiation. Does your prototype Shirley implementation for

Re: [openib-general] [PATCH] [RFC] cma_new_id can kfree on error instead of destroy_id

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: cma_new_id() does not require to do destroy_id(), instead it can kfree(), since nothing is allocated on that id. Posting this as an RFC in case anyone feels that create_id should be cleaned up by destroy_id (even if redundant). I can go either way on this. It's a little

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH] use mmiowb after doorbell ring Michael BTW, something like this will be needed for userspace too? Ugh, I forgot about that. I don't think an mmiowb() equivalent is available from userspace. Isn't this just an asm()

Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Sean Hefty
Acked-by: Sean Hefty [EMAIL PROTECTED] Roland, this looks good for 2.6.20. How would you like to handle pulling in patches like these? Once OFA has git up, would it be easier to pull them into my git tree, then request that you pull from there, or does this work okay? In queue_req(), use

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: No, it's never a good idea to turn off TCP or IP checksums. That leads to possibilities of silent data corruption too easily. never is probably too strong a word - hardware checksum offloading turns off checksumming in software, moving that to

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread akepner
On Tue, 17 Oct 2006, Michael S. Tsirkin wrote: Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH] use mmiowb after doorbell ring Michael BTW, something like this will be needed for userspace too? Ugh, I forgot about that. I don't think an mmiowb() equivalent is available

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma
What I suggested here is when it's connected mode with large MTU, set ib interface flag to CHECKSUM_UNNECESSARY. But this only works on packets not being routed off-net at the TCP layer. Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax):

Re: [openib-general] [PATCH] Fix some cancellation problems in process_req().

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: mutex_lock(lock); list_for_each_entry_safe(req, temp_req, req_list, list) { - if (req-status) { + if (req-status req-status != -ECANCELED) { I think we just need: if (req-status == -ENODATA) {

Re: [openib-general] [PATCH] Rewrite cma_req_handler() to encapsulate common code.

2006-10-17 Thread Sean Hefty
Acked-by: Sean Hefty [EMAIL PROTECTED] Let me see how Roland would like to handle merging the patches going forward, but this one looks fine. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] Re-send ARP as prev ARP request could have got dropped.

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: Re-send ARP, since earlier ARP request could have got dropped/lost. This should be done in addr_resolve_remote() as doing it in rdma_resolve_ip() means sending ARP only once. This was intentional. Users can call rdma_resolve_ip() again to retry a timed out request. In

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
I don't think an mmiowb() equivalent is available from userspace. Isn't this just an asm() command? Nope, look at the kernel source, specifically arch/ia64/sn/kernel/iomv.c BTW, I think we really should implement proper rmb/wmb in arch.h. Last time I looked we only had compiler

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH] use mmiowb after doorbell ring I don't think an mmiowb() equivalent is available from userspace. Isn't this just an asm() command? Nope, look at the kernel source, specifically arch/ia64/sn/kernel/iomv.c BTW, I

[openib-general] [PATCH] opensm: misc fixes in lft dump file parser

2006-10-17 Thread Sasha Khapyorsky
There are misc small fixes for lft dump parser: - merge ERROR and SYS logging in single osm_log() call - more strict strtoul() results checking - fix potential bugs with invalid dump files - break too long lines Signed-off-by: Sasha Khapyorsky [EMAIL PROTECTED] --- osm/opensm/osm_ucast_file.c |

Re: [openib-general] [ucma] executing the ucmatose with local IPoIB IP address of port 2 fails

2006-10-17 Thread Sean Hefty
scenario 2: fails SM was executed on port 2 i executed ucmatose server and ucmatose client with IPoIB IP address of port 2 here is the output of the client: ucmatose: starting client ucmatose: connecting ucmatose: event: 3, error: 0 receiving data transfers sending replies

Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote: Hi Hal Fixing more things in the multicast test flow. Still have things to do in case when multicast group removal fails, and have to add some cleanup (as we've discussed previously). -- Yevgeny Signed-off-by: Yevgeny Kliteynik

[openib-general] client-server small message performance issues

2006-10-17 Thread Pete Wyckoff
I'm trying to understand some performance variation in an Openib application, and wrote a small test program to simulate its behavior. Attached are the code and a plot of some results. Each dot in the plot shows the time for a single iteration in the code explained below. One client

[openib-general] OFED-1.1-pre1 is ready

2006-10-17 Thread Tziporet Koren
Hi All, OFED 1.1-pre1 is available: URL: https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz According to the 1.1 release schedule I published yesterday and got all partners approval (Qlogic have not answered so I assumed its OK with them too). Each company has 3 days for

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread akepner
On Tue, 17 Oct 2006, Roland Dreier wrote: OK, here's what I actually put in my tree. Can you eyeball this and maybe give it a quick test? If it looks good to you, I'll send it on to the stable team for 2.6.18.x. Yep, looks fine, and it works on my Altix. Thanks, Roland. -- Arthur

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael kernel code does rmb rather than mb there. OK, but that's an optimization rather than a correctness issue: mb is stronger than rmb. The reason I did it that way was because I wasn't sure it was worth defining mb, rmb and wmb for userspace. - R.

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Subject: Re: [PATCH] use mmiowb after doorbell ring Michael kernel code does rmb rather than mb there. OK, but that's an optimization rather than a correctness issue: mb is stronger than rmb. Very strange. Let's consider amd64: libibverbs

[openib-general] OFED 1.1-RC7 build problem on SLES10

2006-10-17 Thread Chris Dennett
I've been trying to install OFED 1.1 RC7 on an x86 server with a fresh install of SLES10 (32-bit). It errors out when trying to build the kernel modules. I've included what I think are the relevant log messages below. I've tried installing everything (minus iser and tvflash) or just the

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Parks Fields
No, it's never a good idea to turn off TCP or IP checksums. That leads to possibilities of silent data corruption too easily. I totally agree... * Correspondence * This email contains no programmatic content that requires independent ADC review

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 17:09 +0200, Michael S. Tsirkin wrote: Yeah, this is the rolling updates thing I was telling you about. The Beta1 kernel was 2.6.17+several git repos and patches. We've since updated to 2.6.18, but that was released as an update to the Beta1 isos and trees via RHN.

Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Yevgeny Kliteynik
Hal Rosenstock wrote: On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote: Hi Hal Fixing more things in the multicast test flow. Still have things to do in case when multicast group removal fails, and have to add some cleanup (as we've discussed previously). -- Yevgeny

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting Doug Ledford [EMAIL PROTECTED]: Evidently, I was mistaken and rhn is still populated with the beta1 rpms. So, I've made the latest kernel available on my web page as referenced below (amongst other rpms as well). However, it may still be a while before the rpms are fully populated as

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
On a tangent, is there a way to set up a cross-build environment that will build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine? I'm doing this now with gcc and kernel.org kernel I built myself from source. I guess I mostly need to get gcc and binutils SRPMs to generate

Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 16:21, Yevgeny Kliteynik wrote: Hal Rosenstock wrote: On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote: Hi Hal Fixing more things in the multicast test flow. Still have things to do in case when multicast group removal fails, and have to add some cleanup

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 22:23 +0200, Michael S. Tsirkin wrote: Quoting Doug Ledford [EMAIL PROTECTED]: Evidently, I was mistaken and rhn is still populated with the beta1 rpms. So, I've made the latest kernel available on my web page as referenced below (amongst other rpms as well).

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma
Parks Fields [EMAIL PROTECTED] wrote on 10/17/2006 01:12:48 PM: No, it's never a good idea to turn off TCP or IP checksums. That leads to possibilities of silent data corruption too easily. I totally agree... Have we ever seen silent data corruption in CHECKSUM_HW? Thanks Shirley

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Roland Dreier
Shirley Have we ever seen silent data corruption in CHECKSUM_HW? Well, a quick web search finds stuff like http://my.adsm.org/modules.php?op=modloadname=phpBB_14file=indexaction=viewtopictopic=23620 But what I was really talking about was the risk of sending IP packets without a checksum.

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Very strange. Let's consider amd64: libibverbs has #elif defined(__x86_64__) #define mb()asm volatile( ::: memory) So its just a compiler barrier there. While linux has asm-x86_64/system.h #define rmb() asm volatile(lfence:::memory) So rmb seems to be stronger

Re: [openib-general] client-server small message performance issues

2006-10-17 Thread Roland Dreier
Basic ping pong is 25 us. That's fine as this is not a particularly optimal way to communicate. Each additional server adds 6 us. That seems like a lot of overhead just to do another pair of posts and polls, but not my major complaint. Look at the jump from 6 to 7 servers, 41 us.

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: But of course not all x86 processors support lfence/mfence True, but I dont think anyone us still running libibverbs on processors that don't. What happens if an older processors when you call lfence? -- MST

Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Roland Dreier
Roland, this looks good for 2.6.20. How would you like to handle pulling in patches like these? Once OFA has git up, would it be easier to pull them into my git tree, then request that you pull from there, or does this work okay? Git pulls are definitely the easiest, but I'm fine with

Re: [openib-general] [PATCH] Rewrite cma_req_handler() to encapsulate common code.

2006-10-17 Thread Roland Dreier
OK, queued for 2.6.20 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 22:28 +0200, Michael S. Tsirkin wrote: On a tangent, is there a way to set up a cross-build environment that will build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine? I'm doing this now with gcc and kernel.org kernel I built myself from source. I guess I

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: Another confusing thing is that asm-i386 defines mb() and rmb() just to be compiler barriers, I see: #define rmb() alternative(lock; addl $0,0(%%esp), lfence, X86_FEATURE_XMM2) as for mb() - I don't thnk our kernel code uses that so I think userspace

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael True, but I dont think anyone us still running libibverbs Michael on processors that don't. What happens if an older Michael processors when you call lfence? You get an illegal instruction signal and the process dies I guess. ___

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Another confusing thing is that asm-i386 defines mb() and rmb() just to be compiler barriers, I see: #define rmb() alternative(lock; addl $0,0(%%esp), lfence, X86_FEATURE_XMM2) Oops, you're right. I misread that file. OK, we probably want mb() to be more than a compiler barrier

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]: But of course not all x86 processors support lfence/mfence which leads to some ugly issues of how to handle this lfence seems to be part of SSE2, and I don't think we really need sfence/mfence. We can just require SSE2 support:

[openib-general] [GIT PULL] please pull infiniband.git

2006-10-17 Thread Roland Dreier
Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This includes various fixes found since 2.6.19-rc2:

  1   2   >