[openib-general] umad doc

2004-11-15 Thread Hal Rosenstock
Hi Roland, Should the user-mad.txt doc indicate /udev rather than /dev as follows: /udev files r.t. /dev files /udev/infiniband/mthca0/ports/1/mad r.t. /dev/infiniband/mthca0/ports/1/mad -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http:/

Re: [openib-general] [PATCH] fix warning in mad.c

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 23:18, Roland Dreier wrote: > flags for spin lock should be unsigned long, not int. Thanks. Applied. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, pleas

[openib-general] warning: ipoibcfg no longer needed

2004-11-15 Thread Roland Dreier
I just committed a change to IPoIB that means ipoibcfg is no longer needed (and will no longer work). See the previous message in this thread, "[PATCH] Get rid of /proc/infiniband/ipoib_vlan", for full details. - Roland ___ openib-general mailing list

[openib-general] [PATCH] Get rid of /proc/infiniband/ipoib_vlan

2004-11-15 Thread Roland Dreier
This kills off /proc/infiniband/ipoib_vlan in favor of a simpler sysfs interface. To create ib0.8001, you can now just do # echo 0x8001 > /sys/class/net/ib0/create_child and to get rid of the interface, # echo 0x8001 > /sys/class/net/ib0/delete_child (Better names for these files glad

[openib-general] [PATCH] fix warning in mad.c

2004-11-15 Thread Roland Dreier
flags for spin lock should be unsigned long, not int. - R. Index: infiniband/core/mad.c === --- infiniband/core/mad.c (revision 1239) +++ infiniband/core/mad.c (working copy) @@ -1353,7 +1353,7 @@ { struct ib_ma

[openib-general] [PATCH] fix cleanup in MAD code when unloading HCA driver

2004-11-15 Thread Sean Hefty
After looking at the code, I believe that there's a race condition cleaning up in the MAD code when unloading the HCA driver. The MAD layer can be processing a received MAD when the driver unloads, which can result in accessing the receive queue after all MADs on the receive queue have been freed.

Re: [openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Tom Duffy
On Mon, 2004-11-15 at 15:22 -0800, Peter Buckingham wrote: > Roland Dreier wrote: > > Tom> I just tried with the latest gen2 openib bits on 2.6.10-rc2, > > Tom> mthca and ipoib builtin and everything builds and boots fine > > Tom> (at least on x86_64). > > > > Cool, thanks for testing.

Re: [openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Roland Dreier
Peter> So gen2 works. From what I understand OpenSM is not yet Peter> supported for gen2. What other things are still missing Peter> between gen1 and gen2? (sorry, this is probably a FAQ...) Easier to answer what works now in gen2: only IPoIB. Everything else (userspace verbs, CM, SDP

Re: [openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Peter Buckingham
Roland Dreier wrote: Tom> I just tried with the latest gen2 openib bits on 2.6.10-rc2, Tom> mthca and ipoib builtin and everything builds and boots fine Tom> (at least on x86_64). Cool, thanks for testing. For what it's worth, it works here on i386 as well. (Not very convenient for de

Re: [openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Roland Dreier
Tom> I just tried with the latest gen2 openib bits on 2.6.10-rc2, Tom> mthca and ipoib builtin and everything builds and boots fine Tom> (at least on x86_64). Cool, thanks for testing. For what it's worth, it works here on i386 as well. (Not very convenient for development though :)

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 17:37, Sean Hefty wrote: > It's just that before supporting this, I'd like > to make sure that routing unmatched responses is really the right solution. > > I.e. Is this something that kernel mode clients would need? I think you mean user mode clients. > Does it > make

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Sean Hefty
Hal Rosenstock wrote: If we want to route the MAD to the corresponding agent, however, we can do that. But doing this only seems useful if a client is duplicating functionality, which only makes sense to me for user-mode clients. If we want to limit this to user mode clients only, we would need

Re: [openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Roland Dreier
Peter> I have tried this with gen1 and things don't seem to play Peter> nice.. I've only tried it with mellanox's hca driver, does Peter> mthca work better when built-in? Yes, I'm sure gen1 is completely broken, as is mellanox's driver. mthca should work since it uses the correct PCI d

Re: [openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Tom Duffy
On Mon, 2004-11-15 at 14:11 -0800, Peter Buckingham wrote: > I have tried this with gen1 and things don't seem to play nice.. I've > only tried it with mellanox's hca driver, does mthca work better when > built-in? I just tried with the latest gen2 openib bits on 2.6.10-rc2, mthca and ipoib buil

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 17:15, Sean Hefty wrote: > If we want to route the MAD to the corresponding agent, however, we can > do that. But doing this only seems useful if a client is duplicating > functionality, which only makes sense to me for user-mode clients. If we want to limit this to user m

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Sean Hefty
Hal Rosenstock wrote: My personal take would be to avoid adding that complexity. E.g. a client sends a MAD with TID 5, cancels 5, sends 5, cancels 5, sends 5. A response is now received. What should the MAD layer do? I don't see issues with silently dropping any MAD that we're not ready to rec

Re: [openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Peter Buckingham
Roland Dreier wrote: Hal> Should IB build as either built-in or modules ? (I usually Hal> build everything as modules). If built-in should work, does Hal> everything IB need to be built in rather than as modules ? I haven't actually tried it but I think any combination of 'y' and 'm' fo

[openib-general] Re: OpenIB BuiltIn Support ?

2004-11-15 Thread Roland Dreier
Hal> Should IB build as either built-in or modules ? (I usually Hal> build everything as modules). If built-in should work, does Hal> everything IB need to be built in rather than as modules ? I haven't actually tried it but I think any combination of 'y' and 'm' for config options tha

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Roland Dreier
Hal> So do we just keep the cancel around for some time period to Hal> make sure this doesn't occur ? If so, should cancel also have Hal> its own timeout or should some arbitrary timeout be used to Hal> handle this case ? I don't think we should worry about this. If a consumer sen

Re: [openib-general] Signed-off-by: lines

2004-11-15 Thread Matt Leininger
On Mon, 2004-11-15 at 10:16 -0800, Roland Dreier wrote: > By the way, for our initial submission upstream, I am planning on > submitting all the patches with my own > > Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> > > line, of course preserving any other Signed-off-by: lines that already > ex

[openib-general] OpenIB BuiltIn Support ?

2004-11-15 Thread Hal Rosenstock
Hi Roland, Should IB build as either built-in or modules ? (I usually build everything as modules). If built-in should work, does everything IB need to be built in rather than as modules ? Just wondering what the expectations should be here. Thanks. -- Hal _

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 16:36, Sean Hefty wrote: > Hal Rosenstock wrote: > > > After Roland's query this AM, I am looking at this some more: > > > > On Wed, 2004-11-10 at 13:43, Sean Hefty wrote: > > > >>The second case where I can see this happening is if the client canceled > >>the send, and I'

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Hal Rosenstock
After Roland's query this AM, I am looking at this some more: On Wed, 2004-11-10 at 13:43, Sean Hefty wrote: > The second case where I can see this happening is if the client canceled > the send, and I'm not sure that we'd want to give the client an > unmatched response in this case. So do we j

RE: [openib-general] Upstream submission

2004-11-15 Thread Woodruff, Robert J
Paul Baxter wrote, >While I am delighted that the lower layers are suffficiently stable to >warrant being considered for code review/inclusion in the kernel, I am >slightly surprised. >Has the code been used in anger enough? I think that Roland is suggesting we submit it for review now, not

[openib-general] Re: [PATCH] fix sparse warnings in mthca

2004-11-15 Thread Roland Dreier
Thanks, applied. I'm cross-compiling for lots of archs but I only run sparse on i386. It's always something... ;) (Thanks for the Signed-off-by: line too) - R. ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openi

Re: [openib-general] Solicited response with no matching send request

2004-11-15 Thread Sean Hefty
Hal Rosenstock wrote: After Roland's query this AM, I am looking at this some more: On Wed, 2004-11-10 at 13:43, Sean Hefty wrote: The second case where I can see this happening is if the client canceled the send, and I'm not sure that we'd want to give the client an unmatched response in this ca

[openib-general] [PATCH] fix sparse warnings in mthca

2004-11-15 Thread Tom Duffy
Was getting warnings like: "warning: Using plain integer as NULL pointer" when sparse checking on x86_64. Signed-off-by: Tom Duffy <[EMAIL PROTECTED]> Index: drivers/infiniband/hw/mthca/mthca_doorbell.h === --- drivers/infiniband/hw/

RE: [openib-general] Upstream submission

2004-11-15 Thread Woodruff, Robert J
Doug Ledford wrote, >Boo! ;-) Ditto what Roland said, Welcome. woody ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [PATCH] collapse MAD function calls

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 15:04, Sean Hefty wrote: > On Mon, 15 Nov 2004 14:50:06 -0500 > Hal Rosenstock <[EMAIL PROTECTED]> wrote: > > > On Mon, 2004-11-15 at 13:29, Sean Hefty wrote: > > > On Fri, 12 Nov 2004 22:08:14 -0500 > > > Hal Rosenstock <[EMAIL PROTECTED]> wrote: > > > > This patch looks lik

Re: [openib-general] Upstream submission

2004-11-15 Thread Roland Dreier
Paul> Has the code been used in anger enough? Paul> There seem to be a lot of bugs still being discovered daily. I think in most scenarios IPoIB is quite stable. I've run many gigabytes of traffic without trouble. There may still be corner cases with module unloading and the like, but I

[openib-general] Re: [PATCH] collapse MAD function calls

2004-11-15 Thread Sean Hefty
On Mon, 15 Nov 2004 14:50:06 -0500 Hal Rosenstock <[EMAIL PROTECTED]> wrote: > On Mon, 2004-11-15 at 13:29, Sean Hefty wrote: > > On Fri, 12 Nov 2004 22:08:14 -0500 > > Hal Rosenstock <[EMAIL PROTECTED]> wrote: > > > This patch looks like it includes the previous patch and due to this 2 > > > larg

Re: [openib-general] Upstream submission

2004-11-15 Thread Paul Baxter
While I am delighted that the lower layers are suffficiently stable to warrant being considered for code review/inclusion in the kernel, I am slightly surprised. Has the code been used in anger enough? There seem to be a lot of bugs still being discovered daily. Wouldn't having at least a prelim

[openib-general] [PATCH] mad: In handle_outgoing_smp, only match response if generated

2004-11-15 Thread Hal Rosenstock
mad: In handle_outgoing_smp, only match response if generated (based on comment from Roland) Index: mad.c === --- mad.c (revision 1230) +++ mad.c (working copy) @@ -394,6 +394,10 @@ goto error1;

[openib-general] Re: [PATCH] collapse MAD function calls

2004-11-15 Thread Sean Hefty
Hal Rosenstock wrote: Patch now applies but I get the following compile errors: drivers/infiniband/core/mad.c: In function `ib_mad_change_qp_state_to_init': drivers/infiniband/core/mad.c:1708: warning: declaration of `qp' shadows a parameter drivers/infiniband/core/mad.c:1716: `i' undeclared (first

[openib-general] Re: [PATCH] collapse MAD function calls

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 13:29, Sean Hefty wrote: > On Fri, 12 Nov 2004 22:08:14 -0500 > Hal Rosenstock <[EMAIL PROTECTED]> wrote: > > This patch looks like it includes the previous patch and due to this 2 > > large hunks are rejected. Can you regenerate this ? > > Updated patch. Patch now applies b

Re: [openib-general] Upstream submission

2004-11-15 Thread Roland Dreier
Doug> Suggestions for items I can read, web sites I should visit Doug> in order to help get me up to speed, etc. welcomed. Doug, First off, welcome! Unfortunately there's not much to read about InfiniBand beyond the current IB spec. However, I think chapter 3 is actually quite a nice in

Re: [openib-general] Upstream submission

2004-11-15 Thread Doug Ledford
On Mon, 2004-11-15 at 08:52 -0800, Roland Dreier wrote: > Just to focus our minds, I would like to propose that we aim to post a > first version of InfiniBand patches for review to linux-kernel next > Monday, November 22. Boo! ;-) I'll echo the sentiment that this is a good idea. While I'm pipin

[openib-general] Re: [PATCH] collapse MAD function calls

2004-11-15 Thread Sean Hefty
On Fri, 12 Nov 2004 22:08:14 -0500 Hal Rosenstock <[EMAIL PROTECTED]> wrote: > This patch looks like it includes the previous patch and due to this 2 > large hunks are rejected. Can you regenerate this ? Updated patch. - Sean Index: core/mad.c ===

[openib-general] Signed-off-by: lines

2004-11-15 Thread Roland Dreier
By the way, for our initial submission upstream, I am planning on submitting all the patches with my own Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> line, of course preserving any other Signed-off-by: lines that already exist. However, for the future, it would be a good idea to make sure th

[openib-general] Re: IPoIB removal issue

2004-11-15 Thread Roland Dreier
Hal> unregister_netdevice: waiting for ib0 to become free. Usage count = 1 Someone is still holding a reference to the ib0 device. I don't see anything in the IPoIB code that could be doing it, so it seems like someone outside the driver must be doing it. - R. _

Re: [openib-general] Upstream submission

2004-11-15 Thread Roland Dreier
Tom> Is there a reason to break up into patches code in Tom> drivers/infiniband? I think so: ease of review. A single 15000 line patch is not going to be very readable. Breaking it up into multiple pieces makes the architecture a little clearer and also helps naturally organize the repli

[openib-general] IPoIB removal issue

2004-11-15 Thread Hal Rosenstock
Hi, The ethernet on this machine is DHCP'd. Some network glitch (I think) followed by trying to remove the ipoib modules caused the following to be display in the console logs. Any ideas ? Thanks. -- Hal Nov 15 10:44:28 hpc-1 network: Shutting down interface eth1: succeeded Nov 15 10:44:28 hpc-

Re: [openib-general] Upstream submission

2004-11-15 Thread Tom Duffy
On Mon, 2004-11-15 at 08:52 -0800, Roland Dreier wrote: > The plan would be to produce a series of patches > that adds the code in our gen2/trunk: the IB core, mad layer, mthca, > IPoIB and user MAD modules. Is there a reason to break up into patches code in drivers/infiniband? There seem to alr

RE: [openib-general] Upstream submission

2004-11-15 Thread Woodruff, Robert J
>Comments? Objections? >Thanks, > Roland Getting code review as early as possible is probably a good idea. woody ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http:

[openib-general] [PATCH] umad: pass timeouts to userspace

2004-11-15 Thread Roland Dreier
OK, this adds a status and a timeout_ms field to struct ib_user_mad and passes timeouts up to userspace. Seem OK? - R. Index: infiniband/include/ib_user_mad.h === --- infiniband/include/ib_user_mad.h(revision 1223) +++ infinib

[openib-general] Upstream submission

2004-11-15 Thread Roland Dreier
Just to focus our minds, I would like to propose that we aim to post a first version of InfiniBand patches for review to linux-kernel next Monday, November 22. The plan would be to produce a series of patches that adds the code in our gen2/trunk: the IB core, mad layer, mthca, IPoIB and user MAD m

Re: [openib-general] MAD handling

2004-11-15 Thread Roland Dreier
Oh yeah, one more slight glitch in the MAD API. It turns out that if a 0-hop DR SMP is passed to ib_post_send_mad(), the client's recv_handler will be called back directly from the same context. This means that the client has to be very careful to avoid deadlocking by taking the same lock in both

Re: [openib-general] MAD handling

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 10:20, Roland Dreier wrote: > Roland> - Also, if I'm reading the code correctly, it seems that > Roland> in handle_outgoing_smp, mad_priv->mad will be dispatched > Roland> even if no response was generated by the call to > Roland> process_mad (ie we might pass

Re: [openib-general] MAD handling

2004-11-15 Thread Roland Dreier
Hal> Seems to me like the SM would/could.should be using soliticed Hal> sends with time outs. Maybe that's not the way this would be Hal> today just porting what is already there. I guess I'll extend user_mad.c to handle timeouts then. Roland> - Also, if I'm reading the code corre

Re: [openib-general] MAD handling

2004-11-15 Thread Hal Rosenstock
On Mon, 2004-11-15 at 00:25, Roland Dreier wrote: > A few questions about MAD handling: > > - What is supposed to happen to MADs that are received and are > considered "solicited" because they have a method like GetResp, but > which don't match any outstanding sends? Right now it looks as if