Hi Roland,
Should the user-mad.txt doc indicate /udev rather than /dev as follows:
/udev files
r.t.
/dev files
/udev/infiniband/mthca0/ports/1/mad
r.t.
/dev/infiniband/mthca0/ports/1/mad
-- Hal
___
openib-general mailing list
[EMAIL PROTECTED]
http:/
On Mon, 2004-11-15 at 23:18, Roland Dreier wrote:
> flags for spin lock should be unsigned long, not int.
Thanks. Applied.
-- Hal
___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, pleas
I just committed a change to IPoIB that means ipoibcfg is no longer
needed (and will no longer work). See the previous message in this
thread, "[PATCH] Get rid of /proc/infiniband/ipoib_vlan", for full details.
- Roland
___
openib-general mailing list
This kills off /proc/infiniband/ipoib_vlan in favor of a simpler sysfs
interface. To create ib0.8001, you can now just do
# echo 0x8001 > /sys/class/net/ib0/create_child
and to get rid of the interface,
# echo 0x8001 > /sys/class/net/ib0/delete_child
(Better names for these files glad
flags for spin lock should be unsigned long, not int.
- R.
Index: infiniband/core/mad.c
===
--- infiniband/core/mad.c (revision 1239)
+++ infiniband/core/mad.c (working copy)
@@ -1353,7 +1353,7 @@
{
struct ib_ma
After looking at the code, I believe that there's a race condition
cleaning up in the MAD code when unloading the HCA driver. The
MAD layer can be processing a received MAD when the driver unloads,
which can result in accessing the receive queue after all MADs
on the receive queue have been freed.
On Mon, 2004-11-15 at 15:22 -0800, Peter Buckingham wrote:
> Roland Dreier wrote:
> > Tom> I just tried with the latest gen2 openib bits on 2.6.10-rc2,
> > Tom> mthca and ipoib builtin and everything builds and boots fine
> > Tom> (at least on x86_64).
> >
> > Cool, thanks for testing.
Peter> So gen2 works. From what I understand OpenSM is not yet
Peter> supported for gen2. What other things are still missing
Peter> between gen1 and gen2? (sorry, this is probably a FAQ...)
Easier to answer what works now in gen2: only IPoIB. Everything else
(userspace verbs, CM, SDP
Roland Dreier wrote:
Tom> I just tried with the latest gen2 openib bits on 2.6.10-rc2,
Tom> mthca and ipoib builtin and everything builds and boots fine
Tom> (at least on x86_64).
Cool, thanks for testing. For what it's worth, it works here on i386
as well. (Not very convenient for de
Tom> I just tried with the latest gen2 openib bits on 2.6.10-rc2,
Tom> mthca and ipoib builtin and everything builds and boots fine
Tom> (at least on x86_64).
Cool, thanks for testing. For what it's worth, it works here on i386
as well. (Not very convenient for development though :)
On Mon, 2004-11-15 at 17:37, Sean Hefty wrote:
> It's just that before supporting this, I'd like
> to make sure that routing unmatched responses is really the right solution.
>
> I.e. Is this something that kernel mode clients would need?
I think you mean user mode clients.
> Does it
> make
Hal Rosenstock wrote:
If we want to route the MAD to the corresponding agent, however, we can
do that. But doing this only seems useful if a client is duplicating
functionality, which only makes sense to me for user-mode clients.
If we want to limit this to user mode clients only, we would need
Peter> I have tried this with gen1 and things don't seem to play
Peter> nice.. I've only tried it with mellanox's hca driver, does
Peter> mthca work better when built-in?
Yes, I'm sure gen1 is completely broken, as is mellanox's driver.
mthca should work since it uses the correct PCI d
On Mon, 2004-11-15 at 14:11 -0800, Peter Buckingham wrote:
> I have tried this with gen1 and things don't seem to play nice.. I've
> only tried it with mellanox's hca driver, does mthca work better when
> built-in?
I just tried with the latest gen2 openib bits on 2.6.10-rc2, mthca and
ipoib buil
On Mon, 2004-11-15 at 17:15, Sean Hefty wrote:
> If we want to route the MAD to the corresponding agent, however, we can
> do that. But doing this only seems useful if a client is duplicating
> functionality, which only makes sense to me for user-mode clients.
If we want to limit this to user m
Hal Rosenstock wrote:
My personal take would be to avoid adding that complexity. E.g. a
client sends a MAD with TID 5, cancels 5, sends 5, cancels 5, sends 5.
A response is now received. What should the MAD layer do?
I don't see issues with silently dropping any MAD that we're not ready
to rec
Roland Dreier wrote:
Hal> Should IB build as either built-in or modules ? (I usually
Hal> build everything as modules). If built-in should work, does
Hal> everything IB need to be built in rather than as modules ?
I haven't actually tried it but I think any combination of 'y' and 'm'
fo
Hal> Should IB build as either built-in or modules ? (I usually
Hal> build everything as modules). If built-in should work, does
Hal> everything IB need to be built in rather than as modules ?
I haven't actually tried it but I think any combination of 'y' and 'm'
for config options tha
Hal> So do we just keep the cancel around for some time period to
Hal> make sure this doesn't occur ? If so, should cancel also have
Hal> its own timeout or should some arbitrary timeout be used to
Hal> handle this case ?
I don't think we should worry about this. If a consumer sen
On Mon, 2004-11-15 at 10:16 -0800, Roland Dreier wrote:
> By the way, for our initial submission upstream, I am planning on
> submitting all the patches with my own
>
> Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
>
> line, of course preserving any other Signed-off-by: lines that already
> ex
Hi Roland,
Should IB build as either built-in or modules ? (I usually build
everything as modules). If built-in should work, does everything IB need
to be built in rather than as modules ?
Just wondering what the expectations should be here.
Thanks.
-- Hal
_
On Mon, 2004-11-15 at 16:36, Sean Hefty wrote:
> Hal Rosenstock wrote:
>
> > After Roland's query this AM, I am looking at this some more:
> >
> > On Wed, 2004-11-10 at 13:43, Sean Hefty wrote:
> >
> >>The second case where I can see this happening is if the client canceled
> >>the send, and I'
After Roland's query this AM, I am looking at this some more:
On Wed, 2004-11-10 at 13:43, Sean Hefty wrote:
> The second case where I can see this happening is if the client canceled
> the send, and I'm not sure that we'd want to give the client an
> unmatched response in this case.
So do we j
Paul Baxter wrote,
>While I am delighted that the lower layers are suffficiently stable to
>warrant being considered for code review/inclusion in the kernel, I am
>slightly surprised.
>Has the code been used in anger enough?
I think that Roland is suggesting we submit it for review now,
not
Thanks, applied. I'm cross-compiling for lots of archs but I only run
sparse on i386. It's always something... ;)
(Thanks for the Signed-off-by: line too)
- R.
___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openi
Hal Rosenstock wrote:
After Roland's query this AM, I am looking at this some more:
On Wed, 2004-11-10 at 13:43, Sean Hefty wrote:
The second case where I can see this happening is if the client canceled
the send, and I'm not sure that we'd want to give the client an
unmatched response in this ca
Was getting warnings like: "warning: Using plain integer as NULL
pointer" when sparse checking on x86_64.
Signed-off-by: Tom Duffy <[EMAIL PROTECTED]>
Index: drivers/infiniband/hw/mthca/mthca_doorbell.h
===
--- drivers/infiniband/hw/
Doug Ledford wrote,
>Boo! ;-)
Ditto what Roland said,
Welcome.
woody
___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
On Mon, 2004-11-15 at 15:04, Sean Hefty wrote:
> On Mon, 15 Nov 2004 14:50:06 -0500
> Hal Rosenstock <[EMAIL PROTECTED]> wrote:
>
> > On Mon, 2004-11-15 at 13:29, Sean Hefty wrote:
> > > On Fri, 12 Nov 2004 22:08:14 -0500
> > > Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > > > This patch looks lik
Paul> Has the code been used in anger enough?
Paul> There seem to be a lot of bugs still being discovered daily.
I think in most scenarios IPoIB is quite stable. I've run many
gigabytes of traffic without trouble. There may still be corner cases
with module unloading and the like, but I
On Mon, 15 Nov 2004 14:50:06 -0500
Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> On Mon, 2004-11-15 at 13:29, Sean Hefty wrote:
> > On Fri, 12 Nov 2004 22:08:14 -0500
> > Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > > This patch looks like it includes the previous patch and due to this 2
> > > larg
While I am delighted that the lower layers are suffficiently stable to
warrant being considered for code review/inclusion in the kernel, I am
slightly surprised.
Has the code been used in anger enough?
There seem to be a lot of bugs still being discovered daily.
Wouldn't having at least a prelim
mad: In handle_outgoing_smp, only match response if generated
(based on comment from Roland)
Index: mad.c
===
--- mad.c (revision 1230)
+++ mad.c (working copy)
@@ -394,6 +394,10 @@
goto error1;
Hal Rosenstock wrote:
Patch now applies but I get the following compile errors:
drivers/infiniband/core/mad.c: In function
`ib_mad_change_qp_state_to_init':
drivers/infiniband/core/mad.c:1708: warning: declaration of `qp' shadows
a parameter
drivers/infiniband/core/mad.c:1716: `i' undeclared (first
On Mon, 2004-11-15 at 13:29, Sean Hefty wrote:
> On Fri, 12 Nov 2004 22:08:14 -0500
> Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> > This patch looks like it includes the previous patch and due to this 2
> > large hunks are rejected. Can you regenerate this ?
>
> Updated patch.
Patch now applies b
Doug> Suggestions for items I can read, web sites I should visit
Doug> in order to help get me up to speed, etc. welcomed.
Doug,
First off, welcome!
Unfortunately there's not much to read about InfiniBand beyond the
current IB spec. However, I think chapter 3 is actually quite a nice
in
On Mon, 2004-11-15 at 08:52 -0800, Roland Dreier wrote:
> Just to focus our minds, I would like to propose that we aim to post a
> first version of InfiniBand patches for review to linux-kernel next
> Monday, November 22.
Boo! ;-)
I'll echo the sentiment that this is a good idea.
While I'm pipin
On Fri, 12 Nov 2004 22:08:14 -0500
Hal Rosenstock <[EMAIL PROTECTED]> wrote:
> This patch looks like it includes the previous patch and due to this 2
> large hunks are rejected. Can you regenerate this ?
Updated patch.
- Sean
Index: core/mad.c
===
By the way, for our initial submission upstream, I am planning on
submitting all the patches with my own
Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
line, of course preserving any other Signed-off-by: lines that already
exist. However, for the future, it would be a good idea to make sure
th
Hal> unregister_netdevice: waiting for ib0 to become free. Usage count = 1
Someone is still holding a reference to the ib0 device. I don't see
anything in the IPoIB code that could be doing it, so it seems like
someone outside the driver must be doing it.
- R.
_
Tom> Is there a reason to break up into patches code in
Tom> drivers/infiniband?
I think so: ease of review. A single 15000 line patch is not going to
be very readable. Breaking it up into multiple pieces makes the
architecture a little clearer and also helps naturally organize the
repli
Hi,
The ethernet on this machine is DHCP'd. Some network glitch (I think)
followed by trying to remove the ipoib modules caused the following to
be display in the console logs. Any ideas ? Thanks.
-- Hal
Nov 15 10:44:28 hpc-1 network: Shutting down interface eth1: succeeded
Nov 15 10:44:28 hpc-
On Mon, 2004-11-15 at 08:52 -0800, Roland Dreier wrote:
> The plan would be to produce a series of patches
> that adds the code in our gen2/trunk: the IB core, mad layer, mthca,
> IPoIB and user MAD modules.
Is there a reason to break up into patches code in drivers/infiniband?
There seem to alr
>Comments? Objections?
>Thanks,
> Roland
Getting code review as early as possible is probably a good idea.
woody
___
openib-general mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http:
OK, this adds a status and a timeout_ms field to struct ib_user_mad
and passes timeouts up to userspace. Seem OK?
- R.
Index: infiniband/include/ib_user_mad.h
===
--- infiniband/include/ib_user_mad.h(revision 1223)
+++ infinib
Just to focus our minds, I would like to propose that we aim to post a
first version of InfiniBand patches for review to linux-kernel next
Monday, November 22. The plan would be to produce a series of patches
that adds the code in our gen2/trunk: the IB core, mad layer, mthca,
IPoIB and user MAD m
Oh yeah, one more slight glitch in the MAD API. It turns out that if
a 0-hop DR SMP is passed to ib_post_send_mad(), the client's
recv_handler will be called back directly from the same context. This
means that the client has to be very careful to avoid deadlocking by
taking the same lock in both
On Mon, 2004-11-15 at 10:20, Roland Dreier wrote:
> Roland> - Also, if I'm reading the code correctly, it seems that
> Roland> in handle_outgoing_smp, mad_priv->mad will be dispatched
> Roland> even if no response was generated by the call to
> Roland> process_mad (ie we might pass
Hal> Seems to me like the SM would/could.should be using soliticed
Hal> sends with time outs. Maybe that's not the way this would be
Hal> today just porting what is already there.
I guess I'll extend user_mad.c to handle timeouts then.
Roland> - Also, if I'm reading the code corre
On Mon, 2004-11-15 at 00:25, Roland Dreier wrote:
> A few questions about MAD handling:
>
> - What is supposed to happen to MADs that are received and are
> considered "solicited" because they have a method like GetResp, but
> which don't match any outstanding sends? Right now it looks as if
50 matches
Mail list logo