Re: [openib-general] [PATCH] Optimize cma_process_remove()

2006-09-11 Thread Sean Hefty
Krishna Kumar wrote: > static void cma_process_remove(struct cma_device *cma_dev) > { > struct list_head remove_list; > - struct rdma_id_private *id_priv; > + struct rdma_id_private *id_priv, *tmp; > int ret; > > INIT_LIST_HEAD(&remove_list); > @@ -2344,22 +2344,20 @@

Re: [openib-general] [PATCH] Modify callers of cma_get_net_info for better error handling.

2006-09-11 Thread Sean Hefty
Krishna Kumar wrote: > Re-organize code relating to cma_get_net_info() and rdma_create_id() to > optimize error case handling (no need to alloc memory/etc as part of > rdma_create_id() if input parameters are wrong). Thanks! Committed with a minor adjustment to rename 'out' label 'err'. - Sean

Re: [openib-general] [PATCH] cma_connect_ib leaks memory in failure cases.

2006-09-11 Thread Sean Hefty
Krishna Kumar wrote: > cma_connect_ib leaks an struct ib_cm_id* in failure cases. Thanks - committed. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http:

Re: [openib-general] [PATCH] cma_connect_ib leaks memory in failure cases.

2006-09-11 Thread Sean Hefty
Michael S. Tsirkin wrote: >>cma_connect_ib leaks an struct ib_cm_id* in failure cases. >> >>Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]> > > > This one looks like it might be good for 2.6.18. Sean? The ib_cm_id will be cleaned up if the rdma_cm_id is destroyed, as long as a second call is n

Re: [openib-general] RDMA CMA and C++

2006-09-11 Thread Sean Hefty
Dotan Barak wrote: >>The user-mode cm header files don't have the C++ stuff to identify all >>the declarations as C. The verbs.h file has it and works fine if you >>wanted to copy it, but all you really need is ... >> > Sean, please add those definitions to the libibcm header as well. I've updated

Re: [openib-general] Wrong byte order in lid of struct ibv_port_attr reported by ibv_query port!?

2006-09-11 Thread Sean Hefty
Bub Thomas wrote: > with the help of your modified cmpost.c example I found out that the > byte order in the lid your query_for_path in cmpost.c is getting into > the ib_sa_path_rec is the opposite to the one reported by ibv_query_port. The path record defines all fields in network-byte order.

Re: [openib-general] [PATCH v3] ib_sa: require SA registration

2006-09-11 Thread Sean Hefty
Roland Dreier wrote: > I haven't really read the later patches but I am planning on merging > at least the registration stuff for 2.6.19. I'd like to commit the SA related patches soon. There have been several e-mails recently about using IB multicast and the IB CM directly. - Sean __

Re: [openib-general] [librdmacm] execuation of the the test udaddy is failing

2006-09-06 Thread Sean Hefty
> # udaddy >udaddy: starting server >librdmacm: Kernel ABI does not support requested port space. >udaddy: listen request failed >test complete >return status -93 UD QP and multicast support requires kernel ABI version 2. It appears that the kernel version running is 1. - Sean _

Re: [openib-general] libibcm can't connect/talk to libicm on other machine.

2006-09-05 Thread Sean Hefty
Bub Thomas wrote: > Dotan, > the ibv_rc_pingpong example works for me so I can exclude the > architecture. > I never got the libibcm example compiled. > Which is your example and which architecture x86 vs. x86_64 did you > compile it for? > Can you share your libibcm the example code? (if it is not

Re: [openib-general] rdmacm library

2006-09-04 Thread Sean Hefty
>/usr/bin/ld: warning: libibverbs.so.1, needed by >/usr/local/lib/librdmacm.so, may conflict with libibverbs.so.2 > >Does rdmacm use the older version of ibverbs or do I need to install >rdmacm differently? I keep the RDMA CM updated with the latest version of verbs. There may be an issue with th

Re: [openib-general] [PATCH] for-2.6.19 cma: protect against adding device during destruction

2006-09-04 Thread Sean Hefty
>ok, thanks for clarifying that, is cancellation allowed only for address >resolution or also for route resolving and/or CM calls? also how about >documenting this? Cancellation is allowed for any asynchronous operation. I will pull in your patch when I get back in the office. Thanks. - Sean _

Re: [openib-general] [PATCH] for-2.6.19 cma: protect against adding device during destruction

2006-09-03 Thread Sean Hefty
>Does this patch protects against the case where an rdma_cm_id is being >destructed while address resolution related to the **same** id attaches >it to a device? > >If yes, why does someone destroys this id? is it legal to do so? Yes - this protects against the user destroying the id while that sa

[openib-general] [PATCH] for-2.6.19 cma: protect against adding device during destruction

2006-09-01 Thread Sean Hefty
This closes a window where address resolution can attach an rdma_cm_id to a device during destruction of the rdma_cm_id. This can result in the rdma_cm_id remaining in the device list after its memory has been freed. Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> --- I generated this pat

Re: [openib-general] [PATCH] cma: protect against adding device during destruction

2006-09-01 Thread Sean Hefty
>I'll test some, but the problem hasn't reappeared since. >The patch looks right, I'd say push it for 2.6.18. We need the following change, which applies on top of the previous patch, as well. Add missing synchronization around acquiring an IB device. Signed-off-by: Sean He

[openib-general] [PATCH] cma: protect against adding device during destruction

2006-08-31 Thread Sean Hefty
Can you see if this patch helps any? This closes a window where address resolution can attach an rdma_cm_id to a device during destruction of the rdma_cm_id. This can result in the rdma_cm_id remaining in the device list after its memory has been freed. Signed-off-by: Sean Hefty <[EM

[openib-general] [PATCH] 2.6.19 cma: fix typo

2006-08-31 Thread Sean Hefty
Comma should be semi-colon Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> --- Please queue for 2.6.19 diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index d6f99d5..bf20410 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -265,7

Re: [openib-general] [PATCH v5 2/2] iWARP Core Changes.

2006-08-30 Thread Sean Hefty
Roland Dreier wrote: > While merging this, I uninlined rdma_node_get_transport, since I don't > think there's any reason to make it inline: I've committed the patch to svn to sync as well. - Sean ___ openib-general mailing list openib-general@openib.or

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-30 Thread Sean Hefty
retried. In this situation, the DREQ gets dropped repeatedly. We will want to queue this patch for 2.6.19, if you can point Roland to your git tree. Acked-by: Sean Hefty <[EMAIL PROTECTED]> ___ openib-general mailing list openib-general@openi

Re: [openib-general] CMA oops

2006-08-30 Thread Sean Hefty
Michael S. Tsirkin wrote: >>I'm trying to come up with a fix for this, but I'm not convinced it's the >>problem that you're seeing. > > > Could be what you describe leads to a memory corruption. I believe so. If this were the cause of the crash, I would expect to see an issue with list->prev-

Re: [openib-general] CMA oops

2006-08-30 Thread Sean Hefty
Michael S. Tsirkin wrote: > Apparently, list->prev pointer in CMA id_priv structure is NULL > which causes a crash in list_del. > > I note that rdma_destroy_id tests outside the mutex lock. > Could that be the problem? > The problem is not unfortunately easily reproducible. I think I see one bug,

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-30 Thread Sean Hefty
Michael S. Tsirkin wrote: > And so can RTU, in which case again QP will be in RTR. So it seems > lost CM packets aren't protected by timewait. Maybe we just try to deal with this the best that we can and make the HCA driver responsible for not re-allocating QPs for a duration of local_ack_timeou

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-29 Thread Sean Hefty
Michael S. Tsirkin wrote: >>If we completely ignore timewait, what conditions are required to have a >>problem >>occur? > > Outstanding packets with PSNs and QP numbers coinside between the 2 > connections. > Look for "Stale packet" in IB spec. From what I can tell, a QP will receive an incom

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-29 Thread Sean Hefty
Michael S. Tsirkin wrote: > Hmm. But you need timewait already after you get to RTR, right? The active side looks fine. The passive side can enter timewait without moving through RTS if it gets an RTU timeout. I'm not sure how much going into timewait really helps in this case though. If we c

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-29 Thread Sean Hefty
Michael S. Tsirkin wrote: >>Verbs gets local_ack_timeout through qp_attr.timeout when modifying the QP to >>RTS. > > > Isn't that RTR? It's the transition from RTR to RTS. > So it seems we won't need any API changes. This begins to look good. > I waner what Roland and other low level driver ma

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-29 Thread Sean Hefty
Sean Hefty wrote: > How would the driver determine how long the QP should remain in timewait The spec isn't totally clear to me on this, but here's what I can gather: timewait = packet lifetime x 2 + remote ack delay local_ack_timeout (in CM REQ) = packet lifetime x 2 + local ack

Re: [openib-general] libibcm can't open /dev/infiniband/ucm0

2006-08-29 Thread Sean Hefty
>Looked into the openIB kernel sources and found that the minor number >seems to be wrong in the README file. With a minor number "224" and the >creation like: > > "mknod /dev/infiniband/ucm0 c 231 224" The README file was never updated when the userspace CM added per device handling. I've

Re: [openib-general] [PATCH] libibcm: Need to include stddef.h in cm.c for SLES10 compilations

2006-08-29 Thread Sean Hefty
Jack Morgenstein wrote: > Fix compilation on SLES10: > cm.c uses offsetof, so it must include stddef.h Thanks - committed in 9150. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To un

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-29 Thread Sean Hefty
Hal Rosenstock wrote: > OK. So shouldn't IBV_SA_METHOD_SEND be removed from sa_net.h ? I was just defining the well known methods. I can remove this. > By raw access, do you mean SEND_MAD operation ? > > How do those applications gain this privilege ? The kernel module exports two files to pe

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-29 Thread Sean Hefty
Michael S. Tsirkin wrote: >>I've thought about this too, and I think this may end up making the most >>sense. >>How would the driver determine how long the QP should remain in timewait, > > > Need to look into this - likely we can just add a call for that. > Roland? The Intel gen1 code passed t

Re: [openib-general] [PATCH] libibcm: modify API to support multi-threaded event processing

2006-08-29 Thread Sean Hefty
Michael S. Tsirkin wrote: > I think offsetof is defined in stddef.h, so you must include that. Dotan, Can you see if adding this include works for you? I just re-tested the build on my system, and it worked fine without it (gcc 3.3.3). Jack posted a patch for this earlier if you need one. -

Re: [openib-general] [PATCHES] for 2.6.19

2006-08-29 Thread Sean Hefty
>I handled it all myself this time, but in the future it is easier for >me if each patch is inline in a separate email. A couple of other >things that would also make my life easier: That's not a problem. I think in the past I've just referred you to the svn revision numbers. I was just trying

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-29 Thread Sean Hefty
>Here's an idea: >how about we move the whole timewait thing to low level driver, >starting timer automatically upon QP destroy? I've thought about this too, and I think this may end up making the most sense. How would the driver determine how long the QP should remain in timewait, and how would y

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-29 Thread Sean Hefty
>Why SEND ? In general, couldn't it be used like SET/DELETE (in addition >to being used like the GET method variants) ? Also, the SA doesn't use >the SEND method. The latest version of the patch only allows GET or GET_TABLE for PathRecords ServiceRecords, and MCMemberRecords, and GET_MULTI for Mul

Re: [openib-general] [PATCH] libibcm: modify API to support multi-threaded event processing

2006-08-29 Thread Sean Hefty
>There are compilation errors with this patch when using gcc 4.1.0: Hmmm... I will look into this. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://

Re: [openib-general] [PATCH v3] ib_sa: require SA registration

2006-08-28 Thread Sean Hefty
Roland, Not sure if you've had a chance to review the SA patches, but any comments on any of the SA related patches? (SA registration, generic RMPP query support, or userspace SA) - Sean ___ openib-general mailing list openib-general@openib.org htt

Re: [openib-general] [PATCH] libibcm: modify API to support multi-threaded event processing

2006-08-28 Thread Sean Hefty
Sean Hefty wrote: > Modify the libibcm API to provide better support for multi-threaded > event processing. CM devices are no longer tied to verb devices > and hidden from the user. This should allow an application to direct > events to specific threads for processing. > &g

[openib-general] [PATCHES] for 2.6.19

2006-08-28 Thread Sean Hefty
9088 - randomize starting local comm id Let me know if you'd prefer these in another format (such as inline). - Sean >From d697059a6f69e19c18a50c87df20894d253d3d8f Mon Sep 17 00:00:00 2001 From: Sean Hefty <[EMAIL PROTECTED]> Date: Mon, 28 Aug 2006 15:15:18 -0700 Subject: [PATCH] Ran

Re: [openib-general] drop mthca from svn?

2006-08-28 Thread Sean Hefty
>Well, what is an "OpenFabrics driver" anyway? I'm interesting in >writing Linux drivers to be honest. It's often ignored, but OpenFabrics does include Windows. My understanding is that the requirement for lower level components is that they must be licensed using dual GPL / BSD. This agreement

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-28 Thread Sean Hefty
Michael S. Tsirkin wrote: > I believe communication id should be checked to detect duplicates. Right? Can you clarify this? Check the remote comm id of an incoming REQ against a value in timewait? > Remote QPN stale connection rule is only to avoid a case where we keep > connection in establish

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-28 Thread Sean Hefty
Michael S. Tsirkin wrote: > Another problem that I see is that CMA currently seems to completely > mask timewait exit. This is correct. > So there's no way to properly handle timewait on top of cma that I can see. I don't think so, which is what brought up the problem with Arlin. (He's using

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-28 Thread Sean Hefty
Michael S. Tsirkin wrote: >>The CM tracks the remote QP, not the local. > > > I might not have been clear. > For connection in timewait state, spec explicitly says local QP > must be in reset, error or init. > Only after it goes out of timewait can you destroy the QP. > That's the tracking I thin

Re: [openib-general] drop mthca from svn?

2006-08-28 Thread Sean Hefty
Roland Dreier wrote: > James> If the code is moved, how can the OpenFabrics community be > James> guaranteed that the entire software stack will remain under > James> a dual BSD/GPL license? > > You can't guarantee that someone won't come along and write some IB > driver and get it mer

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-28 Thread Sean Hefty
Michael S. Tsirkin wrote: > So, you must somehow detect that the remote QP is in timewait state. > I don't see any way to do this, and this is not what the CM > currently does. > > Our CM tracks local QPs in timewait state, which is obviously not > what the spec intends since remote QP could be re

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-28 Thread Sean Hefty
Michael S. Tsirkin wrote: > IB spec, section 12.4, says: > > CMs shall maintain enough connection state information to detect an > attempt > to initiate a connection on a remote QP/EEC that has not been released > from a connection with a local QP/EEC, or that is in the TimeWait

Re: [openib-general] [PATCH ] RFC IB/cm do not track remote QPN in timewait state

2006-08-28 Thread Sean Hefty
Michael S. Tsirkin wrote: > Comments appreciated. I will look at the spec in more details, but I thought that timewait was included as part of the life of a connection. I.e. the connection wasn't released until it returned to idle. Also, isn't the purpose behind timewait to prevent re-connect

Re: [openib-general] CMA oops

2006-08-28 Thread Sean Hefty
Michael S. Tsirkin wrote: > Apparently, list->prev pointer in CMA id_priv structure is NULL > which causes a crash in list_del. > > I note that rdma_destroy_id tests outside the mutex lock. > Could that be the problem? > The problem is not unfortunately easily reproducible. I'll see if I see a pr

Re: [openib-general] basic IB doubt

2006-08-25 Thread Sean Hefty
>Thomas> How does an adapter guarantee that no bridges or other >Thomas> intervening devices reorder their writes, or for that >Thomas> matter flush them to memory at all!? > >That's a good point. The HCA would have to do a read to flush the >posted writes, and I'm sure it's not doing

[openib-general] [PATCH v2] ib_usa: support userspace SA queries and multicast

2006-08-24 Thread Sean Hefty
sent to the SA. An administrator can set control on these files in any appropriate way. Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> --- Index: include/rdma/ib_usa.h === --- include/rdma/ib_usa.h (revision 0) +++ includ

Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
>> Polling on a CQ involves a function call, synchronization to the CQ, and >> formatting a structure to return to the user. I don't see this ever being >> faster than polling memory. > >Why don't you measure it, then? Why? Reading a memory location directly will be faster than calling a functio

Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
>Actually, if a hardware implementation provided the same performance >(in this case latency) by polling on a CQ as one where polling on >memory was guaranteed to work, the customer may actually prefer the >"standard" implementation. Polling on a CQ involves a function call, synchronization to the

Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
>But you're still confusing practicality and theory. I can see why it's >pratical sense for newcomers to implement this new, performance- >reducing feature. But why is it theoretically good? I'm missing the standard you're using to judge what's theoretically good and bad. Applications are written

Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
>We're trying to create *inter-operable* hardware and >software in this community. So we follow the IB standard. Atomic operations and RDD are optional, yet still part of the IB "standard". An application that makes use of either of these isn't guaranteed to operate with all IB hardware. I'm not

Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
>OK, great. I'm fine with people using things which are supported, but >then we need the big, blinking "Warning! This program is non-standard, and >won't work with many of the devices supported by Open Fabrics!" sign. If an application were written to use Myrinet, would you consider it non-standar

Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-24 Thread Sean Hefty
I committed this change to the librdmacm in svn 9105. It still requires a backport patch for the kernel code. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please vi

Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-24 Thread Sean Hefty
Michael S. Tsirkin wrote: >>And even with these proposed changes, there's a race condition where the CM >>can timeout a connection after data is received over it, but before this event >>can be processed. > > > Hmm. And what happens then? The connection is aborted by the CM. The CM sends a REJ

Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-24 Thread Sean Hefty
Michael S. Tsirkin wrote: > Maybe the librdmacm part should be merged to svn? > So librdmacm could try to read from misc, then from > /sys/class/infiniband/rdma_cm, and then assume latest. > It's good to have userspace code portable across distros ... I can go with that. - Sean _

Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-23 Thread Sean Hefty
>I have some rdma_cm test code and when I run with the OFED 1.1 code (running on >2.6.9 U3 based kernel) I got the following error. > >librdmacm: couldn't read ABI version. >librdmacm: assuming: 2 The RDMA CM places the abi_version file in /sys/class/misc/rdma_cm. The misc class didn't exist in 2

Re: [openib-general] basic IB doubt

2006-08-23 Thread Sean Hefty
>Actually, that leads me to a question: does the vendor of that adaptor >say that this is actually safe? I believe so. >most of the time doesn't mean it does it all of the time. So it it >really smart to write non-standard-conforming programs unless the >vendor stands behind that behavior? I'm n

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Sean Hefty
>Donnu. I'm just speaking on the general principle that we should deny by >default, not allow by default. Which queries do you want to perform? At a minimum, I would expect the following queries: PathRecord MultiPathRecord MCMemberRecord ServiceRecord Support for ServiceRecord set/delete and In

Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-23 Thread Sean Hefty
Roland Dreier wrote: > It's unfortunate that we have to add a special-case event hook for the > CM, but I guess the iWARP CM changes are so ugly anyway it doesn't > matter much. So I think committing this is OK. We also have the alternative of pushing the responsibility of notifying the CM of th

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Sean Hefty
> Yea I had the same question. Shouldn't interface expose > just the specific queries that we need? I don't know what queries a user will want, and I'd rather not change the kernel ABI with every new query, but that is a possibility. Which queries are of concern? - Sean _

Re: [openib-general] [PATCH] libibsa: userspace SA query and multicast support

2006-08-23 Thread Sean Hefty
Roland Dreier wrote: > What's the plan for how this would be used? We can't let unprivileged > userspace processes talk to the SA, because they could cause problems > like deleting someone else's multicast group membership. And I don't > think we want to try to do some elaborate filtering in the

Re: [openib-general] Rollup patch for ipath and OFED

2006-08-23 Thread Sean Hefty
Bryan O'Sullivan wrote: > SVN is not a high priority for me personally. Fixing things so that I > can send meaningful patches upstream in a timely manner us. Why not remove your code from SVN? - Sean ___ openib-general mailing list openib-general@open

Re: [openib-general] basic IB doubt

2006-08-23 Thread Sean Hefty
Tang, Changqing wrote: > Can you give a few more words on 'immediate', I know A will have > A completion event in its CQ, Does B receive a CQ event on the > Same RDMA operation as well ? He means and RDMA write with immediate data. B will see a completion event for that operation. - Sean

Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-23 Thread Sean Hefty
Sean Hefty wrote: > The following set of patches forwards communication related events to the IB > CM > for processing. Communication events of interest are communication > established > and path migration, with only the former is currently handled by the IB CM. > > This

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Sean Hefty
Or Gerlitz wrote: > I have tested the patch against an iser target based on our Gen1 CM - > it works as expected. This has been committed in 9088. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openi

Re: [openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-23 Thread Sean Hefty
Michael S. Tsirkin wrote: > so I am wandering why is it not sufficient to wait for > the window of time as described above to expire? > Is something broken in CM that this patch is papering over? Yes. There are a couple of issues. The CM doesn't time when a REQ was received, and the local comm

Re: [openib-general] InfiniBand merge plans for 2.6.19

2006-08-23 Thread Sean Hefty
Or Gerlitz wrote: > OK. Now, if this (RC, UD, MCAST) turns to be too much for your > schedule before 2.6.19 opens up, does it make sense for you to push a > char device which supports only the CMA RC functionality and the UD > and multicast in the future? Yes - and the fact that I can pull the OF

Re: [openib-general] basic IB doubt

2006-08-23 Thread Sean Hefty
john t wrote: > I have a very basic doubt. Suppose Host A is doing RDMA write (say 8 > MB) to Host B. When data is copied into Host B's local buffer, is it > guaranteed that data will be copied starting from the first location > (first buffer address) to the last location (last buffer address)?

[openib-general] IB CM and the case of the lost RTU: was a bunch of other topics...

2006-08-22 Thread Sean Hefty
Or Gerlitz wrote: > Indeed, lets see if we can get some input from the ULP people working on > passive side / targets (eg NFS/Lustre/iSER/SDP). To recap (since it's been a couple of weeks), we have two general solutions for how to support the passive/server/target side of a connection: 1. One m

[openib-general] [PATCH] libibcm: modify API to support multi-threaded event processing

2006-08-22 Thread Sean Hefty
cy on libsysfs. The changes do not break the kernel ABI, but do break the library's API in such a way that requires (hopefully minor) changes to all existing users. Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> --- Index: include

Re: [openib-general] [libibcm] does the libibcm support multithreaded applications?

2006-08-22 Thread Sean Hefty
Dotan Barak wrote: >>I understand what the problem is, and I think you're right. If >>ib_cm_get_device() returned a new ib_cm_device, you could more easily control >>event processing. I will fix this up when I remove the dependency on >>libsysfs >>from the libibcm. I am probably at least 2 w

[openib-general] [PATCH] ib_cm: randomize starting local comm IDs

2006-08-22 Thread Sean Hefty
Randomize the starting local comm ID to avoid getting a rejected connection due to a stale connection after a system reboot or reloading of the ib_cm. Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> --- Index: cm.c === --

Re: [openib-general] Question about QP's in timewait state and CM stale conn rejects

2006-08-22 Thread Sean Hefty
>Cool, I would go for XOR-ing a random value with the **local id** . > >Sean, my understanding it can be narrowed for doing so in: > >1) cm_alloc_id() after calling idr_get_new_above() >2) cm_free_id() before calling idr_remove() >3) cm_get_id() before calling idr_find() > >and initializing the ran

Re: [openib-general] libibcm can't open /dev/infiniband/ucm0

2006-08-22 Thread Sean Hefty
>https://openib.org/tiki/tiki-index.php?page=Install+OpenIB+for+Chelsio+T >3&highlight=udev The udev information for this link looks correct. >Or is there another way/description? You can run mknod to manually create the file. (See the README file in the libibcm directory.) >Additionally I did

Re: [openib-general] InfiniBand merge plans for 2.6.19

2006-08-22 Thread Sean Hefty
>What about pushing the char device to support user space CMA, i recall >that you have mentioned the API was not mature enough when the 2.6.18 >feature merge window was open. I will look at doing this. I need to verify what functionality (RC, UD, multicast) of the kernel RDMA CM we want merged up

[openib-general] [PATCH] ib_usa: support userspace SA queries and multicast

2006-08-21 Thread Sean Hefty
Add support for userspace SA queries and multicast join operations. This allows a userspace library to issue SA queries and join IB multicast groups. Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> --- This patch depends on the generic RMPP query interface: http://openib.org/pipermail/

[openib-general] [PATCH 2/2] ib_local_sa: use SA iterator routines to walk RMPP response

2006-08-21 Thread Sean Hefty
Convert local SA to use the new SA iterator routines for walking a list of attributes in an RMPP response returned by the SA. This replaces a local SA specific implementation. Signed-off-by: Sean Hefty --- --- infiniband/core/local_sa.c 2006-08-21 16:40:23.760246472 -0700 +++ infiniband.user

[openib-general] [PATCH 1/2] ib_sa: add generic RMPP query interface

2006-08-21 Thread Sean Hefty
of existing SA query routines was layered on top of the generic query interface. Signed-off-by: Sean Hefty --- This patch applies on top of the SA registration patch: http://openib.org/pipermail/openib-general/2006-August/025267.html --- infiniband/include/rdma/ib_sa.h 2006-08-21 16:37

[openib-general] [PATCH v3] ib_sa: require SA registration

2006-08-21 Thread Sean Hefty
Require registration with SA module, to prevent module text from going away while sa query callback is still running, and update all users. Signed-off-by: Michael S. Tsirkin Signed-off-by: Sean Hefty --- Changes from the previous post include: * Move struct ib_sa_client definition external

Re: [openib-general] [openfabrics-ewg] Rollup patch for ipath and OFED

2006-08-21 Thread Sean Hefty
>But *development* is not usually done on stable tree - it is merged there. >See the difference? Let's keep this simple. We submit patches (which are expected to compile and run) against the "latest" code. Today, that is the tip of gen2 branch in SVN. - Sean ___

Re: [openib-general] [openfabrics-ewg] Rollup patch for ipath and OFED

2006-08-21 Thread Sean Hefty
Michael S. Tsirkin wrote: > Simply put, ideally each component should be developed > separately against upstream versions of the rest of > them. While this sounds good, it implies that the components are somewhat isolated from each other, which often isn't the case. > Maybe Sean can start publis

Re: [openib-general] [PATCH] cmpost: allow cmpost to build with latest RDMA CM

2006-08-21 Thread Sean Hefty
Bub Thomas wrote: > as I understand cmpost.c and simple.c where originally pure libibcm > examples. simple.c was originally a pure libibcm example, but it never actually established any connections. Cmpost has always relied on a separate library to obtain path record information. > Is there an

Re: [openib-general] libibcm can't open /dev/infiniband/ucm0

2006-08-21 Thread Sean Hefty
Bub Thomas wrote: > Here is the list of all loaded ib modules and their dependencies: > > ib_rds 37656 0 > ib_ucm 21512 0 Did you update udev rules to create the device? - Sean ___ openib-general mailing list openib-g

Re: [openib-general] [PATCH v3 0/6] Tranport Neutral Verbs Proposal.

2006-08-21 Thread Sean Hefty
Krishna Kumar2 wrote: > What is your opinion on this patch set ? Anything else needs to be done > for acceptance ? I don't have any issues with it, but Roland would need to commit the changes to verbs as the first step. - Sean ___ openib-general maili

Re: [openib-general] Question about QP's in timewait state and CM stale conn rejects

2006-08-20 Thread Sean Hefty
>> If we get here, this means that the REQ was a new REQ and not a >> duplicate, but the remote_id or remote_qpn is already in use. We need >> to reject the new REQ as containing stale data. > >I don't follow, if we get to the else case its as of cm_get_id() >returning NULL. This holds when idr_fi

Re: [openib-general] Question about QP's in timewait state and CM stale conn rejects

2006-08-20 Thread Sean Hefty
>> Just to emphasize what Sean has pointed out, you are asking how can a CM >> consumer know that a **local** QPN is not in the timewait state >> according to the **remote** CM. Since the issue is with the remote CM, >> it seems to me that pushing down timewait into verbs is not the correct >> dire

Re: [openib-general] Question about QP's in timewait state and CM stale conn rejects

2006-08-20 Thread Sean Hefty
>How about (for the meantime, till this rework is designed && done) going >to projecting the initial random local id into the range of (say) >[0-1022] (i think 1023 is prime, if not choose a prime near it) this way >with very good probability and with very little overhead on memory >consumption a c

Re: [openib-general] [PATCH v2][RDMA CM] IB mcast fix

2006-08-18 Thread Sean Hefty
Thanks! committed in 9008. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] cmpost: allow cmpost to build with latest RDMA CM

2006-08-18 Thread Sean Hefty
Bub Thomas wrote: > Can I still use the LID, GUID and SubnetID for connection establishment > then? Then Gen1 counterpart has no IP over IB running. If IPoIB is not running, then you will need to use the IB CM directly. The RDMA CM uses ARP to resolve IP addresses to GIDs. > I'm using OFED-1.0

Re: [openib-general] libibcm can't open /dev/infiniband/ucm0

2006-08-18 Thread Sean Hefty
Bub Thomas wrote: > It seems as if the problem I had there was not in my code but the > libibcm not being able to open the device /dev/infiniband/ucm0. You will need to load ib_ucm, which exports the IB CM to userspace. - Sean ___ openib-general maili

Re: [openib-general] [PATCH][RDMA CM] IB mcast fix

2006-08-17 Thread Sean Hefty
Steve Wise wrote: > You mean for kernel mode right? I did it for user mode. Yes - see cma_sidr_rep_handler() and cma_send_sidr_rep() in kernel cma.c. You'll want to run cmatose, mckey, and udaddy to verify that all changes are correct. - Sean ___ op

Re: [openib-general] [PATCH][RDMA CM] IB mcast fix

2006-08-17 Thread Sean Hefty
Steve Wise wrote: > Index: src/linux-kernel/infiniband/core/cma.c > === > --- src/linux-kernel/infiniband/core/cma.c(revision 9004) > +++ src/linux-kernel/infiniband/core/cma.c(working copy) > @@ -2172,7 +2172,7 @@ > ib_a

Re: [openib-general] Question about QP's in timewait state and CM stale conn rejects

2006-08-17 Thread Sean Hefty
Or Gerlitz wrote: > If you don't mind (also related to the patch you have sent Eric of > randomizing the initial local cm id) to get into this deeper, can we do There's an issue trying to randomize the initial local CM ID. The way the IDR works, if you start at a high value, then the IDR size

Re: [openib-general] [PATCH] cmpost: allow cmpost to build with latest RDMA CM

2006-08-17 Thread Sean Hefty
Bub Thomas wrote: > I'm getting a little puzzled. > For me it seems as if we are moving in the wrong direction. > I don't have a RDMA CM on the Gen1 counterpart that my gen2 application > is talking too. The RDMA CM is only used on the local (active or client) side to obtain a path record, which

Re: [openib-general] Question about QP's in timewait state and CM stale conn rejects

2006-08-16 Thread Sean Hefty
Arlin Davis wrote: > How can a consumer know for sure that the new QP will not be in a > timewait state according to the CM? Given that the QP may have been in use by another process, I don't think that there's any way for the new owner to know. > Does it make sense to push the timewait functio

Re: [openib-general] return error when rdma_listen fails

2006-08-16 Thread Sean Hefty
Tom Tucker wrote: > I think this makes sense for IB, however, for TCP based transports, we > should share the port space with TCP. My view is that the iWarp transport needs to provide the mapping from an RDMA_PS_TCP to the actual TCP port space, RDMA_PS_UDP to UDP, etc. This is a function that

Re: [openib-general] return error when rdma_listen fails

2006-08-16 Thread Sean Hefty
Pete Wyckoff wrote: > 1) When a device gets added to the system, is there code that applies > existing INADDR_ANY listens to the new device? Where? Yes - see cma_add_one() where cma_listen_on_dev() is called. > By the way, shouldn't the rdma_bind_addr call that preceeded > rdma_listen have faile

Re: [openib-general] [PATCH] cmpost: allow cmpost to build with latest RDMA CM

2006-08-16 Thread Sean Hefty
Bub Thomas wrote: > cmpost.c:65: error: field `path_rec' has incomplete type > cmpost.c: In function `int modify_to_rtr(cmtest_node*)': > cmpost.c:130: error: invalid conversion from `int' to `ibv_qp_attr_mask' > cmpost.c:130: error: initializing argument 3 of `int > ibv_modify_qp(ibv_qp*, ibv_qp

Re: [openib-general] openIB question

2006-08-16 Thread Sean Hefty
john t wrote: > The example code does not seem to support this. It first copies data to > a local buffer (which is not required in my case) and only then it could > send it over other QP. Is there a more efficient way to do this? Maybe I'm not following you here, but the data should go directly

<    1   2   3   4   5   6   7   8   9   10   >