RE: [openib-general] uAT issues after SM node bounced

2005-08-19 Thread Sean Hefty
Sean and I are seeing some issues with uAT when our dedicated SM node bounces. Sean saw a kernel oops (I will let him send output) and I see the following console message with my failing ib_at_ips_by_gid requests : Here's the bug check that I saw. I haven't spent anytime debugging this yet. -

[openib-general] [PATCH] [uDAPL] update to new uCM API

2005-08-19 Thread Sean Hefty
This patch updates uDAPL to the new uCM API. It only fixes the build issues at this point and does not try to optimize for the use of the new API. That will come in a later patch. James, I can commit this when committing the uCM changes if that's okay. - Sean Index:

RE: [openib-general] [PATCH] [uDAPL] update to new uCM API

2005-08-20 Thread Sean Hefty
This patch updates uDAPL to the new uCM API. It only fixes the build issues at this point and does not try to optimize for the use of the new API. That will come in a later patch. FYI: I've sent a patch for the updated uCM three times, but I don't see where it's shown up on the openib mailing

[openib-general] RE: RMPP Message Format Errors

2005-08-21 Thread Sean Hefty
Title: RMPP Message Format Errors Please let me know if you will have time to dig into these problems or if I should try and resolve them myself and provide patches. I will not be able to look at this until early next week (with IDF running this week), but I will try to do so. Note

RE: [openib-general] RDMA connection and address translation API

2005-08-24 Thread Sean Hefty
However, there's another problem with trying to lump address translation and connection into a single connect call, and this problem looks fundamental and fatal to me. The connect call takes a QP pointer, but to create a QP the consumer needs to know which local device to use. However, the

[openib-general] RE: RMPP Message Format Errors

2005-08-24 Thread Sean Hefty
But the receive side needs to calculate back the correct size of the assembled MAD. If it is done in kernel or user it does not matter. To my best knowledge the only way to calculate how many records are enclosed in an RMPP message is to use the paylen and offset. How can it be done without

RE: [openib-general] RDMA connection and address translation API

2005-08-24 Thread Sean Hefty
Fab Why can't the IPV field be ignored? If a listen wants only Fab IPV4 addresses, it would specify a 16-byte compare buffer Fab with the first 12 bytes zero, the next 4 filled with the IPV4 Fab address, and would set the offset to that of the hello Fab message's destination

RE: [openib-general] RDMA connection and address translation API

2005-08-24 Thread Sean Hefty
I think if all ULPs provide their source and destination IP in the private data, you can eliminate the reverse lookup altogether. A simple forward lookup is all that's needed to validate that the source GID in the REQ matches the reported source IP in the private data. The forward lookup

RE: [openib-general] RDMA connection and address translation API

2005-08-24 Thread Sean Hefty
Because it would be better to configure your network properly. Putting IP addresses in private data is fundamentally insecure since any user mode client can spoof the IP address. A simple forward lookup could detect this. - Sean ___ openib-general

RE: [openib-general] [PATCH][iWARP] Added provider CM verbs and queryprovider methods

2005-08-24 Thread Sean Hefty
@@ -59,7 +60,8 @@ enum ib_node_type { IB_NODE_CA = 1, IB_NODE_SWITCH, - IB_NODE_ROUTER + IB_NODE_ROUTER, + IB_NODE_IWARP }; I guess I'm not sure what an iWarp node is or how it would be used. +/* Connection events. */ +enum ib_xcm_event_type { +

RE: [openib-general] RDMA connection and address translation API

2005-08-24 Thread Sean Hefty
With this in mind, I believe that the connection API needs to be something more like the following: rdma_resolve_address(): inputs: dest IP address, qos, npaths, done callback, opaque context done callback params: status, local RDMA device, RDMA

RE: [openib-general] [PATCH][iWARP] Added provider CM verbs and queryprovider methods

2005-08-25 Thread Sean Hefty
Why include the connection protocol as part of the verbs layer? Granted I haven't looked at the iWarp specs in a long time, but I don't remember connection establishment being part of the verbs. Connection management is not part of the RDMAC verbs, however, we need some way for transports to

RE: [openib-general] cma header - change some things according to thelist feedback

2005-08-25 Thread Sean Hefty
typedef void (*ib_cma_ac_handler)(enum ib_cma_event event, void *context); typedef void (*ib_cma_event_handler)(enum ib_cma_event event, void *context, void *private_data); Would ib_cma_conn_handler be more appropriate? Maybe, but it is actually the active

RE: [openib-general] [PATCH][iWARP] Added provider CM verbs andqueryprovider methods

2005-08-25 Thread Sean Hefty
The Ammasso 1100 does do 100% connection setup. That's why we're pushing connection establishment verbs into the device struct. IMO, these functions are analagous to the process_mad function in the ib_device structs, which has no meaning to an iwarp device. So I think we have to admit up front,

RE: [openib-general] RDMA connection and address translation API

2005-08-25 Thread Sean Hefty
Sean Another possibility could be to add a list of receives to Sean rdma_connect(). Guy I added this to both connect and accept calls I don't think this is a good idea. Let's try to streamline the connect call, not add every single possible feature to it. I don't think that we want

RE: [openib-general] Re: RMPP Message Format Errors

2005-08-25 Thread Sean Hefty
if (rmpp_active) { ... rmpp_mad-rmpp_hdr.paylen_newwin = cpu_to_be32(hdr_len - offsetof(struct ib_rmpp_mad, data) + data_len); Then in mad_rmpp.c::send_next_seg, I see: if (mad_send_wr-seg_num == 1) {

RE: [openib-general] [PATCH][iWARP] Added provider CM verbsandqueryprovider methods

2005-08-25 Thread Sean Hefty
From NetEffect's perspective, the per device approach is simple to implement and I do not see it as an Ammasso specific approach. As Caitlin described, existing code needs to be reorganized but this aspect of our port is not a major effort. I agree that connection setup code could be duplicated

RE: [openib-general] Re: RMPP Message Format Errors

2005-08-26 Thread Sean Hefty
The 220 byte payload length is for SA. That's mostly right but assumes the last segment will be full (and accounted for by the paylen in the last segment). I believe that the 220 byte payload length is for all RMPP MADs. Only the common and RMPP header lengths are ignored. Doesn't it need to

RE: [openib-general] [PATCH][iWARP] Added provider CMverbsandqueryprovider methods

2005-08-26 Thread Sean Hefty
I believe that what you are advocating is having a mini-TCP-stack in Linux. This mini-TCP-stack knows how to establish connections which are then passed down to the adpater. This mini-stack would comprise the iWARP side of a unified connection manager. I not advocating this, but I believe that

[openib-general] RE: [PATCH][iWARP] IW CM Verbs

2005-08-26 Thread Sean Hefty
Please comment, and if it looks good, I'll commit this to the iWARP branch tonight. Looks fine. See one minor comment below. +/* This is provided in the event generated by a remote + * connect request to a listening endpoint + */ +struct iw_conn_request { + int cr_id;

RE: [openib-general] kernel oops

2005-08-26 Thread Sean Hefty
1. I rebooted both the machines, started opensm, after LID assignment killed opensm. Next started the ucmpost client/server, killing it panics the system This definitely shouldn't crash the systems, so there's a bug that needs to be fixed. But the tests will not work unless an SM is running

RE: [openib-general] Re: RMPP Message Format Errors

2005-08-29 Thread Sean Hefty
In my interpretation, partial data is indicated by the PayloadLength field in the last segment only. It's quite possible that my interpretation is incorrect, in which case the calculation in the RMPP code is off. I agree the text might be missing an example or two for clarification. Anyway, we

Re: [openib-general] [PATCH][iWARP] IW CM Verbs

2005-08-29 Thread Sean Hefty
James Lentini wrote: Why does the ib_device need a cm structure for iWARP but not IB? If you used either Guy or Roland's generic RDMA connection API and did the iWARP implementation, would you still need to add the iw_cm structure? Their connection protocol is implemented in hardware. Even

Re: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Sean Hefty
Guy German wrote: - ib_cma_get_device (...) /* get device(1) or device+path(2) */ - pd = ib_alloc_pd(...) /* pd allocated in the given device */ - qp = ib_cma_create_qp(...) /* qp returned in init state */ - ib_post_recv(qp, ...); - ib_cma_connect (qp, dst_addr(1)/path(2), ...); To focus on

Re: [openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Sean Hefty
Michael S. Tsirkin wrote: Its an sa query, so I'm not sure why would you want to modify a QP there. Further, please note that in the current API the callback is always called even if the query is cancelled. And clearly you cant allow cancel under a spinlock and at the same time ensure callback

[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Sean Hefty
Michael S. Tsirkin wrote: How is this different from what we have with ib_verbs now? With ib_verbs, users receive notification of device addition/removal. This interface doesn't require receiving that notification. I think that reasonable ULPs must register for hotplug events in the ib

Re: [openib-general] kernel oops

2005-08-30 Thread Sean Hefty
Viswanath Krishnamurthy wrote: Call Trace: [c013e410] __alloc_pages+0x166/0x3b6 [c0267637] ib_get_client_data+0x14/0x54 [c027390f] ib_sa_path_rec_get+0x1b/0x13e [c027952f] resolve_path+0x8c/0x15b [c0278ff2] path_req_complete+0x0/0xf7 [c02a9932] rtnetlink_dump_all+0x0/0x9e [c02a9a6d]

Re: [openib-general] kernel oops

2005-08-30 Thread Sean Hefty
Hal Rosenstock wrote: Why would ib_at_paths_by_route be called if no route were obtained (from ib_at_route_by_ip) ? Isn't that a ucmpost issue ? (I also agree it's not good for UAT to crash). The assumption that I made was that the call to ib_at_route_by_ip() would fail if given an invalid

Re: [openib-general] Re: RMPP Message Format Errors

2005-08-30 Thread Sean Hefty
Hal Rosenstock wrote: I already submitted a patch for this. It wasn't clear to me what the answer for the first segment is from Greg's response (so I sent a followup to clarify that). Hal, can you go ahead and commit your two patches for payload length changes for RMPP? - Sean

Re: [openib-general] Re: RMPP Message Format Errors

2005-08-30 Thread Sean Hefty
Hal Rosenstock wrote: Hal, can you go ahead and commit your two patches for payload length changes for RMPP? Do you think this is the correct interpretation ? If so, I will go ahead. I was waiting for confirmation. The interpretation of payload length for the first segment value looks

Re: [openib-general] RDMA Generic Connection Management

2005-08-30 Thread Sean Hefty
Roland Dreier wrote: Sean I should also point out that the kernel CM returns a device Sean pointer when reporting REQ and SIDR_REQ events, so it has Sean similar issues supporting hotplug. Hmm, good point. Perhaps we should make IB CM listens be per-device? Then the consumer is in

[openib-general] ibv_get_async_event

2005-08-30 Thread Sean Hefty
This was brought up before, but to summarize, ibv_get_async_event() can return events for objects (CQ, QP, SRQ) that may have been destroyed. Likewise for ibv_get_cq_event(). Roland, would a patch to fix this that is similar to what was done for uCM be acceptable? (I can describe the method

Re: [openib-general] kernel oops

2005-08-30 Thread Sean Hefty
Hal Rosenstock wrote: Can we just remove this field and use the sgid to locate the correct device structure in the kernel, or fail if it cannot be located? That seems like a good idea. Quickly skimming through the code I couldn't easily locate where AT maintained a device list, or how it

Re: [openib-general] Re: ibv_get_async_event

2005-08-30 Thread Sean Hefty
Fab Tillier wrote: Hmm, I'd rather just sweep through the list of events when we destroy a CQ/QP/SRQ and delete any events that refer to the object we're destroying. It's on my to-do list but I'll definitely take patches if you do it first. Couldn't an event be in flight when the user

Re: [openib-general] RDMA Generic Connection Management

2005-08-31 Thread Sean Hefty
Roland Dreier wrote: Hey, that's a really good point. We should make sure that our API makes it easy to handle device hotplug. One solution is to start reference counting device references, but that inevitably leads to bugs in ULPs -- protocol authors won't get it right unless we make it

Re: [openib-general] RDMA Generic Connection Management

2005-08-31 Thread Sean Hefty
Roland Dreier wrote: This is the hard part: one CPU could start calling into a consumer with a valid device, but get delayed by an interrupt or something. In the meantime, another CPU could remove that device from the consumer, and then when the first notification finally arrives, it's no

Re: [openib-general] Re: RDMA Generic Connection Management

2005-08-31 Thread Sean Hefty
Yaron Haviv wrote: If all the ULPs need to do exactly the same, or the implementation is different for IB/iWarp, than we should probably do it under the API like its defined in kDAPL. To do this means destroying QPs, CQs, PDs, MRs, etc. under the API. I don't see that you want to do this.

Re: [openib-general] [PATCH] hotplug support: selective removal notification

2005-08-31 Thread Sean Hefty
Michael S. Tsirkin wrote: Hi! As Sean pointed out, in the existing client registration the client gets removal events even from devices which it may not be interested in. I was actually trying to point out that a remove event may occur before a client receives a device pointer from another

Re: [openib-general] List of issues in uverbs

2005-08-31 Thread Sean Hefty
viswanath krishnamurthy wrote: 1. ib_cm_destroy_id(cm_id) hangs (does return to the caller) Is there a particular shutdown sequence that needs to be followed ? Is there a trace/debug I can enable ? There's no significant debug to enable. What app are you running that's calling

Re: [openib-general] List of issues in uverbs

2005-08-31 Thread Sean Hefty
viswanath krishnamurthy wrote: Probably called from a callback.. The application is small application which accepts incoming connections (Like a socket server). When is the good time to call the destroy ? You need to call ib_cm_event_put() after processing a CM event. You can call

[openib-general] [RFC] change to ib_create_cm_id()

2005-08-31 Thread Sean Hefty
I'm considering changing the function: ib_create_cm_id(cm_handler, context); to ib_create_cm_id(device, cm_handler, context); This will bind all cm_id's to a specific device, including cm_id's associated with listens. This will help prevent the CM from returning a cm_id associated with a

Re: [openib-general] kernel oops

2005-09-01 Thread Sean Hefty
Hal Rosenstock wrote: Here's a patch for this. Let me know if it works. [I tried it out and it works for me.] If it does, the next question is how does the pointer get trashed. I don't think that the pointer is getting trashed. The SA was not running, so I don't think that any route was

Re: [openib-general] RDMA Generic Connection Management

2005-09-01 Thread Sean Hefty
Gleb Natapov wrote: On Wed, Aug 31, 2005 at 11:11:40AM -0700, Sean Hefty wrote: I don't have a good solution yet for calls like ib_cma_get_device(). Yet another possibility is to have it return a device pointer in a callback. Then it can synchronize with device removal internally. What

[openib-general] Re: [RFC] change to ib_create_cm_id()

2005-09-01 Thread Sean Hefty
Michael S. Tsirkin wrote: This will bind all cm_id's to a specific device, including cm_id's associated with listens. This will help prevent the CM from returning a cm_id associated with a device that a consumer may have already seen as removed. Looking at the API, cm_ids are not currently

Re: [openib-general] Re: [PATCH] memory leaks in ipoib, srp

2005-09-01 Thread Sean Hefty
Michael S. Tsirkin wrote: An additional thinking behind this is: ULPs (e.g. SDP, CM) need to keep lists of per-device objects and kill them on device removal. For example with change Sean proposes SDP will need to keep a list of per-device cm_ids in each connection. One idea, then, is in this

[openib-general] add guid to struct ib_device

2005-09-01 Thread Sean Hefty
Is there any objection to adding the node_guid to struct ib_device? The CM queries for this, and it looks like SRP does too. To support per device listens from userspace, I was considering adding the same functionality to uCM as well. I didn't see where RNICs have this concept; although,

[openib-general] Re: [RFC] change to ib_create_cm_id()

2005-09-01 Thread Sean Hefty
Michael S. Tsirkin wrote: The proposal is to change the cm_id's so that they become associated with a specific device. Currently, they are not visibly associated with a device. I understand. But you said this will help prevent the CM from returning a cm_id associated with a device which seems

Re: [openib-general] Re: ibv_get_async_event

2005-09-02 Thread Sean Hefty
Roland Dreier wrote: Arlin Shouldn't there be a new ibv_put_cq_event() to go with the Arlin ibv_get_cq_event() ? No, I think that's dealt with by sweeping the CQ in userspace when destroying a QP. I don't think that sweeping the CQ in userspace eliminates the race. The call to

Re: [openib-general] Re: ibv_get_async_event

2005-09-02 Thread Sean Hefty
Roland Dreier wrote: I take this to mean that it's fine if CQEs _are_ retrieved after a QP is destroyed. Since a CQE does not have a pointer to the QP, but only a QP number and a consumer-defined work request ID, I think this is OK; there's no direct reference to a stale resource. I was

[openib-general] Re: ibv_get_async_event

2005-09-06 Thread Sean Hefty
Michael S. Tsirkin wrote: If thats what we are trying to do, I'd like to propose another idea: when cq is destroyed, and after all cq events are queued to user, put a special cq destroyed event into the event queue. When a user calls destroy_cq, and after releasing kernel/hardware resources,

[openib-general] Re: ibv_get_async_event

2005-09-06 Thread Sean Hefty
Roland Dreier wrote: Sean I think that this will only work if users are using a single Sean thread to poll for events. I don't think that we want to Sean impose such a restriction. But I think Michael has a point. Do we really want to impose the cost of an extra

[openib-general] Re: ibv_get_async_event

2005-09-06 Thread Sean Hefty
Roland Dreier wrote: Sean Does the problem go away if we require users to poll for all Sean CQ events after destroying a QP, but before destroying a CQ? I don't see how an app could do this. It doesn't know how many CQ events it needs to retrieve, and there could be arbitrarily many

[openib-general] ibv_get_device_guid() not byte swapping

2005-09-06 Thread Sean Hefty
Has anyone else seen errors with byte swapping in ibv_get_device_guid()? I'm seeing a condition where the initial 2-bytes of the GUID are not swapped. The actual code in device.c appears to be correct, and if I insert a printf before returning from the call, then the returned GUID is correct.

[openib-general] Re: ibv_get_async_event

2005-09-06 Thread Sean Hefty
Roland Dreier wrote: The API I came up with is the following: /** * ibv_ack_cq_events - Free an async event * @cq: CQ to acknowledge events for * @nevents: Number of events to acknowledge. * * All completion events which are returned by

[openib-general] [PATCH] use union in ibv_get_device_guid()

2005-09-06 Thread Sean Hefty
This patch replaces the uint16_t array with a union to avoid a compiler related optimization issue with SuSE gcc 3.3.3. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: libibverbs/src/device.c === --- libibverbs/src/device.c

[openib-general] [PATCH] [CM] 1/6 core kernel changes to bind cm_id's to a device

2005-09-06 Thread Sean Hefty
The following patch will bind communication identifiers to a specific device. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: core/cm.c === --- core/cm.c (revision 3295) +++ core/cm.c (working copy) @@ -365,9 +365,15

RE: [openib-general] [PATCH] [CM] 2/6 SRP updates to bind cm_id's to a device

2005-09-06 Thread Sean Hefty
This patch should update SRP to use the new ib_create_cm_id() API. This patch is untested. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: ulp/srp/ib_srp.c === --- ulp/srp/ib_srp.c(revision 3295) +++ ulp/srp/ib_srp.c

RE: [openib-general] [PATCH] [CM] 3/6 SDP updates to bind cm_id's to a device

2005-09-06 Thread Sean Hefty
This patch updates SDP to use the new ib_create_cm_id() API. It also replaces the state driven CM callback processing model with the more reliable event driven processing model. This patch is for review and is untested. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: ulp/sdp/sdp_actv.c

RE: [openib-general] [PATCH] [CM] 4/6 userspace CM changes for per device cm_id's

2005-09-06 Thread Sean Hefty
This patch extends binding cm_id's to a device to userspace. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: libibcm/include/infiniband/cm_abi.h === --- libibcm/include/infiniband/cm_abi.h (revision 3295) +++ libibcm/include

RE: [openib-general] [PATCH] [CM] 6/6 update cmpost to use per device cm_id's

2005-09-06 Thread Sean Hefty
Patch updates kernel cmpost test utility to updated ib_create_cm_id() API. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: cmpost/cmpost.c === --- cmpost/cmpost.c (revision 3327) +++ cmpost/cmpost.c (working copy

RE: [openib-general] [PATCH] [CM] 1/6 core kernel changes to bind cm_id's to a device

2005-09-06 Thread Sean Hefty
Now that cm_id's are per-IB-device, does it make sense to have the userspace CM create a charcter node for each IB device? It seems that might simplify the interface. That makes sense. I'll work on updating to that model. - Sean ___ openib-general

Re: [openib-general] Re: [PATCH] [CM] 3/6 SDP updates to bind cm_id's to a device

2005-09-07 Thread Sean Hefty
Michael S. Tsirkin wrote: Could you please elaborate on why is an event driven model more reliable than the state driven one? It certainly seems to require more code: isnt cm_id-state set by CM to a valid value? It seems that cm needs to track connection state anyway as per 12.9.2 Invalid State

Re: [openib-general] Re: [PATCH] [CM] 3/6 SDP updates to bind cm_id's to a device

2005-09-07 Thread Sean Hefty
Michael S. Tsirkin wrote: The state of the cm_id is controlled by the CM and can change at any time as a result of processing a received MAD. I see. Lets hide this field then. At least, this warrants a comment in the header file. In ib_cm.h: enum ib_cm_statestate;

Re: [openib-general] [PATCH] [CM] 1/2 Fix CM redirection

2005-09-07 Thread Sean Hefty
John Kingman wrote: I found that CM handling for SRP is broken when handling a REJ with reason 24 (Port and CM Redirection) with a RedirectLID supplied. As stated in the spec, if RedirectLID is non-zero, it is the DLID a requester _shall_ use to access the class services. I believe that

Re: [openib-general] Re: [PATCH] [CM] 2/2 Fix CM redirection in SRP

2005-09-07 Thread Sean Hefty
Roland Dreier wrote: One question on the CM interface: + cm_id-redirect_qpn = +be32_to_cpu(*(u32 *)(event-param.rej_rcvd.ari + 32)) + 0x00ff; It seems a little awkward that a consumer has to poke a value back into the cm_id structure. Sean, how do you want to

RE: [openib-general] Re: Re: [PATCH] [SDP] change CM event processing

2005-09-07 Thread Sean Hefty
). Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: sdp_event.c === --- sdp_event.c (revision 3295) +++ sdp_event.c (working copy) @@ -384,45 +384,45 @@ int sdp_cm_event_handler(struct ib_cm_id struct sdp_sock *conn = NULL

Re: [openib-general] Re: Re: [PATCH] [SDP] change CM event processing

2005-09-07 Thread Sean Hefty
Michael S. Tsirkin wrote: Looks good. Is that tested? No. It will take me a while to get to that, but it's on my list to do. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To

Re: [openib-general] Re: [PATCH] [CM] 2/2 Fix CM redirection in SRP

2005-09-07 Thread Sean Hefty
Fab Tillier wrote: I'm not sure. The first thought that comes to mind is having a MAD redirection module that the CM could query before sending any message. Since the ARI for a redirect is defined by the IB spec, why not have the CM just update the redirect_qpn (or better yet the forthcoming

Re: [openib-general] Re: ibv_get_async_event

2005-09-07 Thread Sean Hefty
Sean Hefty wrote: I think that this would work well. I will update the uCM put event to match. After looking at the uCM more, changing from it's current model of put_event to ack_events would require changes to the get_event, which requires changes to the ib_cm_event structure. I think

[openib-general] [PATCH] [verbs] add node_guid to device structure

2005-09-07 Thread Sean Hefty
in the CM, SRP, and sysfs. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: include/rdma/ib_verbs.h === --- include/rdma/ib_verbs.h (revision 3340) +++ include/rdma/ib_verbs.h (working copy) @@ -952,6 +952,7 @@ struct

Re: [openib-general] [PATCH] [verbs] add node_guid to device structure

2005-09-07 Thread Sean Hefty
Roland Dreier wrote: Sean This patch adds the node_guid to struct ib_device to avoid Sean ULPs needing to query for it. Seems reasonable. I can't think of any valid reason why the node_guid would ever change during a device's lifetime. If we're going to put node_guid in the

Re: [openib-general] [PATCH] [verbs] add node_guid to device structure

2005-09-07 Thread Sean Hefty
Roland Dreier wrote: Sean I call ib_query_device() to set the node_guid. I didn't see Sean any other way of getting it reading through the mthca code. I think that it should be the responsibility of the device provider to set the node_guid field before registering the struct ib_device

[openib-general] [PATCH] [DAPL] update to match new event processing APIs

2005-09-08 Thread Sean Hefty
The following patch updates DAPL to match the verbs and CM event processing APIs. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: dapl/openib/dapl_ib_util.c === --- dapl/openib/dapl_ib_util.c (revision 3342) +++ dapl/openib

[openib-general] RE: [PATCH] [CM] 1/2 Fix CM redirection

2005-09-09 Thread Sean Hefty
Second attempt; sent this last night. I apologize if it is a duplicate. Here's an updated change. Took the suggestions and it should be cleaner. Not the be-all, end-all answer for redirection, of course, but it works for my situation. Any chance of inclusion as an interim fix? Committed this

Re: [openib-general] [PATCH] [DAPL] update to match new event processing APIs

2005-09-09 Thread Sean Hefty
Roland Dreier wrote: Sean, what's your feeling about merging the new uverbs stale event handling stuff for 2.6.14? I'm inclined to get it in early, so that it gets wider exposure. And I think the ABI is good, so we won't have to break it again. I think that earlier would be better. - Sean

Re: [openib-general] Re: [PATCH] [CM] 2/2 Fix CM redirection in SRP

2005-09-09 Thread Sean Hefty
Roland Dreier wrote: Sean/Hal, does this make sense as a place to stick the ClassPortInfo structure (initially for use with CM redirection)? If so I'll go ahead and commit it and put it in my git tree as well. Looks good to me. - Sean ___

Re: [openib-general] libibat/libibcm build mess

2005-09-09 Thread Sean Hefty
Roland Dreier wrote: - libibat and libibcm both have an include file named infiniband/at.h. It's actually installed by libibcm, but the version in libibat has some structures not defined in the libibcm version. I think you meant sa.h. I agree that there should be a single file.

Re: [openib-general] Re: different CM panic

2005-09-12 Thread Sean Hefty
Roland Dreier wrote: Well, at least I tracked this down to a use-after-free bug in the CM. I went ahead and committed this trivial fix: If the CM REQ handling function gets to error2, then it frees cm_id_priv-timewait_info. But the next line goes through ib_destroy_cm_id() - ib_send_cm_rej() -

Re: [openib-general] [PATCH] ib_sync_cq ( was Re: RFC: ib_set_comp_handler)

2005-09-12 Thread Sean Hefty
James Lentini wrote: The purpose of this function would be more obvious if you included the new comp_handler and cq_contex in the function signature. A different name would help as well. I would suggest: void ib_modify_cq(struct ib_cq *cq, void (*event_handler)(struct ib_event *, void

Re: [openib-general] Re: Opensm - casting issues #2

2005-09-13 Thread Sean Hefty
Christoph Hellwig wrote: Why does the windows port needs a separate repository? Please just check all windows code (not just opensm) into the openib repository. My understanding is that the labs, who control the OpenIB servers, refused to host any Windows related code, forcing it to have a

Re: [openib-general] [PATCH] [CM] 1/6 core kernel changes to bind cm_id's to a device

2005-09-13 Thread Sean Hefty
Roland Dreier wrote: Now that cm_id's are per-IB-device, does it make sense to have the userspace CM create a charcter node for each IB device? It seems that might simplify the interface. I've modified the patch to have the uCM create a character node for each IB device. uverbs handles up

[openib-general] userspace CM API for per device handling

2005-09-13 Thread Sean Hefty
Sean Hefty wrote: For the userspace portion, I'm still trying to decide what the correct API should be. I'd like to avoid apps from having to call something like ib_cm_get_devices(), which would mirror the verbs call. I was thinking of having ib_cm_create_id() still take a struct ibv_context

Re: [openib-general] userspace CM API for per device handling

2005-09-14 Thread Sean Hefty
Arlin Davis wrote: User events are processed (poll/select) with FD's so can we just use the FD to get events? This would give the user a direct mapping back to the correct device based on the poll or select results. Something like... 5. ib_cm_get_fd( struct ibv_context *device_context) and

Re: [openib-general] userspace CM API for per device handling

2005-09-14 Thread Sean Hefty
Roland Dreier wrote: ibv_get_async_event(int fd, struct ibv_async_event *event); ibv_get_cq_event(int fd, struct ibv_cq **cq, void **cq_context); This seems like mostly pain with little gain to me. A consumer doing a poll or something with multiple file descriptors still needs some mapping to

[openib-general] [PATCH] [uDAPL] 4/5 per device communication identifiers

2005-09-15 Thread Sean Hefty
Convert uDAPL to use per device cm_id's. Untested, but changes appear straightforward. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: userspace/dapl/dapl/openib/dapl_ib_cm.c === --- userspace/dapl/dapl/openib/dapl_ib_cm.c

[openib-general] [PATCH] [SRP] 5/5 per device communication identifiers

2005-09-15 Thread Sean Hefty
Patch to update SRP to per device cm_id's. I don't have an SRP target to test this against, but changes appear straightforward. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: linux-kernel/infiniband/ulp/srp/ib_srp.c

[openib-general] [PATCH] [SDP] 6/5 per device communication identifiers

2005-09-15 Thread Sean Hefty
I can't count. (Actually I counted SRP and SDP together...) Here's a patch to update SDP to per device cm_id's. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: linux-kernel/infiniband/ulp/sdp/sdp_actv.c === --- linux-kernel

[openib-general] [RFC] send side QP redirection

2005-09-16 Thread Sean Hefty
I'd like to get feedback about a possible implementation for requester QP redirection based on the APIs given below. Specifically, I'm referring to GSI redirection (spec 13.5.2) and port and CM redirection (REJ code 24). The basic proposal is to combine QP redirection as an extension to an

Re: [openib-general] netdev reference counting problem with ib_at

2005-09-16 Thread Sean Hefty
Roland Dreier wrote: ib_at needs to be reworked so that it doesn't keep perpetual references to netdevs. I continue to hit this same issue, so I've started looking at the ib_at code. Ib_at accesses struct ipoib_dev_priv to get information about the related port that IPoIB is using. Is there

RE: [openib-general] [RFC] send side QP redirection

2005-09-17 Thread Sean Hefty
struct ib_mad_av { struct ib_ah *ah; u32 remote_qpn; u32 remote_qkey; u16 pkey_index; }; What about SL and the other redirect GRH fields (TC and FL) ? These would have been specified through the ib_ah_attr when the destination was added. I think that these are the four

Re: [openib-general] Re: [PATCH] libibcm/libibat disable-libcheck option

2005-09-19 Thread Sean Hefty
Michael S. Tsirkin wrote: So, any chance of this patch being accepted? I really want an option to first configure all libraries, then build them all. configure checks break this, but they aren't really needed in a monolitic build, so an option to disable ib library checks makes sense IMO. I

Re: [openib-general] [RFC] send side QP redirection

2005-09-19 Thread Sean Hefty
Hal Rosenstock wrote: struct ib_mad_av { struct ib_ah *ah; u32 remote_qpn; u32 remote_qkey; u16 pkey_index; }; What about SL and the other redirect GRH fields (TC and FL) ? These would have been specified through the ib_ah_attr when the destination was added.

[openib-general] Re: [PATCH] madeye: Mainly add more SA decode

2005-09-19 Thread Sean Hefty
Hal Rosenstock wrote: madeye: Mainly add more SA decode Support SA attributes and add support for some missing SA methods Also, display data for received RMPP messages (next step is to do this on the send side) Also, allow filtering of messages by attribute ID Signed-off-by: Hal Rosenstock

Re: [openib-general][PATCH][RFC]: CMA header

2005-09-19 Thread Sean Hefty
Guy German wrote: typedef void (*ib_cma_event_handler)(enum ib_cma_event event, void *context, const void *private_data); typedef void (*ib_cma_listen_handler)(void *cma_id, struct ib_device *device, void *private_data,

Re: [openib-general][PATCH][RFC]: CMA IB implementation

2005-09-19 Thread Sean Hefty
Guy German wrote: #define CMA_TARGET_MAX 4 #define CMA_INITIATOR_DEPTH 4 #define CMA_RC_RETRY_COUNT 7 #define CMA_RNR_RETRY_COUNT 6 #define CMA_CM_RESPONSE_TIMEOUT 20 /* 4 sec */ #define CMA_MAX_CM_RETRIES 0 Are these values hard-coded just for the initial

[openib-general] Re: [PATCH] libibcm/libibat disable-libcheck option

2005-09-19 Thread Sean Hefty
Michael S. Tsirkin wrote: Add an option to disable configure checks for ib libraries. This makes it possible to first configure all libraries, then make them all. Committed. - Sean ___ openib-general mailing list openib-general@openib.org

Re: [openib-general][PATCH][RFC]: CMA IB implementation

2005-09-20 Thread Sean Hefty
Guy German wrote: memset(qp_attr, 0, sizeof qp_attr); qp_attr.qp_state = qp_state; if (cm_id !qp_attr_mask) Or this check... This check we do need, because: - when we call modify qp state to RTR or RTS cm_id is valid and qp_attr_mask==0, so we need to call ib_cm_init_qp_attr

Re: [openib-general][RFC][CMA]: ib_cma_get_device hot unplug issue

2005-09-20 Thread Sean Hefty
Guy German wrote: I'm sorry for bringing it up again, but I don't understand yet why a cma consumer is different then any other verbs consumer (who needs to synchronize between a device removal cb and device verbs calls). The difference is that there isn't a verbs call that returns a pointer

Re: (SPAM?) Re: [openib-general][PATCH][RFC]: CMA header

2005-09-20 Thread Sean Hefty
Guy German wrote: typedef void (*ib_cma_event_handler)(enum ib_cma_event event, void *context, const void *private_data); typedef void (*ib_cma_listen_handler)(void *cma_id, struct ib_device *device, void *private_data, void *context); I think we can merge these two handlers. We do not want

Re: [openib-general] [PATCH] Allow setting of NodeDescription

2005-09-20 Thread Sean Hefty
Roland Dreier wrote: This patch does a few things: - Adds node_guid and node_desc fields to struct ib_device - Has mthca set these fields on startup - Extends modify_device method to handle setting node_desc - Exposes node_desc in sysfs - Allows userspace to set node_desc by writing into

<    1   2   3   4   5   6   7   8   9   10   >