[PATCH] IB/ehca: Fix CQE flags reporting
Was reporting CQE flags in the wrong bit positions, causing consumers to miss incoming immediate data. Signed-off-by: Joachim Fenkes fen...@de.ibm.com --- Please review and queue for 2.6.32 if you think it's okay. Thanks! Joachim drivers/infiniband/hw/ehca/ehca_reqs.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index 5a3d96f..8fd88cd 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -786,7 +786,11 @@ repoll: wc-slid = cqe-rlid; wc-dlid_path_bits = cqe-dlid; wc-src_qp = cqe-remote_qp_number; - wc-wc_flags = cqe-w_completion_flags; + /* +* HW has Immed data present and GRH present in bits 6 and 5. +* SW defines those in bits 1 and 0, so we can just shift and mask. +*/ + wc-wc_flags = (cqe-w_completion_flags 5) 3; wc-ex.imm_data = cpu_to_be32(cqe-immediate_data); wc-sl = cqe-service_level; -- 1.6.0.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [ewg] [PATCH] IB/ehca: Construct MAD redirect replies from request MAD
Hal Rosenstock hal.rosenst...@gmail.com wrote on 27.08.2009 15:31:40: I don't think it should be hard coded. IMO it would be better to default to 18 and somehow able to be adjusted (via a (dynamic) module parameter ?). I don't see how making this a parameter would benefit any end user, while on the other hand it clutters up our parameter list. Changing RespTimeValue won't influence the IB performance or user-visible behavior of our driver in any way, and in fact, all RespTimeValue says is Please use a timeout of one second for all future MADs you send me, only there won't be any more MADs in the future because we just redirected the client to someone else. So, the RespTimeValue field is a don't care in the redirection scenario. Setting it to an arbitrary, but legal value isn't much more than a concession towards any broken clients that may be out there. Given that you seem to like the rest of the code and Jason hasn't spoken up yet, I think we can have Roland merge this patch. Roland, what do you think? Regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [ewg] [PATCH] IB/ehca: Construct MAD redirect replies from request MAD
Hal Rosenstock hal.rosenst...@gmail.com wrote on 26.08.2009 17:15:03: Thanks for doing this. It looks sane to me. The only issue I recall that appears to be remaining is a better setting of ClassPortInfo:RespTimeValue rather than hardcoding. Perhaps using the value from PortInfo is the way to go (ideally it would be that value from the port to which the the requester is being redirected to but that might not be so easy to get from this port. I don't think that effort will be necessary or even legal. The requestor will react to the redirection with another Get(ClassPortInfo) to the redirection target, which will reply with its own RespTimeValue, so our driver should speak for itself. Since we don't know when our MAD processing and sending of the response is going to be scheduled (we're not running on real-time constraints here), we play it safe and return 18, which amounts to roughly a second. Make sense? Regards Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] IB/ehca: Construct MAD redirect replies from request MAD
The old code used a lot of hardcoded values, which might not be valid in all environments (especially routed fabrics or partitioned subnets). Copy as much information as possible from the incoming request to prevent that. Signed-off-by: Joachim Fenkes fen...@de.ibm.com --- Hal, Jason -- here's the change I promised. Looks okay to you? Roland -- if Hal and Jason don't object, please queue this up for the next kernel. Thanks! Regards, Joachim drivers/infiniband/hw/ehca/ehca_sqp.c | 47 1 files changed, 41 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c index c568b28..8c1213f 100644 --- a/drivers/infiniband/hw/ehca/ehca_sqp.c +++ b/drivers/infiniband/hw/ehca/ehca_sqp.c @@ -125,14 +125,30 @@ struct ib_perf { u8 data[192]; } __attribute__ ((packed)); +/* TC/SL/FL packed into 32 bits, as in ClassPortInfo */ +struct tcslfl { + u32 tc:8; + u32 sl:4; + u32 fl:20; +} __attribute__ ((packed)); + +/* IP Version/TC/FL packed into 32 bits, as in GRH */ +struct vertcfl { + u32 ver:4; + u32 tc:8; + u32 fl:20; +} __attribute__ ((packed)); static int ehca_process_perf(struct ib_device *ibdev, u8 port_num, +struct ib_wc *in_wc, struct ib_grh *in_grh, struct ib_mad *in_mad, struct ib_mad *out_mad) { struct ib_perf *in_perf = (struct ib_perf *)in_mad; struct ib_perf *out_perf = (struct ib_perf *)out_mad; struct ib_class_port_info *poi = (struct ib_class_port_info *)out_perf-data; + struct tcslfl *tcslfl = + (struct tcslfl *)poi-redirect_tcslfl; struct ehca_shca *shca = container_of(ibdev, struct ehca_shca, ib_device); struct ehca_sport *sport = shca-sport[port_num - 1]; @@ -158,10 +174,29 @@ static int ehca_process_perf(struct ib_device *ibdev, u8 port_num, poi-base_version = 1; poi-class_version = 1; poi-resp_time_value = 18; - poi-redirect_lid = sport-saved_attr.lid; - poi-redirect_qp = sport-pma_qp_nr; + + /* copy local routing information from WC where applicable */ + tcslfl-sl = in_wc-sl; + poi-redirect_lid = + sport-saved_attr.lid | in_wc-dlid_path_bits; + poi-redirect_qp = sport-pma_qp_nr; poi-redirect_qkey = IB_QP1_QKEY; - poi-redirect_pkey = IB_DEFAULT_PKEY_FULL; + + ehca_query_pkey(ibdev, port_num, in_wc-pkey_index, + poi-redirect_pkey); + + /* if request was globally routed, copy route info */ + if (in_grh) { + struct vertcfl *vertcfl = + (struct vertcfl *)in_grh-version_tclass_flow; + memcpy(poi-redirect_gid, in_grh-dgid.raw, + sizeof(poi-redirect_gid)); + tcslfl-tc= vertcfl-tc; + tcslfl-fl= vertcfl-fl; + } else + /* else only fill in default GID */ + ehca_query_gid(ibdev, port_num, 0, + (union ib_gid *)poi-redirect_gid); ehca_dbg(ibdev, ehca_pma_lid=%x ehca_pma_qp=%x, sport-saved_attr.lid, sport-pma_qp_nr); @@ -183,8 +218,7 @@ perf_reply: int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, struct ib_wc *in_wc, struct ib_grh *in_grh, -struct ib_mad *in_mad, -struct ib_mad *out_mad) +struct ib_mad *in_mad, struct ib_mad *out_mad) { int ret; @@ -196,7 +230,8 @@ int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, return IB_MAD_RESULT_SUCCESS; ehca_dbg(ibdev, port_num=%x src_qp=%x, port_num, in_wc-src_qp); - ret = ehca_process_perf(ibdev, port_num, in_mad, out_mad); + ret = ehca_process_perf(ibdev, port_num, in_wc, in_grh, + in_mad, out_mad); return ret; } -- 1.6.0.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] IB/ehca: Remove superfluous bitmasks from QP control block
All the fields in the control block are nicely right-aligned, so no masking is necessary. Signed-off-by: Joachim Fenkes fen...@de.ibm.com --- drivers/infiniband/hw/ehca/ehca_classes_pSeries.h | 28 - drivers/infiniband/hw/ehca/ehca_qp.c | 18 +++- 2 files changed, 5 insertions(+), 41 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h index 1798e64..689c357 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h +++ b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h @@ -165,7 +165,6 @@ struct hcp_modify_qp_control_block { #define MQPCB_MASK_ALT_P_KEY_IDXEHCA_BMASK_IBM( 7, 7) #define MQPCB_MASK_RDMA_ATOMIC_CTRL EHCA_BMASK_IBM( 8, 8) #define MQPCB_MASK_QP_STATE EHCA_BMASK_IBM( 9, 9) -#define MQPCB_QP_STATE EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES EHCA_BMASK_IBM(11, 11) #define MQPCB_MASK_PATH_MIGRATION_STATE EHCA_BMASK_IBM(12, 12) #define MQPCB_MASK_RDMA_ATOMIC_OUTST_DEST_QPEHCA_BMASK_IBM(13, 13) @@ -176,60 +175,33 @@ struct hcp_modify_qp_control_block { #define MQPCB_MASK_RETRY_COUNT EHCA_BMASK_IBM(18, 18) #define MQPCB_MASK_TIMEOUT EHCA_BMASK_IBM(19, 19) #define MQPCB_MASK_PATH_MTU EHCA_BMASK_IBM(20, 20) -#define MQPCB_PATH_MTU EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_MAX_STATIC_RATE EHCA_BMASK_IBM(21, 21) -#define MQPCB_MAX_STATIC_RATE EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_DLID EHCA_BMASK_IBM(22, 22) -#define MQPCB_DLID EHCA_BMASK_IBM(16, 31) #define MQPCB_MASK_RNR_RETRY_COUNT EHCA_BMASK_IBM(23, 23) -#define MQPCB_RNR_RETRY_COUNT EHCA_BMASK_IBM(29, 31) #define MQPCB_MASK_SOURCE_PATH_BITS EHCA_BMASK_IBM(24, 24) -#define MQPCB_SOURCE_PATH_BITS EHCA_BMASK_IBM(25, 31) #define MQPCB_MASK_TRAFFIC_CLASSEHCA_BMASK_IBM(25, 25) -#define MQPCB_TRAFFIC_CLASS EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_HOP_LIMITEHCA_BMASK_IBM(26, 26) -#define MQPCB_HOP_LIMIT EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_SOURCE_GID_IDX EHCA_BMASK_IBM(27, 27) -#define MQPCB_SOURCE_GID_IDXEHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_FLOW_LABEL EHCA_BMASK_IBM(28, 28) -#define MQPCB_FLOW_LABELEHCA_BMASK_IBM(12, 31) #define MQPCB_MASK_DEST_GID EHCA_BMASK_IBM(30, 30) #define MQPCB_MASK_SERVICE_LEVEL_AL EHCA_BMASK_IBM(31, 31) -#define MQPCB_SERVICE_LEVEL_AL EHCA_BMASK_IBM(28, 31) #define MQPCB_MASK_SEND_GRH_FLAG_AL EHCA_BMASK_IBM(32, 32) -#define MQPCB_SEND_GRH_FLAG_AL EHCA_BMASK_IBM(31, 31) #define MQPCB_MASK_RETRY_COUNT_AL EHCA_BMASK_IBM(33, 33) -#define MQPCB_RETRY_COUNT_ALEHCA_BMASK_IBM(29, 31) #define MQPCB_MASK_TIMEOUT_AL EHCA_BMASK_IBM(34, 34) -#define MQPCB_TIMEOUT_ALEHCA_BMASK_IBM(27, 31) #define MQPCB_MASK_MAX_STATIC_RATE_AL EHCA_BMASK_IBM(35, 35) -#define MQPCB_MAX_STATIC_RATE_ALEHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_DLID_AL EHCA_BMASK_IBM(36, 36) -#define MQPCB_DLID_AL EHCA_BMASK_IBM(16, 31) #define MQPCB_MASK_RNR_RETRY_COUNT_AL EHCA_BMASK_IBM(37, 37) -#define MQPCB_RNR_RETRY_COUNT_ALEHCA_BMASK_IBM(29, 31) #define MQPCB_MASK_SOURCE_PATH_BITS_AL EHCA_BMASK_IBM(38, 38) -#define MQPCB_SOURCE_PATH_BITS_AL EHCA_BMASK_IBM(25, 31) #define MQPCB_MASK_TRAFFIC_CLASS_AL EHCA_BMASK_IBM(39, 39) -#define MQPCB_TRAFFIC_CLASS_AL EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_HOP_LIMIT_AL EHCA_BMASK_IBM(40, 40) -#define MQPCB_HOP_LIMIT_AL EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_SOURCE_GID_IDX_ALEHCA_BMASK_IBM(41, 41) -#define MQPCB_SOURCE_GID_IDX_AL EHCA_BMASK_IBM(24, 31) #define MQPCB_MASK_FLOW_LABEL_ALEHCA_BMASK_IBM(42, 42) -#define MQPCB_FLOW_LABEL_AL EHCA_BMASK_IBM(12, 31) #define MQPCB_MASK_DEST_GID_AL EHCA_BMASK_IBM(44, 44) #define MQPCB_MASK_MAX_NR_OUTST_SEND_WR EHCA_BMASK_IBM(45, 45) -#define MQPCB_MAX_NR_OUTST_SEND_WR EHCA_BMASK_IBM(16, 31) #define MQPCB_MASK_MAX_NR_OUTST_RECV_WR EHCA_BMASK_IBM(46, 46) -#define MQPCB_MAX_NR_OUTST_RECV_WR EHCA_BMASK_IBM(16, 31) #define MQPCB_MASK_DISABLE_ETE_CREDIT_CHECK EHCA_BMASK_IBM(47, 47) -#define MQPCB_DISABLE_ETE_CREDIT_CHECK EHCA_BMASK_IBM(31, 31) -#define MQPCB_QP_NUMBER
Re: [PATCH] IB/ehca: Change misleading error message
Roland Dreier [EMAIL PROTECTED] wrote on 26.11.2008 00:13:51: That's too bad... I applied this patch but out of curiousity, why doesn't the hot-remove/hot-add work? I would have thought that re-registering all of memory after the hot-add would do the right thing. That's right, but right now, we simply try to register all of memory from KERNELBASE to high_memory, which works right until we have memory holes in the middle; then the hypervisor will reject our page registrations. Same goes for huge (16GB) pages, by the way. We're working on a solution to this. Cheers, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Change misleading error message
The error message printed when the eHCA driver prevents memory hotplug is misleading -- the user might think that hot-removing the lhca, hotplugging memory, then hot-adding the lhca again will work, but it doesn't. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index bb02a86..bec7e02 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -994,8 +994,7 @@ static int ehca_mem_notifier(struct notifier_block *nb, if (printk_timed_ratelimit(ehca_dmem_warn_time, 30 * 1000)) ehca_gen_err(DMEM operations are not allowed -as long as an ehca adapter is -attached to the LPAR); +in conjunction with eHCA); return NOTIFY_BAD; } } -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Fix lockdep failures for shca_list_lock
From: Michael Ellerman [EMAIL PROTECTED] shca_list_lock is taken from softirq context in ehca_poll_eqs, so we need to lock IRQ safe elsewhere. Signed-off-by: Michael Ellerman [EMAIL PROTECTED] Acked-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c | 17 ++--- 1 files changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index bb02a86..021c454 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -717,6 +717,7 @@ static int __devinit ehca_probe(struct of_device *dev, const u64 *handle; struct ib_pd *ibpd; int ret, i, eq_size; + u64 flags; handle = of_get_property(dev-node, ibm,hca-handle, NULL); if (!handle) { @@ -830,9 +831,9 @@ static int __devinit ehca_probe(struct of_device *dev, ehca_err(shca-ib_device, Cannot create device attributes ret=%d, ret); - spin_lock(shca_list_lock); + spin_lock_irqsave(shca_list_lock, flags); list_add(shca-shca_list, shca_list); - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); return 0; @@ -878,6 +879,7 @@ probe1: static int __devexit ehca_remove(struct of_device *dev) { struct ehca_shca *shca = dev-dev.driver_data; + u64 flags; int ret; sysfs_remove_group(dev-dev.kobj, ehca_dev_attr_grp); @@ -915,9 +917,9 @@ static int __devexit ehca_remove(struct of_device *dev) ib_dealloc_device(shca-ib_device); - spin_lock(shca_list_lock); + spin_lock_irqsave(shca_list_lock, flags); list_del(shca-shca_list); - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); return ret; } @@ -975,6 +977,7 @@ static int ehca_mem_notifier(struct notifier_block *nb, unsigned long action, void *data) { static unsigned long ehca_dmem_warn_time; + unsigned long flags; switch (action) { case MEM_CANCEL_OFFLINE: @@ -985,12 +988,12 @@ static int ehca_mem_notifier(struct notifier_block *nb, case MEM_GOING_ONLINE: case MEM_GOING_OFFLINE: /* only ok if no hca is attached to the lpar */ - spin_lock(shca_list_lock); + spin_lock_irqsave(shca_list_lock, flags); if (list_empty(shca_list)) { - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); return NOTIFY_OK; } else { - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); if (printk_timed_ratelimit(ehca_dmem_warn_time, 30 * 1000)) ehca_gen_err(DMEM operations are not allowed -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Fix locking for shca_list_lock
shca_list_lock is taken from softirq context in ehca_poll_eqs, so we need to lock IRQ safe elsewhere. Signed-off-by: Michael Ellerman [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- On Friday 21 November 2008 17:02, Johannes Berg wrote: On Fri, 2008-11-21 at 16:37 +0100, Joachim Fenkes wrote: + u64 flags; - spin_lock(shca_list_lock); + spin_lock_irqsave(shca_list_lock, flags); That's wrong and I think will give a warning on all machines where u64 != unsigned long. Might not particularly matter in this case. Doesn't matter for a ppc64 only driver, but you're right nonetheless. Thanks. Also, generally it seems wrong to say fix lockdep failure when the patch really fixes a bug that lockdep happened to find. Whatever -- changed. Here's the updated patch. Regards, Joachim drivers/infiniband/hw/ehca/ehca_main.c | 17 ++--- 1 files changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index bb02a86..169aa1a 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -717,6 +717,7 @@ static int __devinit ehca_probe(struct of_device *dev, const u64 *handle; struct ib_pd *ibpd; int ret, i, eq_size; + unsigned long flags; handle = of_get_property(dev-node, ibm,hca-handle, NULL); if (!handle) { @@ -830,9 +831,9 @@ static int __devinit ehca_probe(struct of_device *dev, ehca_err(shca-ib_device, Cannot create device attributes ret=%d, ret); - spin_lock(shca_list_lock); + spin_lock_irqsave(shca_list_lock, flags); list_add(shca-shca_list, shca_list); - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); return 0; @@ -878,6 +879,7 @@ probe1: static int __devexit ehca_remove(struct of_device *dev) { struct ehca_shca *shca = dev-dev.driver_data; + unsigned long flags; int ret; sysfs_remove_group(dev-dev.kobj, ehca_dev_attr_grp); @@ -915,9 +917,9 @@ static int __devexit ehca_remove(struct of_device *dev) ib_dealloc_device(shca-ib_device); - spin_lock(shca_list_lock); + spin_lock_irqsave(shca_list_lock, flags); list_del(shca-shca_list); - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); return ret; } @@ -975,6 +977,7 @@ static int ehca_mem_notifier(struct notifier_block *nb, unsigned long action, void *data) { static unsigned long ehca_dmem_warn_time; + unsigned long flags; switch (action) { case MEM_CANCEL_OFFLINE: @@ -985,12 +988,12 @@ static int ehca_mem_notifier(struct notifier_block *nb, case MEM_GOING_ONLINE: case MEM_GOING_OFFLINE: /* only ok if no hca is attached to the lpar */ - spin_lock(shca_list_lock); + spin_lock_irqsave(shca_list_lock, flags); if (list_empty(shca_list)) { - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); return NOTIFY_OK; } else { - spin_unlock(shca_list_lock); + spin_unlock_irqrestore(shca_list_lock, flags); if (printk_timed_ratelimit(ehca_dmem_warn_time, 30 * 1000)) ehca_gen_err(DMEM operations are not allowed -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] IB/ehca: Fix suppression of port activation events
Roland Dreier [EMAIL PROTECTED] wrote on 10.11.2008 21:36:23: A previous fix introduced a regression where port activation events were dropped unconditionally if port autodetection was not enabled. Fixed. Is this a fix to IB/ehca: Remove reference to special QP in case of port activation failure? Because if so I can roll it into that patch, since Linus hasn't pulled it yet. Yes, that would be splendid, thank you! Cheers, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Fix suppression of port activation events
A previous fix introduced a regression where port activation events were dropped unconditionally if port autodetection was not enabled. Fixed. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- Roland -- this patch is made against your for-linus branch. Please review and apply if you think it's okay. Hope it's not too late for the next kernel. Joachim drivers/infiniband/hw/ehca/ehca_irq.c | 45 +++- 1 files changed, 27 insertions(+), 18 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index 9e43459..757035e 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -359,34 +359,43 @@ static void notify_port_conf_change(struct ehca_shca *shca, int port_num) *old_attr = new_attr; } +/* replay modify_qp for sqps -- return 0 if all is well, 1 if AQP1 destroyed */ +static int replay_modify_qp(struct ehca_sport *sport) +{ + int aqp1_destroyed; + unsigned long flags; + + spin_lock_irqsave(sport-mod_sqp_lock, flags); + + aqp1_destroyed = !sport-ibqp_sqp[IB_QPT_GSI]; + + if (sport-ibqp_sqp[IB_QPT_SMI]) + ehca_recover_sqp(sport-ibqp_sqp[IB_QPT_SMI]); + if (!aqp1_destroyed) + ehca_recover_sqp(sport-ibqp_sqp[IB_QPT_GSI]); + + spin_unlock_irqrestore(sport-mod_sqp_lock, flags); + + return aqp1_destroyed; +} + static void parse_ec(struct ehca_shca *shca, u64 eqe) { u8 ec = EHCA_BMASK_GET(NEQE_EVENT_CODE, eqe); u8 port = EHCA_BMASK_GET(NEQE_PORT_NUMBER, eqe); u8 spec_event; struct ehca_sport *sport = shca-sport[port - 1]; - unsigned long flags; switch (ec) { case 0x30: /* port availability change */ if (EHCA_BMASK_GET(NEQE_PORT_AVAILABILITY, eqe)) { - /* only for autodetect mode important */ - if (ehca_nr_ports = 0) - break; - - int suppress_event; - /* replay modify_qp for sqps */ - spin_lock_irqsave(sport-mod_sqp_lock, flags); - suppress_event = !sport-ibqp_sqp[IB_QPT_GSI]; - if (sport-ibqp_sqp[IB_QPT_SMI]) - ehca_recover_sqp(sport-ibqp_sqp[IB_QPT_SMI]); - if (!suppress_event) - ehca_recover_sqp(sport-ibqp_sqp[IB_QPT_GSI]); - spin_unlock_irqrestore(sport-mod_sqp_lock, flags); - - /* AQP1 was destroyed, ignore this event */ - if (suppress_event) - break; + /* only replay modify_qp calls in autodetect mode; +* if AQP1 was destroyed, the port is already down +* again and we can drop the event. +*/ + if (ehca_nr_ports 0) + if (replay_modify_qp(sport)) + break; sport-port_state = IB_PORT_ACTIVE; dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE, -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] ibmebus/of_platform: Move name sysfs attribute into generic OF devices
Paul Mackerras [EMAIL PROTECTED] wrote on 19.08.2008 06:14:00: Recent of_platform changes made of_bus_type_init() overwrite the bus type's .dev_attrs list, so move ibmebus' name attribute (which is needed by eHCA userspace support) into generic OF device code. Tested on POWER. Is this a bugfix that is needed for 2.6.27? Yes, definitely. The eHCA userspace driver relies on the name attribute to check for valid adapters (it checks that the name is lhca), so with the name attribute gone, eHCA userspace will cease to work. Regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] ibmebus/of_platform: Move name sysfs attribute into generic OF devices
Recent of_platform changes made of_bus_type_init() overwrite the bus type's .dev_attrs list, so move ibmebus' name attribute (which is needed by eHCA userspace support) into generic OF device code. Tested on POWER. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- arch/powerpc/kernel/ibmebus.c | 12 drivers/of/device.c | 10 ++ 2 files changed, 10 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index 9d42eb5..a063622 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -233,17 +233,6 @@ void ibmebus_free_irq(u32 ist, void *dev_id) } EXPORT_SYMBOL(ibmebus_free_irq); -static ssize_t name_show(struct device *dev, -struct device_attribute *attr, char *buf) -{ - return sprintf(buf, %s\n, to_of_device(dev)-node-name); -} - -static struct device_attribute ibmebus_dev_attrs[] = { - __ATTR_RO(name), - __ATTR_NULL -}; - static char *ibmebus_chomp(const char *in, size_t count) { char *out = kmalloc(count + 1, GFP_KERNEL); @@ -327,7 +316,6 @@ static struct bus_attribute ibmebus_bus_attrs[] = { struct bus_type ibmebus_bus_type = { .uevent= of_device_uevent, - .dev_attrs = ibmebus_dev_attrs, .bus_attrs = ibmebus_bus_attrs }; EXPORT_SYMBOL(ibmebus_bus_type); diff --git a/drivers/of/device.c b/drivers/of/device.c index 8a1d93a..51e5214 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -57,6 +57,15 @@ static ssize_t devspec_show(struct device *dev, return sprintf(buf, %s\n, ofdev-node-full_name); } +static ssize_t name_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct of_device *ofdev; + + ofdev = to_of_device(dev); + return sprintf(buf, %s\n, ofdev-node-name); +} + static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -71,6 +80,7 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute of_platform_device_attrs[] = { __ATTR_RO(devspec), + __ATTR_RO(name), __ATTR_RO(modalias), __ATTR_NULL }; -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/2] IB/ehca: Two minor circumventions
[1/2] fixes spurious PATH_MIG events with certain FW versions [2/2] inserts a default value for Local CA ACK Delay Please review these patches and queue them for inclusion into the kernel if you think they're okay. Thanks! Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/2] IB/ehca: Filter PATH_MIG events if QP was never armed
Certain firmware versions sometimes cause spurious PATH_MIG events to occur during QP creation. Filter these events by making sure PATH_MIG events are only handed down when they actually make sense (i.e. when the QP has been armed at least once). Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |1 + drivers/infiniband/hw/ehca/ehca_irq.c |4 drivers/infiniband/hw/ehca/ehca_qp.c |2 ++ 3 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 1e9e99a..0b0618e 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -194,6 +194,7 @@ struct ehca_qp { u32 packet_count; atomic_t nr_events; /* events seen */ wait_queue_head_t wait_completion; + int mig_armed; }; #define IS_SRQ(qp) (qp-ext_type == EQPT_SRQ) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index 0792d93..99642a6 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -178,6 +178,10 @@ static void dispatch_qp_event(struct ehca_shca *shca, struct ehca_qp *qp, { struct ib_event event; + /* PATH_MIG without the QP ever having been armed is false alarm */ + if (event_type == IB_EVENT_PATH_MIG !qp-mig_armed) + return; + event.device = shca-ib_device; event.event = event_type; diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index 3f59587..ea13efd 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -1460,6 +1460,8 @@ static int internal_modify_qp(struct ib_qp *ibqp, goto modify_qp_exit2; } mqpcb-path_migration_state = attr-path_mig_state + 1; + if (attr-path_mig_state == IB_MIG_REARM) + my_qp-mig_armed = 1; update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PATH_MIGRATION_STATE, 1); } -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/2] IB/ehca: Default value for Local CA ACK Delay
Some firmware versions report a Local CA ACK Delay of 0. In that case, return a more sensible default value of 12 (- 16 msec) instead. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_hca.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c index bc3b37d..4628822 100644 --- a/drivers/infiniband/hw/ehca/ehca_hca.c +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -114,7 +114,9 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props) } props-max_pkeys = 16; - props-local_ca_ack_delay = min_t(u8, rblock-local_ca_ack_delay, 255); + /* Some FW versions say 0 here; insert sensible value in that case */ + props-local_ca_ack_delay = rblock-local_ca_ack_delay ? + min_t(u8, rblock-local_ca_ack_delay, 255) : 12; props-max_raw_ipv6_qp = limit_uint(rblock-max_raw_ipv6_qp); props-max_raw_ethy_qp = limit_uint(rblock-max_raw_ethy_qp); props-max_mcast_grp = limit_uint(rblock-max_mcast_grp); -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Make device table externally visible
This gives ehca an autogenerated modalias and therefore enables automatic loading. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 482103e..598844d 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -923,6 +923,7 @@ static struct of_device_id ehca_device_table[] = }, {}, }; +MODULE_DEVICE_TABLE(of, ehca_device_table); static struct of_platform_driver ehca_driver = { .name= ehca, -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Reject recv WRs if QP is in RESET state
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- On Friday 06 June 2008 22:20, Dotan Barak wrote: I checked the code in the ehca driver and noticed that post RR to a QP is being accepted in any state (including the RESET state). You're right, this is only consistent -- thanks for pointing it out! Regards, Joachim drivers/infiniband/hw/ehca/ehca_reqs.c | 12 ++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index f093b00..ad197f4 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -544,8 +544,16 @@ int ehca_post_recv(struct ib_qp *qp, struct ib_recv_wr *recv_wr, struct ib_recv_wr **bad_recv_wr) { - return internal_post_recv(container_of(qp, struct ehca_qp, ib_qp), - qp-device, recv_wr, bad_recv_wr); + struct ehca_qp *my_qp = container_of(qp, struct ehca_qp, ib_qp); + + /* Reject WR if QP is in RESET state */ + if (unlikely(my_qp-state == IB_QPS_RESET)) { + ehca_err(qp-device, Invalid QP state qp_state=%d qpn=%x, +my_qp-state, qp-qp_num); + return -EINVAL; + } + + return internal_post_recv(my_qp, qp-device, recv_wr, bad_recv_wr); } int ehca_post_srq_recv(struct ib_srq *srq, -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
IB/ehca: Reject send WRs only for RESET, INIT and RTR state
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index bbe0436..f093b00 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -421,8 +421,10 @@ int ehca_post_send(struct ib_qp *qp, int ret = 0; unsigned long flags; - if (unlikely(my_qp-state != IB_QPS_RTS)) { - ehca_err(qp-device, QP not in RTS state qpn=%x, qp-qp_num); + /* Reject WR if QP is in RESET, INIT or RTR state */ + if (unlikely(my_qp-state IB_QPS_RTS)) { + ehca_err(qp-device, Invalid QP state qp_state=%d qpn=%x, +my_qp-state, qp-qp_num); return -EINVAL; } -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/5] IB/ehca: IB compliance fix, tracing verbosity and module parameters
[1/5] makes the driver reject SQ WRs if the QP is not in RTS [2/5] bumps a lot of tracing into higher debug_levels [3/5] removes the mr_largepage parameter [4/5] changes some bool-ish module parms into actual bools, also updates some descriptions [5/5] bumps the version number to 0026 Please review these patches and queue them for inclusion into 2.6.26 if you think they're okay. Thanks! Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/5] IB/ehca: Prevent posting of SQ WQEs if QP not in RTS
...as required by IB Spec, C10-29. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |1 + drivers/infiniband/hw/ehca/ehca_qp.c |3 +++ drivers/infiniband/hw/ehca/ehca_reqs.c|5 + 3 files changed, 9 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 0d13fe0..3d6d946 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -160,6 +160,7 @@ struct ehca_qp { }; u32 qp_type; enum ehca_ext_qp_type ext_type; + enum ib_qp_state state; struct ipz_queue ipz_squeue; struct ipz_queue ipz_rqueue; struct h_galpas galpas; diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index 3eb14a5..5a653d7 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -550,6 +550,7 @@ static struct ehca_qp *internal_create_qp( spin_lock_init(my_qp-spinlock_r); my_qp-qp_type = qp_type; my_qp-ext_type = parms.ext_type; + my_qp-state = IB_QPS_RESET; if (init_attr-recv_cq) my_qp-recv_cq = @@ -1508,6 +1509,8 @@ static int internal_modify_qp(struct ib_qp *ibqp, if (attr_mask IB_QP_QKEY) my_qp-qkey = attr-qkey; + my_qp-state = qp_new_state; + modify_qp_exit2: if (squeue_locked) { /* this means: sqe - rts */ spin_unlock_irqrestore(my_qp-spinlock_s, flags); diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index a20bbf4..0b2359e 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -421,6 +421,11 @@ int ehca_post_send(struct ib_qp *qp, int ret = 0; unsigned long flags; + if (unlikely(my_qp-state != IB_QPS_RTS)) { + ehca_err(qp-device, QP not in RTS state qpn=%x, qp-qp_num); + return -EINVAL; + } + /* LOCK the QUEUE */ spin_lock_irqsave(my_qp-spinlock_s, flags); -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/5] IB/ehca: Move high-volume debug output to higher debug levels
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_irq.c|2 +- drivers/infiniband/hw/ehca/ehca_main.c | 14 ++-- drivers/infiniband/hw/ehca/ehca_mrmw.c | 16 ++ drivers/infiniband/hw/ehca/ehca_qp.c | 12 drivers/infiniband/hw/ehca/ehca_reqs.c | 46 ++--- drivers/infiniband/hw/ehca/ehca_uverbs.c |6 +-- drivers/infiniband/hw/ehca/hcp_if.c | 23 --- 7 files changed, 63 insertions(+), 56 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index b5ca94c..ca5eb0c 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -633,7 +633,7 @@ static inline int find_next_online_cpu(struct ehca_comp_pool *pool) unsigned long flags; WARN_ON_ONCE(!in_interrupt()); - if (ehca_debug_level) + if (ehca_debug_level = 3) ehca_dmp(cpu_online_map, sizeof(cpumask_t), ); spin_lock_irqsave(pool-last_cpu_lock, flags); diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 65b3362..4379bef 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -85,8 +85,8 @@ module_param_named(lock_hcalls, ehca_lock_hcalls, bool, S_IRUGO); MODULE_PARM_DESC(open_aqp1, AQP1 on startup (0: no (default), 1: yes)); MODULE_PARM_DESC(debug_level, -debug level - (0: no debug traces (default), 1: with debug traces)); +Amount of debug output (0: none (default), 1: traces, +2: some dumps, 3: lots)); MODULE_PARM_DESC(hw_level, hardware level (0: autosensing (default), 1: v. 0.20, 2: v. 0.21)); @@ -275,6 +275,7 @@ static int ehca_sense_attributes(struct ehca_shca *shca) u64 h_ret; struct hipz_query_hca *rblock; struct hipz_query_port *port; + const char *loc_code; static const u32 pgsize_map[] = { HCA_CAP_MR_PGSIZE_4K, 0x1000, @@ -283,6 +284,12 @@ static int ehca_sense_attributes(struct ehca_shca *shca) HCA_CAP_MR_PGSIZE_16M, 0x100, }; + ehca_gen_dbg(Probing adapter %s..., +shca-ofdev-node-full_name); + loc_code = of_get_property(shca-ofdev-node, ibm,loc-code, NULL); + if (loc_code) + ehca_gen_dbg( ... location lode=%s, loc_code); + rblock = ehca_alloc_fw_ctrlblock(GFP_KERNEL); if (!rblock) { ehca_gen_err(Cannot allocate rblock memory.); @@ -567,8 +574,7 @@ static int ehca_destroy_aqp1(struct ehca_sport *sport) static ssize_t ehca_show_debug_level(struct device_driver *ddp, char *buf) { - return snprintf(buf, PAGE_SIZE, %d\n, - ehca_debug_level); + return snprintf(buf, PAGE_SIZE, %d\n, ehca_debug_level); } static ssize_t ehca_store_debug_level(struct device_driver *ddp, diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index f26997f..46ae4eb 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -1794,8 +1794,9 @@ static int ehca_check_kpages_per_ate(struct scatterlist *page_list, int t; for (t = start_idx; t = end_idx; t++) { u64 pgaddr = page_to_pfn(sg_page(page_list[t])) PAGE_SHIFT; - ehca_gen_dbg(chunk_page=%lx value=%016lx, pgaddr, -*(u64 *)abs_to_virt(phys_to_abs(pgaddr))); + if (ehca_debug_level = 3) + ehca_gen_dbg(chunk_page=%lx value=%016lx, pgaddr, +*(u64 *)abs_to_virt(phys_to_abs(pgaddr))); if (pgaddr - PAGE_SIZE != *prev_pgaddr) { ehca_gen_err(uncontiguous page found pgaddr=%lx prev_pgaddr=%lx page_list_i=%x, @@ -1862,10 +1863,13 @@ static int ehca_set_pagebuf_user2(struct ehca_mr_pginfo *pginfo, pgaddr ~(pginfo-hwpage_size - 1)); } - ehca_gen_dbg(kpage=%lx chunk_page=%lx -value=%016lx, *kpage, pgaddr, -*(u64 *)abs_to_virt( -phys_to_abs(pgaddr))); + if (ehca_debug_level = 3) { + u64 val = *(u64 *)abs_to_virt( + phys_to_abs(pgaddr)); + ehca_gen_dbg(kpage=%lx chunk_page=%lx +value=%016lx, +*kpage
[PATCH 3/5] IB/ehca: Remove mr_largepage parameter
Always enable large page support; didn't seem to cause problems for anyone. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c | 22 +++--- 1 files changed, 3 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 4379bef..ab02ac8 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -60,7 +60,6 @@ MODULE_VERSION(HCAD_VERSION); static int ehca_open_aqp1 = 0; static int ehca_hw_level = 0; static int ehca_poll_all_eqs = 1; -static int ehca_mr_largepage = 1; int ehca_debug_level = 0; int ehca_nr_ports = 2; @@ -79,7 +78,6 @@ module_param_named(port_act_time, ehca_port_act_time, int, S_IRUGO); module_param_named(poll_all_eqs, ehca_poll_all_eqs, int, S_IRUGO); module_param_named(static_rate, ehca_static_rate, int, S_IRUGO); module_param_named(scaling_code, ehca_scaling_code, int, S_IRUGO); -module_param_named(mr_largepage, ehca_mr_largepage, int, S_IRUGO); module_param_named(lock_hcalls, ehca_lock_hcalls, bool, S_IRUGO); MODULE_PARM_DESC(open_aqp1, @@ -104,9 +102,6 @@ MODULE_PARM_DESC(static_rate, set permanent static rate (default: disabled)); MODULE_PARM_DESC(scaling_code, set scaling code (0: disabled/default, 1: enabled)); -MODULE_PARM_DESC(mr_largepage, -use large page for MR (0: use PAGE_SIZE (default), -1: use large page depending on MR size); MODULE_PARM_DESC(lock_hcalls, serialize all hCalls made by the driver (default: autodetect)); @@ -357,11 +352,9 @@ static int ehca_sense_attributes(struct ehca_shca *shca) /* translate supported MR page sizes; always support 4K */ shca-hca_cap_mr_pgsize = EHCA_PAGESIZE; - if (ehca_mr_largepage) { /* support extra sizes only if enabled */ - for (i = 0; i ARRAY_SIZE(pgsize_map); i += 2) - if (rblock-memory_page_size_supported pgsize_map[i]) - shca-hca_cap_mr_pgsize |= pgsize_map[i + 1]; - } + for (i = 0; i ARRAY_SIZE(pgsize_map); i += 2) + if (rblock-memory_page_size_supported pgsize_map[i]) + shca-hca_cap_mr_pgsize |= pgsize_map[i + 1]; /* query max MTU from first port -- it's the same for all ports */ port = (struct hipz_query_port *)rblock; @@ -663,14 +656,6 @@ static ssize_t ehca_show_adapter_handle(struct device *dev, } static DEVICE_ATTR(adapter_handle, S_IRUGO, ehca_show_adapter_handle, NULL); -static ssize_t ehca_show_mr_largepage(struct device *dev, - struct device_attribute *attr, - char *buf) -{ - return sprintf(buf, %d\n, ehca_mr_largepage); -} -static DEVICE_ATTR(mr_largepage, S_IRUGO, ehca_show_mr_largepage, NULL); - static struct attribute *ehca_dev_attrs[] = { dev_attr_adapter_handle.attr, dev_attr_num_ports.attr, @@ -687,7 +672,6 @@ static struct attribute *ehca_dev_attrs[] = { dev_attr_cur_mw.attr, dev_attr_max_pd.attr, dev_attr_max_ah.attr, - dev_attr_mr_largepage.attr, NULL }; -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 4/5] IB/ehca: Make some module parameters bool, update descriptions
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c | 37 +++ 1 files changed, 18 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index ab02ac8..45fe35a 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -69,41 +69,40 @@ int ehca_static_rate = -1; int ehca_scaling_code = 0; int ehca_lock_hcalls = -1; -module_param_named(open_aqp1, ehca_open_aqp1, int, S_IRUGO); -module_param_named(debug_level, ehca_debug_level, int, S_IRUGO); -module_param_named(hw_level, ehca_hw_level, int, S_IRUGO); -module_param_named(nr_ports, ehca_nr_ports, int, S_IRUGO); -module_param_named(use_hp_mr, ehca_use_hp_mr, int, S_IRUGO); -module_param_named(port_act_time, ehca_port_act_time, int, S_IRUGO); -module_param_named(poll_all_eqs, ehca_poll_all_eqs, int, S_IRUGO); -module_param_named(static_rate, ehca_static_rate, int, S_IRUGO); -module_param_named(scaling_code, ehca_scaling_code, int, S_IRUGO); +module_param_named(open_aqp1, ehca_open_aqp1, bool, S_IRUGO); +module_param_named(debug_level, ehca_debug_level, int, S_IRUGO); +module_param_named(hw_level, ehca_hw_level, int, S_IRUGO); +module_param_named(nr_ports, ehca_nr_ports, int, S_IRUGO); +module_param_named(use_hp_mr, ehca_use_hp_mr, bool, S_IRUGO); +module_param_named(port_act_time, ehca_port_act_time, int, S_IRUGO); +module_param_named(poll_all_eqs, ehca_poll_all_eqs, bool, S_IRUGO); +module_param_named(static_rate, ehca_static_rate, int, S_IRUGO); +module_param_named(scaling_code, ehca_scaling_code, bool, S_IRUGO); module_param_named(lock_hcalls, ehca_lock_hcalls, bool, S_IRUGO); MODULE_PARM_DESC(open_aqp1, -AQP1 on startup (0: no (default), 1: yes)); +Open AQP1 on startup (default: no)); MODULE_PARM_DESC(debug_level, Amount of debug output (0: none (default), 1: traces, 2: some dumps, 3: lots)); MODULE_PARM_DESC(hw_level, -hardware level - (0: autosensing (default), 1: v. 0.20, 2: v. 0.21)); +Hardware level (0: autosensing (default), +0x10..0x14: eHCA, 0x20..0x23: eHCA2)); MODULE_PARM_DESC(nr_ports, number of connected ports (-1: autodetect, 1: port one only, 2: two ports (default)); MODULE_PARM_DESC(use_hp_mr, -high performance MRs (0: no (default), 1: yes)); +Use high performance MRs (default: no)); MODULE_PARM_DESC(port_act_time, -time to wait for port activation (default: 30 sec)); +Time to wait for port activation (default: 30 sec)); MODULE_PARM_DESC(poll_all_eqs, -polls all event queues periodically - (0: no, 1: yes (default))); +Poll all event queues periodically (default: yes)); MODULE_PARM_DESC(static_rate, -set permanent static rate (default: disabled)); +Set permanent static rate (default: no static rate)); MODULE_PARM_DESC(scaling_code, -set scaling code (0: disabled/default, 1: enabled)); +Enable scaling code (default: no)); MODULE_PARM_DESC(lock_hcalls, -serialize all hCalls made by the driver +Serialize all hCalls made by the driver (default: autodetect)); DEFINE_RWLOCK(ehca_qp_idr_lock); -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 5/5] IB/ehca: Bump version number to 0026
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 45fe35a..6504897 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -50,7 +50,7 @@ #include ehca_tools.h #include hcp_if.h -#define HCAD_VERSION 0025 +#define HCAD_VERSION 0026 MODULE_LICENSE(Dual BSD/GPL); MODULE_AUTHOR(Christoph Raisch [EMAIL PROTECTED]); -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/5] IB/ehca: Prevent posting of SQ WQEs if QP not in RTS
On Monday 21 April 2008 10:04, Joachim Fenkes wrote: + if (unlikely(my_qp-state != IB_QPS_RTS)) { + ehca_err(qp-device, QP not in RTS state qpn=%x, qp-qp_num); + return -EINVAL; + } Myself, I'm not very happy with using EINVAL, but I can't think of a more fitting return code. Also, this is what nes, amso and cxgb3 return in such a case; ipath posts an error CQE and mthca/mlx4 don't do this check at all (AFAICS). Better suggestions, anyone? Regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/2] IB/ehca: Update sma_attr also in case of disruptive config change
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_irq.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index 863b34f..b5ca94c 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -403,6 +403,8 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe) sport-port_state = IB_PORT_ACTIVE; dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE, is active); + ehca_query_sma_attr(shca, port, + sport-saved_attr); } else notify_port_conf_change(shca, port); break; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/2] IB/ehca: PMA support and a minor fix
This patchset will fix a minor issue and then add support for Performance MADs, which redirects all PMA queries to the actual PMA QP. [1/2] adds a missing query_pma_attr() [2/2] adds PMA redirection code The patches will apply, in order, on top of Roland's for-2.6.25 branch. Please review them and apply for 2.6.25 if you think they're okay. Thanks and regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/2] IB/ehca: Add PMA support
From: Hoang-Nam Nguyen [EMAIL PROTECTED] This patch enables ehca to redirect any PMA queries to the actual PMA QP. Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED] Reviewed-by: Joachim Fenkes [EMAIL PROTECTED] Reviewed-by: Christoph Raisch [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |1 + drivers/infiniband/hw/ehca/ehca_iverbs.h |5 ++ drivers/infiniband/hw/ehca/ehca_main.c|2 +- drivers/infiniband/hw/ehca/ehca_sqp.c | 91 + 4 files changed, 98 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index f281d16..92cce8a 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -101,6 +101,7 @@ struct ehca_sport { spinlock_t mod_sqp_lock; enum ib_port_state port_state; struct ehca_sma_attr saved_attr; + u32 pma_qp_nr; }; #define HCA_CAP_MR_PGSIZE_4K 0x8000 diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h index c469bfd..a8a2ea5 100644 --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h @@ -187,6 +187,11 @@ int ehca_dealloc_ucontext(struct ib_ucontext *context); int ehca_mmap(struct ib_ucontext *context, struct vm_area_struct *vma); +int ehca_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num, +struct ib_wc *in_wc, struct ib_grh *in_grh, +struct ib_mad *in_mad, +struct ib_mad *out_mad); + void ehca_poll_eqs(unsigned long data); int ehca_calc_ipd(struct ehca_shca *shca, int port, diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 0fe0c84..33b5bac 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -472,7 +472,7 @@ int ehca_init_device(struct ehca_shca *shca) shca-ib_device.dealloc_fmr = ehca_dealloc_fmr; shca-ib_device.attach_mcast= ehca_attach_mcast; shca-ib_device.detach_mcast= ehca_detach_mcast; - /* shca-ib_device.process_mad = ehca_process_mad; */ + shca-ib_device.process_mad = ehca_process_mad; shca-ib_device.mmap= ehca_mmap; if (EHCA_BMASK_GET(HCA_CAP_SRQ, shca-hca_cap)) { diff --git a/drivers/infiniband/hw/ehca/ehca_sqp.c b/drivers/infiniband/hw/ehca/ehca_sqp.c index 79e72b2..706d97a 100644 --- a/drivers/infiniband/hw/ehca/ehca_sqp.c +++ b/drivers/infiniband/hw/ehca/ehca_sqp.c @@ -39,12 +39,18 @@ * POSSIBILITY OF SUCH DAMAGE. */ +#include rdma/ib_mad.h #include ehca_classes.h #include ehca_tools.h #include ehca_iverbs.h #include hcp_if.h +#define IB_MAD_STATUS_REDIRECT __constant_htons(0x0002) +#define IB_MAD_STATUS_UNSUP_VERSION__constant_htons(0x0004) +#define IB_MAD_STATUS_UNSUP_METHOD __constant_htons(0x0008) + +#define IB_PMA_CLASS_PORT_INFO __constant_htons(0x0001) /** * ehca_define_sqp - Defines special queue pair 1 (GSI QP). When special queue @@ -83,6 +89,9 @@ u64 ehca_define_sqp(struct ehca_shca *shca, port, ret); return ret; } + shca-sport[port - 1].pma_qp_nr = pma_qp_nr; + ehca_dbg(shca-ib_device, port=%x pma_qp_nr=%x, +port, pma_qp_nr); break; default: ehca_err(shca-ib_device, invalid qp_type=%x, @@ -109,3 +118,85 @@ u64 ehca_define_sqp(struct ehca_shca *shca, return H_SUCCESS; } + +struct ib_perf { + struct ib_mad_hdr mad_hdr; + u8 reserved[40]; + u8 data[192]; +} __attribute__ ((packed)); + + +static int ehca_process_perf(struct ib_device *ibdev, u8 port_num, +struct ib_mad *in_mad, struct ib_mad *out_mad) +{ + struct ib_perf *in_perf = (struct ib_perf *)in_mad; + struct ib_perf *out_perf = (struct ib_perf *)out_mad; + struct ib_class_port_info *poi = + (struct ib_class_port_info *)out_perf-data; + struct ehca_shca *shca = + container_of(ibdev, struct ehca_shca, ib_device); + struct ehca_sport *sport = shca-sport[port_num - 1]; + + ehca_dbg(ibdev, method=%x, in_perf-mad_hdr.method); + + *out_mad = *in_mad; + + if (in_perf-mad_hdr.class_version != 1) { + ehca_warn(ibdev, Unsupported class_version=%x, + in_perf-mad_hdr.class_version); + out_perf-mad_hdr.status = IB_MAD_STATUS_UNSUP_VERSION; + goto perf_reply; + } + + switch (in_perf-mad_hdr.method) { + case IB_MGMT_METHOD_GET: + case IB_MGMT_METHOD_SET: + /* set class port info for redirection */ + out_perf-mad_hdr.attr_id = IB_PMA_CLASS_PORT_INFO; + out_perf
[PATCH] IB/ehca: Prevent sending UD packets to QP0
IB spec doesn't allow packets to QP0 sent on any other VL than VL15. Hardware doesn't filter those packets on the send side, so we need to do this in the driver and firmware. As eHCA doesn't support QP0, we can just filter out all traffic going to QP0, regardless of SL or VL. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index 3aacc8c..2ce8cff 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -209,6 +209,10 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, ehca_gen_err(wr.ud.ah is NULL. qp=%p, qp); return -EINVAL; } + if (unlikely(send_wr-wr.ud.remote_qpn == 0)) { + ehca_gen_err(dest QP# is 0. qp=%x, qp-real_qp_num); + return -EINVAL; + } my_av = container_of(send_wr-wr.ud.ah, struct ehca_av, ib_ah); wqe_p-u.ud_av.ud_av = my_av-av; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/4] IB/ehca: Remove CQ-QP-link before destroying QP in error path of create_qp()
From: Hoang-Nam Nguyen hnguyen at de.ibm.com Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_qp.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index f116eb7..26c6a94 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -769,12 +769,15 @@ static struct ehca_qp *internal_create_qp( if (ib_copy_to_udata(udata, resp, sizeof resp)) { ehca_err(pd-device, Copy to udata failed); ret = -EINVAL; - goto create_qp_exit4; + goto create_qp_exit5; } } return my_qp; +create_qp_exit5: + ehca_cq_unassign_qp(my_qp-send_cq, my_qp-real_qp_num); + create_qp_exit4: if (HAS_RQ(my_qp)) ipz_queue_dtor(my_pd, my_qp-ipz_rqueue); -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/4] IB/ehca: Define array to store SMI/GSI QPs
From: Hoang-Nam Nguyen hnguyen at de.ibm.com Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |2 +- drivers/infiniband/hw/ehca/ehca_main.c|6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 74d2b72..936580d 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -94,7 +94,7 @@ struct ehca_sma_attr { struct ehca_sport { struct ib_cq *ibcq_aqp1; - struct ib_qp *ibqp_aqp1; + struct ib_qp *ibqp_sqp[2]; enum ib_port_state port_state; struct ehca_sma_attr saved_attr; }; diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 6a56d86..cde486c 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -511,7 +511,7 @@ static int ehca_create_aqp1(struct ehca_shca *shca, u32 port) } sport-ibcq_aqp1 = ibcq; - if (sport-ibqp_aqp1) { + if (sport-ibqp_sqp[IB_QPT_GSI]) { ehca_err(shca-ib_device, AQP1 QP is already created.); ret = -EPERM; goto create_aqp1; @@ -537,7 +537,7 @@ static int ehca_create_aqp1(struct ehca_shca *shca, u32 port) ret = PTR_ERR(ibqp); goto create_aqp1; } - sport-ibqp_aqp1 = ibqp; + sport-ibqp_sqp[IB_QPT_GSI] = ibqp; return 0; @@ -550,7 +550,7 @@ static int ehca_destroy_aqp1(struct ehca_sport *sport) { int ret; - ret = ib_destroy_qp(sport-ibqp_aqp1); + ret = ib_destroy_qp(sport-ibqp_sqp[IB_QPT_GSI]); if (ret) { ehca_gen_err(Cannot destroy AQP1 QP. ret=%i, ret); return ret; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 4/4] IB/ehca: Prevent RDMA-related connection failures
Some HW revisions of eHCA2 may cause an RC connection to break if they received RDMA Reads over that connection before. This can be prevented by assuring that, after the first RDMA Read, the QP receives a new RDMA Read every few million link packets. Include code into the driver that inserts an empty (size 0) RDMA Read into the message stream every now and then if the consumer doesn't post them frequently enough. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |5 ++ drivers/infiniband/hw/ehca/ehca_qp.c | 14 +++- drivers/infiniband/hw/ehca/ehca_reqs.c| 112 3 files changed, 95 insertions(+), 36 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 2502366..f281d16 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -183,6 +183,11 @@ struct ehca_qp { u32 mm_count_squeue; u32 mm_count_rqueue; u32 mm_count_galpa; + /* unsolicited ack circumvention */ + int unsol_ack_circ; + int mtu_shift; + u32 message_count; + u32 packet_count; }; #define IS_SRQ(qp) (qp-ext_type == EQPT_SRQ) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index bb7ccef..6c050e0 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -592,10 +592,8 @@ static struct ehca_qp *internal_create_qp( goto create_qp_exit1; } - if (init_attr-sq_sig_type == IB_SIGNAL_ALL_WR) - parms.sigtype = HCALL_SIGT_EVERY; - else - parms.sigtype = HCALL_SIGT_BY_WQE; + /* Always signal by WQE so we can hide circ. WQEs */ + parms.sigtype = HCALL_SIGT_BY_WQE; /* UD_AV CIRCUMVENTION */ max_send_sge = init_attr-cap.max_send_sge; @@ -618,6 +616,10 @@ static struct ehca_qp *internal_create_qp( parms.squeue.max_sge = max_send_sge; parms.rqueue.max_sge = max_recv_sge; + /* RC QPs need one more SWQE for unsolicited ack circumvention */ + if (qp_type == IB_QPT_RC) + parms.squeue.max_wr++; + if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca-hca_cap)) { if (HAS_SQ(my_qp)) ehca_determine_small_queue( @@ -650,6 +652,8 @@ static struct ehca_qp *internal_create_qp( parms.squeue.act_nr_sges = 1; parms.rqueue.act_nr_sges = 1; } + /* hide the extra WQE */ + parms.squeue.act_nr_wqes--; break; case IB_QPT_UD: case IB_QPT_GSI: @@ -1295,6 +1299,8 @@ static int internal_modify_qp(struct ib_qp *ibqp, } if (attr_mask IB_QP_PATH_MTU) { + /* store ld(MTU) */ + my_qp-mtu_shift = attr-path_mtu + 7; mqpcb-path_mtu = attr-path_mtu; update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PATH_MTU, 1); } diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index ea91360..3aacc8c 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -50,6 +50,9 @@ #include hcp_if.h #include hipz_fns.h +/* in RC traffic, insert an empty RDMA READ every this many packets */ +#define ACK_CIRC_THRESHOLD 200 + static inline int ehca_write_rwqe(struct ipz_queue *ipz_rqueue, struct ehca_wqe *wqe_p, struct ib_recv_wr *recv_wr) @@ -81,7 +84,7 @@ static inline int ehca_write_rwqe(struct ipz_queue *ipz_rqueue, if (ehca_debug_level) { ehca_gen_dbg(RECEIVE WQE written into ipz_rqueue=%p, ipz_rqueue); - ehca_dmp( wqe_p, 16*(6 + wqe_p-nr_of_data_seg), recv wqe); + ehca_dmp(wqe_p, 16*(6 + wqe_p-nr_of_data_seg), recv wqe); } return 0; @@ -135,7 +138,8 @@ static void trace_send_wr_ud(const struct ib_send_wr *send_wr) static inline int ehca_write_swqe(struct ehca_qp *qp, struct ehca_wqe *wqe_p, - const struct ib_send_wr *send_wr) + const struct ib_send_wr *send_wr, + int hidden) { u32 idx; u64 dma_length; @@ -176,7 +180,9 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, wqe_p-wr_flag = 0; - if (send_wr-send_flags IB_SEND_SIGNALED) + if ((send_wr-send_flags IB_SEND_SIGNALED || + qp-init_attr.sq_sig_type == IB_SIGNAL_ALL_WR) +!hidden) wqe_p-wr_flag |= WQE_WRFLAG_REQ_SIGNAL_COM; if (send_wr-opcode == IB_WR_SEND_WITH_IMM || @@ -199,7 +205,7 @@ static inline int ehca_write_swqe(struct ehca_qp *qp, wqe_p
[PATCH 0/4] IB/ehca: fixes, port connectivity autodetection, problem workaround
This patchset will fix a minor issue, introduce port connectivity autodetection and work around an RDMA-related problem in eHCA2. [1/4] fixes an error path in destroy_qp() [2/4] stores the SMI/GSI QPs in a per-port array [3/4] adds port connectivity autodetection [4/4] adds the aforementioned workaround The patches will apply, in order, on top of Roland's for-2.6.25 branch. Please review them and apply for 2.6.25 if you think they're okay. Thanks and regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Fix lock flag location, bump version number
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- This addresses a comment of Roland and bumps the version number. If it's not too late, please apply for 2.6.24. Thanks! drivers/infiniband/hw/ehca/ehca_classes.h |1 + drivers/infiniband/hw/ehca/ehca_main.c|2 +- drivers/infiniband/hw/ehca/hcp_if.c |1 - 3 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 87f12d4..74d2b72 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -322,6 +322,7 @@ extern int ehca_static_rate; extern int ehca_port_act_time; extern int ehca_use_hp_mr; extern int ehca_scaling_code; +extern int ehca_lock_hcalls; struct ipzu_queue_resp { u32 qe_size; /* queue entry size */ diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index c7bff3e..6a56d86 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -50,7 +50,7 @@ #include ehca_tools.h #include hcp_if.h -#define HCAD_VERSION 0024 +#define HCAD_VERSION 0025 MODULE_LICENSE(Dual BSD/GPL); MODULE_AUTHOR(Christoph Raisch [EMAIL PROTECTED]); diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c index 331b5e8..7029aa6 100644 --- a/drivers/infiniband/hw/ehca/hcp_if.c +++ b/drivers/infiniband/hw/ehca/hcp_if.c @@ -89,7 +89,6 @@ #define HCALL9_REGS_FORMAT HCALL7_REGS_FORMAT r11=%lx r12=%lx static DEFINE_SPINLOCK(hcall_lock); -extern int ehca_lock_hcalls; static u32 get_longbusy_msecs(int longbusy_rc) { -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on POWER5
[EMAIL PROTECTED] wrote on 13.12.2007 20:22:49: On Dec 13, 2007 12:30 AM, Or Gerlitz [EMAIL PROTECTED] wrote: The current implementation of the open iscsi initiator makes sure to issue commands in thread (sleepable) context, see iscsi_xmitworker and references to it in drivers/scsi/libiscsi.c , so this keeps ehca users safe for the time being. I agree, *some* form of FMR support is important for iSER (and probably for NFS over RDMA as well). Rather than adding a crippled NO FMR mode it would make more sense to add support for FMR Work Requests. I'm not certain what, if any, impact that would have on the Power5 problem, but that's certainly a cleaner path for iWARP. Well, FMR WRs wouldn't change the eHCA issue -- the driver would have to make an hCall in any case, and the architecture says that the hCalls used in this scenario might return H_LONG_BUSY, causing the driver to sleep. No way around that. Because of this, eHCA's FMRs are actually standard MRs with a different API. If, as Or said, the iSCSI initiator issues commands in sleepable context anyway, nothing would be lost by using standard MRs as a fallback solution if FMRs aren't available, would it? J. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: [ofa-general] Re: [ewg] Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on POWER5
Caitlin Bestler [EMAIL PROTECTED] wrote on 13.12.2007 22:08:34: To clarify, an FMR Work Request is simply posted to the SendQ like any other Work Request (of course the QP has to be privileged, or it will complete in error). An SQ Post should never block. This would require hardware support, wouldn't it? eHCA2 doesn't have this kind of support, so FMR WRs are not an option here. J. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Return correct #SGEs for SRQ
Firmware would round up the number of SGEs to four, because the WQE structure holds four SGEs. For SRQ, only three are supported, so return a fixed value instead. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- The patch will apply cleanly on top of Roland's git. Please review and apply for 2.6.24 -- Thanks! drivers/infiniband/hw/ehca/ehca_qp.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index dd12668..eff5fb5 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -838,7 +838,7 @@ struct ib_srq *ehca_create_srq(struct ib_pd *pd, /* copy back return values */ srq_init_attr-attr.max_wr = qp_init_attr.cap.max_recv_wr; - srq_init_attr-attr.max_sge = qp_init_attr.cap.max_recv_sge; + srq_init_attr-attr.max_sge = 3; /* drive SRQ into RTR state */ mqpcb = ehca_alloc_fw_ctrlblock(GFP_KERNEL); @@ -1750,7 +1750,7 @@ int ehca_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr) } srq_attr-max_wr = qpcb-max_nr_outst_recv_wr - 1; - srq_attr-max_sge = qpcb-actual_nr_sges_in_rq_wqe; + srq_attr-max_sge = 3; srq_attr-srq_limit = EHCA_BMASK_GET( MQPCB_CURR_SRQ_LIMIT, qpcb-curr_srq_limit); -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on POWER5
On Monday 10 December 2007 00:22, Roland Dreier wrote: Fair enough... according to Documentation/infiniband/core_locking.txt, the only driver methods that cannot sleep are: [...] map_phys_fmr In fact, we do use hCalls there. Our hardware doesn't actually support FMRs, so we translate a map FMR into a reallocate PMR, which doesn't work without hCalls. What's more, the hCalls involved (e.g. H_FREE_RESOURCE) might well return H_LONG_BUSY, so the whole operation might sleep; no way around it. How should we deal with this? Thanks, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on POWER5
Roland Dreier [EMAIL PROTECTED] wrote on 06.12.2007 19:27:09: + ehca_lock_hcalls = !(cur_cpu_spec-cpu_user_features + PPC_FEATURE_ARCH_2_05); We already talked about this yesterday, but I still feel that checking the instruction set of the CPU should not be used to determine whether a specific device driver implementation is used int hypervisor. I had the same reaction... is testing cpu_user_features really the best way to detect this issue? I concur it's not nice, but it was the only feasible method we could find without adding a bug fixed feature flag to the partition-firmware interface. The firmware version reported in the OFDT is not a reliable enough source, and even if it were, it would require a lot of string parsing and matching against tables. We're taking this to the firmware architects at the moment, but they're not very fond of the idea of reporting the absence of bugs through capability flags, as this could quickly lead to the exhaustion of flag bits. We'll let the discussion stew for a bit, but if we don't get this flag, we'll have to resort to the CPU features. I'll hold off applying this for a few days so you guys can decide the best thing to do. We'll definitely get some fix into 2.6.24 but we have time to make a good decision. Right. Regarding the performance problem, have you checked whether converting all your spin_lock_irqsave to spin_lock/spin_lock_irq improves your performance on the older machines? Maybe it's already fast enough that way. It does seem that the only places that the hcall_lock is taken also use msleep, so they must always be in process context. So you can safely just use spin_lock(), right? As Arnd said, there are hCalls that will never return H_LONG_BUSY_*, such as H_QUERY_PORT and chums, so they will never sleep. The surrounding functions, though, are not prepared to be called from interrupt context (GFP_KERNEL comes to mind), so I agree that a simple spin_lock() will suffice. Thanks, Arnd, for pointing this out. We'll keep you guys posted on the feature flag discussion. Until then, have a nice weekend! Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Serialize HCA-related hCalls on POWER5
All firmware versions on POWER5 systems have a locking issue in the HCA-related hCalls that can cause loss of Infiniband connectivity if allocate and free calls happen in parallel. This may for example be caused if two processes are using OpenMPI in parallel. Circumvent this by serializing all HCA-related hCalls on POWER5. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- We tested this patch, especially the autodetection, and it works okay. Please review and apply for 2.6.24-rc5 - thanks! drivers/infiniband/hw/ehca/ehca_main.c | 16 drivers/infiniband/hw/ehca/hcp_if.c| 28 +++- 2 files changed, 27 insertions(+), 17 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 90d4334..8f33d06 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -43,6 +43,9 @@ #ifdef CONFIG_PPC_64K_PAGES #include linux/slab.h #endif + +#include asm/cputable.h + #include ehca_classes.h #include ehca_iverbs.h #include ehca_mrmw.h @@ -66,6 +69,7 @@ int ehca_poll_all_eqs = 1; int ehca_static_rate = -1; int ehca_scaling_code = 0; int ehca_mr_largepage = 1; +int ehca_lock_hcalls = -1; module_param_named(open_aqp1, ehca_open_aqp1, int, S_IRUGO); module_param_named(debug_level, ehca_debug_level, int, S_IRUGO); @@ -77,6 +81,7 @@ module_param_named(poll_all_eqs, ehca_poll_all_eqs, int, S_IRUGO); module_param_named(static_rate, ehca_static_rate, int, S_IRUGO); module_param_named(scaling_code, ehca_scaling_code, int, S_IRUGO); module_param_named(mr_largepage, ehca_mr_largepage, int, S_IRUGO); +module_param_named(lock_hcalls, ehca_lock_hcalls, bool, S_IRUGO); MODULE_PARM_DESC(open_aqp1, AQP1 on startup (0: no (default), 1: yes)); @@ -102,6 +107,9 @@ MODULE_PARM_DESC(scaling_code, MODULE_PARM_DESC(mr_largepage, use large page for MR (0: use PAGE_SIZE (default), 1: use large page depending on MR size); +MODULE_PARM_DESC(lock_hcalls, +serialize all hCalls made by the driver +(default: autodetect)); DEFINE_RWLOCK(ehca_qp_idr_lock); DEFINE_RWLOCK(ehca_cq_idr_lock); @@ -924,6 +932,14 @@ int __init ehca_module_init(void) printk(KERN_INFO eHCA Infiniband Device Driver (Version HCAD_VERSION )\n); + /* Autodetect hCall locking -- we can't read the firmware version +* directly, but we know that starting with POWER6, all firmware +* versions are good. +*/ + if (ehca_lock_hcalls == -1) + ehca_lock_hcalls = !(cur_cpu_spec-cpu_user_features + PPC_FEATURE_ARCH_2_05); + ret = ehca_create_comp_pool(); if (ret) { ehca_gen_err(Cannot create comp pool.); diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c index c16a213..331b5e8 100644 --- a/drivers/infiniband/hw/ehca/hcp_if.c +++ b/drivers/infiniband/hw/ehca/hcp_if.c @@ -89,6 +89,7 @@ #define HCALL9_REGS_FORMAT HCALL7_REGS_FORMAT r11=%lx r12=%lx static DEFINE_SPINLOCK(hcall_lock); +extern int ehca_lock_hcalls; static u32 get_longbusy_msecs(int longbusy_rc) { @@ -120,26 +121,21 @@ static long ehca_plpar_hcall_norets(unsigned long opcode, unsigned long arg7) { long ret; - int i, sleep_msecs, do_lock; - unsigned long flags; + int i, sleep_msecs; + unsigned long flags = 0; ehca_gen_dbg(opcode=%lx HCALL7_REGS_FORMAT, opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7); - /* lock H_FREE_RESOURCE(MR) against itself and H_ALLOC_RESOURCE(MR) */ - if ((opcode == H_FREE_RESOURCE) (arg7 == 5)) { - arg7 = 0; /* better not upset firmware */ - do_lock = 1; - } - for (i = 0; i 5; i++) { - if (do_lock) + /* serialize hCalls to work around firmware issue */ + if (ehca_lock_hcalls) spin_lock_irqsave(hcall_lock, flags); ret = plpar_hcall_norets(opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7); - if (do_lock) + if (ehca_lock_hcalls) spin_unlock_irqrestore(hcall_lock, flags); if (H_IS_LONG_BUSY(ret)) { @@ -174,24 +170,22 @@ static long ehca_plpar_hcall9(unsigned long opcode, unsigned long arg9) { long ret; - int i, sleep_msecs, do_lock; + int i, sleep_msecs; unsigned long flags = 0; ehca_gen_dbg(INPUT -- opcode=%lx HCALL9_REGS_FORMAT, opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9); - /* lock H_ALLOC_RESOURCE(MR) against itself and H_FREE_RESOURCE(MR) */ - do_lock = ((opcode
[PATCH 2/2] IB/ehca: Fix static rate calculation
The IPD formula was a little off and assumed a fixed physical link rate; fix the formula and query the actual physical link rate, now that we can get it. Also, refactor the calculation into a common function ehca_calc_ipd() and use that instead of duplicating code. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_av.c | 48 +++- drivers/infiniband/hw/ehca/ehca_classes.h |1 - drivers/infiniband/hw/ehca/ehca_iverbs.h |3 ++ drivers/infiniband/hw/ehca/ehca_main.c|3 -- drivers/infiniband/hw/ehca/ehca_qp.c | 29 +++-- 5 files changed, 54 insertions(+), 30 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c index 97d1086..453eb99 100644 --- a/drivers/infiniband/hw/ehca/ehca_av.c +++ b/drivers/infiniband/hw/ehca/ehca_av.c @@ -50,6 +50,38 @@ static struct kmem_cache *av_cache; +int ehca_calc_ipd(struct ehca_shca *shca, int port, + enum ib_rate path_rate, u32 *ipd) +{ + int path = ib_rate_to_mult(path_rate); + int link, ret; + struct ib_port_attr pa; + + if (path_rate == IB_RATE_PORT_CURRENT) { + *ipd = 0; + return 0; + } + + if (unlikely(path 0)) { + ehca_err(shca-ib_device, Invalid static rate! path_rate=%x, +path_rate); + return -EINVAL; + } + + ret = ehca_query_port(shca-ib_device, port, pa); + if (unlikely(ret 0)) { + ehca_err(shca-ib_device, Failed to query port ret=%i, ret); + return ret; + } + + link = ib_width_enum_to_int(pa.active_width) * pa.active_speed; + + /* IPD = round((link / path) - 1) */ + *ipd = ((link + (path 1)) / path) - 1; + + return 0; +} + struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) { int ret; @@ -69,15 +101,13 @@ struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) av-av.slid_path_bits = ah_attr-src_path_bits; if (ehca_static_rate 0) { - int ah_mult = ib_rate_to_mult(ah_attr-static_rate); - int ehca_mult = - ib_rate_to_mult(shca-sport[ah_attr-port_num].rate ); - - if (ah_mult = ehca_mult) - av-av.ipd = 0; - else - av-av.ipd = (ah_mult 0) ? - ((ehca_mult - 1) / ah_mult) : 0; + u32 ipd; + if (ehca_calc_ipd(shca, ah_attr-port_num, + ah_attr-static_rate, ipd)) { + ret = -EINVAL; + goto create_ah_exit1; + } + av-av.ipd = ipd; } else av-av.ipd = ehca_static_rate; diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 2d660ae..87f12d4 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -95,7 +95,6 @@ struct ehca_sma_attr { struct ehca_sport { struct ib_cq *ibcq_aqp1; struct ib_qp *ibqp_aqp1; - enum ib_rate rate; enum ib_port_state port_state; struct ehca_sma_attr saved_attr; }; diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h index dce503b..5485799 100644 --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h @@ -189,6 +189,9 @@ int ehca_mmap(struct ib_ucontext *context, struct vm_area_struct *vma); void ehca_poll_eqs(unsigned long data); +int ehca_calc_ipd(struct ehca_shca *shca, int port, + enum ib_rate path_rate, u32 *ipd); + #ifdef CONFIG_PPC_64K_PAGES void *ehca_alloc_fw_ctrlblock(gfp_t flags); void ehca_free_fw_ctrlblock(void *ptr); diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index c6cd38c..90d4334 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -327,9 +327,6 @@ static int ehca_sense_attributes(struct ehca_shca *shca) shca-hw_level = ehca_hw_level; ehca_gen_dbg( ... hardware level=%x, shca-hw_level); - shca-sport[0].rate = IB_RATE_30_GBPS; - shca-sport[1].rate = IB_RATE_30_GBPS; - shca-hca_cap = rblock-hca_cap_indicators; ehca_gen_dbg( ... HCA capabilities:); for (i = 0; i ARRAY_SIZE(hca_cap_descr); i++) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index de18264..2e3e654 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -1196,10 +1196,6 @@ static int internal_modify_qp(struct ib_qp *ibqp, update_mask |= EHCA_BMASK_SET(MQPCB_MASK_QKEY, 1); } if (attr_mask IB_QP_AV) { - int ah_mult
[PATCH 1/2] IB/ehca: Return physical link information in query_port()
Newer firmware versions return physical port information to the partition, so hand that information to the consumer if it's present. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_hca.c | 20 ++-- drivers/infiniband/hw/ehca/hipz_hw.h |6 +- 2 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c index 15806d1..5bd7b59 100644 --- a/drivers/infiniband/hw/ehca/ehca_hca.c +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -151,7 +151,6 @@ int ehca_query_port(struct ib_device *ibdev, } memset(props, 0, sizeof(struct ib_port_attr)); - props-state = rblock-state; switch (rblock-max_mtu) { case 0x1: @@ -188,11 +187,20 @@ int ehca_query_port(struct ib_device *ibdev, props-subnet_timeout = rblock-subnet_timeout; props-init_type_reply = rblock-init_type_reply; - props-active_width= IB_WIDTH_12X; - props-active_speed= 0x1; - - /* at the moment (logical) link state is always LINK_UP */ - props-phys_state = 0x5; + if (rblock-state rblock-phys_width) { + props-phys_state = rblock-phys_pstate; + props-state = rblock-phys_state; + props-active_width= rblock-phys_width; + props-active_speed= rblock-phys_speed; + } else { + /* old firmware releases don't report physical +* port info, so use default values +*/ + props-phys_state = 5; + props-state = rblock-state; + props-active_width= IB_WIDTH_12X; + props-active_speed= 0x1; + } query_port1: ehca_free_fw_ctrlblock(rblock); diff --git a/drivers/infiniband/hw/ehca/hipz_hw.h b/drivers/infiniband/hw/ehca/hipz_hw.h index d9739e5..485b840 100644 --- a/drivers/infiniband/hw/ehca/hipz_hw.h +++ b/drivers/infiniband/hw/ehca/hipz_hw.h @@ -402,7 +402,11 @@ struct hipz_query_port { u64 max_msg_sz; u32 max_mtu; u32 vl_cap; - u8 reserved2[1900]; + u32 phys_pstate; + u32 phys_state; + u32 phys_speed; + u32 phys_width; + u8 reserved2[1884]; u64 guid_entries[255]; } __attribute__ ((packed)); -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/2] IB/ehca: Return physical link information, fix static rate calculation
This patchset will fix static rate calculation for the new link speeds supported by eHCA2. Also, it enables query_port() to return physical link information instead of constant values, which is needed for the static rate fix. [1/2] makes query_port() return actual physical link info where supported [2/2] fixes static rate calculation based on that info The patches will apply, in order, on top of Roland's for-2.6.24 branch. Please review them and apply for 2.6.24-rc2 if you think they're okay. Thanks and regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 4/5] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers
On Tuesday 09 October 2007 10:21, Jan-Bernd Themann wrote: Roland Dreier [EMAIL PROTECTED] wrote on 03.10.2007 20:05:44: Replace struct ibmebus_dev and struct ibmebus_driver with struct of_device and struct of_platform_driver, respectively. Match the external ibmebus interface and drivers using it. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] This is somewhat difficult as this patch touches files that are the responsibility of three different maintainers. �Is it possible to split the patch into three, one for each maintainer (possibly by keeping both old and new interfaces around for a little while)? If not, then you need to get an Acked-by and an agreement that this change can go via the powerpc.git tree from Roland Dreier and Jeff Garzik. I don't see anything objectionable in the infiniband parts of the patch -- I don't have any way to test the changes but it all looks like a straightforward conversion to a new platform API. So: Acked-by: Roland Dreier [EMAIL PROTECTED] - R. Looks good from eHEA driver perspective. Acked-by: Jan-Bernd Themann [EMAIL PROTECTED] Jeff, do you have any objections against this patch going into the kernel via Paul's powerpc.git tree? It touches only a few lines of ehea which are specific to the bus interface changes. You can see the full patch here: http://patchwork.ozlabs.org/linuxppc/patch?id=13750 If you have no objections, please ack the patch so Paul can include it. Thanks and regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/5] IB/ehca: SRQ and MR/MW fixes
Here are some more fixes for the eHCA driver, fixing some problems we found during internal system test. [1/5] fixes the QP pointer determination for SRQ base QPs [2/5] fixes a masking error in {,re}reg_phys_mr() [3/5] fixes a bug in alloc_fmr() and simplifies some code [4/5] refactors hca_cap_mr_pgsize and fixes a problem with ib_srp [5/5] enables large page MRs by default I built the patches on top of Roland's for-2.6.24 git branch. Please review and queue them for 2.6.24-rc1 if you're okay with them. Thanks! Cheers, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/5] IB/ehca: Supply QP token for SRQ base QPs
Because hardware reports the SRQ token in RWQEs of SRQ base QPs, supply the base QP token as SRQ token, so we can properly find the SRQ base QP. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_qp.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index e2bd62b..de18264 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -451,7 +451,6 @@ static struct ehca_qp *internal_create_qp( has_srq = 1; parms.ext_type = EQPT_SRQBASE; parms.srq_qpn = my_srq-real_qp_num; - parms.srq_token = my_srq-token; } if (is_llqp has_srq) { @@ -583,6 +582,9 @@ static struct ehca_qp *internal_create_qp( goto create_qp_exit1; } + if (has_srq) + parms.srq_token = my_qp-token; + parms.servicetype = ibqptype2servicetype(qp_type); if (parms.servicetype 0) { ret = -EINVAL; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/5] IB/ehca: Fix masking error in {,re}reg_phys_mr()
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_mrmw.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index da88738..16c9efd 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -259,7 +259,7 @@ struct ib_mr *ehca_reg_phys_mr(struct ib_pd *pd, pginfo.u.phy.num_phys_buf = num_phys_buf; pginfo.u.phy.phys_buf_array = phys_buf_array; pginfo.next_hwpage = - ((u64)iova_start ~(hw_pgsize - 1)) / hw_pgsize; + ((u64)iova_start ~PAGE_MASK) / hw_pgsize; ret = ehca_reg_mr(shca, e_mr, iova_start, size, mr_access_flags, e_pd, pginfo, e_mr-ib.ib_mr.lkey, @@ -547,7 +547,7 @@ int ehca_rereg_phys_mr(struct ib_mr *mr, pginfo.u.phy.num_phys_buf = num_phys_buf; pginfo.u.phy.phys_buf_array = phys_buf_array; pginfo.next_hwpage = - ((u64)iova_start ~(hw_pgsize - 1)) / hw_pgsize; + ((u64)iova_start ~PAGE_MASK) / hw_pgsize; } if (mr_rereg_mask IB_MR_REREG_ACCESS) new_acl = mr_access_flags; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 3/5] IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr()
Simplify ehca_encode_hwpage_size(), fixing an infinite loop for pgsize == 0 in the process. Fix the bug in alloc_fmr() that triggered the loop. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_mrmw.c | 15 --- 1 files changed, 4 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index 16c9efd..b9a788c 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -72,17 +72,9 @@ enum ehca_mr_pgsize { static u32 ehca_encode_hwpage_size(u32 pgsize) { - u32 idx = 0; - pgsize = 12; - /* -* map mr page size into hw code: -* 0, 1, 2, 3 for 4K, 64K, 1M, 64M -*/ - while (!(pgsize 1)) { - idx++; - pgsize = 4; - } - return idx; + int log = ilog2(pgsize); + WARN_ON(log 12 || log 24 || log 3); + return (log - 12) / 4; } static u64 ehca_get_max_hwpage_size(struct ehca_shca *shca) @@ -826,6 +818,7 @@ struct ib_fmr *ehca_alloc_fmr(struct ib_pd *pd, /* register MR on HCA */ memset(pginfo, 0, sizeof(pginfo)); + pginfo.hwpage_size = hw_pgsize; /* * pginfo.num_hwpages==0, ie register_rpages() will not be called * but deferred to map_phys_fmr() -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 4/5] IB/ehca: Change meaning of hca_cap_mr_pgsize
ehca_shca.hca_cap_mr_pgsize now contains all supported page sizes ORed together. This makes some checks easier to code and understand, plus we can return this value verbatim in query_hca(), fixing a problem with SRP (reported by Anton Blanchard -- thanks!). Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |1 - drivers/infiniband/hw/ehca/ehca_hca.c |1 + drivers/infiniband/hw/ehca/ehca_main.c| 18 - drivers/infiniband/hw/ehca/ehca_mrmw.c| 38 ++-- 4 files changed, 36 insertions(+), 22 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 0f7a55d..365bc5d 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -323,7 +323,6 @@ extern int ehca_static_rate; extern int ehca_port_act_time; extern int ehca_use_hp_mr; extern int ehca_scaling_code; -extern int ehca_mr_largepage; struct ipzu_queue_resp { u32 qe_size; /* queue entry size */ diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c index 4aa3ffa..15806d1 100644 --- a/drivers/infiniband/hw/ehca/ehca_hca.c +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -77,6 +77,7 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props) } memset(props, 0, sizeof(struct ib_device_attr)); + props-page_size_cap = shca-hca_cap_mr_pgsize; props-fw_ver = rblock-hw_ver; props-max_mr_size = rblock-max_mr_size; props-vendor_id = rblock-vendor_id 8; diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 403467f..d477dc3 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -260,13 +260,20 @@ static struct cap_descr { { HCA_CAP_MINI_QP, HCA_CAP_MINI_QP }, }; -int ehca_sense_attributes(struct ehca_shca *shca) +static int ehca_sense_attributes(struct ehca_shca *shca) { int i, ret = 0; u64 h_ret; struct hipz_query_hca *rblock; struct hipz_query_port *port; + static const u32 pgsize_map[] = { + HCA_CAP_MR_PGSIZE_4K, 0x1000, + HCA_CAP_MR_PGSIZE_64K, 0x1, + HCA_CAP_MR_PGSIZE_1M, 0x10, + HCA_CAP_MR_PGSIZE_16M, 0x100, + }; + rblock = ehca_alloc_fw_ctrlblock(GFP_KERNEL); if (!rblock) { ehca_gen_err(Cannot allocate rblock memory.); @@ -329,8 +336,15 @@ int ehca_sense_attributes(struct ehca_shca *shca) if (EHCA_BMASK_GET(hca_cap_descr[i].mask, shca-hca_cap)) ehca_gen_dbg( %s, hca_cap_descr[i].descr); - shca-hca_cap_mr_pgsize = rblock-memory_page_size_supported; + /* translate supported MR page sizes; always support 4K */ + shca-hca_cap_mr_pgsize = EHCA_PAGESIZE; + if (ehca_mr_largepage) { /* support extra sizes only if enabled */ + for (i = 0; i ARRAY_SIZE(pgsize_map); i += 2) + if (rblock-memory_page_size_supported pgsize_map[i]) + shca-hca_cap_mr_pgsize |= pgsize_map[i + 1]; + } + /* query max MTU from first port -- it's the same for all ports */ port = (struct hipz_query_port *)rblock; h_ret = hipz_h_query_port(shca-ipz_hca_handle, 1, port); if (h_ret != H_SUCCESS) { diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index b9a788c..bb97915 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -79,9 +79,7 @@ static u32 ehca_encode_hwpage_size(u32 pgsize) static u64 ehca_get_max_hwpage_size(struct ehca_shca *shca) { - if (shca-hca_cap_mr_pgsize HCA_CAP_MR_PGSIZE_16M) - return EHCA_MR_PGSIZE16M; - return EHCA_MR_PGSIZE4K; + return 1UL ilog2(shca-hca_cap_mr_pgsize); } static struct ehca_mr *ehca_mr_new(void) @@ -288,7 +286,7 @@ struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, container_of(pd-device, struct ehca_shca, ib_device); struct ehca_pd *e_pd = container_of(pd, struct ehca_pd, ib_pd); struct ehca_mr_pginfo pginfo; - int ret; + int ret, page_shift; u32 num_kpages; u32 num_hwpages; u64 hwpage_size; @@ -343,19 +341,20 @@ struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, /* determine number of MR pages */ num_kpages = NUM_CHUNKS((virt % PAGE_SIZE) + length, PAGE_SIZE); /* select proper hw_pgsize */ - if (ehca_mr_largepage - (shca-hca_cap_mr_pgsize HCA_CAP_MR_PGSIZE_16M)) { - int page_shift = PAGE_SHIFT; - if (e_mr-umem-hugetlb) { - /* determine page_shift, clamp between 4K
[PATCH 5/5] IB/ehca: Enable large page MRs by default
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index d477dc3..2f51c13 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -65,7 +65,7 @@ int ehca_port_act_time = 30; int ehca_poll_all_eqs = 1; int ehca_static_rate = -1; int ehca_scaling_code = 0; -int ehca_mr_largepage = 0; +int ehca_mr_largepage = 1; module_param_named(open_aqp1, ehca_open_aqp1, int, S_IRUGO); module_param_named(debug_level, ehca_debug_level, int, S_IRUGO); -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 3/5] ibmebus: Add device creation and bus probing based on of_device
+/* These devices will automatically be added to the bus during init */ +static struct of_device_id builtin_matches[] = { + { .name = lhca }, + { .compatible = IBM,lhca }, + { .name = lhea }, + { .compatible = IBM,lhea }, + {}, +}; + Hmm, do you have devices that only have the matching name property but not the compatible property? If not, I'd suggest only looking for compatible, so you have less chance of false positives. If a device that's not an lhca is called lhca, that's its own fault, i guess ;) But i concur that looking for the compatible property will probably suffice. +static int ibmebus_create_device(struct device_node *dn) [...] nice! Thanks. - rc = IS_ERR(dev) ? PTR_ERR(dev) : count; + rc = rc ? rc : count; the last line looks a bit silly. Maybe instead do rc = ibmebus_create_device(dn); of_node_put(dn); } kfree(path); if (rc) return rc; return count; } More code lines? ;) But yes, that looks more like standard kernel pattern - I'll change that. Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2/5] ibmebus: Remove bus match/probe/remove functions
Arnd Bergmann [EMAIL PROTECTED] wrote on 25.09.2007 16:29:51: The description makes it sound like a git-bisect would get broken by this patch, which should never happen. If the patch indeed ends up with a broken kernel, it would be better to merge it with the later patch that fixes the code again. I took extra care to prevent just that from happening. ibmebus will simply be disabled during the transition (because of {un,}register_driver being empty dummies), but the kernel builds and boots without problems. So unless you're trying to find an ibmebus-based problem, git bisect will be fine. I'll repost 2/5 with an updated description. I split the ibmebus rework into three patches because the merged patch was impossible to read. Makes reviewing easier. Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 1/5] PowerPC: Move of_device allocation into of_device.[ch]
Arnd Bergmann [EMAIL PROTECTED] wrote on 25.09.2007 16:27:57: The patch looks good to me, especially since you did exactly what I suggested ;-) Yes, our discussions were very productive. Thanks and sorry I forgot to mention your input. Maybe the description should have another sentence in it about what the change is good for. You have that in the 0/5 mail, but that does not go into the changelog, so the information gets lost in the process. Can do. New patch coming right up! Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/5] [REPOST] PowerPC: ibmebus refactoring and fixes
This patchset will merge the ibmebus and of_platform bus drivers by basing a lot of ibmebus functionality on of_platform code and adding the features specific to ibmebus on top of that. This is a repost of my previous patchset incorporating Arnd's comments. I split the actual ibmebus rework into three patches (2/5-4/5) for easier readability. The kernel will compile during the intermediate states, and ibmebus will not crash, but not work either. As a side-effect of patch 3/5, a problem with bus_id collisions in case of two devices sharing the same location code is resolved -- the bus_id is now determined differently. [1/5] moves of_device allocation into of_device.[ch] [2/5] removes the old bus match/probe/remove functions [3/5] adds device creation and bus probing based on of_device [4/5] finally moves to of_device and of_platform_driver by changing ibmebus.h and matching the eHCA and eHEA drivers [5/5] just changes a nit in ibmebus_store_probe() These patches should apply cleanly, in order, against 2.6.23-rc5 and against Linus' git. Please review and comment them, and queue them up for 2.6.24 if you think they're okay. Thanks and regards, Joachim arch/powerpc/kernel/ibmebus.c | 267 - arch/powerpc/kernel/of_device.c | 80 + arch/powerpc/kernel/of_platform.c | 70 + drivers/infiniband/hw/ehca/ehca_classes.h |2 +- drivers/infiniband/hw/ehca/ehca_eq.c |6 +- drivers/infiniband/hw/ehca/ehca_main.c| 32 ++-- drivers/net/ehea/ehea.h |2 +- drivers/net/ehea/ehea_main.c | 72 include/asm-powerpc/ibmebus.h | 38 + include/asm-powerpc/of_device.h |4 + include/linux/of_device.h |5 + 11 files changed, 228 insertions(+), 350 deletions(-) -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/5] PowerPC: Move of_device allocation into of_device.[ch]
Extract generic of_device allocation code from of_platform_device_create() and move it into of_device.[ch], called of_device_alloc(). Also, there's now of_device_free() which puts the device node. This way, bus drivers that build on of_platform (like ibmebus will) can build upon this code instead of reinventing the wheel. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- include/asm-powerpc/of_device.h |4 ++ include/linux/of_device.h |5 ++ arch/powerpc/kernel/of_device.c | 80 + arch/powerpc/kernel/of_platform.c | 70 +--- 4 files changed, 91 insertions(+), 68 deletions(-) diff --git a/include/asm-powerpc/of_device.h b/include/asm-powerpc/of_device.h index ec2a8a2..9ab469d 100644 --- a/include/asm-powerpc/of_device.h +++ b/include/asm-powerpc/of_device.h @@ -17,6 +17,10 @@ struct of_device struct device dev;/* Generic device interface */ }; +extern struct of_device *of_device_alloc(struct device_node *np, +const char *bus_id, +struct device *parent); + extern ssize_t of_device_get_modalias(struct of_device *ofdev, char *str, ssize_t len); extern int of_device_uevent(struct device *dev, diff --git a/include/linux/of_device.h b/include/linux/of_device.h index 91bf84b..212bffb 100644 --- a/include/linux/of_device.h +++ b/include/linux/of_device.h @@ -22,5 +22,10 @@ extern int of_device_register(struct of_device *ofdev); extern void of_device_unregister(struct of_device *ofdev); extern void of_release_dev(struct device *dev); +static inline void of_device_free(struct of_device *dev) +{ + of_release_dev(dev-dev); +} + #endif /* __KERNEL__ */ #endif /* _LINUX_OF_DEVICE_H */ diff --git a/arch/powerpc/kernel/of_device.c b/arch/powerpc/kernel/of_device.c index 89b911e..ecb8b0e 100644 --- a/arch/powerpc/kernel/of_device.c +++ b/arch/powerpc/kernel/of_device.c @@ -7,8 +7,88 @@ #include linux/slab.h #include asm/errno.h +#include asm/dcr.h #include asm/of_device.h +static void of_device_make_bus_id(struct of_device *dev) +{ + static atomic_t bus_no_reg_magic; + struct device_node *node = dev-node; + char *name = dev-dev.bus_id; + const u32 *reg; + u64 addr; + int magic; + + /* +* If it's a DCR based device, use 'd' for native DCRs +* and 'D' for MMIO DCRs. +*/ +#ifdef CONFIG_PPC_DCR + reg = of_get_property(node, dcr-reg, NULL); + if (reg) { +#ifdef CONFIG_PPC_DCR_NATIVE + snprintf(name, BUS_ID_SIZE, d%x.%s, +*reg, node-name); +#else /* CONFIG_PPC_DCR_NATIVE */ + addr = of_translate_dcr_address(node, *reg, NULL); + if (addr != OF_BAD_ADDR) { + snprintf(name, BUS_ID_SIZE, +D%llx.%s, (unsigned long long)addr, +node-name); + return; + } +#endif /* !CONFIG_PPC_DCR_NATIVE */ + } +#endif /* CONFIG_PPC_DCR */ + + /* +* For MMIO, get the physical address +*/ + reg = of_get_property(node, reg, NULL); + if (reg) { + addr = of_translate_address(node, reg); + if (addr != OF_BAD_ADDR) { + snprintf(name, BUS_ID_SIZE, +%llx.%s, (unsigned long long)addr, +node-name); + return; + } + } + + /* +* No BusID, use the node name and add a globally incremented +* counter (and pray...) +*/ + magic = atomic_add_return(1, bus_no_reg_magic); + snprintf(name, BUS_ID_SIZE, %s.%d, node-name, magic - 1); +} + +struct of_device *of_device_alloc(struct device_node *np, + const char *bus_id, + struct device *parent) +{ + struct of_device *dev; + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return NULL; + + dev-node = of_node_get(np); + dev-dev.dma_mask = dev-dma_mask; + dev-dev.parent = parent; + dev-dev.release = of_release_dev; + dev-dev.archdata.of_node = np; + dev-dev.archdata.numa_node = of_node_to_nid(np); + + if (bus_id) + strlcpy(dev-dev.bus_id, bus_id, BUS_ID_SIZE); + else + of_device_make_bus_id(dev); + + return dev; +} +EXPORT_SYMBOL(of_device_alloc); + ssize_t of_device_get_modalias(struct of_device *ofdev, char *str, ssize_t len) { diff --git a/arch/powerpc/kernel/of_platform.c b/arch/powerpc/kernel/of_platform.c index f70e787..1d96b82 100644 --- a/arch/powerpc/kernel/of_platform.c +++ b/arch/powerpc/kernel/of_platform.c @@ -21,7 +21,6 @@ #include linux/pci.h
[PATCH 5/5] ibmebus: More speaking error return code in ibmebus_store_probe()
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- arch/powerpc/kernel/ibmebus.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index c1e2963..0bd186c 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -268,10 +268,10 @@ static ssize_t ibmebus_store_probe(struct bus_type *bus, return -ENOMEM; if (bus_find_device(ibmebus_bus_type, NULL, path, -ibmebus_match_path)) { + ibmebus_match_path)) { printk(KERN_WARNING %s: %s has already been probed\n, __FUNCTION__, path); - rc = -EINVAL; + rc = -EEXIST; goto out; } -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/5] PowerPC: ibmebus refactoring and fixes
This patchset will merge the ibmebus and of_platform bus drivers by basing a lot of ibmebus functionality on of_platform code and adding the features specific to ibmebus on top of that. I split the actual ibmebus rework into three patches (2/5-4/5) for easier readability. The kernel will compile during the intermediate states, and ibmebus will not crash, but not work either. As a side-effect of patch 3/5, a problem with bus_id collisions in case of two devices sharing the same location code is resolved -- the bus_id is now determined differently. [1/5] moves of_device allocation into of_device.[ch] [2/5] removes the old bus match/probe/remove functions [3/5] adds device creation and bus probing based on of_device [4/5] finally moves to of_device and of_platform_driver by changing ibmebus.h and matching the eHCA and eHEA drivers [5/5] just changes a nit in ibmebus_store_probe() These patches should apply cleanly, in order, against 2.6.23-rc5 and against Linus' git. Please review and comment them, and queue them up for 2.6.24 if you think they're okay. Thanks and regards, Joachim arch/powerpc/kernel/ibmebus.c | 263 - arch/powerpc/kernel/of_device.c | 80 + arch/powerpc/kernel/of_platform.c | 70 + drivers/infiniband/hw/ehca/ehca_classes.h |2 +- drivers/infiniband/hw/ehca/ehca_eq.c |6 +- drivers/infiniband/hw/ehca/ehca_main.c| 32 ++-- drivers/net/ehea/ehea.h |2 +- drivers/net/ehea/ehea_main.c | 72 include/asm-powerpc/ibmebus.h | 38 + include/asm-powerpc/of_device.h |4 + include/linux/of_device.h |5 + 11 files changed, 226 insertions(+), 348 deletions(-) -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/5] PowerPC: Move of_device allocation into of_device.[ch]
Extract generic of_device allocation code from of_platform_device_create() and move it into of_device.[ch], called of_device_alloc(). Also, there's now of_device_free() which puts the device node. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- include/asm-powerpc/of_device.h |4 ++ include/linux/of_device.h |5 ++ arch/powerpc/kernel/of_device.c | 80 + arch/powerpc/kernel/of_platform.c | 70 +--- 4 files changed, 91 insertions(+), 68 deletions(-) diff --git a/include/asm-powerpc/of_device.h b/include/asm-powerpc/of_device.h index ec2a8a2..9ab469d 100644 --- a/include/asm-powerpc/of_device.h +++ b/include/asm-powerpc/of_device.h @@ -17,6 +17,10 @@ struct of_device struct device dev;/* Generic device interface */ }; +extern struct of_device *of_device_alloc(struct device_node *np, +const char *bus_id, +struct device *parent); + extern ssize_t of_device_get_modalias(struct of_device *ofdev, char *str, ssize_t len); extern int of_device_uevent(struct device *dev, diff --git a/include/linux/of_device.h b/include/linux/of_device.h index 91bf84b..212bffb 100644 --- a/include/linux/of_device.h +++ b/include/linux/of_device.h @@ -22,5 +22,10 @@ extern int of_device_register(struct of_device *ofdev); extern void of_device_unregister(struct of_device *ofdev); extern void of_release_dev(struct device *dev); +static inline void of_device_free(struct of_device *dev) +{ + of_release_dev(dev-dev); +} + #endif /* __KERNEL__ */ #endif /* _LINUX_OF_DEVICE_H */ diff --git a/arch/powerpc/kernel/of_device.c b/arch/powerpc/kernel/of_device.c index 89b911e..ecb8b0e 100644 --- a/arch/powerpc/kernel/of_device.c +++ b/arch/powerpc/kernel/of_device.c @@ -7,8 +7,88 @@ #include linux/slab.h #include asm/errno.h +#include asm/dcr.h #include asm/of_device.h +static void of_device_make_bus_id(struct of_device *dev) +{ + static atomic_t bus_no_reg_magic; + struct device_node *node = dev-node; + char *name = dev-dev.bus_id; + const u32 *reg; + u64 addr; + int magic; + + /* +* If it's a DCR based device, use 'd' for native DCRs +* and 'D' for MMIO DCRs. +*/ +#ifdef CONFIG_PPC_DCR + reg = of_get_property(node, dcr-reg, NULL); + if (reg) { +#ifdef CONFIG_PPC_DCR_NATIVE + snprintf(name, BUS_ID_SIZE, d%x.%s, +*reg, node-name); +#else /* CONFIG_PPC_DCR_NATIVE */ + addr = of_translate_dcr_address(node, *reg, NULL); + if (addr != OF_BAD_ADDR) { + snprintf(name, BUS_ID_SIZE, +D%llx.%s, (unsigned long long)addr, +node-name); + return; + } +#endif /* !CONFIG_PPC_DCR_NATIVE */ + } +#endif /* CONFIG_PPC_DCR */ + + /* +* For MMIO, get the physical address +*/ + reg = of_get_property(node, reg, NULL); + if (reg) { + addr = of_translate_address(node, reg); + if (addr != OF_BAD_ADDR) { + snprintf(name, BUS_ID_SIZE, +%llx.%s, (unsigned long long)addr, +node-name); + return; + } + } + + /* +* No BusID, use the node name and add a globally incremented +* counter (and pray...) +*/ + magic = atomic_add_return(1, bus_no_reg_magic); + snprintf(name, BUS_ID_SIZE, %s.%d, node-name, magic - 1); +} + +struct of_device *of_device_alloc(struct device_node *np, + const char *bus_id, + struct device *parent) +{ + struct of_device *dev; + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return NULL; + + dev-node = of_node_get(np); + dev-dev.dma_mask = dev-dma_mask; + dev-dev.parent = parent; + dev-dev.release = of_release_dev; + dev-dev.archdata.of_node = np; + dev-dev.archdata.numa_node = of_node_to_nid(np); + + if (bus_id) + strlcpy(dev-dev.bus_id, bus_id, BUS_ID_SIZE); + else + of_device_make_bus_id(dev); + + return dev; +} +EXPORT_SYMBOL(of_device_alloc); + ssize_t of_device_get_modalias(struct of_device *ofdev, char *str, ssize_t len) { diff --git a/arch/powerpc/kernel/of_platform.c b/arch/powerpc/kernel/of_platform.c index f70e787..1d96b82 100644 --- a/arch/powerpc/kernel/of_platform.c +++ b/arch/powerpc/kernel/of_platform.c @@ -21,7 +21,6 @@ #include linux/pci.h #include asm/errno.h -#include asm/dcr.h #include asm/of_device.h #include asm/of_platform.h #include asm/topology.h
[PATCH 2/5] ibmebus: Remove bus match/probe/remove functions
ibmebus_{,un}register_driver() are replaced by dummy functions because ibmebus is temporarily unusable in this transitional state. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- arch/powerpc/kernel/ibmebus.c | 199 ++--- 1 files changed, 6 insertions(+), 193 deletions(-) diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index d6a38cd..cc80f84 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -41,6 +41,7 @@ #include linux/kobject.h #include linux/dma-mapping.h #include linux/interrupt.h +#include linux/of_platform.h #include asm/ibmebus.h #include asm/abs_addr.h @@ -123,183 +124,14 @@ static struct dma_mapping_ops ibmebus_dma_ops = { .dma_supported = ibmebus_dma_supported, }; -static int ibmebus_bus_probe(struct device *dev) -{ - struct ibmebus_dev *ibmebusdev= to_ibmebus_dev(dev); - struct ibmebus_driver *ibmebusdrv = to_ibmebus_driver(dev-driver); - const struct of_device_id *id; - int error = -ENODEV; - - if (!ibmebusdrv-probe) - return error; - - id = of_match_device(ibmebusdrv-id_table, ibmebusdev-ofdev); - if (id) { - error = ibmebusdrv-probe(ibmebusdev, id); - } - - return error; -} - -static int ibmebus_bus_remove(struct device *dev) -{ - struct ibmebus_dev *ibmebusdev= to_ibmebus_dev(dev); - struct ibmebus_driver *ibmebusdrv = to_ibmebus_driver(dev-driver); - - if (ibmebusdrv-remove) { - return ibmebusdrv-remove(ibmebusdev); - } - - return 0; -} - -static void __devinit ibmebus_dev_release(struct device *dev) -{ - of_node_put(to_ibmebus_dev(dev)-ofdev.node); - kfree(to_ibmebus_dev(dev)); -} - -static int __devinit ibmebus_register_device_common( - struct ibmebus_dev *dev, const char *name) -{ - int err = 0; - - dev-ofdev.dev.parent = ibmebus_bus_device; - dev-ofdev.dev.bus = ibmebus_bus_type; - dev-ofdev.dev.release = ibmebus_dev_release; - - dev-ofdev.dev.archdata.of_node = dev-ofdev.node; - dev-ofdev.dev.archdata.dma_ops = ibmebus_dma_ops; - dev-ofdev.dev.archdata.numa_node = of_node_to_nid(dev-ofdev.node); - - /* An ibmebusdev is based on a of_device. We have to change the -* bus type to use our own DMA mapping operations. -*/ - if ((err = of_device_register(dev-ofdev)) != 0) { - printk(KERN_ERR %s: failed to register device (%d).\n, - __FUNCTION__, err); - return -ENODEV; - } - - return 0; -} - -static struct ibmebus_dev* __devinit ibmebus_register_device_node( - struct device_node *dn) -{ - struct ibmebus_dev *dev; - int i, len, bus_len; - - dev = kzalloc(sizeof(struct ibmebus_dev), GFP_KERNEL); - if (!dev) - return ERR_PTR(-ENOMEM); - - dev-ofdev.node = of_node_get(dn); - - len = strlen(dn-full_name + 1); - bus_len = min(len, BUS_ID_SIZE - 1); - memcpy(dev-ofdev.dev.bus_id, dn-full_name + 1 - + (len - bus_len), bus_len); - for (i = 0; i bus_len; i++) - if (dev-ofdev.dev.bus_id[i] == '/') - dev-ofdev.dev.bus_id[i] = '_'; - - /* Register with generic device framework. */ - if (ibmebus_register_device_common(dev, dn-name) != 0) { - kfree(dev); - return ERR_PTR(-ENODEV); - } - - return dev; -} - -static void ibmebus_probe_of_nodes(char* name) -{ - struct device_node *dn = NULL; - - while ((dn = of_find_node_by_name(dn, name))) { - if (IS_ERR(ibmebus_register_device_node(dn))) { - of_node_put(dn); - return; - } - } - - of_node_put(dn); - - return; -} - -static void ibmebus_add_devices_by_id(struct of_device_id *idt) -{ - while (strlen(idt-name) 0) { - ibmebus_probe_of_nodes(idt-name); - idt++; - } - - return; -} - -static int ibmebus_match_name(struct device *dev, void *data) -{ - const struct ibmebus_dev *ebus_dev = to_ibmebus_dev(dev); - const char *name; - - name = of_get_property(ebus_dev-ofdev.node, name, NULL); - - if (name (strcmp(data, name) == 0)) - return 1; - - return 0; -} - -static int ibmebus_unregister_device(struct device *dev) -{ - of_device_unregister(to_of_device(dev)); - - return 0; -} - -static void ibmebus_remove_devices_by_id(struct of_device_id *idt) -{ - struct device *dev; - - while (strlen(idt-name) 0) { - while ((dev = bus_find_device(ibmebus_bus_type, NULL, - (void*)idt-name, - ibmebus_match_name))) { - ibmebus_unregister_device(dev
[PATCH 3/5] ibmebus: Add device creation and bus probing based on of_device
The devtree root is now searched for devices matching a built-in whitelist during boot, so these devices appear on the bus from the beginning. It is still possible to manually add/remove devices to/from the bus by using the probe/remove sysfs interface. Also, when a device driver registers itself, the devtree is matched against its matchlist. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- arch/powerpc/kernel/ibmebus.c | 97 ++--- 1 files changed, 81 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index cc80f84..c506e0d 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -51,6 +51,15 @@ static struct device ibmebus_bus_device = { /* fake parent device */ struct bus_type ibmebus_bus_type; +/* These devices will automatically be added to the bus during init */ +static struct of_device_id builtin_matches[] = { + { .name = lhca }, + { .compatible = IBM,lhca }, + { .name = lhea }, + { .compatible = IBM,lhea }, + {}, +}; + static void *ibmebus_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, @@ -124,6 +133,67 @@ static struct dma_mapping_ops ibmebus_dma_ops = { .dma_supported = ibmebus_dma_supported, }; +static int ibmebus_match_path(struct device *dev, void *data) +{ + struct device_node *dn = to_of_device(dev)-node; + return (dn-full_name + (strcasecmp((char *)data, dn-full_name) == 0)); +} + +static int ibmebus_match_node(struct device *dev, void *data) +{ + return to_of_device(dev)-node == data; +} + +static int ibmebus_create_device(struct device_node *dn) +{ + struct of_device *dev; + int ret; + + dev = of_device_alloc(dn, NULL, ibmebus_bus_device); + if (!dev) + return -ENOMEM; + + dev-dev.bus = ibmebus_bus_type; + dev-dev.archdata.dma_ops = ibmebus_dma_ops; + + ret = of_device_register(dev); + if (ret) { + of_device_free(dev); + return ret; + } + + return 0; +} + +static int ibmebus_create_devices(const struct of_device_id *matches) +{ + struct device_node *root, *child; + int ret = 0; + + root = of_find_node_by_path(/); + + for (child = NULL; (child = of_get_next_child(root, child)); ) { + if (!of_match_node(matches, child)) + continue; + + if (bus_find_device(ibmebus_bus_type, NULL, child, + ibmebus_match_node)) + continue; + + ret = ibmebus_create_device(child); + if (ret) { + printk(KERN_ERR %s: failed to create device (%i), + __FUNCTION__, ret); + of_node_put(child); + break; + } + } + + of_node_put(root); + return ret; +} + int ibmebus_register_driver(struct ibmebus_driver *drv) { return 0; @@ -172,18 +242,6 @@ static struct device_attribute ibmebus_dev_attrs[] = { __ATTR_NULL }; -static int ibmebus_match_path(struct device *dev, void *data) -{ - int rc; - struct device_node *dn = - of_node_get(to_ibmebus_dev(dev)-ofdev.node); - - rc = (dn-full_name (strcasecmp((char*)data, dn-full_name) == 0)); - - of_node_put(dn); - return rc; -} - static char *ibmebus_chomp(const char *in, size_t count) { char *out = (char*)kmalloc(count + 1, GFP_KERNEL); @@ -202,7 +260,6 @@ static ssize_t ibmebus_store_probe(struct bus_type *bus, const char *buf, size_t count) { struct device_node *dn = NULL; - struct ibmebus_dev *dev; char *path; ssize_t rc; @@ -219,9 +276,9 @@ static ssize_t ibmebus_store_probe(struct bus_type *bus, } if ((dn = of_find_node_by_path(path))) { -/* dev = ibmebus_register_device_node(dn); */ + rc = ibmebus_create_device(dn); of_node_put(dn); - rc = IS_ERR(dev) ? PTR_ERR(dev) : count; + rc = rc ? rc : count; } else { printk(KERN_WARNING %s: no such device node: %s\n, __FUNCTION__, path); @@ -245,7 +302,7 @@ static ssize_t ibmebus_store_remove(struct bus_type *bus, if ((dev = bus_find_device(ibmebus_bus_type, NULL, path, ibmebus_match_path))) { -/* ibmebus_unregister_device(dev); */ + of_device_unregister(to_of_device(dev)); kfree(path); return count; @@ -265,6 +322,7 @@ static struct bus_attribute ibmebus_bus_attrs[] = { }; struct bus_type ibmebus_bus_type = { + .uevent= of_device_uevent, .dev_attrs
[PATCH 4/5] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |2 +- drivers/net/ehea/ehea.h |2 +- include/asm-powerpc/ibmebus.h | 38 +++ arch/powerpc/kernel/ibmebus.c | 28 ++- drivers/infiniband/hw/ehca/ehca_eq.c |6 +- drivers/infiniband/hw/ehca/ehca_main.c| 32 ++-- drivers/net/ehea/ehea_main.c | 72 ++-- 7 files changed, 79 insertions(+), 101 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index c2edd4c..8ca4dd4 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -106,7 +106,7 @@ struct ehca_sport { struct ehca_shca { struct ib_device ib_device; - struct ibmebus_dev *ibmebus_dev; + struct of_device *ofdev; u8 num_ports; int hw_level; struct list_head shca_list; diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h index 8d58be5..830a66a 100644 --- a/drivers/net/ehea/ehea.h +++ b/drivers/net/ehea/ehea.h @@ -382,7 +382,7 @@ struct ehea_port_res { #define EHEA_MAX_PORTS 16 struct ehea_adapter { u64 handle; - struct ibmebus_dev *ebus_dev; + struct of_device *ofdev; struct ehea_port *port[EHEA_MAX_PORTS]; struct ehea_eq *neq; /* notification event queue */ struct workqueue_struct *ehea_wq; diff --git a/include/asm-powerpc/ibmebus.h b/include/asm-powerpc/ibmebus.h index 87d396e..1a9d9ae 100644 --- a/include/asm-powerpc/ibmebus.h +++ b/include/asm-powerpc/ibmebus.h @@ -43,42 +43,18 @@ #include linux/device.h #include linux/interrupt.h #include linux/mod_devicetable.h -#include asm/of_device.h +#include linux/of_device.h +#include linux/of_platform.h extern struct bus_type ibmebus_bus_type; -struct ibmebus_dev { - struct of_device ofdev; -}; +int ibmebus_register_driver(struct of_platform_driver *drv); +void ibmebus_unregister_driver(struct of_platform_driver *drv); -struct ibmebus_driver { - char *name; - struct of_device_id *id_table; - int (*probe) (struct ibmebus_dev *dev, const struct of_device_id *id); - int (*remove) (struct ibmebus_dev *dev); - struct device_driver driver; -}; - -int ibmebus_register_driver(struct ibmebus_driver *drv); -void ibmebus_unregister_driver(struct ibmebus_driver *drv); - -int ibmebus_request_irq(struct ibmebus_dev *dev, - u32 ist, - irq_handler_t handler, - unsigned long irq_flags, const char * devname, +int ibmebus_request_irq(u32 ist, irq_handler_t handler, + unsigned long irq_flags, const char *devname, void *dev_id); -void ibmebus_free_irq(struct ibmebus_dev *dev, u32 ist, void *dev_id); - -static inline struct ibmebus_driver *to_ibmebus_driver(struct device_driver *drv) -{ - return container_of(drv, struct ibmebus_driver, driver); -} - -static inline struct ibmebus_dev *to_ibmebus_dev(struct device *dev) -{ - return container_of(dev, struct ibmebus_dev, ofdev.dev); -} - +void ibmebus_free_irq(u32 ist, void *dev_id); #endif /* __KERNEL__ */ #endif /* _ASM_IBMEBUS_H */ diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index c506e0d..379472f 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -194,21 +194,26 @@ static int ibmebus_create_devices(const struct of_device_id *matches) return ret; } -int ibmebus_register_driver(struct ibmebus_driver *drv) +int ibmebus_register_driver(struct of_platform_driver *drv) { - return 0; + /* If the driver uses devices that ibmebus doesn't know, add them */ + ibmebus_create_devices(drv-match_table); + + drv-driver.name = drv-name; + drv-driver.bus= ibmebus_bus_type; + + return driver_register(drv-driver); } EXPORT_SYMBOL(ibmebus_register_driver); -void ibmebus_unregister_driver(struct ibmebus_driver *drv) +void ibmebus_unregister_driver(struct of_platform_driver *drv) { + driver_unregister(drv-driver); } EXPORT_SYMBOL(ibmebus_unregister_driver); -int ibmebus_request_irq(struct ibmebus_dev *dev, - u32 ist, - irq_handler_t handler, - unsigned long irq_flags, const char * devname, +int ibmebus_request_irq(u32 ist, irq_handler_t handler, + unsigned long irq_flags, const char *devname, void *dev_id) { unsigned int irq = irq_create_mapping(NULL, ist); @@ -216,12 +221,11 @@ int ibmebus_request_irq(struct ibmebus_dev *dev, if (irq == NO_IRQ) return -EINVAL; - return request_irq(irq, handler, - irq_flags, devname, dev_id); + return request_irq(irq, handler, irq_flags, devname
[PATCH 5/5] ibmebus: More speaking error return code in ibmebus_store_probe()
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- arch/powerpc/kernel/ibmebus.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index 379472f..8c08a98 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -270,10 +270,10 @@ static ssize_t ibmebus_store_probe(struct bus_type *bus, return -ENOMEM; if (bus_find_device(ibmebus_bus_type, NULL, path, -ibmebus_match_path)) { + ibmebus_match_path)) { printk(KERN_WARNING %s: %s has already been probed\n, __FUNCTION__, path); - rc = -EINVAL; + rc = -EEXIST; goto out; } -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] IB/ehca: Make sure user pages are from hugetlb before using MR large pages
Roland Dreier [EMAIL PROTECTED] wrote on 13.09.2007 06:33:45: -#define HCA_CAP_MR_PGSIZE_4K 1 -#define HCA_CAP_MR_PGSIZE_64K 2 -#define HCA_CAP_MR_PGSIZE_1M 4 -#define HCA_CAP_MR_PGSIZE_16M 8 +#define HCA_CAP_MR_PGSIZE_4K 0x8000 +#define HCA_CAP_MR_PGSIZE_64K 0x4000 +#define HCA_CAP_MR_PGSIZE_1M 0x2000 +#define HCA_CAP_MR_PGSIZE_16M 0x1000 Not sure I understand what this has to do with things... is this an unrelated fix? Kinda. I can put it into its own patch if you want. I would suggest extending ib_umem_get() to check the vmas and adding a member to struct ib_umem to say whether the memory is entirely covered by hugetlb pages or not. I like that approach - one patch coming right up! =) + default: /* out of mem */ + ib_mr = ERR_PTR(-ENOMEM); + goto reg_user_mr_exit1; It seems like it would be better to just assume the memory is not from a hugetlb is ehca_is_mem_hugetlb() fails its memory allocation and fall back to the PAGE_SIZE case rather than failing entirely. If ehca_is_mem_hugetlb() runs out of memory, ehca_reg_mr() is rather unlikely to get the memory, but it's worth a try, I'll give you that. I'll make the umem patch work that way. Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/3] IB/ehca: MR/MW fixes
This patchset replaces Nam's previous MR/MW patch (posted by me). I split the #define fixes into a separate patch and moved the is the memory from hugetlbfs? code into ib_umem_get(). [1/3] fixes the page size HW cap defines [2/3] adds the hugetlb test to ib_umem_get() [3/3] finally uses the hugetlb flag in ehca_reg_user_mr() The patches should apply cleanly, in order, on top of my previous 12-patch set. Please review the changes and apply the patches for 2.6.24 if they are okay. Regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/3] IB/ehca: Fix large page HW cap defines
From: Hoang-Nam Nguyen [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 206d4eb..c2edd4c 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -99,10 +99,10 @@ struct ehca_sport { struct ehca_sma_attr saved_attr; }; -#define HCA_CAP_MR_PGSIZE_4K 1 -#define HCA_CAP_MR_PGSIZE_64K 2 -#define HCA_CAP_MR_PGSIZE_1M 4 -#define HCA_CAP_MR_PGSIZE_16M 8 +#define HCA_CAP_MR_PGSIZE_4K 0x8000 +#define HCA_CAP_MR_PGSIZE_64K 0x4000 +#define HCA_CAP_MR_PGSIZE_1M 0x2000 +#define HCA_CAP_MR_PGSIZE_16M 0x1000 struct ehca_shca { struct ib_device ib_device; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/3] IB/umem: Add hugetlb flag to struct ib_umem
During ib_umem_get(), determine whether all pages from the memory region are hugetlb pages and report this in the hugetlb field. Low-level driver can use this information if they need it. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/core/umem.c | 20 +++- include/rdma/ib_umem.h |1 + 2 files changed, 20 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index 664d2fa..2f54e29 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -37,6 +37,7 @@ #include linux/mm.h #include linux/dma-mapping.h #include linux/sched.h +#include linux/hugetlb.h #include uverbs.h @@ -75,6 +76,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, { struct ib_umem *umem; struct page **page_list; + struct vm_area_struct **vma_list; struct ib_umem_chunk *chunk; unsigned long locked; unsigned long lock_limit; @@ -104,6 +106,9 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, */ umem-writable = !!(access ~IB_ACCESS_REMOTE_READ); + /* We assume the memory is from hugetlb until proved otherwise */ + umem-hugetlb = 1; + INIT_LIST_HEAD(umem-chunk_list); page_list = (struct page **) __get_free_page(GFP_KERNEL); @@ -112,6 +117,14 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, return ERR_PTR(-ENOMEM); } + /* +* if we can't alloc the vma_list, it's not so bad; +* just assume the memory is not hugetlb memory +*/ + vma_list = (struct vm_area_struct **) __get_free_page(GFP_KERNEL); + if (!vma_list) + umem-hugetlb = 0; + npages = PAGE_ALIGN(size + umem-offset) PAGE_SHIFT; down_write(current-mm-mmap_sem); @@ -131,7 +144,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, ret = get_user_pages(current, current-mm, cur_base, min_t(int, npages, PAGE_SIZE / sizeof (struct page *)), -1, !umem-writable, page_list, NULL); +1, !umem-writable, page_list, vma_list); if (ret 0) goto out; @@ -152,6 +165,9 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, chunk-nents = min_t(int, ret, IB_UMEM_MAX_PAGE_CHUNK); for (i = 0; i chunk-nents; ++i) { + if (vma_list + !is_vm_hugetlb_page(vma_list[i + off])) + umem-hugetlb = 0; chunk-page_list[i].page = page_list[i + off]; chunk-page_list[i].offset = 0; chunk-page_list[i].length = PAGE_SIZE; @@ -186,6 +202,8 @@ out: current-mm-locked_vm = locked; up_write(current-mm-mmap_sem); + if (vma_list) + free_page((unsigned long) vma_list); free_page((unsigned long) page_list); return ret 0 ? ERR_PTR(ret) : umem; diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index c533d6c..2229842 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -45,6 +45,7 @@ struct ib_umem { int offset; int page_size; int writable; + int hugetlb; struct list_headchunk_list; struct work_struct work; struct mm_struct *mm; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 3/3] IB/ehca: Make sure user pages are from hugetlb before using MR large pages
...because, on virtualized hardware like System p, we can't be sure that the physical pages behind them are contiguous. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_mrmw.c | 25 +++-- 1 files changed, 15 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index 4c8f3b3..4ba8b7c 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -51,6 +51,7 @@ #define NUM_CHUNKS(length, chunk_size) \ (((length) + (chunk_size - 1)) / (chunk_size)) + /* max number of rpages (per hcall register_rpages) */ #define MAX_RPAGES 512 @@ -64,6 +65,11 @@ enum ehca_mr_pgsize { EHCA_MR_PGSIZE16M = 0x100L }; +#define EHCA_MR_PGSHIFT4K 12 +#define EHCA_MR_PGSHIFT64K 16 +#define EHCA_MR_PGSHIFT1M 20 +#define EHCA_MR_PGSHIFT16M 24 + static u32 ehca_encode_hwpage_size(u32 pgsize) { u32 idx = 0; @@ -347,17 +353,16 @@ struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, /* select proper hw_pgsize */ if (ehca_mr_largepage (shca-hca_cap_mr_pgsize HCA_CAP_MR_PGSIZE_16M)) { - if (length = EHCA_MR_PGSIZE4K -PAGE_SIZE == EHCA_MR_PGSIZE4K) - hwpage_size = EHCA_MR_PGSIZE4K; - else if (length = EHCA_MR_PGSIZE64K) - hwpage_size = EHCA_MR_PGSIZE64K; - else if (length = EHCA_MR_PGSIZE1M) - hwpage_size = EHCA_MR_PGSIZE1M; - else - hwpage_size = EHCA_MR_PGSIZE16M; + int page_shift = PAGE_SHIFT; + if (e_mr-umem-hugetlb) { + /* determine page_shift, clamp between 4K and 16M */ + page_shift = (fls64(length - 1) + 3) ~3; + page_shift = min(max(page_shift, EHCA_MR_PGSHIFT4K), +EHCA_MR_PGSHIFT16M); + } + hwpage_size = 1UL page_shift; } else - hwpage_size = EHCA_MR_PGSIZE4K; + hwpage_size = EHCA_MR_PGSIZE4K; /* ehca1 only supports 4k */ ehca_dbg(pd-device, hwpage_size=%lx, hwpage_size); reg_user_mr_fallback: -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Make sure user pages are from hugetlb before using MR large pages
From: Hoang-Nam Nguyen [EMAIL PROTECTED] ...because, on virtualized hardware like System p, we can't be sure that the physical pages behind them are contiguous. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- Another patch for 2.6.24 that will apply cleanly on top of my previous patchset. Please review and apply. Thanks! drivers/infiniband/hw/ehca/ehca_classes.h |8 ++-- drivers/infiniband/hw/ehca/ehca_mrmw.c| 82 + 2 files changed, 75 insertions(+), 15 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 206d4eb..c2edd4c 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -99,10 +99,10 @@ struct ehca_sport { struct ehca_sma_attr saved_attr; }; -#define HCA_CAP_MR_PGSIZE_4K 1 -#define HCA_CAP_MR_PGSIZE_64K 2 -#define HCA_CAP_MR_PGSIZE_1M 4 -#define HCA_CAP_MR_PGSIZE_16M 8 +#define HCA_CAP_MR_PGSIZE_4K 0x8000 +#define HCA_CAP_MR_PGSIZE_64K 0x4000 +#define HCA_CAP_MR_PGSIZE_1M 0x2000 +#define HCA_CAP_MR_PGSIZE_16M 0x1000 struct ehca_shca { struct ib_device ib_device; diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index 4c8f3b3..1bb9d23 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -41,6 +41,8 @@ */ #include asm/current.h +#include linux/mm.h +#include linux/hugetlb.h #include rdma/ib_umem.h @@ -51,6 +53,7 @@ #define NUM_CHUNKS(length, chunk_size) \ (((length) + (chunk_size - 1)) / (chunk_size)) + /* max number of rpages (per hcall register_rpages) */ #define MAX_RPAGES 512 @@ -279,6 +282,52 @@ reg_phys_mr_exit0: } /* end ehca_reg_phys_mr() */ /*--*/ +static int ehca_is_mem_hugetlb(unsigned long addr, unsigned long size) +{ + struct vm_area_struct **vma_list; + unsigned long cur_base; + unsigned long npages; + int ret, i; + + vma_list = (struct vm_area_struct **) __get_free_page(GFP_KERNEL); + if (!vma_list) { + ehca_gen_err(Can not alloc vma_list); + return -ENOMEM; + } + + down_write(current-mm-mmap_sem); + npages = PAGE_ALIGN(size + (addr ~PAGE_MASK)) PAGE_SHIFT; + cur_base = addr PAGE_MASK; + + while (npages) { + ret = get_user_pages(current, current-mm, cur_base, +min_t(int, npages, + PAGE_SIZE / sizeof (*vma_list)), +1, 0, NULL, vma_list); + + if (ret 0) { + ehca_gen_err(get_user_pages() failed +ret=%x cur_base=%lx, ret, cur_base); + goto is_hugetlb_out; + } + + for (i = 0; i ret; ++i) + if (!is_vm_hugetlb_page(vma_list[i])) { + ret = 0; + goto is_hugetlb_out; + } + + cur_base += ret * PAGE_SIZE; + npages -= ret; + } + ret = 1; + +is_hugetlb_out: + up_write(current-mm-mmap_sem); + free_page((unsigned long) vma_list); + + return ret; +} struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, u64 virt, int mr_access_flags, @@ -346,18 +395,29 @@ struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, num_kpages = NUM_CHUNKS((virt % PAGE_SIZE) + length, PAGE_SIZE); /* select proper hw_pgsize */ if (ehca_mr_largepage - (shca-hca_cap_mr_pgsize HCA_CAP_MR_PGSIZE_16M)) { - if (length = EHCA_MR_PGSIZE4K -PAGE_SIZE == EHCA_MR_PGSIZE4K) - hwpage_size = EHCA_MR_PGSIZE4K; - else if (length = EHCA_MR_PGSIZE64K) - hwpage_size = EHCA_MR_PGSIZE64K; - else if (length = EHCA_MR_PGSIZE1M) - hwpage_size = EHCA_MR_PGSIZE1M; - else - hwpage_size = EHCA_MR_PGSIZE16M; + shca-hca_cap_mr_pgsize HCA_CAP_MR_PGSIZE_16M) { + ret = ehca_is_mem_hugetlb(virt, length); + switch (ret) { + case 0: /* mem is not from hugetlb */ + hwpage_size = PAGE_SIZE; + break; + case 1: + if (length = EHCA_MR_PGSIZE4K +PAGE_SIZE == EHCA_MR_PGSIZE4K) + hwpage_size = EHCA_MR_PGSIZE4K; + else if (length = EHCA_MR_PGSIZE64K) + hwpage_size = EHCA_MR_PGSIZE64K; + else if (length = EHCA_MR_PGSIZE1M
Re: [PATCH 08/12] IB/ehca: Replace get_paca()-paca_index by the more portable smp_processor_id()
On Tuesday 11 September 2007 16:51, Nathan Lynch wrote: - get_paca()-paca_index, __FUNCTION__, \ + smp_processor_id(), __FUNCTION__, \ I think I see these macros used in preemptible code (e.g. ehca_probe), where smp_processor_id() will print a warning when CONFIG_DEBUG_PREEMPT=y. Probably better to use raw_smp_processor_id. You're right, man. The processor id doesn't need to be preemption-safe in this context, so that would be a bogus warning. Thanks for pointing this out. I'll post a new version of this patch. Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 08/12] IB/ehca: Replace get_paca()-paca_index by the more portable raw_smp_processor_id()
We can use raw_smp_processor_id() here because the processor ID is only used for debug output and may therefore be preemption-unsafe. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- This is the same patch, but with smp_processor_id() replaced by raw_smp_processor_id(), as kindly pointed out to me by Nathan. Thanks! drivers/infiniband/hw/ehca/ehca_tools.h | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_tools.h b/drivers/infiniband/hw/ehca/ehca_tools.h index f9b264b..4a8346a 100644 --- a/drivers/infiniband/hw/ehca/ehca_tools.h +++ b/drivers/infiniband/hw/ehca/ehca_tools.h @@ -73,37 +73,37 @@ extern int ehca_debug_level; if (unlikely(ehca_debug_level)) \ dev_printk(KERN_DEBUG, (ib_dev)-dma_device, \ PU%04x EHCA_DBG:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, \ + raw_smp_processor_id(), __FUNCTION__, \ ## arg); \ } while (0) #define ehca_info(ib_dev, format, arg...) \ dev_info((ib_dev)-dma_device, PU%04x EHCA_INFO:%s format \n, \ -get_paca()-paca_index, __FUNCTION__, ## arg) +raw_smp_processor_id(), __FUNCTION__, ## arg) #define ehca_warn(ib_dev, format, arg...) \ dev_warn((ib_dev)-dma_device, PU%04x EHCA_WARN:%s format \n, \ -get_paca()-paca_index, __FUNCTION__, ## arg) +raw_smp_processor_id(), __FUNCTION__, ## arg) #define ehca_err(ib_dev, format, arg...) \ dev_err((ib_dev)-dma_device, PU%04x EHCA_ERR:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg) + raw_smp_processor_id(), __FUNCTION__, ## arg) /* use this one only if no ib_dev available */ #define ehca_gen_dbg(format, arg...) \ do { \ if (unlikely(ehca_debug_level)) \ printk(KERN_DEBUG PU%04x EHCA_DBG:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg); \ + raw_smp_processor_id(), __FUNCTION__, ## arg); \ } while (0) #define ehca_gen_warn(format, arg...) \ printk(KERN_INFO PU%04x EHCA_WARN:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg) + raw_smp_processor_id(), __FUNCTION__, ## arg) #define ehca_gen_err(format, arg...) \ printk(KERN_ERR PU%04x EHCA_ERR:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg) + raw_smp_processor_id(), __FUNCTION__, ## arg) /** * ehca_dmp - printk a memory block, whose length is n*8 bytes. -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 00/12] IB/ehca: New features and fixes for 2.6.24
Here are some fresh eHCA driver features and fixes for your reviewing pleasure. They have passed internal testing and checkpatch.pl, so we think they are ready for inclusion. [01/12] adds userspace support for small QPs [02/12] changes a nit in firmware communication [03/12] adds support for more than 4096 QPs/CQs in user space [04/12] enables mapping firmware contexts into uspace on 64K-page kernels [05/12] changes hvCall debug trace formatting [06/12] outputs return codes as signed decimal integers [07/12] makes warnings also appear in non-debug mode, like they should [08/12] replaces get_paca()-paca_index by the portable smp_processor_id() [09/12] checks the allowed max number of SGEs when creating a QP [10/12] fixes some Path Migration problems [11/12] works around a firmware race condition [12/12] bumps the driver's version number The patches should apply cleanly, in order, against Roland's git. Please review the changes and apply the patches for 2.6.24 if they are okay. Regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 02/12] IB/ehca: Add 1 is not longer needed because of firmware interface change
From: Stefan Roscher [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/hcp_if.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c index 24f4541..8534061 100644 --- a/drivers/infiniband/hw/ehca/hcp_if.c +++ b/drivers/infiniband/hw/ehca/hcp_if.c @@ -317,9 +317,9 @@ u64 hipz_h_alloc_resource_qp(const struct ipz_adapter_handle adapter_handle, max_r10_reg = EHCA_BMASK_SET(H_ALL_RES_QP_MAX_OUTST_SEND_WR, - parms-squeue.max_wr + 1) + parms-squeue.max_wr) | EHCA_BMASK_SET(H_ALL_RES_QP_MAX_OUTST_RECV_WR, -parms-rqueue.max_wr + 1) +parms-rqueue.max_wr) | EHCA_BMASK_SET(H_ALL_RES_QP_MAX_SEND_SGE, parms-squeue.max_sge) | EHCA_BMASK_SET(H_ALL_RES_QP_MAX_RECV_SGE, -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 01/12] IB/ehca: Small QP userspace support
From: Stefan Roscher [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_qp.c |7 +++ drivers/infiniband/hw/ehca/ipz_pt_fn.c |1 + 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index 84d435a..13b61c3 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -273,6 +273,7 @@ static inline void queue2resp(struct ipzu_queue_resp *resp, resp-queue_length = queue-queue_length; resp-pagesize = queue-pagesize; resp-toggle_state = queue-toggle_state; + resp-offset = queue-offset; } /* @@ -598,8 +599,7 @@ static struct ehca_qp *internal_create_qp( parms.squeue.max_sge = max_send_sge; parms.rqueue.max_sge = max_recv_sge; - if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca-hca_cap) -!(context udata)) { /* no small QP support in userspace ATM */ + if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca-hca_cap)) { if (HAS_SQ(my_qp)) ehca_determine_small_queue( parms.squeue, max_send_sge, is_llqp); @@ -741,8 +741,7 @@ static struct ehca_qp *internal_create_qp( resp.ext_type = my_qp-ext_type; resp.qkey = my_qp-qkey; resp.real_qp_num = my_qp-real_qp_num; - resp.ipz_rqueue.offset = my_qp-ipz_rqueue.offset; - resp.ipz_squeue.offset = my_qp-ipz_squeue.offset; + if (HAS_SQ(my_qp)) queue2resp(resp.ipz_squeue, my_qp-ipz_squeue); if (HAS_RQ(my_qp)) diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c b/drivers/infiniband/hw/ehca/ipz_pt_fn.c index 29bd476..661f8db 100644 --- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c +++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c @@ -158,6 +158,7 @@ static int alloc_small_queue_page(struct ipz_queue *queue, struct ehca_pd *pd) queue-queue_pages[0] = (void *)(page-page | (bit (order + 9))); queue-small_page = page; + queue-offset = bit (order + 9); return 1; out: -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 04/12] IB/ehca: Use remap_4k_pfn() to map firmware contexts to user space
From: Hoang-Nam Nguyen [EMAIL PROTECTED] Use Paul's new remap_4k_pfn() function to map our 4K firmware contexts into user space on 64K-page machines without exposing neighboring firmware contexts. Return the context's offset within a 64K page to user space so it can determine the proper virtual address. For details about remap_4k_pfn(), see commit 721151d0 or http://patchwork.ozlabs.org/linuxppc/patch?id=10281 Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |4 +++- drivers/infiniband/hw/ehca/ehca_cq.c |2 ++ drivers/infiniband/hw/ehca/ehca_qp.c |2 ++ drivers/infiniband/hw/ehca/ehca_uverbs.c |6 +++--- 4 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index b5e9603..206d4eb 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -337,6 +337,8 @@ struct ehca_create_cq_resp { u32 cq_number; u32 token; struct ipzu_queue_resp ipz_queue; + u32 fw_handle_ofs; + u32 dummy; }; struct ehca_create_qp_resp { @@ -347,7 +349,7 @@ struct ehca_create_qp_resp { u32 qkey; /* qp_num assigned by ehca: sqp0/1 may have got different numbers */ u32 real_qp_num; - u32 dummy; /* padding for 8 byte alignment */ + u32 fw_handle_ofs; struct ipzu_queue_resp ipz_squeue; struct ipzu_queue_resp ipz_rqueue; }; diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c b/drivers/infiniband/hw/ehca/ehca_cq.c index a6f17e4..d68603d 100644 --- a/drivers/infiniband/hw/ehca/ehca_cq.c +++ b/drivers/infiniband/hw/ehca/ehca_cq.c @@ -281,6 +281,8 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, int comp_vector, resp.ipz_queue.queue_length = ipz_queue-queue_length; resp.ipz_queue.pagesize = ipz_queue-pagesize; resp.ipz_queue.toggle_state = ipz_queue-toggle_state; + resp.fw_handle_ofs = (u32) + (my_cq-galpas.user.fw_handle (PAGE_SIZE - 1)); if (ib_copy_to_udata(udata, resp, sizeof(resp))) { ehca_err(device, Copy to udata failed.); goto create_cq_exit4; diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index e886e3b..3a3880f 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -751,6 +751,8 @@ static struct ehca_qp *internal_create_qp( queue2resp(resp.ipz_squeue, my_qp-ipz_squeue); if (HAS_RQ(my_qp)) queue2resp(resp.ipz_rqueue, my_qp-ipz_rqueue); + resp.fw_handle_ofs = (u32) + (my_qp-galpas.user.fw_handle (PAGE_SIZE - 1)); if (ib_copy_to_udata(udata, resp, sizeof resp)) { ehca_err(pd-device, Copy to udata failed); diff --git a/drivers/infiniband/hw/ehca/ehca_uverbs.c b/drivers/infiniband/hw/ehca/ehca_uverbs.c index 3340f49..84a16bc 100644 --- a/drivers/infiniband/hw/ehca/ehca_uverbs.c +++ b/drivers/infiniband/hw/ehca/ehca_uverbs.c @@ -109,7 +109,7 @@ static int ehca_mmap_fw(struct vm_area_struct *vma, struct h_galpas *galpas, u64 vsize, physical; vsize = vma-vm_end - vma-vm_start; - if (vsize != EHCA_PAGESIZE) { + if (vsize EHCA_PAGESIZE) { ehca_gen_err(invalid vsize=%lx, vma-vm_end - vma-vm_start); return -EINVAL; } @@ -118,8 +118,8 @@ static int ehca_mmap_fw(struct vm_area_struct *vma, struct h_galpas *galpas, vma-vm_page_prot = pgprot_noncached(vma-vm_page_prot); ehca_gen_dbg(vsize=%lx physical=%lx, vsize, physical); /* VM_IO | VM_RESERVED are set by remap_pfn_range() */ - ret = remap_pfn_range(vma, vma-vm_start, physical PAGE_SHIFT, - vsize, vma-vm_page_prot); + ret = remap_4k_pfn(vma, vma-vm_start, physical EHCA_PAGESHIFT, + vma-vm_page_prot); if (unlikely(ret)) { ehca_gen_err(remap_pfn_range() failed ret=%x, ret); return -ENOMEM; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 05/12] IB/ehca: Refactor hvcall tracing
Change hvcall trace output towards better readability: reg numbers instead of argument numbers, return code as signed decimal instead of unsigned hex. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/hcp_if.c | 57 ++ 1 files changed, 24 insertions(+), 33 deletions(-) diff --git a/drivers/infiniband/hw/ehca/hcp_if.c b/drivers/infiniband/hw/ehca/hcp_if.c index 8534061..32f465b 100644 --- a/drivers/infiniband/hw/ehca/hcp_if.c +++ b/drivers/infiniband/hw/ehca/hcp_if.c @@ -84,6 +84,10 @@ #define H_MP_SHUTDOWN EHCA_BMASK_IBM(48, 48) #define H_MP_RESET_QKEY_CTR EHCA_BMASK_IBM(49, 49) +#define HCALL4_REGS_FORMAT r4=%lx r5=%lx r6=%lx r7=%lx +#define HCALL7_REGS_FORMAT HCALL4_REGS_FORMAT r8=%lx r9=%lx r10=%lx +#define HCALL9_REGS_FORMAT HCALL7_REGS_FORMAT r11=%lx r12=%lx + static DEFINE_SPINLOCK(hcall_lock); static u32 get_longbusy_msecs(int longbusy_rc) @@ -118,8 +122,7 @@ static long ehca_plpar_hcall_norets(unsigned long opcode, long ret; int i, sleep_msecs; - ehca_gen_dbg(opcode=%lx arg1=%lx arg2=%lx arg3=%lx arg4=%lx -arg5=%lx arg6=%lx arg7=%lx, + ehca_gen_dbg(opcode=%lx HCALL7_REGS_FORMAT, opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7); for (i = 0; i 5; i++) { @@ -133,16 +136,13 @@ static long ehca_plpar_hcall_norets(unsigned long opcode, } if (ret H_SUCCESS) - ehca_gen_err(opcode=%lx ret=%lx - arg1=%lx arg2=%lx arg3=%lx arg4=%lx - arg5=%lx arg6=%lx arg7=%lx , -opcode, ret, -arg1, arg2, arg3, arg4, arg5, -arg6, arg7); - - ehca_gen_dbg(opcode=%lx ret=%lx, opcode, ret); - return ret; + ehca_gen_err(opcode=%lx ret=%li HCALL7_REGS_FORMAT, +opcode, ret, arg1, arg2, arg3, +arg4, arg5, arg6, arg7); + else + ehca_gen_dbg(opcode=%lx ret=%li, opcode, ret); + return ret; } return H_BUSY; @@ -164,10 +164,8 @@ static long ehca_plpar_hcall9(unsigned long opcode, int i, sleep_msecs, lock_is_set = 0; unsigned long flags = 0; - ehca_gen_dbg(opcode=%lx arg1=%lx arg2=%lx arg3=%lx arg4=%lx -arg5=%lx arg6=%lx arg7=%lx arg8=%lx arg9=%lx, -opcode, arg1, arg2, arg3, arg4, arg5, arg6, arg7, -arg8, arg9); + ehca_gen_dbg(INPUT -- opcode=%lx HCALL9_REGS_FORMAT, opcode, +arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9); for (i = 0; i 5; i++) { if ((opcode == H_ALLOC_RESOURCE) (arg2 == 5)) { @@ -188,26 +186,19 @@ static long ehca_plpar_hcall9(unsigned long opcode, continue; } - if (ret H_SUCCESS) - ehca_gen_err(opcode=%lx ret=%lx - arg1=%lx arg2=%lx arg3=%lx arg4=%lx - arg5=%lx arg6=%lx arg7=%lx arg8=%lx - arg9=%lx - out1=%lx out2=%lx out3=%lx out4=%lx - out5=%lx out6=%lx out7=%lx out8=%lx - out9=%lx, -opcode, ret, -arg1, arg2, arg3, arg4, arg5, -arg6, arg7, arg8, arg9, -outs[0], outs[1], outs[2], outs[3], + if (ret H_SUCCESS) { + ehca_gen_err(INPUT -- opcode=%lx HCALL9_REGS_FORMAT, +opcode, arg1, arg2, arg3, arg4, arg5, +arg6, arg7, arg8, arg9); + ehca_gen_err(OUTPUT -- ret=%li HCALL9_REGS_FORMAT, +ret, outs[0], outs[1], outs[2], outs[3], +outs[4], outs[5], outs[6], outs[7], +outs[8]); + } else + ehca_gen_dbg(OUTPUT -- ret=%li HCALL9_REGS_FORMAT, +ret, outs[0], outs[1], outs[2], outs[3], outs[4], outs[5], outs[6], outs[7], outs[8]); - - ehca_gen_dbg(opcode=%lx ret=%lx out1=%lx out2=%lx out3=%lx -out4=%lx out5=%lx out6=%lx out7=%lx out8=%lx -out9=%lx, -opcode, ret, outs[0], outs[1], outs[2], outs[3], -outs[4], outs[5], outs[6], outs[7], outs[8
[PATCH 06/12] IB/ehca: Print return codes as signed decimal integers
...because -12 is easier to read than FFF4. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_cq.c | 14 +++--- drivers/infiniband/hw/ehca/ehca_hca.c|2 +- drivers/infiniband/hw/ehca/ehca_main.c | 24 +- drivers/infiniband/hw/ehca/ehca_mcast.c |4 +- drivers/infiniband/hw/ehca/ehca_mrmw.c | 75 +++--- drivers/infiniband/hw/ehca/ehca_qp.c | 46 +- drivers/infiniband/hw/ehca/ehca_reqs.c |2 +- drivers/infiniband/hw/ehca/ehca_sqp.c|2 +- drivers/infiniband/hw/ehca/ehca_uverbs.c | 18 drivers/infiniband/hw/ehca/hcp_if.c | 20 10 files changed, 103 insertions(+), 104 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c b/drivers/infiniband/hw/ehca/ehca_cq.c index d68603d..79c25f5 100644 --- a/drivers/infiniband/hw/ehca/ehca_cq.c +++ b/drivers/infiniband/hw/ehca/ehca_cq.c @@ -190,7 +190,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, int comp_vector, if (h_ret != H_SUCCESS) { ehca_err(device, hipz_h_alloc_resource_cq() failed -h_ret=%lx device=%p, h_ret, device); +h_ret=%li device=%p, h_ret, device); cq = ERR_PTR(ehca2ib_return_code(h_ret)); goto create_cq_exit2; } @@ -198,7 +198,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, int comp_vector, ipz_rc = ipz_queue_ctor(NULL, my_cq-ipz_queue, param.act_pages, EHCA_PAGESIZE, sizeof(struct ehca_cqe), 0, 0); if (!ipz_rc) { - ehca_err(device, ipz_queue_ctor() failed ipz_rc=%x device=%p, + ehca_err(device, ipz_queue_ctor() failed ipz_rc=%i device=%p, ipz_rc, device); cq = ERR_PTR(-EINVAL); goto create_cq_exit3; @@ -226,7 +226,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, int comp_vector, if (h_ret H_SUCCESS) { ehca_err(device, hipz_h_register_rpage_cq() failed -ehca_cq=%p cq_num=%x h_ret=%lx counter=%i +ehca_cq=%p cq_num=%x h_ret=%li counter=%i act_pages=%i, my_cq, my_cq-cq_number, h_ret, counter, param.act_pages); cq = ERR_PTR(-EINVAL); @@ -238,7 +238,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, int comp_vector, if ((h_ret != H_SUCCESS) || vpage) { ehca_err(device, Registration of pages not complete ehca_cq=%p cq_num=%x -h_ret=%lx, my_cq, my_cq-cq_number, +h_ret=%li, my_cq, my_cq-cq_number, h_ret); cq = ERR_PTR(-EAGAIN); goto create_cq_exit4; @@ -246,7 +246,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int cqe, int comp_vector, } else { if (h_ret != H_PAGE_REGISTERED) { ehca_err(device, Registration of page failed -ehca_cq=%p cq_num=%x h_ret=%lx +ehca_cq=%p cq_num=%x h_ret=%li counter=%i act_pages=%i, my_cq, my_cq-cq_number, h_ret, counter, param.act_pages); @@ -298,7 +298,7 @@ create_cq_exit3: h_ret = hipz_h_destroy_cq(adapter_handle, my_cq, 1); if (h_ret != H_SUCCESS) ehca_err(device, hipz_h_destroy_cq() failed ehca_cq=%p -cq_num=%x h_ret=%lx, my_cq, my_cq-cq_number, h_ret); +cq_num=%x h_ret=%li, my_cq, my_cq-cq_number, h_ret); create_cq_exit2: write_lock_irqsave(ehca_cq_idr_lock, flags); @@ -362,7 +362,7 @@ int ehca_destroy_cq(struct ib_cq *cq) cq_num); } if (h_ret != H_SUCCESS) { - ehca_err(device, hipz_h_destroy_cq() failed h_ret=%lx + ehca_err(device, hipz_h_destroy_cq() failed h_ret=%li ehca_cq=%p cq_num=%x, h_ret, my_cq, cq_num); return ehca2ib_return_code(h_ret); } diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c index cf22472..3436c49 100644 --- a/drivers/infiniband/hw/ehca/ehca_hca.c +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -352,7 +352,7 @@ int ehca_modify_port(struct ib_device *ibdev, hret = hipz_h_modify_port(shca-ipz_hca_handle, port, cap, props-init_type, port_modify_mask
[PATCH 07/12] IB/ehca: ehca_gen_warn() should always print
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_tools.h |9 +++-- 1 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_tools.h b/drivers/infiniband/hw/ehca/ehca_tools.h index 57c77a7..f9b264b 100644 --- a/drivers/infiniband/hw/ehca/ehca_tools.h +++ b/drivers/infiniband/hw/ehca/ehca_tools.h @@ -98,15 +98,12 @@ extern int ehca_debug_level; } while (0) #define ehca_gen_warn(format, arg...) \ - do { \ - if (unlikely(ehca_debug_level)) \ - printk(KERN_INFO PU%04x EHCA_WARN:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg); \ - } while (0) + printk(KERN_INFO PU%04x EHCA_WARN:%s format \n, \ + get_paca()-paca_index, __FUNCTION__, ## arg) #define ehca_gen_err(format, arg...) \ printk(KERN_ERR PU%04x EHCA_ERR:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg) + get_paca()-paca_index, __FUNCTION__, ## arg) /** * ehca_dmp - printk a memory block, whose length is n*8 bytes. -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 08/12] IB/ehca: Replace get_paca()-paca_index by the more portable smp_processor_id()
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_tools.h | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_tools.h b/drivers/infiniband/hw/ehca/ehca_tools.h index f9b264b..863f972 100644 --- a/drivers/infiniband/hw/ehca/ehca_tools.h +++ b/drivers/infiniband/hw/ehca/ehca_tools.h @@ -73,37 +73,37 @@ extern int ehca_debug_level; if (unlikely(ehca_debug_level)) \ dev_printk(KERN_DEBUG, (ib_dev)-dma_device, \ PU%04x EHCA_DBG:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, \ + smp_processor_id(), __FUNCTION__, \ ## arg); \ } while (0) #define ehca_info(ib_dev, format, arg...) \ dev_info((ib_dev)-dma_device, PU%04x EHCA_INFO:%s format \n, \ -get_paca()-paca_index, __FUNCTION__, ## arg) +smp_processor_id(), __FUNCTION__, ## arg) #define ehca_warn(ib_dev, format, arg...) \ dev_warn((ib_dev)-dma_device, PU%04x EHCA_WARN:%s format \n, \ -get_paca()-paca_index, __FUNCTION__, ## arg) +smp_processor_id(), __FUNCTION__, ## arg) #define ehca_err(ib_dev, format, arg...) \ dev_err((ib_dev)-dma_device, PU%04x EHCA_ERR:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg) + smp_processor_id(), __FUNCTION__, ## arg) /* use this one only if no ib_dev available */ #define ehca_gen_dbg(format, arg...) \ do { \ if (unlikely(ehca_debug_level)) \ printk(KERN_DEBUG PU%04x EHCA_DBG:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg); \ + smp_processor_id(), __FUNCTION__, ## arg); \ } while (0) #define ehca_gen_warn(format, arg...) \ printk(KERN_INFO PU%04x EHCA_WARN:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg) + smp_processor_id(), __FUNCTION__, ## arg) #define ehca_gen_err(format, arg...) \ printk(KERN_ERR PU%04x EHCA_ERR:%s format \n, \ - get_paca()-paca_index, __FUNCTION__, ## arg) + smp_processor_id(), __FUNCTION__, ## arg) /** * ehca_dmp - printk a memory block, whose length is n*8 bytes. -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 10/12] IB/ehca: Path migration support
Rectify some modify_qp() issues related to path migration. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_irq.c |4 +- drivers/infiniband/hw/ehca/ehca_qp.c | 90 - 2 files changed, 68 insertions(+), 26 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index a925ea5..7093986 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -294,8 +294,8 @@ static void parse_identifier(struct ehca_shca *shca, u64 eqe) case 0x11: /* unaffiliated access error */ ehca_err(shca-ib_device, Unaffiliated access error.); break; - case 0x12: /* path migrating error */ - ehca_err(shca-ib_device, Path migration error.); + case 0x12: /* path migrating */ + ehca_err(shca-ib_device, Path migrating.); break; case 0x13: /* interface trace stopped */ ehca_err(shca-ib_device, Interface trace stopped.); diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index 7154f62..6c70dee 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -1167,6 +1167,13 @@ static int internal_modify_qp(struct ib_qp *ibqp, } if (attr_mask IB_QP_PKEY_INDEX) { + if (attr-pkey_index = 16) { + ret = -EINVAL; + ehca_err(ibqp-device, Invalid pkey_index=%x. +ehca_qp=%p qp_num=%x max_pkey_index=f, +attr-pkey_index, my_qp, ibqp-qp_num); + goto modify_qp_exit2; + } mqpcb-prim_p_key_idx = attr-pkey_index; update_mask |= EHCA_BMASK_SET(MQPCB_MASK_PRIM_P_KEY_IDX, 1); } @@ -1275,50 +1282,78 @@ static int internal_modify_qp(struct ib_qp *ibqp, int ehca_mult = ib_rate_to_mult( shca-sport[my_qp-init_attr.port_num].rate); + if (attr-alt_port_num 1 + || attr-alt_port_num shca-num_ports) { + ret = -EINVAL; + ehca_err(ibqp-device, Invalid alt_port=%x. +ehca_qp=%p qp_num=%x num_ports=%x, +attr-alt_port_num, my_qp, ibqp-qp_num, +shca-num_ports); + goto modify_qp_exit2; + } + mqpcb-alt_phys_port = attr-alt_port_num; + + if (attr-alt_pkey_index = 16) { + ret = -EINVAL; + ehca_err(ibqp-device, Invalid alt_pkey_index=%x. +ehca_qp=%p qp_num=%x max_pkey_index=f, +attr-pkey_index, my_qp, ibqp-qp_num); + goto modify_qp_exit2; + } + mqpcb-alt_p_key_idx = attr-alt_pkey_index; + + mqpcb-timeout_al = attr-alt_timeout; mqpcb-dlid_al = attr-alt_ah_attr.dlid; - update_mask |= EHCA_BMASK_SET(MQPCB_MASK_DLID_AL, 1); mqpcb-source_path_bits_al = attr-alt_ah_attr.src_path_bits; - update_mask |= - EHCA_BMASK_SET(MQPCB_MASK_SOURCE_PATH_BITS_AL, 1); mqpcb-service_level_al = attr-alt_ah_attr.sl; - update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SERVICE_LEVEL_AL, 1); - if (ah_mult ehca_mult) - mqpcb-max_static_rate = (ah_mult 0) ? - ((ehca_mult - 1) / ah_mult) : 0; + if (ah_mult 0 ah_mult ehca_mult) + mqpcb-max_static_rate_al = (ehca_mult - 1) / ah_mult; else mqpcb-max_static_rate_al = 0; - update_mask |= EHCA_BMASK_SET(MQPCB_MASK_MAX_STATIC_RATE_AL, 1); + /* OpenIB doesn't support alternate retry counts - copy them */ + mqpcb-retry_count_al = mqpcb-retry_count; + mqpcb-rnr_retry_count_al = mqpcb-rnr_retry_count; + + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_ALT_PHYS_PORT, 1) + | EHCA_BMASK_SET(MQPCB_MASK_ALT_P_KEY_IDX, 1) + | EHCA_BMASK_SET(MQPCB_MASK_TIMEOUT_AL, 1) + | EHCA_BMASK_SET(MQPCB_MASK_DLID_AL, 1) + | EHCA_BMASK_SET(MQPCB_MASK_SOURCE_PATH_BITS_AL, 1) + | EHCA_BMASK_SET(MQPCB_MASK_SERVICE_LEVEL_AL, 1) + | EHCA_BMASK_SET(MQPCB_MASK_MAX_STATIC_RATE_AL, 1) + | EHCA_BMASK_SET(MQPCB_MASK_RETRY_COUNT_AL, 1) + | EHCA_BMASK_SET(MQPCB_MASK_RNR_RETRY_COUNT_AL, 1); + + /* +* Always supply the GRH flag, even if it's zero, to give the +* hypervisor
[PATCH 12/12] IB/ehca: Bump version number and change its format
Nobody needed the SVNEHCA_ prefix anyway. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 799f218..c84e310 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -49,10 +49,12 @@ #include ehca_tools.h #include hcp_if.h +#define HCAD_VERSION 0024 + MODULE_LICENSE(Dual BSD/GPL); MODULE_AUTHOR(Christoph Raisch [EMAIL PROTECTED]); MODULE_DESCRIPTION(IBM eServer HCA InfiniBand Device Driver); -MODULE_VERSION(SVNEHCA_0023); +MODULE_VERSION(HCAD_VERSION); int ehca_open_aqp1 = 0; int ehca_debug_level = 0; @@ -909,7 +911,7 @@ int __init ehca_module_init(void) int ret; printk(KERN_INFO eHCA Infiniband Device Driver - (Rel.: SVNEHCA_0023)\n); + (Version HCAD_VERSION )\n); ret = ehca_create_comp_pool(); if (ret) { -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2.6.23] ibmebus: Prevent bus_id collisions
Hi, Arnd, The whole logic of dynamically adding and removing device is rather bogus, and it prevents autoloading of device drivers. of_platform_make_bus_id is the function that is responsible for creating unique names over there. The plaintiff makes a valid point. How about a staging approach: We put the patch as it is now into 2.6.23 so the problem is fixed, and I'll post a nice version with autoloading support and a generic of_make_bus_id function for 2.6.24. Agree? Regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/2] IB/ehca: Fixes for rc5
These two patches fix some ehca issues that should be fixed in 2.6.23. [1/2] fixes regressions caused by the recent addition of Small QPs. [2/2] adds missing SRQ-related functionality that would have broken IPoIB CM. The patches should apply cleanly, in order, against Roland's git. Please review the changes and apply the patches for 2.6.23-rc5 if they are okay. Regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2.6.23] ibmebus: Prevent bus_id collisions
The plaintiff makes a valid point. How about a staging approach: We put the patch as it is now into 2.6.23 so the problem is fixed, and I'll post a nice version with autoloading support and a generic of_make_bus_id function for 2.6.24. Agree? Ok, sounds fair. Can you make sure that the resulting bus_id is the same for the final version then? No, it would change once more -- the current, minimal-invasive, variant of the fix uses the full_name - [EMAIL PROTECTED], while of_make_bus_id uses another format: 12345678.ehca. I don't think this makes much of a difference, though, and users shouldn't rely on the bus_id having a certain format anyway, so IMHO we can live with this. Regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 2.6.23] ibmebus: Prevent bus_id collisions
Nathan Lynch [EMAIL PROTECTED] wrote on 29.08.2007 20:12:32: Previously, ibmebus derived a device's bus_id from its location code. The location code is not guaranteed to be unique, so we might get bus_id collisions if two devices share the same location code. The OFDT full_name, however, is unique, so we use that instead. This is a userspace-visible change, but I guess it's unavoidable. Will anything break? Nope. Userspace programs should not depend on ibmebus' way of naming the devices; especially since some overly long loc_codes tended to be truncated and thus rendered useless. I have tested IBM's DLPAR tools against the changed kernel, and they didn't break. Also, I dislike this approach of duplicating the firmware device tree path in sysfs. Why? Any specific reasons for your dislike? Are GX/ibmebus devices guaranteed to be children of the same node in the OF device tree? If so, their unit addresses will be unique, and therefore suitable values for bus_id. I believe this is what the powerpc vio bus code does. While there's no such guarantee (as in officially signed document), yes, I expect future GX devices to also appear beneath the OFDT root node. For the existing devices, the unit addresses are already part of the device name, so I save the need to use sprintf() again. Plus, I rather like using the full_name since it also contains a descriptive name as opposed to being just nondescript numbers, helping the layman (ie user) to make sense out of a dev_id. jschopp [EMAIL PROTECTED] wrote on 29.08.2007 20:33:30: + len = strlen(dn-full_name + 1); + bus_len = min(len, BUS_ID_SIZE - 1); + memcpy(dev-ofdev.dev.bus_id, dn-full_name + 1 + + (len - bus_len), bus_len); + for (i = 0; i bus_len; i++) + if (dev-ofdev.dev.bus_id[i] == '/') + dev-ofdev.dev.bus_id[i] = '_'; /* Register with generic device framework. */ if (ibmebus_register_device_common(dev, dn-name) != 0) { What happens when the full name is 31 characters? It looks to me that it will be truncated, which takes away the uniqueness guarantee. There are currently two GX devices, eHCA and eHEA, which both reside beneath the root node - this is required by architecture for those devices. Unless they invent a device called supercalifragilisticexpialidocious, devices in the root note will have a full_name of less than 31 chars. Even in that case, the truncation occurs at the beginning, so the @xxx part that makes the nodes unique will stay in place. If any more GX devices appear on the scene, I expect them to appear in the root node as well. The substitution of / by _ is a safeguard so possible weird OFDT setups don't break the kernel. There must be an individual property that is guaranteed to be unique and less than 32 characters. How about ibm,my-drc-index? That looks like a good candidate. On first glance, it does, however the attribute might not be present in all cases. Architecture states it only needs to be present on systems with dynamic reconfiguration enabled. All things considered, I still like the idea of using the full_name most. No offense. Regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2.6.23] ibmebus: Prevent bus_id collisions
Previously, ibmebus derived a device's bus_id from its location code. The location code is not guaranteed to be unique, so we might get bus_id collisions if two devices share the same location code. The OFDT full_name, however, is unique, so we use that instead. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- The patch has been tested and works fine. If you think it's too much change for 2.6.23-rc5, please schedule for 2.6.24 instead. arch/powerpc/kernel/ibmebus.c | 30 +- 1 files changed, 9 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index 9a8c9af..d6a38cd 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -188,33 +188,21 @@ static struct ibmebus_dev* __devinit ibmebus_register_device_node( struct device_node *dn) { struct ibmebus_dev *dev; - const char *loc_code; - int length; - - loc_code = of_get_property(dn, ibm,loc-code, NULL); - if (!loc_code) { -printk(KERN_WARNING %s: node %s missing 'ibm,loc-code'\n, - __FUNCTION__, dn-name ? dn-name : unknown); - return ERR_PTR(-EINVAL); -} - - if (strlen(loc_code) == 0) { - printk(KERN_WARNING %s: 'ibm,loc-code' is invalid\n, - __FUNCTION__); - return ERR_PTR(-EINVAL); - } + int i, len, bus_len; dev = kzalloc(sizeof(struct ibmebus_dev), GFP_KERNEL); - if (!dev) { + if (!dev) return ERR_PTR(-ENOMEM); - } dev-ofdev.node = of_node_get(dn); - length = strlen(loc_code); - memcpy(dev-ofdev.dev.bus_id, loc_code - + (length - min(length, BUS_ID_SIZE - 1)), - min(length, BUS_ID_SIZE - 1)); + len = strlen(dn-full_name + 1); + bus_len = min(len, BUS_ID_SIZE - 1); + memcpy(dev-ofdev.dev.bus_id, dn-full_name + 1 + + (len - bus_len), bus_len); + for (i = 0; i bus_len; i++) + if (dev-ofdev.dev.bus_id[i] == '/') + dev-ofdev.dev.bus_id[i] = '_'; /* Register with generic device framework. */ if (ibmebus_register_device_common(dev, dn-name) != 0) { -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] IB/ehca: Properly report max #SRQs in query_device()
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- This patch should apply cleanly on top of Stefan's recent patchset. Please review and apply for 2.6.23. Thanks. drivers/infiniband/hw/ehca/ehca_hca.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c index fc19ef9..cf22472 100644 --- a/drivers/infiniband/hw/ehca/ehca_hca.c +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -93,9 +93,13 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props) props-max_pd = min_t(int, rblock-max_pd, INT_MAX); props-max_ah = min_t(int, rblock-max_ah, INT_MAX); props-max_fmr = min_t(int, rblock-max_mr, INT_MAX); - props-max_srq = 0; - props-max_srq_wr = 0; - props-max_srq_sge = 0; + + if (EHCA_BMASK_GET(HCA_CAP_SRQ, shca-hca_cap)) { + props-max_srq = props-max_qp; + props-max_srq_wr = props-max_qp_wr; + props-max_srq_sge = 3; + } + props-max_pkeys = 16; props-local_ca_ack_delay = rblock-local_ca_ack_delay; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 10/10] IB/ehca: Support large page MRs
Roland Dreier [EMAIL PROTECTED] wrote on 16.07.2007 19:37:09: If enabled via the mr_largepage module parameter, Why the module parameter? Is there any reason a user would want to turn this off? Or conversely, why is it off by default? We're pretty confident this new feature works, but as with all new and possibly experimental features, there are chances it might explode your machine when activated. So, like with the scaling code, we want the user to make the conscious decision of using this code instead of activating it by default. static ssize_t ehca_show_nr_eqs(struct device *dev, struct device_attribute *attr, char *buf) { return sprintf(buf, %d\n, ehca_nr_eqs); } - static DEVICE_ATTR(nr_eqs, S_IRUGO, ehca_show_nr_eqs, NULL); Although trivial, this chunk doesn't really belong in this patch -- just fix it up in the multiple EQ patch (which I haven't merged yet). Sure thing. Regards, Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 05/10] IB/ehca: use #define for pages per register_rpage instead of hardcoded value
From: Hoang-Nam Nguyen [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_mrmw.c | 19 +++ 1 files changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index 7c1656a..1fe4f72 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -48,6 +48,9 @@ #include hcp_if.h #include hipz_hw.h +/* max number of rpages (per hcall register_rpages) */ +#define MAX_RPAGES 512 + static struct kmem_cache *mr_cache; static struct kmem_cache *mw_cache; @@ -1027,14 +1030,14 @@ int ehca_reg_mr_rpages(struct ehca_shca *shca, } /* max 512 pages per shot */ - for (i = 0; i ((pginfo-num_4k + 512 - 1) / 512); i++) { + for (i = 0; i ((pginfo-num_4k + MAX_RPAGES - 1) / MAX_RPAGES); i++) { - if (i == ((pginfo-num_4k + 512 - 1) / 512) - 1) { - rnum = pginfo-num_4k % 512; /* last shot */ + if (i == ((pginfo-num_4k + MAX_RPAGES - 1) / MAX_RPAGES) - 1) { + rnum = pginfo-num_4k % MAX_RPAGES; /* last shot */ if (rnum == 0) - rnum = 512; /* last shot is full */ + rnum = MAX_RPAGES; /* last shot is full */ } else - rnum = 512; + rnum = MAX_RPAGES; if (rnum 1) { ret = ehca_set_pagebuf(e_mr, pginfo, rnum, kpage); @@ -1066,7 +1069,7 @@ int ehca_reg_mr_rpages(struct ehca_shca *shca, 0, /* pagesize 4k */ 0, rpage, rnum); - if (i == ((pginfo-num_4k + 512 - 1) / 512) - 1) { + if (i == ((pginfo-num_4k + MAX_RPAGES - 1) / MAX_RPAGES) - 1) { /* * check for 'registration complete'==H_SUCCESS * and for 'page registered'==H_PAGE_REGISTERED @@ -1215,7 +1218,7 @@ int ehca_rereg_mr(struct ehca_shca *shca, int rereg_3_hcall = 0; /* 1: use 3 hipz calls for reregistration */ /* first determine reregistration hCall(s) */ - if ((pginfo-num_4k 512) || (e_mr-num_4k 512) || + if ((pginfo-num_4k MAX_RPAGES) || (e_mr-num_4k MAX_RPAGES) || (pginfo-num_4k e_mr-num_4k)) { ehca_dbg(shca-ib_device, Rereg3 case, pginfo-num_4k=%lx e_mr-num_4k=%x, pginfo-num_4k, e_mr-num_4k); @@ -1306,7 +1309,7 @@ int ehca_unmap_one_fmr(struct ehca_shca *shca, struct ehca_mr_hipzout_parms hipzout = {{0},0,0,0,0,0}; /* first check if reregistration hCall can be used for unmap */ - if (e_fmr-fmr_max_pages 512) { + if (e_fmr-fmr_max_pages MAX_RPAGES) { rereg_1_hcall = 0; rereg_3_hcall = 1; } -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 08/10] IB/ehca: Restructure ehca_set_pagebuf()
From: Hoang-Nam Nguyen [EMAIL PROTECTED] Split ehca_set_pagebuf() into three functions depending on MR type (phys/user/fast) and remove superfluous ehca_set_pagebuf_1(). Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_mrmw.c | 531 1 files changed, 200 insertions(+), 331 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c index 53b334b..93c26cc 100644 --- a/drivers/infiniband/hw/ehca/ehca_mrmw.c +++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c @@ -824,6 +824,7 @@ int ehca_map_phys_fmr(struct ib_fmr *fmr, pginfo.u.fmr.page_list = page_list; pginfo.next_hwpage = ((iova (e_fmr-fmr_page_size-1)) / EHCA_PAGESIZE); + pginfo.u.fmr.fmr_pgsize = e_fmr-fmr_page_size; ret = ehca_rereg_mr(shca, e_fmr, (u64*)iova, list_len * e_fmr-fmr_page_size, @@ -1044,15 +1045,15 @@ int ehca_reg_mr_rpages(struct ehca_shca *shca, } else rnum = MAX_RPAGES; - if (rnum 1) { - ret = ehca_set_pagebuf(e_mr, pginfo, rnum, kpage); - if (ret) { - ehca_err(shca-ib_device, ehca_set_pagebuf + ret = ehca_set_pagebuf(pginfo, rnum, kpage); + if (ret) { + ehca_err(shca-ib_device, ehca_set_pagebuf bad rc, ret=%x rnum=%x kpage=%p, ret, rnum, kpage); - ret = -EFAULT; - goto ehca_reg_mr_rpages_exit1; - } + goto ehca_reg_mr_rpages_exit1; + } + + if (rnum 1) { rpage = virt_to_abs(kpage); if (!rpage) { ehca_err(shca-ib_device, kpage=%p i=%x, @@ -1060,15 +1061,8 @@ int ehca_reg_mr_rpages(struct ehca_shca *shca, ret = -EFAULT; goto ehca_reg_mr_rpages_exit1; } - } else { /* rnum==1 */ - ret = ehca_set_pagebuf_1(e_mr, pginfo, rpage); - if (ret) { - ehca_err(shca-ib_device, ehca_set_pagebuf_1 -bad rc, ret=%x i=%x, ret, i); - ret = -EFAULT; - goto ehca_reg_mr_rpages_exit1; - } - } + } else + rpage = *kpage; h_ret = hipz_h_register_rpage_mr(shca-ipz_hca_handle, e_mr, 0, /* pagesize 4k */ @@ -1146,7 +1140,7 @@ inline int ehca_rereg_mr_rereg1(struct ehca_shca *shca, } pginfo_save = *pginfo; - ret = ehca_set_pagebuf(e_mr, pginfo, pginfo-num_hwpages, kpage); + ret = ehca_set_pagebuf(pginfo, pginfo-num_hwpages, kpage); if (ret) { ehca_err(shca-ib_device, set pagebuf failed, e_mr=%p pginfo=%p type=%x num_kpages=%lx num_hwpages=%lx @@ -1306,98 +1300,86 @@ int ehca_unmap_one_fmr(struct ehca_shca *shca, { int ret = 0; u64 h_ret; - int rereg_1_hcall = 1; /* 1: use hipz_mr_reregister directly */ - int rereg_3_hcall = 0; /* 1: use 3 hipz calls for unmapping */ struct ehca_pd *e_pd = container_of(e_fmr-ib.ib_fmr.pd, struct ehca_pd, ib_pd); struct ehca_mr save_fmr; u32 tmp_lkey, tmp_rkey; struct ehca_mr_pginfo pginfo; struct ehca_mr_hipzout_parms hipzout = {{0},0,0,0,0,0}; + struct ehca_mr save_mr; - /* first check if reregistration hCall can be used for unmap */ - if (e_fmr-fmr_max_pages MAX_RPAGES) { - rereg_1_hcall = 0; - rereg_3_hcall = 1; - } - - if (rereg_1_hcall) { + if (e_fmr-fmr_max_pages = MAX_RPAGES) { /* * note: after using rereg hcall with len=0, * rereg hcall must be used again for registering pages */ h_ret = hipz_h_reregister_pmr(shca-ipz_hca_handle, e_fmr, 0, 0, 0, e_pd-fw_pd, 0, hipzout); - if (h_ret != H_SUCCESS) { - /* -* should not happen, because length checked above, -* FMRs are not shared and no MW bound to FMRs -*/ - ehca_err(shca-ib_device, hipz_reregister_pmr failed -(Rereg1), h_ret=%lx e_fmr=%p hca_hndl=%lx -mr_hndl=%lx lkey=%x lkey_out=%x, -h_ret, e_fmr, shca
[PATCH 09/10] IB/ehca: Fix warnings issued by checkpatch.pl
From: Hoang-Nam Nguyen [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_av.c |2 +- drivers/infiniband/hw/ehca/ehca_classes.h |4 +- drivers/infiniband/hw/ehca/ehca_classes_pSeries.h | 156 ++-- drivers/infiniband/hw/ehca/ehca_cq.c |2 +- drivers/infiniband/hw/ehca/ehca_eq.c |3 +- drivers/infiniband/hw/ehca/ehca_hca.c | 28 +++- drivers/infiniband/hw/ehca/ehca_irq.c | 56 drivers/infiniband/hw/ehca/ehca_iverbs.h |7 +- drivers/infiniband/hw/ehca/ehca_main.c| 21 ++-- drivers/infiniband/hw/ehca/ehca_mrmw.c| 59 drivers/infiniband/hw/ehca/ehca_mrmw.h|7 +- drivers/infiniband/hw/ehca/ehca_qes.h | 22 ++-- drivers/infiniband/hw/ehca/ehca_qp.c | 39 +++--- drivers/infiniband/hw/ehca/ehca_reqs.c| 15 ++- drivers/infiniband/hw/ehca/ehca_tools.h | 28 ++-- drivers/infiniband/hw/ehca/ehca_uverbs.c | 10 +- drivers/infiniband/hw/ehca/hcp_if.c |8 +- drivers/infiniband/hw/ehca/hcp_phyp.c |2 +- drivers/infiniband/hw/ehca/hipz_fns_core.h|4 +- drivers/infiniband/hw/ehca/hipz_hw.h | 24 ++-- drivers/infiniband/hw/ehca/ipz_pt_fn.c|2 +- drivers/infiniband/hw/ehca/ipz_pt_fn.h|4 +- 22 files changed, 261 insertions(+), 242 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c index 3cd6bf3..e53a97a 100644 --- a/drivers/infiniband/hw/ehca/ehca_av.c +++ b/drivers/infiniband/hw/ehca/ehca_av.c @@ -79,7 +79,7 @@ struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) av-av.ipd = (ah_mult 0) ? ((ehca_mult - 1) / ah_mult) : 0; } else - av-av.ipd = ehca_static_rate; + av-av.ipd = ehca_static_rate; av-av.lnh = ah_attr-ah_flags; av-av.grh.word_0 = EHCA_BMASK_SET(GRH_IPVERSION_MASK, 6); diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 92103df..1752821 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -215,7 +215,7 @@ struct ehca_mr { u32 num_hwpages;/* number of hw pages to form MR */ int acl;/* ACL (stored here for usage in reregister) */ u64 *start; /* virtual start address (stored here for */ - /* usage in reregister) */ + /* usage in reregister) */ u64 size; /* size (stored here for usage in reregister) */ u32 fmr_page_size; /* page size for FMR */ u32 fmr_max_pages; /* max pages for FMR */ @@ -400,6 +400,6 @@ struct ehca_alloc_qp_parms { int ehca_cq_assign_qp(struct ehca_cq *cq, struct ehca_qp *qp); int ehca_cq_unassign_qp(struct ehca_cq *cq, unsigned int qp_num); -struct ehca_qp* ehca_cq_get_qp(struct ehca_cq *cq, int qp_num); +struct ehca_qp *ehca_cq_get_qp(struct ehca_cq *cq, int qp_num); #endif diff --git a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h index fb3df5c..1798e64 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h +++ b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h @@ -154,83 +154,83 @@ struct hcp_modify_qp_control_block { u32 reserved_70_127[58]; /* 70 */ }; -#define MQPCB_MASK_QKEY EHCA_BMASK_IBM(0,0) -#define MQPCB_MASK_SEND_PSN EHCA_BMASK_IBM(2,2) -#define MQPCB_MASK_RECEIVE_PSN EHCA_BMASK_IBM(3,3) -#define MQPCB_MASK_PRIM_PHYS_PORT EHCA_BMASK_IBM(4,4) -#define MQPCB_PRIM_PHYS_PORTEHCA_BMASK_IBM(24,31) -#define MQPCB_MASK_ALT_PHYS_PORTEHCA_BMASK_IBM(5,5) -#define MQPCB_MASK_PRIM_P_KEY_IDX EHCA_BMASK_IBM(6,6) -#define MQPCB_PRIM_P_KEY_IDXEHCA_BMASK_IBM(24,31) -#define MQPCB_MASK_ALT_P_KEY_IDXEHCA_BMASK_IBM(7,7) -#define MQPCB_MASK_RDMA_ATOMIC_CTRL EHCA_BMASK_IBM(8,8) -#define MQPCB_MASK_QP_STATE EHCA_BMASK_IBM(9,9) -#define MQPCB_QP_STATE EHCA_BMASK_IBM(24,31) -#define MQPCB_MASK_RDMA_NR_ATOMIC_RESP_RES EHCA_BMASK_IBM(11,11) -#define MQPCB_MASK_PATH_MIGRATION_STATE EHCA_BMASK_IBM(12,12) -#define MQPCB_MASK_RDMA_ATOMIC_OUTST_DEST_QPEHCA_BMASK_IBM(13,13) -#define MQPCB_MASK_DEST_QP_NR EHCA_BMASK_IBM(14,14) -#define MQPCB_MASK_MIN_RNR_NAK_TIMER_FIELD EHCA_BMASK_IBM(15,15) -#define MQPCB_MASK_SERVICE_LEVELEHCA_BMASK_IBM(16,16) -#define MQPCB_MASK_SEND_GRH_FLAGEHCA_BMASK_IBM(17,17) -#define
Re: [PATCH 06/13] IB/ehca: Set SEND_GRH flag for all non-LL UD QPs on eHCA2
Roland Dreier [EMAIL PROTECTED] wrote on 09.07.2007 23:35:31: Out of curiousity, does this mean that a GRH will be sent on all UD messages (for non-LL QPs)? No - the bit instructs the hardware to fetch the GRH parts of the QP context. The GRH will only be used if the WQE says so. Joachim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 00/13] IB/ehca: eHCA2 enablement some fixes
This patch series enables the eHCA device driver to support new functions of the eHCA2 chip. In addition, there are some bug fixes, code optimizations and general new features included. Another set of patches will follow. The patches, in detail, are: [01/13] fixes a wrong parameter description [02/13] adds HW capabilities autodetection [03/13] restructures the QP code, preparing for Share Receive Queues (SRQ) [04/13] adds SRQ support [05/13] adds support for UD low latency QPs [06/13] sets a flag that needs to be set on eHCA2 [07/13] adds RDMA atomic attributes to the data returned by query_qp() [08/13] straightens out lock flag naming and adds static initializers [09/13] refactors synchronization between completions and destroy_cq() [10/13] changes the global idr spinlocks into rwlocks [11/13] returns the QP pointer in poll_cq() instead of NULL [12/13] adds notifications in case the SM LID etc. changes [13/13] adds a slight latency improvement The patches should apply cleanly, in order, against Roland's git. Please review the changes and apply the patches for 2.6.23 if they are okay. Regards, Joachim -- Joachim Fenkes -- eHCA Linux Driver Developer and Hardware Tamer IBM Deutschland Entwicklung GmbH -- Dept. 3627 (I/O Firmware Dev. 2) Schoenaicher Strasse 220 -- 71032 Boeblingen -- Germany eMail: [EMAIL PROTECTED] ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 01/13] IB/ehca: change scaling_code parameter description to match default value
From: Hoang-Nam Nguyen [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_main.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index c3f99f3..fea199f 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -94,7 +94,7 @@ MODULE_PARM_DESC(poll_all_eqs, MODULE_PARM_DESC(static_rate, set permanent static rate (default: disabled)); MODULE_PARM_DESC(scaling_code, -set scaling code (0: disabled, 1: enabled/default)); +set scaling code (0: disabled/default, 1: enabled)); spinlock_t ehca_qp_idr_lock; spinlock_t ehca_cq_idr_lock; -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 02/13] IB/ehca: HW level, HW caps and MTU autodetection
In preparation for support of new eHCA2 features, change adapter probing: - Hardware level is changed to encode major and minor chip version - Hardware capabilities are queried from the firmware - The maximum MTU is queried from the firmware instead of assuming a fixed value Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_av.c |6 ++- drivers/infiniband/hw/ehca/ehca_classes.h |2 + drivers/infiniband/hw/ehca/ehca_hca.c | 27 +++- drivers/infiniband/hw/ehca/ehca_main.c| 62 ++--- drivers/infiniband/hw/ehca/hipz_hw.h | 18 5 files changed, 104 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_av.c b/drivers/infiniband/hw/ehca/ehca_av.c index 0d6e2c4..3cd6bf3 100644 --- a/drivers/infiniband/hw/ehca/ehca_av.c +++ b/drivers/infiniband/hw/ehca/ehca_av.c @@ -118,7 +118,7 @@ struct ib_ah *ehca_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) } memcpy(av-av.grh.word_1, gid, sizeof(gid)); } - av-av.pmtu = EHCA_MAX_MTU; + av-av.pmtu = shca-max_mtu; /* dgid comes in grh.word_3 */ memcpy(av-av.grh.word_3, ah_attr-grh.dgid, @@ -137,6 +137,8 @@ int ehca_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr) struct ehca_av *av; struct ehca_ud_av new_ehca_av; struct ehca_pd *my_pd = container_of(ah-pd, struct ehca_pd, ib_pd); + struct ehca_shca *shca = container_of(ah-pd-device, struct ehca_shca, + ib_device); u32 cur_pid = current-tgid; if (my_pd-ib_pd.uobject my_pd-ib_pd.uobject-context @@ -192,7 +194,7 @@ int ehca_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr) memcpy(new_ehca_av.grh.word_1, gid, sizeof(gid)); } - new_ehca_av.pmtu = EHCA_MAX_MTU; + new_ehca_av.pmtu = shca-max_mtu; memcpy(new_ehca_av.grh.word_3, ah_attr-grh.dgid, sizeof(ah_attr-grh.dgid)); diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 1d286d3..35d948f 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -107,6 +107,8 @@ struct ehca_shca { struct ehca_pd *pd; struct h_galpas galpas; struct mutex modify_mutex; + u64 hca_cap; + int max_mtu; }; struct ehca_pd { diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c index 32b55a4..b310de5 100644 --- a/drivers/infiniband/hw/ehca/ehca_hca.c +++ b/drivers/infiniband/hw/ehca/ehca_hca.c @@ -45,11 +45,25 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props) { - int ret = 0; + int i, ret = 0; struct ehca_shca *shca = container_of(ibdev, struct ehca_shca, ib_device); struct hipz_query_hca *rblock; + static const u32 cap_mapping[] = { + IB_DEVICE_RESIZE_MAX_WR, HCA_CAP_WQE_RESIZE, + IB_DEVICE_BAD_PKEY_CNTR, HCA_CAP_BAD_P_KEY_CTR, + IB_DEVICE_BAD_QKEY_CNTR, HCA_CAP_Q_KEY_VIOL_CTR, + IB_DEVICE_RAW_MULTI, HCA_CAP_RAW_PACKET_MCAST, + IB_DEVICE_AUTO_PATH_MIG, HCA_CAP_AUTO_PATH_MIG, + IB_DEVICE_CHANGE_PHY_PORT,HCA_CAP_SQD_RTS_PORT_CHANGE, + IB_DEVICE_UD_AV_PORT_ENFORCE, HCA_CAP_AH_PORT_NR_CHECK, + IB_DEVICE_CURR_QP_STATE_MOD, HCA_CAP_CUR_QP_STATE_MOD, + IB_DEVICE_SHUTDOWN_PORT, HCA_CAP_SHUTDOWN_PORT, + IB_DEVICE_INIT_TYPE, HCA_CAP_INIT_TYPE, + IB_DEVICE_PORT_ACTIVE_EVENT, HCA_CAP_PORT_ACTIVE_EVENT, + }; + rblock = ehca_alloc_fw_ctrlblock(GFP_KERNEL); if (!rblock) { ehca_err(shca-ib_device, Can't allocate rblock memory.); @@ -96,6 +110,13 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props) props-max_total_mcast_qp_attach = min_t(int, rblock-max_total_mcast_qp_attach, INT_MAX); + /* translate device capabilities */ + props-device_cap_flags = IB_DEVICE_SYS_IMAGE_GUID | + IB_DEVICE_RC_RNR_NAK_GEN | IB_DEVICE_N_NOTIFY_CQ; + for (i = 0; i ARRAY_SIZE(cap_mapping); i += 2) + if (rblock-hca_cap_indicators cap_mapping[i + 1]) + props-device_cap_flags |= cap_mapping[i]; + query_device1: ehca_free_fw_ctrlblock(rblock); @@ -261,7 +282,7 @@ int ehca_modify_port(struct ib_device *ibdev, } if (mutex_lock_interruptible(shca-modify_mutex)) -return -ERESTARTSYS; + return -ERESTARTSYS; rblock = ehca_alloc_fw_ctrlblock(GFP_KERNEL); if (!rblock) { @@ -290,7 +311,7 @@ modify_port2: ehca_free_fw_ctrlblock(rblock
[PATCH 03/13] IB/ehca: QP code restructuring in preparation for SRQ
- Replace init_qp_queues() by a shorter init_qp_queue(), eliminating duplicate code. - hipz_h_alloc_resource_qp() doesn't need a pointer to struct ehca_qp any longer. All input and output data is transferred through the parms parameter. - Change the interface to also support SRQ. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h | 46 +- drivers/infiniband/hw/ehca/ehca_qp.c | 254 + drivers/infiniband/hw/ehca/hcp_if.c | 35 ++--- drivers/infiniband/hw/ehca/hcp_if.h |1 - 4 files changed, 166 insertions(+), 170 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 35d948f..6e75db6 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -322,14 +322,49 @@ struct ehca_alloc_cq_parms { struct ipz_eq_handle eq_handle; }; +enum ehca_service_type { + ST_RC = 0, + ST_UC = 1, + ST_RD = 2, + ST_UD = 3, +}; + +enum ehca_ext_qp_type { + EQPT_NORMAL= 0, + EQPT_LLQP = 1, + EQPT_SRQBASE = 2, + EQPT_SRQ = 3, +}; + +enum ehca_ll_comp_flags { + LLQP_SEND_COMP = 0x20, + LLQP_RECV_COMP = 0x40, + LLQP_COMP_MASK = 0x60, +}; + struct ehca_alloc_qp_parms { - int servicetype; +/* input parameters */ + enum ehca_service_type servicetype; int sigtype; - int daqp_ctrl; - int max_send_sge; - int max_recv_sge; + enum ehca_ext_qp_type ext_type; + enum ehca_ll_comp_flags ll_comp_flags; + + int max_send_wr, max_recv_wr; + int max_send_sge, max_recv_sge; int ud_av_l_key_ctl; + u32 token; + struct ipz_eq_handle eq_handle; + struct ipz_pd pd; + struct ipz_cq_handle send_cq_handle, recv_cq_handle; + + u32 srq_qpn, srq_token, srq_limit; + +/* output parameters */ + u32 real_qp_num; + struct ipz_qp_handle qp_handle; + struct h_galpas galpas; + u16 act_nr_send_wqes; u16 act_nr_recv_wqes; u8 act_nr_recv_sges; @@ -337,9 +372,6 @@ struct ehca_alloc_qp_parms { u32 nr_rq_pages; u32 nr_sq_pages; - - struct ipz_eq_handle ipz_eq_handle; - struct ipz_pd pd; }; int ehca_cq_assign_qp(struct ehca_cq *cq, struct ehca_qp *qp); diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index b5bc787..ec1d555 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -234,13 +234,6 @@ static inline enum ib_qp_statetrans get_modqp_statetrans(int ib_fromstate, return index; } -enum ehca_service_type { - ST_RC = 0, - ST_UC = 1, - ST_RD = 2, - ST_UD = 3 -}; - /* * ibqptype2servicetype returns hcp service type corresponding to given * ib qp type used by create_qp() @@ -268,15 +261,16 @@ static inline int ibqptype2servicetype(enum ib_qp_type ibqptype) } /* - * init_qp_queues initializes/constructs r/squeue and registers queue pages. + * init_qp_queue initializes/constructs r/squeue and registers queue pages. */ -static inline int init_qp_queues(struct ehca_shca *shca, -struct ehca_qp *my_qp, -int nr_sq_pages, -int nr_rq_pages, -int swqe_size, -int rwqe_size, -int nr_send_sges, int nr_receive_sges) +static inline int init_qp_queue(struct ehca_shca *shca, + struct ehca_qp *my_qp, + struct ipz_queue *queue, + int q_type, + u64 expected_hret, + int nr_q_pages, + int wqe_size, + int nr_sges) { int ret, cnt, ipz_rc; void *vpage; @@ -284,104 +278,63 @@ static inline int init_qp_queues(struct ehca_shca *shca, struct ib_device *ib_dev = shca-ib_device; struct ipz_adapter_handle ipz_hca_handle = shca-ipz_hca_handle; - ipz_rc = ipz_queue_ctor(my_qp-ipz_squeue, - nr_sq_pages, - EHCA_PAGESIZE, swqe_size, nr_send_sges); + if (!nr_q_pages) + return 0; + + ipz_rc = ipz_queue_ctor(queue, nr_q_pages, EHCA_PAGESIZE, + wqe_size, nr_sges); if (!ipz_rc) { - ehca_err(ib_dev,Cannot allocate page for squeue. ipz_rc=%x, + ehca_err(ib_dev,Cannot allocate page for queue. ipz_rc=%x, ipz_rc); return -EBUSY; } - ipz_rc = ipz_queue_ctor(my_qp-ipz_rqueue, - nr_rq_pages, - EHCA_PAGESIZE
[PATCH 05/13] IB/ehca: Support UD low latency QPs
From: Stefan Roscher [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_qp.c | 84 +++--- 1 files changed, 57 insertions(+), 27 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index 9486a44..ffd1ce9 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -275,6 +275,11 @@ static inline void queue2resp(struct ipzu_queue_resp *resp, resp-toggle_state = queue-toggle_state; } +static inline int ll_qp_msg_size(int nr_sge) +{ + return 128 nr_sge; +} + /* * init_qp_queue initializes/constructs r/squeue and registers queue pages. */ @@ -363,8 +368,6 @@ struct ehca_qp *internal_create_qp(struct ib_pd *pd, struct ib_srq_init_attr *srq_init_attr, struct ib_udata *udata, int is_srq) { - static int da_rc_msg_size[] = { 128, 256, 512, 1024, 2048, 4096 }; - static int da_ud_sq_msg_size[]={ 128, 384, 896, 1920, 3968 }; struct ehca_qp *my_qp; struct ehca_pd *my_pd = container_of(pd, struct ehca_pd, ib_pd); struct ehca_shca *shca = container_of(pd-device, struct ehca_shca, @@ -396,6 +399,7 @@ struct ehca_qp *internal_create_qp(struct ib_pd *pd, parms.ll_comp_flags = qp_type LLQP_COMP_MASK; } qp_type = 0x1F; + init_attr-qp_type = 0x1F; /* handle SRQ base QPs */ if (init_attr-srq) { @@ -435,23 +439,49 @@ struct ehca_qp *internal_create_qp(struct ib_pd *pd, return ERR_PTR(-EINVAL); } - if (is_llqp (qp_type != IB_QPT_RC qp_type != IB_QPT_UD)) { - ehca_err(pd-device, unsupported LL QP Type=%x, qp_type); - return ERR_PTR(-EINVAL); - } else if (is_llqp qp_type == IB_QPT_RC - (init_attr-cap.max_send_wr 255 || - init_attr-cap.max_recv_wr 255 )) { - ehca_err(pd-device, Invalid Number of max_sq_wr=%x -or max_rq_wr=%x for RC LLQP, -init_attr-cap.max_send_wr, -init_attr-cap.max_recv_wr); - return ERR_PTR(-EINVAL); - } else if (is_llqp qp_type == IB_QPT_UD -init_attr-cap.max_send_wr 255) { - ehca_err(pd-device, -Invalid Number of max_send_wr=%x for UD QP_TYPE=%x, -init_attr-cap.max_send_wr, qp_type); - return ERR_PTR(-EINVAL); + if (is_llqp) { + switch (qp_type) { + case IB_QPT_RC: + if ((init_attr-cap.max_send_wr 255) || + (init_attr-cap.max_recv_wr 255)) { + ehca_err(pd-device, +Invalid Number of max_sq_wr=%x +or max_rq_wr=%x for RC LLQP, +init_attr-cap.max_send_wr, +init_attr-cap.max_recv_wr); + return ERR_PTR(-EINVAL); + } + break; + case IB_QPT_UD: + if (!EHCA_BMASK_GET(HCA_CAP_UD_LL_QP, shca-hca_cap)) { + ehca_err(pd-device, UD LLQP not supported +by this adapter); + return ERR_PTR(-ENOSYS); + } + if (!(init_attr-cap.max_send_sge = 5 +init_attr-cap.max_send_sge = 1 +init_attr-cap.max_recv_sge = 5 +init_attr-cap.max_recv_sge = 1)) { + ehca_err(pd-device, +Invalid Number of max_send_sge=%x +or max_recv_sge=%x for UD LLQP, +init_attr-cap.max_send_sge, +init_attr-cap.max_recv_sge); + return ERR_PTR(-EINVAL); + } else if (init_attr-cap.max_send_wr 255) { + ehca_err(pd-device, +Invalid Number of +ax_send_wr=%x for UD QP_TYPE=%x, +init_attr-cap.max_send_wr, qp_type); + return ERR_PTR(-EINVAL); + } + break; + default: + ehca_err(pd-device, unsupported LL QP Type=%x, +qp_type); + return ERR_PTR(-EINVAL); + break; + } } if (pd-uobject udata) @@ -509,7 +539,7 @@ struct
[PATCH 06/13] IB/ehca: Set SEND_GRH flag for all non-LL UD QPs on eHCA2
From: Stefan Roscher [EMAIL PROTECTED] Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_qp.c | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index ffd1ce9..cbb8b5b 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -1054,6 +1054,17 @@ static int internal_modify_qp(struct ib_qp *ibqp, ehca_qp=%p qp_num=%x VALID STATE CHANGE qp_state_xsit=%x, my_qp, ibqp-qp_num, statetrans); + /* eHCA2 rev2 and higher require the SEND_GRH_FLAG to be set +* in non-LL UD QPs. +*/ + if ((my_qp-qp_type == IB_QPT_UD) + (my_qp-ext_type != EQPT_LLQP) + (statetrans == IB_QPST_INIT2RTR) + (shca-hw_level = 0x22)){ + update_mask |= EHCA_BMASK_SET(MQPCB_MASK_SEND_GRH_FLAG, 1); + mqpcb-send_grh_flag = 1; + } + /* sqe - rts: set purge bit of bad wqe before actual trans */ if ((my_qp-qp_type == IB_QPT_UD || my_qp-qp_type == IB_QPT_GSI || -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 04/13] IB/ehca: add Shared Receive Queue support
Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code (structures, create, destroy, post_recv) can be shared between QP and SRQ. Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h | 26 +- drivers/infiniband/hw/ehca/ehca_classes_pSeries.h |4 +- drivers/infiniband/hw/ehca/ehca_iverbs.h | 15 + drivers/infiniband/hw/ehca/ehca_main.c| 16 +- drivers/infiniband/hw/ehca/ehca_qp.c | 451 + drivers/infiniband/hw/ehca/ehca_reqs.c| 47 ++- drivers/infiniband/hw/ehca/ehca_uverbs.c |4 +- drivers/infiniband/hw/ehca/hcp_if.c | 23 +- drivers/infiniband/hw/ehca/hipz_hw.h |1 + 9 files changed, 480 insertions(+), 107 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 6e75db6..9d689ae 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -5,6 +5,7 @@ * * Authors: Heiko J Schick [EMAIL PROTECTED] * Christoph Raisch [EMAIL PROTECTED] + * Joachim Fenkes [EMAIL PROTECTED] * * Copyright (c) 2005 IBM Corporation * @@ -117,9 +118,20 @@ struct ehca_pd { u32 ownpid; }; +enum ehca_ext_qp_type { + EQPT_NORMAL= 0, + EQPT_LLQP = 1, + EQPT_SRQBASE = 2, + EQPT_SRQ = 3, +}; + struct ehca_qp { - struct ib_qp ib_qp; + union { + struct ib_qp ib_qp; + struct ib_srq ib_srq; + }; u32 qp_type; + enum ehca_ext_qp_type ext_type; struct ipz_queue ipz_squeue; struct ipz_queue ipz_rqueue; struct h_galpas galpas; @@ -142,6 +154,10 @@ struct ehca_qp { u32 mm_count_galpa; }; +#define IS_SRQ(qp) (qp-ext_type == EQPT_SRQ) +#define HAS_SQ(qp) (qp-ext_type != EQPT_SRQ) +#define HAS_RQ(qp) (qp-ext_type != EQPT_SRQBASE) + /* must be power of 2 */ #define QP_HASHTAB_LEN 8 @@ -307,6 +323,7 @@ struct ehca_create_qp_resp { u32 qp_num; u32 token; u32 qp_type; + u32 ext_type; u32 qkey; /* qp_num assigned by ehca: sqp0/1 may have got different numbers */ u32 real_qp_num; @@ -329,13 +346,6 @@ enum ehca_service_type { ST_UD = 3, }; -enum ehca_ext_qp_type { - EQPT_NORMAL= 0, - EQPT_LLQP = 1, - EQPT_SRQBASE = 2, - EQPT_SRQ = 3, -}; - enum ehca_ll_comp_flags { LLQP_SEND_COMP = 0x20, LLQP_RECV_COMP = 0x40, diff --git a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h index 5665f21..fb3df5c 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h +++ b/drivers/infiniband/hw/ehca/ehca_classes_pSeries.h @@ -228,8 +228,8 @@ struct hcp_modify_qp_control_block { #define MQPCB_QP_NUMBER EHCA_BMASK_IBM(8,31) #define MQPCB_MASK_QP_ENABLEEHCA_BMASK_IBM(48,48) #define MQPCB_QP_ENABLE EHCA_BMASK_IBM(31,31) -#define MQPCB_MASK_CURR_SQR_LIMIT EHCA_BMASK_IBM(49,49) -#define MQPCB_CURR_SQR_LIMITEHCA_BMASK_IBM(15,31) +#define MQPCB_MASK_CURR_SRQ_LIMIT EHCA_BMASK_IBM(49,49) +#define MQPCB_CURR_SRQ_LIMITEHCA_BMASK_IBM(16,31) #define MQPCB_MASK_QP_AFF_ASYN_EV_LOG_REG EHCA_BMASK_IBM(50,50) #define MQPCB_MASK_SHARED_RQ_HNDL EHCA_BMASK_IBM(51,51) diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h index 37e7fe0..fd84a80 100644 --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h @@ -154,6 +154,21 @@ int ehca_post_send(struct ib_qp *qp, struct ib_send_wr *send_wr, int ehca_post_recv(struct ib_qp *qp, struct ib_recv_wr *recv_wr, struct ib_recv_wr **bad_recv_wr); +int ehca_post_srq_recv(struct ib_srq *srq, + struct ib_recv_wr *recv_wr, + struct ib_recv_wr **bad_recv_wr); + +struct ib_srq *ehca_create_srq(struct ib_pd *pd, + struct ib_srq_init_attr *init_attr, + struct ib_udata *udata); + +int ehca_modify_srq(struct ib_srq *srq, struct ib_srq_attr *attr, + enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); + +int ehca_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr); + +int ehca_destroy_srq(struct ib_srq *srq); + u64 ehca_define_sqp(struct ehca_shca *shca, struct ehca_qp *ibqp, struct ib_qp_init_attr *qp_init_attr); diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index befbb9c..9bd749c 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -343,7 +343,7 @@ int ehca_init_device(struct ehca_shca *shca
[PATCH 11/13] IB/ehca: return QP pointer in poll_cq(), add two unlikely() statements
Signed-off-by: Joachim Fenkes [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c b/drivers/infiniband/hw/ehca/ehca_reqs.c index 73f0c06..fd3ba22 100644 --- a/drivers/infiniband/hw/ehca/ehca_reqs.c +++ b/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -517,6 +517,7 @@ static inline int ehca_poll_cq_one(struct ib_cq *cq, struct ib_wc *wc) int ret = 0; struct ehca_cq *my_cq = container_of(cq, struct ehca_cq, ib_cq); struct ehca_cqe *cqe; + struct ehca_qp *my_qp; int cqe_count = 0; poll_cq_one_read_cqe: @@ -568,7 +569,7 @@ poll_cq_one_read_cqe: } /* tracing cqe */ - if (ehca_debug_level) { + if (unlikely(ehca_debug_level)) { ehca_dbg(cq-device, Received COMPLETION ehca_cq=%p cq_num=%x -, my_cq, my_cq-cq_number); @@ -602,7 +603,11 @@ poll_cq_one_read_cqe: } else wc-status = IB_WC_SUCCESS; - wc-qp = NULL; + read_lock(ehca_qp_idr_lock); + my_qp = idr_find(ehca_qp_idr, cqe-qp_token); + wc-qp = my_qp-ib_qp; + read_unlock(ehca_qp_idr_lock); + wc-byte_len = cqe-nr_bytes_transferred; wc-pkey_index = cqe-pkey_index; wc-slid = cqe-rlid; @@ -612,7 +617,7 @@ poll_cq_one_read_cqe: wc-imm_data = cpu_to_be32(cqe-immediate_data); wc-sl = cqe-service_level; - if (wc-status != IB_WC_SUCCESS) + if (unlikely(wc-status != IB_WC_SUCCESS)) ehca_dbg(cq-device, ehca_cq=%p cq_num=%x WARNING unsuccessful cqe OPType=%x status=%x qp_num=%x src_qp=%x wr_id=%lx -- 1.5.2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev