Re: [PATCH 1/3] IB/ehca: Replace vmalloc with kmalloc

2009-04-28 Thread Stefan Roscher
On Tuesday 28 April 2009 05:12:51 pm Dave Hansen wrote:
 On Tue, 2009-04-21 at 17:16 +0200, Stefan Roscher wrote:
  From: Anton Blanchard antonb at au1.ibm.com
  
  To improve performance of driver ressource allocation,
  replace the vmalloc() call with kmalloc().
 
 Just curious, but how big are these allocations?  Why was vmalloc() even
 ever used if we know they'll be small?
 
 -- Dave
 
 

The theoretical maximum size can be 512k, but for common queue pairs 
less than 128k is used.Because of the theoretical maximum we implemented
vmalloc() first, but recognized a huge performance impact.

-- Stefan 

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] IB/ehca: Replace vmalloc with kmalloc

2009-04-22 Thread Stefan Roscher
Hi Roland, 
thanks for the quick review. I was hoping you could apply these changes 
for 2.6.30 because this will be the codebase for the next OFED release.
The patch is well tested in HPC environment and we haven't seen any 
issues.
Regarding Antons patch you are right. If a user allocates an 
unrealistically large queue pair it could happen that kmalloc() is not 
able to allocate the memory. In this case we will return ENOMEM to the 
user so the kernel will not be affected at all. We plan to add vmalloc() 
call in case kmalloc() fails for the next kernel release.
 
Mit freundlichen Grüßen / Kind regards
 
Stefan Roscher
 
eHCA/eHEA Linux Driver Development
IBM Systems Technology Group, Systems Software Development / FW I/O 
Firmware Entwicklung 2
---
IBM Deutschland
Schoenaicher Str. 220
71032 Boeblingen
Phone: +49-7031-16-2015
E-Mail: stefan.rosc...@de.ibm.com
---
IBM Deutschland Research  Development GmbH / Vorsitzender des 
Aufsichtsrats: Martin Jetter
Geschäftsführung: Herbert Kircher
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, 
HRB 243294



From:
Roland Dreier rdre...@cisco.com
To:
Stefan Roscher ossro...@linux.vnet.ibm.com
Cc:
LinuxPPC-Dev linuxppc-dev@ozlabs.org, LKML 
linux-ker...@vger.kernel.org, OF-EWG e...@lists.openfabrics.org, 
Roland Dreier rola...@cisco.com, Joachim Fenkes/Germany/i...@ibmde, 
Christoph Raisch/Germany/i...@ibmde, Alexander Schmidt1/Germany/i...@ibmde, 
Stefan Roscher/Germany/i...@ibmde, Hoang-Nam Nguyen/Germany/i...@ibmde
Date:
21.04.2009 19:34
Subject:
Re: [PATCH 1/3] IB/ehca: Replace vmalloc with kmalloc



  + queue-queue_pages = kmalloc(nr_of_pages * sizeof(void 
*), GFP_KERNEL);

How big might this buffer be?  Any chance of allocation failure due to
memory fragmentation?

 - R.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

Re: [PATCH 1/3] IB/ehca: Replace vmalloc with kmalloc

2009-04-22 Thread Stefan Roscher
In case of large queue pairs there is the possibillity of allocation failures 
due to memory fragmentationo with kmalloc().To ensure the memory is allocated 
even
if kmalloc() can not find chunks which are big enough, we try to allocate the 
memory
with vmalloc().

Signed-off-by: Stefan Roscher stefan.rosc...@de.ibm.com
---

On Tuesday 21 April 2009 07:34:30 pm Roland Dreier wrote:
   +  queue-queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
 
 How big might this buffer be?  Any chance of allocation failure due to
 memory fragmentation?
 
  - R.
Hey Roland, 
yes you are right and here is the patch to circumvent the described problem.
It will apply on top of the patchset.
regards Stefan


 
 drivers/infiniband/hw/ehca/ipz_pt_fn.c |   17 +
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c 
b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
index a260559..1227c59 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -222,8 +222,11 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue 
*queue,
/* allocate queue page pointers */
queue-queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
if (!queue-queue_pages) {
-   ehca_gen_err(Couldn't allocate queue page list);
-   return 0;
+   queue-queue_pages = vmalloc(nr_of_pages * sizeof(void *));
+   if (!queue-queue_pages) {
+   ehca_gen_err(Couldn't allocate queue page list);
+   return 0;
+   }
}
memset(queue-queue_pages, 0, nr_of_pages * sizeof(void *));
 
@@ -240,7 +243,10 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue 
*queue,
 ipz_queue_ctor_exit0:
ehca_gen_err(Couldn't alloc pages queue=%p 
 nr_of_pages=%x,  queue, nr_of_pages);
-   kfree(queue-queue_pages);
+   if (is_vmalloc_addr(queue-queue_pages))
+   vfree(queue-queue_pages);
+   else
+   kfree(queue-queue_pages);
 
return 0;
 }
@@ -262,7 +268,10 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue 
*queue)
free_page((unsigned long)queue-queue_pages[i]);
}
 
-   kfree(queue-queue_pages);
+   if (is_vmalloc_addr(queue-queue_pages))
+   vfree(queue-queue_pages);
+   else
+   kfree(queue-queue_pages);
 
return 1;
 }
-- 
1.5.5




___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 1/3] IB/ehca: Replace vmalloc with kmalloc

2009-04-22 Thread Stefan Roscher
On Wednesday 22 April 2009 04:10:18 pm michael wrote:
 Hi,
 

 I don't take the point, if it is not import use the vmalloc. Why you try 
 with a kmalloc
 alloc first? and why do not use kzalloc?

Because kmalloc() is faster than vmalloc() causing a huge performance win
when someone allocates a large number of queue pairs. We fall back to
vmalloc() only if kmalloc() can't deliver the memory chunk.
We don't need kzalloc because we fill the list right after the alloc.

regards Stefan

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 0/3] IB/ehca: Perfomance improvment for creation of queue pairs

2009-04-21 Thread Stefan Roscher
This patchset contains performance improvments for ehca driver.
It will skip code which is not necessary for userspace queue pairs
and will replace vmalloc() calls with kmalloc().
Because of this fundamental code change we will also increment the version 
number.

They should apply cleanly against 2.6.30 git tree.

Thanks
Stefan
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 1/3] IB/ehca: Replace vmalloc with kmalloc

2009-04-21 Thread Stefan Roscher
From: Anton Blanchard antonb at au1.ibm.com

To improve performance of driver ressource allocation,
replace the vmalloc() call with kmalloc().

Signed-off-by: Stefan Roscher stefan.roscher at de.ibm.com
---
 drivers/infiniband/hw/ehca/ipz_pt_fn.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c 
b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
index c3a3284..a260559 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -220,7 +220,7 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue 
*queue,
queue-small_page = NULL;
 
/* allocate queue page pointers */
-   queue-queue_pages = vmalloc(nr_of_pages * sizeof(void *));
+   queue-queue_pages = kmalloc(nr_of_pages * sizeof(void *), GFP_KERNEL);
if (!queue-queue_pages) {
ehca_gen_err(Couldn't allocate queue page list);
return 0;
@@ -240,7 +240,7 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue 
*queue,
 ipz_queue_ctor_exit0:
ehca_gen_err(Couldn't alloc pages queue=%p 
 nr_of_pages=%x,  queue, nr_of_pages);
-   vfree(queue-queue_pages);
+   kfree(queue-queue_pages);
 
return 0;
 }
@@ -262,7 +262,7 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue 
*queue)
free_page((unsigned long)queue-queue_pages[i]);
}
 
-   vfree(queue-queue_pages);
+   kfree(queue-queue_pages);
 
return 1;
 }
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 2/3] IB/ehca: Remove unnecessary memory operations for userspace queue pairs

2009-04-21 Thread Stefan Roscher
The queue map for flush completion circumvention is only used for
kernel space queue pairs. This patch skips the allocation of the queue maps
in case the QP is created for userspace. In addition, this patch
does not iomap the galpas for kernel usage if the queue pair is only used
in userspace. These changes will improve the performance of creation
of userspace queue pairs.

Signed-off-by: Stefan Roscher stefan.roscher at de.ibm.com
---
 drivers/infiniband/hw/ehca/ehca_qp.c  |   94 ++--
 drivers/infiniband/hw/ehca/hcp_if.c   |6 +-
 drivers/infiniband/hw/ehca/hcp_if.h   |2 +-
 drivers/infiniband/hw/ehca/hcp_phyp.c |   11 +++--
 drivers/infiniband/hw/ehca/hcp_phyp.h |2 +-
 5 files changed, 65 insertions(+), 50 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index 00c1081..ead4e71 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -461,7 +461,7 @@ static struct ehca_qp *internal_create_qp(
  ib_device);
struct ib_ucontext *context = NULL;
u64 h_ret;
-   int is_llqp = 0, has_srq = 0;
+   int is_llqp = 0, has_srq = 0, is_user = 0;
int qp_type, max_send_sge, max_recv_sge, ret;
 
/* h_call's out parameters */
@@ -609,9 +609,6 @@ static struct ehca_qp *internal_create_qp(
}
}
 
-   if (pd-uobject  udata)
-   context = pd-uobject-context;
-
my_qp = kmem_cache_zalloc(qp_cache, GFP_KERNEL);
if (!my_qp) {
ehca_err(pd-device, pd=%p not enough memory to alloc qp, pd);
@@ -619,6 +616,11 @@ static struct ehca_qp *internal_create_qp(
return ERR_PTR(-ENOMEM);
}
 
+   if (pd-uobject  udata) {
+   is_user = 1;
+   context = pd-uobject-context;
+   }
+
atomic_set(my_qp-nr_events, 0);
init_waitqueue_head(my_qp-wait_completion);
spin_lock_init(my_qp-spinlock_s);
@@ -707,7 +709,7 @@ static struct ehca_qp *internal_create_qp(
(parms.squeue.is_small || parms.rqueue.is_small);
}
 
-   h_ret = hipz_h_alloc_resource_qp(shca-ipz_hca_handle, parms);
+   h_ret = hipz_h_alloc_resource_qp(shca-ipz_hca_handle, parms, is_user);
if (h_ret != H_SUCCESS) {
ehca_err(pd-device, h_alloc_resource_qp() failed h_ret=%lli,
 h_ret);
@@ -769,18 +771,20 @@ static struct ehca_qp *internal_create_qp(
goto create_qp_exit2;
}
 
-   my_qp-sq_map.entries = my_qp-ipz_squeue.queue_length /
-my_qp-ipz_squeue.qe_size;
-   my_qp-sq_map.map = vmalloc(my_qp-sq_map.entries *
-   sizeof(struct ehca_qmap_entry));
-   if (!my_qp-sq_map.map) {
-   ehca_err(pd-device, Couldn't allocate squeue 
-map ret=%i, ret);
-   goto create_qp_exit3;
+   if (!is_user) {
+   my_qp-sq_map.entries = my_qp-ipz_squeue.queue_length /
+   my_qp-ipz_squeue.qe_size;
+   my_qp-sq_map.map = vmalloc(my_qp-sq_map.entries *
+   sizeof(struct 
ehca_qmap_entry));
+   if (!my_qp-sq_map.map) {
+   ehca_err(pd-device, Couldn't allocate squeue 
+map ret=%i, ret);
+   goto create_qp_exit3;
+   }
+   INIT_LIST_HEAD(my_qp-sq_err_node);
+   /* to avoid the generation of bogus flush CQEs */
+   reset_queue_map(my_qp-sq_map);
}
-   INIT_LIST_HEAD(my_qp-sq_err_node);
-   /* to avoid the generation of bogus flush CQEs */
-   reset_queue_map(my_qp-sq_map);
}
 
if (HAS_RQ(my_qp)) {
@@ -792,20 +796,21 @@ static struct ehca_qp *internal_create_qp(
 and pages ret=%i, ret);
goto create_qp_exit4;
}
-
-   my_qp-rq_map.entries = my_qp-ipz_rqueue.queue_length /
-   my_qp-ipz_rqueue.qe_size;
-   my_qp-rq_map.map = vmalloc(my_qp-rq_map.entries *
-   sizeof(struct ehca_qmap_entry));
-   if (!my_qp-rq_map.map) {
-   ehca_err(pd-device, Couldn't allocate squeue 
-   map ret=%i, ret);
-   goto create_qp_exit5;
+   if (!is_user) {
+   my_qp-rq_map.entries = my_qp-ipz_rqueue.queue_length /
+   my_qp-ipz_rqueue.qe_size;
+   my_qp-rq_map.map = vmalloc(my_qp-rq_map.entries

[PATCH 3/3] IB/ehca: Increment version number

2009-04-21 Thread Stefan Roscher
Signed-off-by: Stefan Roscher stefan.rosc...@de.ibm.com

---
 drivers/infiniband/hw/ehca/ehca_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_main.c 
b/drivers/infiniband/hw/ehca/ehca_main.c
index 368311c..85905ab 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -52,7 +52,7 @@
 #include ehca_tools.h
 #include hcp_if.h
 
-#define HCAD_VERSION 0026
+#define HCAD_VERSION 0027
 
 MODULE_LICENSE(Dual BSD/GPL);
 MODULE_AUTHOR(Christoph Raisch rai...@de.ibm.com);
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] IB/ehca: replace modulus operations in flush error completion path

2008-12-02 Thread Stefan Roscher
With the latest flush error completion patch we introduced modulus operation
to calculate the next index within a qmap. Based on comments from other
mailing lists we decided to optimize this operation by using an addition and
an if-statement instead of modulus, even though this is in error path.

Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/ehca_classes.h |7 +++
 drivers/infiniband/hw/ehca/ehca_qp.c  |   12 ++--
 drivers/infiniband/hw/ehca/ehca_reqs.c|   13 ++---
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h 
b/drivers/infiniband/hw/ehca/ehca_classes.h
index 7fc35cf..c825142 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -175,6 +175,13 @@ struct ehca_queue_map {
unsigned int next_wqe_idx;   /* Idx to first wqe to be flushed */
 };
 
+/* function to calculate the next index for the qmap */
+static inline unsigned int next_index(unsigned int cur_index, unsigned int 
limit)
+{
+   unsigned int temp = cur_index + 1;
+   return (temp == limit) ? 0 : temp;
+}
+
 struct ehca_qp {
union {
struct ib_qp ib_qp;
diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index cadbf0c..f161cf1 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -1138,14 +1138,14 @@ static int calc_left_cqes(u64 wqe_p, struct ipz_queue 
*ipz_queue,
return -EFAULT;
}
 
-   tail_idx = (qmap-tail + 1) % qmap-entries;
+   tail_idx = next_index(qmap-tail, qmap-entries);
wqe_idx = q_ofs / ipz_queue-qe_size;
 
/* check all processed wqes, whether a cqe is requested or not */
while (tail_idx != wqe_idx) {
if (qmap-map[tail_idx].cqe_req)
qmap-left_to_poll++;
-   tail_idx = (tail_idx + 1) % qmap-entries;
+   tail_idx = next_index(tail_idx, qmap-entries);
}
/* save index in queue, where we have to start flushing */
qmap-next_wqe_idx = wqe_idx;
@@ -1195,14 +1195,14 @@ static int check_for_left_cqes(struct ehca_qp *my_qp, 
struct ehca_shca *shca)
} else {
spin_lock_irqsave(my_qp-send_cq-spinlock, flags);
my_qp-sq_map.left_to_poll = 0;
-   my_qp-sq_map.next_wqe_idx = (my_qp-sq_map.tail + 1) %
-   my_qp-sq_map.entries;
+   my_qp-sq_map.next_wqe_idx = next_index(my_qp-sq_map.tail,
+   my_qp-sq_map.entries);
spin_unlock_irqrestore(my_qp-send_cq-spinlock, flags);
 
spin_lock_irqsave(my_qp-recv_cq-spinlock, flags);
my_qp-rq_map.left_to_poll = 0;
-   my_qp-rq_map.next_wqe_idx = (my_qp-rq_map.tail + 1) %
-   my_qp-rq_map.entries;
+   my_qp-rq_map.next_wqe_idx = next_index(my_qp-rq_map.tail,
+   my_qp-rq_map.entries);
spin_unlock_irqrestore(my_qp-recv_cq-spinlock, flags);
}
 
diff --git a/drivers/infiniband/hw/ehca/ehca_reqs.c 
b/drivers/infiniband/hw/ehca/ehca_reqs.c
index 00a648f..c711268 100644
--- a/drivers/infiniband/hw/ehca/ehca_reqs.c
+++ b/drivers/infiniband/hw/ehca/ehca_reqs.c
@@ -726,13 +726,13 @@ repoll:
 * set left_to_poll to 0 because in error state, we will not
 * get any additional CQEs
 */
-   my_qp-sq_map.next_wqe_idx = (my_qp-sq_map.tail + 1) %
-   my_qp-sq_map.entries;
+   my_qp-sq_map.next_wqe_idx = next_index(my_qp-sq_map.tail,
+   my_qp-sq_map.entries);
my_qp-sq_map.left_to_poll = 0;
ehca_add_to_err_list(my_qp, 1);
 
-   my_qp-rq_map.next_wqe_idx = (my_qp-rq_map.tail + 1) %
-   my_qp-rq_map.entries;
+   my_qp-rq_map.next_wqe_idx = next_index(my_qp-rq_map.tail,
+   my_qp-rq_map.entries);
my_qp-rq_map.left_to_poll = 0;
if (HAS_RQ(my_qp))
ehca_add_to_err_list(my_qp, 0);
@@ -860,9 +860,8 @@ static int generate_flush_cqes(struct ehca_qp *my_qp, 
struct ib_cq *cq,
 
/* mark as reported and advance next_wqe pointer */
qmap_entry-reported = 1;
-   qmap-next_wqe_idx++;
-   if (qmap-next_wqe_idx == qmap-entries)
-   qmap-next_wqe_idx = 0;
+   qmap-next_wqe_idx = next_index(qmap-next_wqe_idx,
+   qmap-entries);
qmap_entry = qmap-map[qmap

[PATCH] IB/ehca: remove reference to the QP in case of port activation failure

2008-11-04 Thread Stefan Roscher
If the initialization of a special QP (e.g. AQP1) fails due to a software 
timeout,
we have to remove the reference to that special QP struct from the port struct
preventing the driver to access the QP, since it will be/has been destroyed
by the caller, ie in this case ib_mad.
This patch will apply cleanly on top of 2.6.28 git tree.

Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/ehca_irq.c |7 +--
 drivers/infiniband/hw/ehca/ehca_qp.c  |5 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c 
b/drivers/infiniband/hw/ehca/ehca_irq.c
index cb55be0..9e43459 100644
--- a/drivers/infiniband/hw/ehca/ehca_irq.c
+++ b/drivers/infiniband/hw/ehca/ehca_irq.c
@@ -370,6 +370,10 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe)
switch (ec) {
case 0x30: /* port availability change */
if (EHCA_BMASK_GET(NEQE_PORT_AVAILABILITY, eqe)) {
+   /* only for autodetect mode important */
+   if (ehca_nr_ports = 0)
+   break;
+
int suppress_event;
/* replay modify_qp for sqps */
spin_lock_irqsave(sport-mod_sqp_lock, flags);
@@ -387,8 +391,7 @@ static void parse_ec(struct ehca_shca *shca, u64 eqe)
sport-port_state = IB_PORT_ACTIVE;
dispatch_port_event(shca, port, IB_EVENT_PORT_ACTIVE,
is active);
-   ehca_query_sma_attr(shca, port,
-   sport-saved_attr);
+   ehca_query_sma_attr(shca, port, sport-saved_attr);
} else {
sport-port_state = IB_PORT_DOWN;
dispatch_port_event(shca, port, IB_EVENT_PORT_ERR,
diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index 4d54b9f..9e05ee2 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -860,6 +860,11 @@ static struct ehca_qp *internal_create_qp(
if (qp_type == IB_QPT_GSI) {
h_ret = ehca_define_sqp(shca, my_qp, init_attr);
if (h_ret != H_SUCCESS) {
+   kfree(my_qp-mod_qp_parm);
+   my_qp-mod_qp_parm = NULL;
+   /* the QP pointer is no longer valid */
+   shca-sport[init_attr-port_num - 1].ibqp_sqp[qp_type] =
+   NULL;
ret = ehca2ib_return_code(h_ret);
goto create_qp_exit6;
}
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH]IB/ehca:reject dynamic memory add/remove

2008-10-13 Thread Stefan Roscher
Since the ehca device driver does not support dynamic memory add and remove
operations, the driver must explicitly reject such requests in order to prevent
unpredictable behaviors related to memory regions already occupied and being
used by InfiniBand applications.
The solution is to add a memory notifier to the ehca device driver and if a 
request
for dynamic memory add or remove comes in, ehca will always reject it.

Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---

diff -Nurp linux-2.6.27-rc6-7/drivers/infiniband/hw/ehca/ehca_main.c 
linux-2.6.27-rc6-7.new/drivers/infiniband/hw/ehca/ehca_main.c
--- linux-2.6.27-rc6-7/drivers/infiniband/hw/ehca/ehca_main.c   2008-09-16 
18:19:27.0 +0200
+++ linux-2.6.27-rc6-7.new/drivers/infiniband/hw/ehca/ehca_main.c   
2008-10-03 13:52:50.0 +0200
@@ -44,6 +44,8 @@
 #include linux/slab.h
 #endif
 
+#include linux/notifier.h
+#include linux/memory.h
 #include ehca_classes.h
 #include ehca_iverbs.h
 #include ehca_mrmw.h
@@ -964,6 +966,41 @@ void ehca_poll_eqs(unsigned long data)
spin_unlock(shca_list_lock);
 }
 
+static int ehca_mem_notifier(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+   static unsigned long ehca_dmem_warn_time;
+
+   switch (action) {
+   case MEM_CANCEL_OFFLINE:
+   case MEM_CANCEL_ONLINE:
+   case MEM_ONLINE:
+   case MEM_OFFLINE:
+   return NOTIFY_OK;
+   case MEM_GOING_ONLINE:
+   case MEM_GOING_OFFLINE:
+   /* only ok if no hca is attached to the lpar */
+   spin_lock(shca_list_lock);
+   if (list_empty(shca_list)) {
+   spin_unlock(shca_list_lock);
+   return NOTIFY_OK;
+   } else {
+   spin_unlock(shca_list_lock);
+   if (printk_timed_ratelimit(ehca_dmem_warn_time,
+  30 * 1000))
+   ehca_gen_err(DMEM operations are not allowed
+as long as an ehca adapter is
+attached to the LPAR);
+   return NOTIFY_BAD;
+   }
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block ehca_mem_nb = {
+   .notifier_call = ehca_mem_notifier,
+};
+
 static int __init ehca_module_init(void)
 {
int ret;
@@ -991,6 +1028,12 @@ static int __init ehca_module_init(void)
goto module_init2;
}
 
+   ret = register_memory_notifier(ehca_mem_nb);
+   if (ret) {
+   ehca_gen_err(Failed registering memory add/remove notifier);
+   goto module_init3;
+   }
+
if (ehca_poll_all_eqs != 1) {
ehca_gen_err(WARNING!!!);
ehca_gen_err(It is possible to lose interrupts.);
@@ -1003,6 +1046,9 @@ static int __init ehca_module_init(void)
 
return 0;
 
+module_init3:
+   ibmebus_unregister_driver(ehca_driver);
+
 module_init2:
ehca_destroy_slab_caches();
 
@@ -1018,6 +1064,8 @@ static void __exit ehca_module_exit(void
 
ibmebus_unregister_driver(ehca_driver);
 
+   unregister_memory_notifier(ehca_mem_nb);
+
ehca_destroy_slab_caches();
 
ehca_destroy_comp_pool();
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH REPOST] IB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts

2008-06-13 Thread Stefan Roscher
Hi Roland,

On Tuesday 10 June 2008 18:18:50 Roland Dreier wrote:
So just to be clear: this is a workaround for a hardware/firmware bug?
 
   Yes it is.
 
 OK, so paulus et al... does it seem like a good approach to call H_EOI
 from driver code (given that this driver makes tons of other hcalls)?
 
 How critical is this?  Since you said corner case testing I suspect we
 can defer this to 2.6.27 and maybe get it into -stable later?

No, it's ok with me if you pick this for 2.6.27.
 
 Also, out of curiousity:
 
   +u64 hipz_h_eoi(int irq)
   +{
   +  int value;
   +  unsigned long xirr;
   +
   +  iosync();
 
 what is the iosync() required for here?

It's the same sequence as the interrupt handler for powerpc is implemented.

 
   +  value = (0xff  24) | irq;
   +  xirr = value  0x;
 
 given that irq and value are ints, is there any possible way value could
 have bits outside of the low 32 set?  If you're worried about sign
 extension isn't it simpler to just make value unsigned?
 
   +  return plpar_hcall_norets(H_EOI, xirr);
   +}
 
 ie why not:
 
 u64 hipz_h_eoi(int irq)
 {
   unsigned xirr;
 
   iosync();
   xirr = (0xff  24) | irq;
   return plpar_hcall_norets(H_EOI, xirr);
 }
 
Yeah, you are rigth I will change that with the final patch.
I will send the final patch soon.

regards Stefan


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH REPOST #2] IB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts

2008-06-13 Thread Stefan Roscher
During corner case testing, we noticed that some versions of ehca 
do not properly transition to interrupt done in special load situations.
This can be resolved by periodically triggering EOI through H_EOI, 
if eqes are pending.

Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
As firmware team suggested I moved the call of the EOI h_call into 
the handler function, this ensures that we will call EOI only when we 
find a valid eqe on the event queue.
Additionally I changed the calculation of the xirr value as Roland suggested.

 drivers/infiniband/hw/ehca/ehca_irq.c |9 +++--
 drivers/infiniband/hw/ehca/hcp_if.c   |   10 ++
 drivers/infiniband/hw/ehca/hcp_if.h   |1 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c 
b/drivers/infiniband/hw/ehca/ehca_irq.c
index ce1ab05..0792d93 100644
--- a/drivers/infiniband/hw/ehca/ehca_irq.c
+++ b/drivers/infiniband/hw/ehca/ehca_irq.c
@@ -531,7 +531,7 @@ void ehca_process_eq(struct ehca_shca *shca, int is_irq)
 {
struct ehca_eq *eq = shca-eq;
struct ehca_eqe_cache_entry *eqe_cache = eq-eqe_cache;
-   u64 eqe_value;
+   u64 eqe_value, ret;
unsigned long flags;
int eqe_cnt, i;
int eq_empty = 0;
@@ -583,8 +583,13 @@ void ehca_process_eq(struct ehca_shca *shca, int is_irq)
ehca_dbg(shca-ib_device,
 No eqe found for irq event);
goto unlock_irq_spinlock;
-   } else if (!is_irq)
+   } else if (!is_irq) {
+   ret = hipz_h_eoi(eq-ist);
+   if (ret != H_SUCCESS)
+   ehca_err(shca-ib_device,
+bad return code EOI -rc = %ld\n, ret);
ehca_dbg(shca-ib_device, deadman found %x eqe, eqe_cnt);
+   }
if (unlikely(eqe_cnt == EHCA_EQE_CACHE_SIZE))
ehca_dbg(shca-ib_device, too many eqes for one irq event);
/* enable irq for new packets */
diff --git a/drivers/infiniband/hw/ehca/hcp_if.c 
b/drivers/infiniband/hw/ehca/hcp_if.c
index 5245e13..415d3a4 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -933,3 +933,13 @@ u64 hipz_h_error_data(const struct ipz_adapter_handle 
adapter_handle,
   r_cb,
   0, 0, 0, 0);
 }
+
+u64 hipz_h_eoi(int irq)
+{
+   unsigned long xirr;
+
+   iosync();
+   xirr = (0xffULL  24) | irq;
+
+   return plpar_hcall_norets(H_EOI, xirr);
+}
diff --git a/drivers/infiniband/hw/ehca/hcp_if.h 
b/drivers/infiniband/hw/ehca/hcp_if.h
index 60ce02b..2c3c6e0 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.h
+++ b/drivers/infiniband/hw/ehca/hcp_if.h
@@ -260,5 +260,6 @@ u64 hipz_h_error_data(const struct ipz_adapter_handle 
adapter_handle,
  const u64 ressource_handle,
  void *rblock,
  unsigned long *byte_count);
+u64 hipz_h_eoi(int irq);
 
 #endif /* __HCP_IF_H__ */
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/2] Prevent loss of interrupts in IB/ehca

2008-06-10 Thread Stefan Roscher
On Tuesday 10 June 2008 00:28:16 Paul Mackerras wrote:
 Stefan Roscher writes:
 
  This patchset contains two changes for IB/ehca and ibmebus.
  
  The first patch enables ibmebus_request_irq() to optionally return the 
  IRQ number, which is used by the second patch to trigger EOI in case of 
  lost interrupts.
 
 At first sight it seems like a very bad idea for a driver to be poking
 into the internals of the interrupt subsystem like this.  Under what
 circumstances do interrupts get lost, and why does doing an extra EOI
 like this fix the problem?
 
 Paul.
 

The processing of events with a timer controlled polling is not the typical
way how you should handle adapter events.
During corner case testing, we noticed that some versions of ehca 
do not properly transition to interrupt done in special load situations.
This can be resolved by periodically triggering EOI through H_EOI, 
if eqes are pending.
Hope this clarifys the backround of the patch.

Is there a better way to initiate this type of EOI in a non-irq case?

regards Stefan R. and Christoph R.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH REPOST] IB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts

2008-06-10 Thread Stefan Roscher
During corner case testing, we noticed that some versions of ehca 
do not properly transition to interrupt done in special load situations.
This can be resolved by periodically triggering EOI through H_EOI, 
if eqes are pending.

Signed-off-by: Stefan Roscher [EMAIL PROTECTED]

---
This patch replaces my previous patch-set.
As Paul suggested, this version of the patch calls H_EOI directly and doesn't 
need
any ibmebus changes.
 
 drivers/infiniband/hw/ehca/ehca_main.c |   11 +--
 drivers/infiniband/hw/ehca/hcp_if.c|   11 +++
 drivers/infiniband/hw/ehca/hcp_if.h|1 +
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_main.c 
b/drivers/infiniband/hw/ehca/ehca_main.c
index 482103e..add4ff4 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -937,6 +937,7 @@ static struct of_platform_driver ehca_driver = {
 void ehca_poll_eqs(unsigned long data)
 {
struct ehca_shca *shca;
+   u64 ret;
 
spin_lock(shca_list_lock);
list_for_each_entry(shca, shca_list, shca_list) {
@@ -955,8 +956,14 @@ void ehca_poll_eqs(unsigned long data)
spin_unlock_irqrestore(eq-spinlock, flags);
max--;
} while (q_ofs == q_ofs2  max  0);
-   if (q_ofs == q_ofs2)
-   ehca_process_eq(shca, 0);
+   if (q_ofs == q_ofs2) {
+   ret = hipz_h_eoi(eq-ist);
+   if (ret != H_SUCCESS)
+   ehca_err(shca-ib_device,
+bad return code EOI -
+rc = %ld\n, ret);
+   tasklet_hi_schedule(shca-eq.interrupt_task);
+   }
}
}
mod_timer(poll_eqs_timer, round_jiffies(jiffies + HZ));
diff --git a/drivers/infiniband/hw/ehca/hcp_if.c 
b/drivers/infiniband/hw/ehca/hcp_if.c
index 5245e13..7084efd 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -933,3 +933,14 @@ u64 hipz_h_error_data(const struct ipz_adapter_handle 
adapter_handle,
   r_cb,
   0, 0, 0, 0);
 }
+
+u64 hipz_h_eoi(int irq)
+{
+   int value;
+   unsigned long xirr;
+
+   iosync();
+   value = (0xff  24) | irq;
+   xirr = value  0x;
+   return plpar_hcall_norets(H_EOI, xirr);
+}
diff --git a/drivers/infiniband/hw/ehca/hcp_if.h 
b/drivers/infiniband/hw/ehca/hcp_if.h
index 60ce02b..2c3c6e0 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.h
+++ b/drivers/infiniband/hw/ehca/hcp_if.h
@@ -260,5 +260,6 @@ u64 hipz_h_error_data(const struct ipz_adapter_handle 
adapter_handle,
  const u64 ressource_handle,
  void *rblock,
  unsigned long *byte_count);
+u64 hipz_h_eoi(int irq);
 
 #endif /* __HCP_IF_H__ */
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH REPOST] IB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts

2008-06-10 Thread Stefan Roscher
On Tuesday 10 June 2008 16:52:57 Roland Dreier wrote:
   During corner case testing, we noticed that some versions of ehca 
   do not properly transition to interrupt done in special load situations.
   This can be resolved by periodically triggering EOI through H_EOI, 
   if eqes are pending.
 
 So just to be clear: this is a workaround for a hardware/firmware bug?
 
  - R.
 

Yes it is.
regards Stefan
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 0/2] Prevent loss of interrupts in IB/ehca

2008-06-09 Thread Stefan Roscher
This patchset contains two changes for IB/ehca and ibmebus.

The first patch enables ibmebus_request_irq() to optionally return the 
IRQ number, which is used by the second patch to trigger EOI in case of 
lost interrupts.

They should apply cleanly against 2.6.26 git tree.

Thanks
Stefan
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 1/2] ibmebus: Change ibmebus_request_irq() to optionally return irq number

2008-06-09 Thread Stefan Roscher
Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 arch/powerpc/kernel/ibmebus.c|5 -
 drivers/infiniband/hw/ehca/ehca_eq.c |4 ++--
 drivers/net/ehea/ehea_main.c |6 +++---
 include/asm-powerpc/ibmebus.h|2 +-
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c
index 9971159..a002fdf 100644
--- a/arch/powerpc/kernel/ibmebus.c
+++ b/arch/powerpc/kernel/ibmebus.c
@@ -208,7 +208,7 @@ void ibmebus_unregister_driver(struct of_platform_driver 
*drv)
 }
 EXPORT_SYMBOL(ibmebus_unregister_driver);
 
-int ibmebus_request_irq(u32 ist, irq_handler_t handler,
+int ibmebus_request_irq(u32 ist, int *irq_number, irq_handler_t handler,
unsigned long irq_flags, const char *devname,
void *dev_id)
 {
@@ -217,6 +217,9 @@ int ibmebus_request_irq(u32 ist, irq_handler_t handler,
if (irq == NO_IRQ)
return -EINVAL;
 
+   if (irq_number)
+   *irq_number = irq;
+
return request_irq(irq, handler, irq_flags, devname, dev_id);
 }
 EXPORT_SYMBOL(ibmebus_request_irq);
diff --git a/drivers/infiniband/hw/ehca/ehca_eq.c 
b/drivers/infiniband/hw/ehca/ehca_eq.c
index 49660df..5bc494f 100644
--- a/drivers/infiniband/hw/ehca/ehca_eq.c
+++ b/drivers/infiniband/hw/ehca/ehca_eq.c
@@ -122,7 +122,7 @@ int ehca_create_eq(struct ehca_shca *shca,
 
/* register interrupt handlers and initialize work queues */
if (type == EHCA_EQ) {
-   ret = ibmebus_request_irq(eq-ist, ehca_interrupt_eq,
+   ret = ibmebus_request_irq(eq-ist, NULL, ehca_interrupt_eq,
  IRQF_DISABLED, ehca_eq,
  (void *)shca);
if (ret  0)
@@ -130,7 +130,7 @@ int ehca_create_eq(struct ehca_shca *shca,
 
tasklet_init(eq-interrupt_task, ehca_tasklet_eq, (long)shca);
} else if (type == EHCA_NEQ) {
-   ret = ibmebus_request_irq(eq-ist, ehca_interrupt_neq,
+   ret = ibmebus_request_irq(eq-ist, NULL, ehca_interrupt_neq,
  IRQF_DISABLED, ehca_neq,
  (void *)shca);
if (ret  0)
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 287a619..102ffeb 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -1216,7 +1216,7 @@ static int ehea_reg_interrupts(struct net_device *dev)
snprintf(port-int_aff_name, EHEA_IRQ_NAME_SIZE - 1, %s-aff,
 dev-name);
 
-   ret = ibmebus_request_irq(port-qp_eq-attr.ist1,
+   ret = ibmebus_request_irq(port-qp_eq-attr.ist1, NULL,
  ehea_qp_aff_irq_handler,
  IRQF_DISABLED, port-int_aff_name, port);
if (ret) {
@@ -1234,7 +1234,7 @@ static int ehea_reg_interrupts(struct net_device *dev)
pr = port-port_res[i];
snprintf(pr-int_send_name, EHEA_IRQ_NAME_SIZE - 1,
 %s-queue%d, dev-name, i);
-   ret = ibmebus_request_irq(pr-eq-attr.ist1,
+   ret = ibmebus_request_irq(pr-eq-attr.ist1, NULL,
  ehea_recv_irq_handler,
  IRQF_DISABLED, pr-int_send_name,
  pr);
@@ -3414,7 +3414,7 @@ static int __devinit ehea_probe_adapter(struct of_device 
*dev,
tasklet_init(adapter-neq_tasklet, ehea_neq_tasklet,
 (unsigned long)adapter);
 
-   ret = ibmebus_request_irq(adapter-neq-attr.ist1,
+   ret = ibmebus_request_irq(adapter-neq-attr.ist1, NULL,
  ehea_interrupt_neq, IRQF_DISABLED,
  ehea_neq, adapter);
if (ret) {
diff --git a/include/asm-powerpc/ibmebus.h b/include/asm-powerpc/ibmebus.h
index 1a9d9ae..3a2618a 100644
--- a/include/asm-powerpc/ibmebus.h
+++ b/include/asm-powerpc/ibmebus.h
@@ -51,7 +51,7 @@ extern struct bus_type ibmebus_bus_type;
 int ibmebus_register_driver(struct of_platform_driver *drv);
 void ibmebus_unregister_driver(struct of_platform_driver *drv);
 
-int ibmebus_request_irq(u32 ist, irq_handler_t handler,
+int ibmebus_request_irq(u32 ist, int *irq_number, irq_handler_t handler,
unsigned long irq_flags, const char *devname,
void *dev_id);
 void ibmebus_free_irq(u32 ist, void *dev_id);
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 2/2] IB/ehca: In case of lost interrupts, trigger EOI to reenable interrupts

2008-06-09 Thread Stefan Roscher
Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/ehca_classes.h |1 +
 drivers/infiniband/hw/ehca/ehca_eq.c  |6 --
 drivers/infiniband/hw/ehca/ehca_main.c|   12 ++--
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h 
b/drivers/infiniband/hw/ehca/ehca_classes.h
index 1e9e99a..4de363d 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -86,6 +86,7 @@ struct ehca_eq {
u32 ist;
spinlock_t irq_spinlock;
struct ehca_eqe_cache_entry eqe_cache[EHCA_EQE_CACHE_SIZE];
+   int irq_number;
 };
 
 struct ehca_sma_attr {
diff --git a/drivers/infiniband/hw/ehca/ehca_eq.c 
b/drivers/infiniband/hw/ehca/ehca_eq.c
index 5bc494f..b70e5e5 100644
--- a/drivers/infiniband/hw/ehca/ehca_eq.c
+++ b/drivers/infiniband/hw/ehca/ehca_eq.c
@@ -122,7 +122,8 @@ int ehca_create_eq(struct ehca_shca *shca,
 
/* register interrupt handlers and initialize work queues */
if (type == EHCA_EQ) {
-   ret = ibmebus_request_irq(eq-ist, NULL, ehca_interrupt_eq,
+   ret = ibmebus_request_irq(eq-ist, eq-irq_number,
+ ehca_interrupt_eq,
  IRQF_DISABLED, ehca_eq,
  (void *)shca);
if (ret  0)
@@ -130,7 +131,8 @@ int ehca_create_eq(struct ehca_shca *shca,
 
tasklet_init(eq-interrupt_task, ehca_tasklet_eq, (long)shca);
} else if (type == EHCA_NEQ) {
-   ret = ibmebus_request_irq(eq-ist, NULL, ehca_interrupt_neq,
+   ret = ibmebus_request_irq(eq-ist, eq-irq_number,
+ ehca_interrupt_neq,
  IRQF_DISABLED, ehca_neq,
  (void *)shca);
if (ret  0)
diff --git a/drivers/infiniband/hw/ehca/ehca_main.c 
b/drivers/infiniband/hw/ehca/ehca_main.c
index 482103e..d713317 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -44,6 +44,7 @@
 #include linux/slab.h
 #endif
 
+#include linux/irq.h
 #include ehca_classes.h
 #include ehca_iverbs.h
 #include ehca_mrmw.h
@@ -937,6 +938,8 @@ static struct of_platform_driver ehca_driver = {
 void ehca_poll_eqs(unsigned long data)
 {
struct ehca_shca *shca;
+   int irq;
+   irq_desc_t *desc;
 
spin_lock(shca_list_lock);
list_for_each_entry(shca, shca_list, shca_list) {
@@ -955,8 +958,13 @@ void ehca_poll_eqs(unsigned long data)
spin_unlock_irqrestore(eq-spinlock, flags);
max--;
} while (q_ofs == q_ofs2  max  0);
-   if (q_ofs == q_ofs2)
-   ehca_process_eq(shca, 0);
+   if (q_ofs == q_ofs2) {
+   irq =  shca-eq.irq_number;
+   desc = get_irq_desc(irq);
+   if (desc-chip  desc-chip-eoi)
+   desc-chip-eoi(irq);
+   tasklet_hi_schedule(shca-eq.interrupt_task);
+   }
}
}
mod_timer(poll_eqs_timer, round_jiffies(jiffies + HZ));
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] IB/ehca: Protect QP against destroying until all async events for it are handled.

2008-05-07 Thread Stefan Roscher
This is necessary because, in a multicore environment, a race between
uverbs async handler and destroy QP could occur.

Signed-off-by: Stefan Roscher stefan.roscher at de.ibm.com
---

We are not sure if this should be fixed in the driver or in uverbs itself.
Roland, what's your opinion about this?

 drivers/infiniband/hw/ehca/ehca_classes.h |2 ++
 drivers/infiniband/hw/ehca/ehca_irq.c |4 
 drivers/infiniband/hw/ehca/ehca_qp.c  |5 +
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h 
b/drivers/infiniband/hw/ehca/ehca_classes.h
index 00bab60..1e9e99a 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -192,6 +192,8 @@ struct ehca_qp {
int mtu_shift;
u32 message_count;
u32 packet_count;
+   atomic_t nr_events; /* events seen */
+   wait_queue_head_t wait_completion;
 };
 
 #define IS_SRQ(qp) (qp-ext_type == EQPT_SRQ)
diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c 
b/drivers/infiniband/hw/ehca/ehca_irq.c
index ca5eb0c..ce1ab05 100644
--- a/drivers/infiniband/hw/ehca/ehca_irq.c
+++ b/drivers/infiniband/hw/ehca/ehca_irq.c
@@ -204,6 +204,8 @@ static void qp_event_callback(struct ehca_shca *shca, u64 
eqe,
 
read_lock(ehca_qp_idr_lock);
qp = idr_find(ehca_qp_idr, token);
+   if (qp)
+   atomic_inc(qp-nr_events);
read_unlock(ehca_qp_idr_lock);
 
if (!qp)
@@ -223,6 +225,8 @@ static void qp_event_callback(struct ehca_shca *shca, u64 
eqe,
if (fatal  qp-ext_type == EQPT_SRQBASE)
dispatch_qp_event(shca, qp, IB_EVENT_QP_LAST_WQE_REACHED);
 
+   if (atomic_dec_and_test(qp-nr_events))
+   wake_up(qp-wait_completion);
return;
 }
 
diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index 18fba92..d550200 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -566,6 +566,8 @@ static struct ehca_qp *internal_create_qp(
return ERR_PTR(-ENOMEM);
}
 
+   atomic_set(my_qp-nr_events, 0);
+   init_waitqueue_head(my_qp-wait_completion);
spin_lock_init(my_qp-spinlock_s);
spin_lock_init(my_qp-spinlock_r);
my_qp-qp_type = qp_type;
@@ -1934,6 +1936,9 @@ static int internal_destroy_qp(struct ib_device *dev, 
struct ehca_qp *my_qp,
idr_remove(ehca_qp_idr, my_qp-token);
write_unlock_irqrestore(ehca_qp_idr_lock, flags);
 
+/* now wait until all pending events have completed */
+   wait_event(my_qp-wait_completion, !atomic_read(my_qp-nr_events));
+
h_ret = hipz_h_destroy_qp(shca-ipz_hca_handle, my_qp);
if (h_ret != H_SUCCESS) {
ehca_err(dev, hipz_h_destroy_qp() failed h_ret=%li 
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [ewg] [PATCH] IB/ehca: Protect QP against destroying until all async events for it are handled.

2008-05-07 Thread Stefan Roscher
On Wednesday 07 May 2008 17:32:03 Roland Dreier wrote:
   We are not sure if this should be fixed in the driver or in uverbs itself.
   Roland, what's your opinion about this?
 
 Would be nice to be able to fix it in uverbs but I don't see how.  In
 particular a kernel consumer has to have the same guarantee that no
 async events will come in after destroy QP returns.  And I don't see any
 way generic code can provide a guarantee about what low-level driver
 code may do internally.
 

I agree, that's why I posted the driver fix first.
So, will you apply it next?

Regards Stefan

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] IB/ehca: Change function return types to correct type.

2008-05-05 Thread Stefan Roscher
Also remove duplicate assignment of local_ca_ack_delay
and change min_t check for local_ca_ack_delay to u8 instead of int.

Signed-off-by: Stefan Roscher stefan.roscher at de.ibm.com
---
 drivers/infiniband/hw/ehca/ehca_hca.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c 
b/drivers/infiniband/hw/ehca/ehca_hca.c
index 2515cbd..bc3b37d 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -101,7 +101,6 @@ int ehca_query_device(struct ib_device *ibdev, struct 
ib_device_attr *props)
props-max_ee  = limit_uint(rblock-max_rd_ee_context);
props-max_rdd = limit_uint(rblock-max_rd_domain);
props-max_fmr = limit_uint(rblock-max_mr);
-   props-local_ca_ack_delay  = limit_uint(rblock-local_ca_ack_delay);
props-max_qp_rd_atom  = limit_uint(rblock-max_rr_qp);
props-max_ee_rd_atom  = limit_uint(rblock-max_rr_ee_context);
props-max_res_rd_atom = limit_uint(rblock-max_rr_hca);
@@ -115,7 +114,7 @@ int ehca_query_device(struct ib_device *ibdev, struct 
ib_device_attr *props)
}
 
props-max_pkeys   = 16;
-   props-local_ca_ack_delay  = limit_uint(rblock-local_ca_ack_delay);
+   props-local_ca_ack_delay  = min_t(u8, rblock-local_ca_ack_delay, 255);
props-max_raw_ipv6_qp = limit_uint(rblock-max_raw_ipv6_qp);
props-max_raw_ethy_qp = limit_uint(rblock-max_raw_ethy_qp);
props-max_mcast_grp   = limit_uint(rblock-max_mcast_grp);
@@ -136,7 +135,7 @@ query_device1:
return ret;
 }
 
-static int map_mtu(struct ehca_shca *shca, u32 fw_mtu)
+static enum ib_mtu map_mtu(struct ehca_shca *shca, u32 fw_mtu)
 {
switch (fw_mtu) {
case 0x1:
@@ -156,7 +155,7 @@ static int map_mtu(struct ehca_shca *shca, u32 fw_mtu)
}
 }
 
-static int map_number_of_vls(struct ehca_shca *shca, u32 vl_cap)
+static u8 map_number_of_vls(struct ehca_shca *shca, u32 vl_cap)
 {
switch (vl_cap) {
case 0x1:
-- 
1.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH] IB/ehca: Allocate event queue size depending on max number of CQs and QPs

2008-04-29 Thread Stefan Roscher


Signed-off-by: Stefan Roscher stefan.roscher at de.ibm.com
---
 drivers/infiniband/hw/ehca/ehca_classes.h |5 
 drivers/infiniband/hw/ehca/ehca_cq.c  |   10 
 drivers/infiniband/hw/ehca/ehca_main.c|   36 +++-
 drivers/infiniband/hw/ehca/ehca_qp.c  |   10 
 4 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h 
b/drivers/infiniband/hw/ehca/ehca_classes.h
index 3d6d946..00bab60 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -66,6 +66,7 @@ struct ehca_av;
 #include ehca_irq.h
 
 #define EHCA_EQE_CACHE_SIZE 20
+#define EHCA_MAX_NUM_QUEUES 0x
 
 struct ehca_eqe_cache_entry {
struct ehca_eqe *eqe;
@@ -127,6 +128,8 @@ struct ehca_shca {
/* MR pgsize: bit 0-3 means 4K, 64K, 1M, 16M respectively */
u32 hca_cap_mr_pgsize;
int max_mtu;
+   atomic_t num_cqs;
+   atomic_t num_qps;
 };
 
 struct ehca_pd {
@@ -344,6 +347,8 @@ extern int ehca_use_hp_mr;
 extern int ehca_scaling_code;
 extern int ehca_lock_hcalls;
 extern int ehca_nr_ports;
+extern int ehca_max_cq;
+extern int ehca_max_qp;
 
 struct ipzu_queue_resp {
u32 qe_size;  /* queue entry size */
diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c 
b/drivers/infiniband/hw/ehca/ehca_cq.c
index ec0cfcf..5b4f9a3 100644
--- a/drivers/infiniband/hw/ehca/ehca_cq.c
+++ b/drivers/infiniband/hw/ehca/ehca_cq.c
@@ -132,6 +132,14 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int 
cqe, int comp_vector,
if (cqe = 0x - 64 - additional_cqe)
return ERR_PTR(-EINVAL);
 
+   if (atomic_read(shca-num_cqs) = ehca_max_cq) {
+   ehca_err(device, Unable to create CQ, max number of %i 
+   CQs reached., ehca_max_cq);
+   ehca_err(device, To increase the maximum number of CQs 
+   use the number_of_cqs module parameter.\n);
+   return ERR_PTR(-ENOSPC);
+   }
+
my_cq = kmem_cache_zalloc(cq_cache, GFP_KERNEL);
if (!my_cq) {
ehca_err(device, Out of memory for ehca_cq struct device=%p,
@@ -286,6 +294,7 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int 
cqe, int comp_vector,
}
}
 
+   atomic_inc(shca-num_cqs);
return cq;
 
 create_cq_exit4:
@@ -359,6 +368,7 @@ int ehca_destroy_cq(struct ib_cq *cq)
ipz_queue_dtor(NULL, my_cq-ipz_queue);
kmem_cache_free(cq_cache, my_cq);
 
+   atomic_dec(shca-num_cqs);
return 0;
 }
 
diff --git a/drivers/infiniband/hw/ehca/ehca_main.c 
b/drivers/infiniband/hw/ehca/ehca_main.c
index 6504897..401907f 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -68,6 +68,8 @@ int ehca_port_act_time = 30;
 int ehca_static_rate   = -1;
 int ehca_scaling_code  = 0;
 int ehca_lock_hcalls   = -1;
+int ehca_max_cq= -1;
+int ehca_max_qp= -1;
 
 module_param_named(open_aqp1, ehca_open_aqp1, bool, S_IRUGO);
 module_param_named(debug_level,   ehca_debug_level,   int,  S_IRUGO);
@@ -79,6 +81,8 @@ module_param_named(poll_all_eqs,  ehca_poll_all_eqs,  bool, 
S_IRUGO);
 module_param_named(static_rate,   ehca_static_rate,   int,  S_IRUGO);
 module_param_named(scaling_code,  ehca_scaling_code,  bool, S_IRUGO);
 module_param_named(lock_hcalls,   ehca_lock_hcalls,   bool, S_IRUGO);
+module_param_named(number_of_cqs, ehca_max_cq,int, S_IRUGO);
+module_param_named(number_of_qps, ehca_max_qp,int, S_IRUGO);
 
 MODULE_PARM_DESC(open_aqp1,
 Open AQP1 on startup (default: no));
@@ -104,6 +108,12 @@ MODULE_PARM_DESC(scaling_code,
 MODULE_PARM_DESC(lock_hcalls,
 Serialize all hCalls made by the driver 
 (default: autodetect));
+MODULE_PARM_DESC(number_of_cqs,
+   Max number of CQs which can be allocated 
+   (default: autodetect));
+MODULE_PARM_DESC(number_of_qps,
+   Max number of QPs which can be allocated 
+   (default: autodetect));
 
 DEFINE_RWLOCK(ehca_qp_idr_lock);
 DEFINE_RWLOCK(ehca_cq_idr_lock);
@@ -355,6 +365,25 @@ static int ehca_sense_attributes(struct ehca_shca *shca)
if (rblock-memory_page_size_supported  pgsize_map[i])
shca-hca_cap_mr_pgsize |= pgsize_map[i + 1];
 
+   /* Set maximum number of CQs and QPs to calculate EQ size */
+   if (ehca_max_qp == -1)
+   ehca_max_qp = min_t(int, rblock-max_qp, EHCA_MAX_NUM_QUEUES);
+   else if (ehca_max_qp  1 || ehca_max_qp  rblock-max_qp) {
+   ehca_gen_err(Requested number of QPs is out of range (1 - %i) 
+   specified by HW, rblock-max_qp);
+   ret = -EINVAL;
+   goto sense_attributes1;
+   }
+
+   if (ehca_max_cq == -1)
+   ehca_max_cq = min_t(int, rblock-max_cq

[REPOST][PATCH] IB/ehca: Allocate event queue size depending on max number of CQs and QPs

2008-04-29 Thread Stefan Roscher
If a lot of QPs fall into Error state at once and the EQ of the respective
HCA is too small, it might overrun, causing the eHCA driver to stop
processing completion events and call application software's completion
handlers, effectively causing traffic to stop.

Fix this by limiting available QPs and CQs to a customizable max count,
and determining EQ size based on these counts and a worst-case assumption.

Signed-off-by: Stefan Roscher stefan.roscher at de.ibm.com
---

Reposted based on Roland's comments:
- use atomic_add_unless instead of atomic_read
- inf% changelog increase ;)

 drivers/infiniband/hw/ehca/ehca_classes.h |5 
 drivers/infiniband/hw/ehca/ehca_cq.c  |   11 +
 drivers/infiniband/hw/ehca/ehca_main.c|   36 +++-
 drivers/infiniband/hw/ehca/ehca_qp.c  |   26 +++-
 4 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h 
b/drivers/infiniband/hw/ehca/ehca_classes.h
index 3d6d946..00bab60 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -66,6 +66,7 @@ struct ehca_av;
 #include ehca_irq.h
 
 #define EHCA_EQE_CACHE_SIZE 20
+#define EHCA_MAX_NUM_QUEUES 0x
 
 struct ehca_eqe_cache_entry {
struct ehca_eqe *eqe;
@@ -127,6 +128,8 @@ struct ehca_shca {
/* MR pgsize: bit 0-3 means 4K, 64K, 1M, 16M respectively */
u32 hca_cap_mr_pgsize;
int max_mtu;
+   atomic_t num_cqs;
+   atomic_t num_qps;
 };
 
 struct ehca_pd {
@@ -344,6 +347,8 @@ extern int ehca_use_hp_mr;
 extern int ehca_scaling_code;
 extern int ehca_lock_hcalls;
 extern int ehca_nr_ports;
+extern int ehca_max_cq;
+extern int ehca_max_qp;
 
 struct ipzu_queue_resp {
u32 qe_size;  /* queue entry size */
diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c 
b/drivers/infiniband/hw/ehca/ehca_cq.c
index ec0cfcf..5540b27 100644
--- a/drivers/infiniband/hw/ehca/ehca_cq.c
+++ b/drivers/infiniband/hw/ehca/ehca_cq.c
@@ -132,10 +132,19 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, 
int cqe, int comp_vector,
if (cqe = 0x - 64 - additional_cqe)
return ERR_PTR(-EINVAL);
 
+   if (!atomic_add_unless(shca-num_cqs, 1, ehca_max_cq)) {
+   ehca_err(device, Unable to create CQ, max number of %i 
+   CQs reached., ehca_max_cq);
+   ehca_err(device, To increase the maximum number of CQs 
+   use the number_of_cqs module parameter.\n);
+   return ERR_PTR(-ENOSPC);
+   }
+
my_cq = kmem_cache_zalloc(cq_cache, GFP_KERNEL);
if (!my_cq) {
ehca_err(device, Out of memory for ehca_cq struct device=%p,
 device);
+   atomic_dec(shca-num_cqs);
return ERR_PTR(-ENOMEM);
}
 
@@ -305,6 +314,7 @@ create_cq_exit2:
 create_cq_exit1:
kmem_cache_free(cq_cache, my_cq);
 
+   atomic_dec(shca-num_cqs);
return cq;
 }
 
@@ -359,6 +369,7 @@ int ehca_destroy_cq(struct ib_cq *cq)
ipz_queue_dtor(NULL, my_cq-ipz_queue);
kmem_cache_free(cq_cache, my_cq);
 
+   atomic_dec(shca-num_cqs);
return 0;
 }
 
diff --git a/drivers/infiniband/hw/ehca/ehca_main.c 
b/drivers/infiniband/hw/ehca/ehca_main.c
index 6504897..482103e 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -68,6 +68,8 @@ int ehca_port_act_time = 30;
 int ehca_static_rate   = -1;
 int ehca_scaling_code  = 0;
 int ehca_lock_hcalls   = -1;
+int ehca_max_cq= -1;
+int ehca_max_qp= -1;
 
 module_param_named(open_aqp1, ehca_open_aqp1, bool, S_IRUGO);
 module_param_named(debug_level,   ehca_debug_level,   int,  S_IRUGO);
@@ -79,6 +81,8 @@ module_param_named(poll_all_eqs,  ehca_poll_all_eqs,  bool, 
S_IRUGO);
 module_param_named(static_rate,   ehca_static_rate,   int,  S_IRUGO);
 module_param_named(scaling_code,  ehca_scaling_code,  bool, S_IRUGO);
 module_param_named(lock_hcalls,   ehca_lock_hcalls,   bool, S_IRUGO);
+module_param_named(number_of_cqs, ehca_max_cq,int,  S_IRUGO);
+module_param_named(number_of_qps, ehca_max_qp,int,  S_IRUGO);
 
 MODULE_PARM_DESC(open_aqp1,
 Open AQP1 on startup (default: no));
@@ -104,6 +108,12 @@ MODULE_PARM_DESC(scaling_code,
 MODULE_PARM_DESC(lock_hcalls,
 Serialize all hCalls made by the driver 
 (default: autodetect));
+MODULE_PARM_DESC(number_of_cqs,
+   Max number of CQs which can be allocated 
+   (default: autodetect));
+MODULE_PARM_DESC(number_of_qps,
+   Max number of QPs which can be allocated 
+   (default: autodetect));
 
 DEFINE_RWLOCK(ehca_qp_idr_lock);
 DEFINE_RWLOCK(ehca_cq_idr_lock);
@@ -355,6 +365,25 @@ static int ehca_sense_attributes(struct ehca_shca *shca)
if (rblock-memory_page_size_supported  pgsize_map[i

[PATCH] IB/ehca: extend query_device() and query_port() to support all values for ibv_devinfo

2008-04-07 Thread Stefan Roscher
Also, introduce a few inline helper functions to make the code more readable.

Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/ehca_hca.c |  128 
 1 files changed, 80 insertions(+), 48 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c 
b/drivers/infiniband/hw/ehca/ehca_hca.c
index 8832123..f89c5f8 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -43,6 +43,11 @@
 #include ehca_iverbs.h
 #include hcp_if.h
 
+static inline unsigned int limit_uint(unsigned int value)
+{
+   return min_t(unsigned int, value, INT_MAX);
+}
+
 int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props)
 {
int i, ret = 0;
@@ -83,37 +88,40 @@ int ehca_query_device(struct ib_device *ibdev, struct 
ib_device_attr *props)
props-vendor_id   = rblock-vendor_id  8;
props-vendor_part_id  = rblock-vendor_part_id  16;
props-hw_ver  = rblock-hw_ver;
-   props-max_qp  = min_t(unsigned, rblock-max_qp, INT_MAX);
-   props-max_qp_wr   = min_t(unsigned, rblock-max_wqes_wq, INT_MAX);
-   props-max_sge = min_t(unsigned, rblock-max_sge, INT_MAX);
-   props-max_sge_rd  = min_t(unsigned, rblock-max_sge_rd, INT_MAX);
-   props-max_cq  = min_t(unsigned, rblock-max_cq, INT_MAX);
-   props-max_cqe = min_t(unsigned, rblock-max_cqe, INT_MAX);
-   props-max_mr  = min_t(unsigned, rblock-max_mr, INT_MAX);
-   props-max_mw  = min_t(unsigned, rblock-max_mw, INT_MAX);
-   props-max_pd  = min_t(unsigned, rblock-max_pd, INT_MAX);
-   props-max_ah  = min_t(unsigned, rblock-max_ah, INT_MAX);
-   props-max_fmr = min_t(unsigned, rblock-max_mr, INT_MAX);
+   props-max_qp  = limit_uint(rblock-max_qp);
+   props-max_qp_wr   = limit_uint(rblock-max_wqes_wq);
+   props-max_sge = limit_uint(rblock-max_sge);
+   props-max_sge_rd  = limit_uint(rblock-max_sge_rd);
+   props-max_cq  = limit_uint(rblock-max_cq);
+   props-max_cqe = limit_uint(rblock-max_cqe);
+   props-max_mr  = limit_uint(rblock-max_mr);
+   props-max_mw  = limit_uint(rblock-max_mw);
+   props-max_pd  = limit_uint(rblock-max_pd);
+   props-max_ah  = limit_uint(rblock-max_ah);
+   props-max_ee  = limit_uint(rblock-max_rd_ee_context);
+   props-max_rdd = limit_uint(rblock-max_rd_domain);
+   props-max_fmr = limit_uint(rblock-max_mr);
+   props-local_ca_ack_delay  = limit_uint(rblock-local_ca_ack_delay);
+   props-max_qp_rd_atom  = limit_uint(rblock-max_rr_qp);
+   props-max_ee_rd_atom  = limit_uint(rblock-max_rr_ee_context);
+   props-max_res_rd_atom = limit_uint(rblock-max_rr_hca);
+   props-max_qp_init_rd_atom = limit_uint(rblock-max_act_wqs_qp);
+   props-max_ee_init_rd_atom = limit_uint(rblock-max_act_wqs_ee_context);
 
if (EHCA_BMASK_GET(HCA_CAP_SRQ, shca-hca_cap)) {
-   props-max_srq = props-max_qp;
-   props-max_srq_wr  = props-max_qp_wr;
+   props-max_srq = limit_uint(props-max_qp);
+   props-max_srq_wr  = limit_uint(props-max_qp_wr);
props-max_srq_sge = 3;
}
 
-   props-max_pkeys   = 16;
-   props-local_ca_ack_delay
-   = rblock-local_ca_ack_delay;
-   props-max_raw_ipv6_qp
-   = min_t(unsigned, rblock-max_raw_ipv6_qp, INT_MAX);
-   props-max_raw_ethy_qp
-   = min_t(unsigned, rblock-max_raw_ethy_qp, INT_MAX);
-   props-max_mcast_grp
-   = min_t(unsigned, rblock-max_mcast_grp, INT_MAX);
-   props-max_mcast_qp_attach
-   = min_t(unsigned, rblock-max_mcast_qp_attach, INT_MAX);
+   props-max_pkeys   = 16;
+   props-local_ca_ack_delay  = limit_uint(rblock-local_ca_ack_delay);
+   props-max_raw_ipv6_qp = limit_uint(rblock-max_raw_ipv6_qp);
+   props-max_raw_ethy_qp = limit_uint(rblock-max_raw_ethy_qp);
+   props-max_mcast_grp   = limit_uint(rblock-max_mcast_grp);
+   props-max_mcast_qp_attach = limit_uint(rblock-max_mcast_qp_attach);
props-max_total_mcast_qp_attach
-   = min_t(unsigned, rblock-max_total_mcast_qp_attach, INT_MAX);
+   = limit_uint(rblock-max_total_mcast_qp_attach);
 
/* translate device capabilities */
props-device_cap_flags = IB_DEVICE_SYS_IMAGE_GUID |
@@ -128,6 +136,46 @@ query_device1:
return ret;
 }
 
+static inline int map_mtu(struct ehca_shca *shca, u32 fw_mtu)
+{
+   switch (fw_mtu) {
+   case 0x1:
+   return IB_MTU_256;
+   case 0x2:
+   return IB_MTU_512;
+   case 0x3:
+   return IB_MTU_1024;
+   case 0x4:
+   return IB_MTU_2048;
+   case 0x5

[PATCH 0/7] IB/ehca: support for user space small queues, support more than 4k queue pairs, generate last WQE reached

2007-08-08 Thread Stefan Roscher
Here is a patch set against Roland's git, branch for-2.6.23 for ehca.
It enables userspace support for small QP feature and make some fixes for it. 
Also there is add the mapping of 4k firmware context to user space.

They are in details:
[1/7] add support for userspace small queues and make some fixes
[2/7] ensure that a non-existing queues in case of SRQs are not interprete as 
small queues
[3/7] we have no longer to add 1 to the number of requestet wqes, because
firmware does now
[4/7] make changes to ehca_mmap() to support more than 4k queues 
[5/7] map 4k firmware context of cq, qp to user space
[6/7] generate last WQE reached, when base QP for SRQ has entered error state
[7/7] prevent overwriting QP init attributes given by caller

The patches should apply cleanly, in order, against Roland's git. Please
review the changes and apply the patches if they are okay.

Regards,
Stefan


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 1/7] IB/ehca: Small QP userspace support and fixes

2007-08-08 Thread Stefan Roscher
Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/ehca_qp.c   |7 +++
 drivers/infiniband/hw/ehca/ipz_pt_fn.c |3 ++-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index b178cba..cfa83fa 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -273,6 +273,7 @@ static inline void queue2resp(struct ipzu_queue_resp *resp,
resp-queue_length = queue-queue_length;
resp-pagesize = queue-pagesize;
resp-toggle_state = queue-toggle_state;
+   resp-offset = queue-offset;
 }
 
 /*
@@ -598,8 +599,7 @@ static struct ehca_qp *internal_create_qp(
parms.squeue.max_sge = max_send_sge;
parms.rqueue.max_sge = max_recv_sge;
 
-   if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca-hca_cap)
-!(context  udata)) { /* no small QP support in userspace ATM */
+   if (EHCA_BMASK_GET(HCA_CAP_MINI_QP, shca-hca_cap)) {
ehca_determine_small_queue(
parms.squeue, max_send_sge, is_llqp);
ehca_determine_small_queue(
@@ -739,8 +739,7 @@ static struct ehca_qp *internal_create_qp(
resp.ext_type = my_qp-ext_type;
resp.qkey = my_qp-qkey;
resp.real_qp_num = my_qp-real_qp_num;
-   resp.ipz_rqueue.offset = my_qp-ipz_rqueue.offset;
-   resp.ipz_squeue.offset = my_qp-ipz_squeue.offset;
+
if (HAS_SQ(my_qp))
queue2resp(resp.ipz_squeue, my_qp-ipz_squeue);
if (HAS_RQ(my_qp))
diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c 
b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
index a090c67..661f8db 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -158,6 +158,7 @@ static int alloc_small_queue_page(struct ipz_queue *queue, 
struct ehca_pd *pd)
 
queue-queue_pages[0] = (void *)(page-page | (bit  (order + 9)));
queue-small_page = page;
+   queue-offset = bit  (order + 9);
return 1;
 
 out:
@@ -172,7 +173,7 @@ static void free_small_queue_page(struct ipz_queue *queue, 
struct ehca_pd *pd)
unsigned long bit;
int free_page = 0;
 
-   bit = ((unsigned long)queue-queue_pages[0]  PAGE_MASK)
+   bit = ((unsigned long)queue-queue_pages[0]  ~PAGE_MASK)
 (order + 9);
 
mutex_lock(pd-lock);
-- 
1.5.2



___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 3/7] IB/ehca: Add 1 is not longer needed because of firmware interface change

2007-08-08 Thread Stefan Roscher
Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/hcp_if.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/hcp_if.c 
b/drivers/infiniband/hw/ehca/hcp_if.c
index 24f4541..8534061 100644
--- a/drivers/infiniband/hw/ehca/hcp_if.c
+++ b/drivers/infiniband/hw/ehca/hcp_if.c
@@ -317,9 +317,9 @@ u64 hipz_h_alloc_resource_qp(const struct 
ipz_adapter_handle adapter_handle,
 
max_r10_reg =
EHCA_BMASK_SET(H_ALL_RES_QP_MAX_OUTST_SEND_WR,
-  parms-squeue.max_wr + 1)
+  parms-squeue.max_wr)
| EHCA_BMASK_SET(H_ALL_RES_QP_MAX_OUTST_RECV_WR,
-parms-rqueue.max_wr + 1)
+parms-rqueue.max_wr)
| EHCA_BMASK_SET(H_ALL_RES_QP_MAX_SEND_SGE,
 parms-squeue.max_sge)
| EHCA_BMASK_SET(H_ALL_RES_QP_MAX_RECV_SGE,
-- 
1.5.2


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 5/7] IB/ehca: map 4k firmware context of cq, qp to user space

2007-08-08 Thread Stefan Roscher
From: Hoang-Nam Nguyen [EMAIL PROTECTED]
Date: Wed, 8 Aug 2007 19:33:23 +0200

This patch utilizes remap_4k_pfn() as introduced by Paul M.,
for details see http://patchwork.ozlabs.org/linuxppc/patch?id=10281,
to map ehca cq, qp firmware context (4k) to user space if kernel page
size is 64k. For reason, why this is required, see also Paul's patch.
In addition to that the kernel page offset of firmware context needs
to be set in cq and qp response block so that user space can assemble
the proper virtual address to use.
An appropriate patch for libehca will follow for ofed-1.3.

Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/ehca_classes.h |4 +++-
 drivers/infiniband/hw/ehca/ehca_cq.c  |2 ++
 drivers/infiniband/hw/ehca/ehca_qp.c  |2 ++
 drivers/infiniband/hw/ehca/ehca_uverbs.c  |6 +++---
 4 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h 
b/drivers/infiniband/hw/ehca/ehca_classes.h
index b5e9603..206d4eb 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -337,6 +337,8 @@ struct ehca_create_cq_resp {
u32 cq_number;
u32 token;
struct ipzu_queue_resp ipz_queue;
+   u32 fw_handle_ofs;
+   u32 dummy;
 };
 
 struct ehca_create_qp_resp {
@@ -347,7 +349,7 @@ struct ehca_create_qp_resp {
u32 qkey;
/* qp_num assigned by ehca: sqp0/1 may have got different numbers */
u32 real_qp_num;
-   u32 dummy; /* padding for 8 byte alignment */
+   u32 fw_handle_ofs;
struct ipzu_queue_resp ipz_squeue;
struct ipzu_queue_resp ipz_rqueue;
 };
diff --git a/drivers/infiniband/hw/ehca/ehca_cq.c 
b/drivers/infiniband/hw/ehca/ehca_cq.c
index c661939..0ac5a97 100644
--- a/drivers/infiniband/hw/ehca/ehca_cq.c
+++ b/drivers/infiniband/hw/ehca/ehca_cq.c
@@ -281,6 +281,8 @@ struct ib_cq *ehca_create_cq(struct ib_device *device, int 
cqe, int comp_vector,
resp.ipz_queue.queue_length = ipz_queue-queue_length;
resp.ipz_queue.pagesize = ipz_queue-pagesize;
resp.ipz_queue.toggle_state = ipz_queue-toggle_state;
+   resp.fw_handle_ofs = (u32)
+   (my_cq-galpas.user.fw_handle  (PAGE_SIZE - 1));
if (ib_copy_to_udata(udata, resp, sizeof(resp))) {
ehca_err(device, Copy to udata failed.);
goto create_cq_exit4;
diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index f26801b..d8c1c22 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -752,6 +752,8 @@ static struct ehca_qp *internal_create_qp(
queue2resp(resp.ipz_squeue, my_qp-ipz_squeue);
if (HAS_RQ(my_qp))
queue2resp(resp.ipz_rqueue, my_qp-ipz_rqueue);
+   resp.fw_handle_ofs = (u32)
+   (my_qp-galpas.user.fw_handle  (PAGE_SIZE - 1));
 
if (ib_copy_to_udata(udata, resp, sizeof resp)) {
ehca_err(pd-device, Copy to udata failed);
diff --git a/drivers/infiniband/hw/ehca/ehca_uverbs.c 
b/drivers/infiniband/hw/ehca/ehca_uverbs.c
index 3340f49..84a16bc 100644
--- a/drivers/infiniband/hw/ehca/ehca_uverbs.c
+++ b/drivers/infiniband/hw/ehca/ehca_uverbs.c
@@ -109,7 +109,7 @@ static int ehca_mmap_fw(struct vm_area_struct *vma, struct 
h_galpas *galpas,
u64 vsize, physical;
 
vsize = vma-vm_end - vma-vm_start;
-   if (vsize != EHCA_PAGESIZE) {
+   if (vsize  EHCA_PAGESIZE) {
ehca_gen_err(invalid vsize=%lx, vma-vm_end - vma-vm_start);
return -EINVAL;
}
@@ -118,8 +118,8 @@ static int ehca_mmap_fw(struct vm_area_struct *vma, struct 
h_galpas *galpas,
vma-vm_page_prot = pgprot_noncached(vma-vm_page_prot);
ehca_gen_dbg(vsize=%lx physical=%lx, vsize, physical);
/* VM_IO | VM_RESERVED are set by remap_pfn_range() */
-   ret = remap_pfn_range(vma, vma-vm_start, physical  PAGE_SHIFT,
- vsize, vma-vm_page_prot);
+   ret = remap_4k_pfn(vma, vma-vm_start, physical  EHCA_PAGESHIFT,
+  vma-vm_page_prot);
if (unlikely(ret)) {
ehca_gen_err(remap_pfn_range() failed ret=%x, ret);
return -ENOMEM;
-- 
1.5.2


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[PATCH 7/7] IB/ehca: Prevent overwriting QP init attributes given by caller

2007-08-08 Thread Stefan Roscher
Signed-off-by: Stefan Roscher [EMAIL PROTECTED]
---
 drivers/infiniband/hw/ehca/ehca_qp.c |   14 +-
 1 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c 
b/drivers/infiniband/hw/ehca/ehca_qp.c
index d8c1c22..6efda3d 100644
--- a/drivers/infiniband/hw/ehca/ehca_qp.c
+++ b/drivers/infiniband/hw/ehca/ehca_qp.c
@@ -709,12 +709,12 @@ static struct ehca_qp *internal_create_qp(
my_qp-ib_qp.event_handler = init_attr-event_handler;
}
 
-   init_attr-cap.max_inline_data = 0; /* not supported yet */
-   init_attr-cap.max_recv_sge = parms.rqueue.act_nr_sges;
-   init_attr-cap.max_recv_wr = parms.rqueue.act_nr_wqes;
-   init_attr-cap.max_send_sge = parms.squeue.act_nr_sges;
-   init_attr-cap.max_send_wr = parms.squeue.act_nr_wqes;
my_qp-init_attr = *init_attr;
+   my_qp-init_attr.cap.max_inline_data = 0; /* not supported yet */
+   my_qp-init_attr.cap.max_recv_sge = parms.rqueue.act_nr_sges;
+   my_qp-init_attr.cap.max_recv_wr = parms.rqueue.act_nr_wqes;
+   my_qp-init_attr.cap.max_send_sge = parms.squeue.act_nr_sges;
+   my_qp-init_attr.cap.max_send_wr = parms.squeue.act_nr_wqes;
 
/* NOTE: define_apq0() not supported yet */
if (qp_type == IB_QPT_GSI) {
@@ -825,10 +825,6 @@ struct ib_srq *ehca_create_srq(struct ib_pd *pd,
if (IS_ERR(my_qp))
return (struct ib_srq *)my_qp;
 
-   /* copy back return values */
-   srq_init_attr-attr.max_wr = qp_init_attr.cap.max_recv_wr;
-   srq_init_attr-attr.max_sge = qp_init_attr.cap.max_recv_sge;
-
/* drive SRQ into RTR state */
mqpcb = ehca_alloc_fw_ctrlblock(GFP_KERNEL);
if (!mqpcb) {
-- 
1.5.2


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev