[ewg] [PATCH v1] mlx4_ib: Optimize hugetlab pages support

2009-01-22 Thread Eli Cohen
Since Linux may not merge adjacent pages into a single scatter entry through
calls to dma_map_sg(), we check the special case of hugetlb pages which are
likely to be mapped to coniguous dma addresses and if they are, take advantage
of this. This will result in a significantly lower number of MTT segments used
for registering hugetlb memory regions.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---

In this version I also took care of the case where the kernel is
compiled without hugetlb support.


 drivers/infiniband/hw/mlx4/mr.c |   86 ++-
 1 files changed, 75 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index 8e4d26d..4c7a5bf 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -119,6 +119,68 @@ out:
return err;
 }
 
+static int handle_hugetlb_user_mr(struct ib_pd *pd, struct mlx4_ib_mr *mr,
+ u64 virt_addr, int access_flags)
+{
+#ifdef CONFIG_HUGETLB_PAGE
+   struct mlx4_ib_dev *dev = to_mdev(pd-device);
+   struct ib_umem_chunk *chunk;
+   unsigned dsize;
+   dma_addr_t daddr;
+   unsigned uninitialized_var(cur_size);
+   dma_addr_t uninitialized_var(cur_addr);
+   int restart;
+   int n;
+   struct ib_umem  *umem = mr-umem;
+   u64 *arr;
+   int err = 0;
+   int i;
+   int j = 0;
+
+n = PAGE_ALIGN(umem-length + umem-offset)  HPAGE_SHIFT;
+   arr = kmalloc(n * sizeof *arr, GFP_KERNEL);
+   if (!arr)
+   return -ENOMEM;
+
+   restart = 1;
+   list_for_each_entry(chunk, umem-chunk_list, list)
+   for (i = 0; i  chunk-nmap; ++i) {
+   daddr = sg_dma_address(chunk-page_list[i]);
+   dsize = sg_dma_len(chunk-page_list[i]);
+   if (restart) {
+   cur_addr = daddr;
+   cur_size = dsize;
+   restart = 0;
+   } else if (cur_addr + cur_size != daddr) {
+   err = -EINVAL;
+   goto out;
+   } else
+cur_size += dsize;
+
+   if (cur_size  HPAGE_SIZE) {
+   err = -EINVAL;
+   goto out;
+   } else if (cur_size == HPAGE_SIZE) {
+   restart = 1;
+   arr[j++] = cur_addr;
+   }
+   }
+
+   err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, umem-length,
+   convert_access(access_flags), n, HPAGE_SHIFT, 
mr-mmr);
+   if (err)
+   goto out;
+
+err = mlx4_write_mtt(dev-dev, mr-mmr.mtt, 0, n, arr);
+
+out:
+   kfree(arr);
+   return err;
+#else
+   return -ENOSYS;
+#endif
+}
+
 struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
  u64 virt_addr, int access_flags,
  struct ib_udata *udata)
@@ -140,17 +202,19 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 
start, u64 length,
goto err_free;
}
 
-   n = ib_umem_page_count(mr-umem);
-   shift = ilog2(mr-umem-page_size);
-
-   err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, length,
-   convert_access(access_flags), n, shift, mr-mmr);
-   if (err)
-   goto err_umem;
-
-   err = mlx4_ib_umem_write_mtt(dev, mr-mmr.mtt, mr-umem);
-   if (err)
-   goto err_mr;
+   if (!mr-umem-hugetlb || handle_hugetlb_user_mr(pd, mr, virt_addr, 
access_flags)) {
+   n = ib_umem_page_count(mr-umem);
+   shift = ilog2(mr-umem-page_size);
+   
+   err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, 
length,
+   convert_access(access_flags), n, shift, 
mr-mmr);
+   if (err)
+   goto err_umem;
+   
+   err = mlx4_ib_umem_write_mtt(dev, mr-mmr.mtt, mr-umem);
+   if (err)
+   goto err_mr;
+   }
 
err = mlx4_mr_enable(dev-dev, mr-mmr);
if (err)
-- 
1.6.0.5

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: RHEL 5.3 and OFED 1.4.x

2009-01-22 Thread Woodruff, Robert J
In the last EWG meeting, we discussed waiting a month or so and seeing what 
kind of bugs 
were reported against 1.4 to determine if a 1.4.1 release was needed.
 



From: general-boun...@lists.openfabrics.org 
[mailto:general-boun...@lists.openfabrics.org] On Behalf Of John Russo
Sent: Thursday, January 22, 2009 12:37 PM
To: gene...@lists.openfabrics.org
Subject: [ofa-general] RHEL 5.3 and OFED 1.4.x



Does the release of RHEL 5.3 create any additional justification for a 
maintenance release of OFED (1.4.1) to be generated?  I am already hearing 
requests for an OFED release that will support it.

 

John Russo

QLogic

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: RHEL 5.3 and OFED 1.4.x

2009-01-22 Thread John Russo
I understand but I think that this is another consideration that should be 
factored in.  Even if there are no critical PRs to fix, the introduction of 
RHEL 5.3 (along with less critical PRs) may be enough justification.

I simply want to plant the seed in everyone's mind before our next meeting.

Thanks

-Original Message-
From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com] 
Sent: Thursday, January 22, 2009 3:44 PM
To: John Russo; gene...@lists.openfabrics.org
Cc: ewg@lists.openfabrics.org
Subject: RE: RHEL 5.3 and OFED 1.4.x

In the last EWG meeting, we discussed waiting a month or so and seeing what 
kind of bugs 
were reported against 1.4 to determine if a 1.4.1 release was needed.
 



From: general-boun...@lists.openfabrics.org 
[mailto:general-boun...@lists.openfabrics.org] On Behalf Of John Russo
Sent: Thursday, January 22, 2009 12:37 PM
To: gene...@lists.openfabrics.org
Subject: [ofa-general] RHEL 5.3 and OFED 1.4.x



Does the release of RHEL 5.3 create any additional justification for a 
maintenance release of OFED (1.4.1) to be generated?  I am already hearing 
requests for an OFED release that will support it.

 

John Russo

QLogic

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

2009-01-22 Thread Steve Wise

I think releasing OMPI-1.3 with iWARP support is also good justification.

And there are RDS issues with ofed-1.4 even over IB that I think will 
add to justification.



John Russo wrote:

I understand but I think that this is another consideration that should be factored in.  
Even if there are no critical PRs to fix, the introduction of RHEL 5.3 (along 
with less critical PRs) may be enough justification.

I simply want to plant the seed in everyone's mind before our next meeting.

Thanks

-Original Message-
From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com] 
Sent: Thursday, January 22, 2009 3:44 PM

To: John Russo; gene...@lists.openfabrics.org
Cc: ewg@lists.openfabrics.org
Subject: RE: RHEL 5.3 and OFED 1.4.x

In the last EWG meeting, we discussed waiting a month or so and seeing what kind of bugs 
were reported against 1.4 to determine if a 1.4.1 release was needed.
 




From: general-boun...@lists.openfabrics.org 
[mailto:general-boun...@lists.openfabrics.org] On Behalf Of John Russo
Sent: Thursday, January 22, 2009 12:37 PM
To: gene...@lists.openfabrics.org
Subject: [ofa-general] RHEL 5.3 and OFED 1.4.x



Does the release of RHEL 5.3 create any additional justification for a 
maintenance release of OFED (1.4.1) to be generated?  I am already hearing 
requests for an OFED release that will support it.

 


John Russo

QLogic

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
  


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] RE: RHEL 5.3 and OFED 1.4.x

2009-01-22 Thread Woodruff, Robert J

I think that we need to discuss this in the EWG meeting.
In the past I think that we have agreed to only do bug fixes
in point release and not add major new features.
If we do want to include the new MPI, then perhaps we should call
it 1.5 and pull in the schedule for 1.5.   Just a thought.

woody


-Original Message-
From: Steve Wise [mailto:sw...@opengridcomputing.com]
Sent: Thursday, January 22, 2009 1:46 PM
To: John Russo
Cc: Woodruff, Robert J; gene...@lists.openfabrics.org; ewg@lists.openfabrics.org
Subject: Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

I think releasing OMPI-1.3 with iWARP support is also good justification.

And there are RDS issues with ofed-1.4 even over IB that I think will
add to justification.


John Russo wrote:
 I understand but I think that this is another consideration that should be 
 factored in.  Even if there are no critical PRs to fix, the introduction of 
 RHEL 5.3 (along with less critical PRs) may be enough justification.

 I simply want to plant the seed in everyone's mind before our next meeting.

 Thanks

 -Original Message-
 From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
 Sent: Thursday, January 22, 2009 3:44 PM
 To: John Russo; gene...@lists.openfabrics.org
 Cc: ewg@lists.openfabrics.org
 Subject: RE: RHEL 5.3 and OFED 1.4.x

 In the last EWG meeting, we discussed waiting a month or so and seeing what 
 kind of bugs
 were reported against 1.4 to determine if a 1.4.1 release was needed.


 

 From: general-boun...@lists.openfabrics.org 
 [mailto:general-boun...@lists.openfabrics.org] On Behalf Of John Russo
 Sent: Thursday, January 22, 2009 12:37 PM
 To: gene...@lists.openfabrics.org
 Subject: [ofa-general] RHEL 5.3 and OFED 1.4.x



 Does the release of RHEL 5.3 create any additional justification for a 
 maintenance release of OFED (1.4.1) to be generated?  I am already hearing 
 requests for an OFED release that will support it.



 John Russo

 QLogic

 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

2009-01-22 Thread Steve Wise


I understand the desire to not release new features in a point release, 
but at the same time, these features are ready or near ready now.  And 
prior features have definitely been released in point releases.  
(connectX for example).  Another key point is that these features do not 
need the kernel rebase that will happen with ofed-1.5, which will take 
months...


Just more thoughts.  :)

Steve.


Woodruff, Robert J wrote:

I think that we need to discuss this in the EWG meeting.
In the past I think that we have agreed to only do bug fixes
in point release and not add major new features.
If we do want to include the new MPI, then perhaps we should call
it 1.5 and pull in the schedule for 1.5.   Just a thought.

woody


-Original Message-
From: Steve Wise [mailto:sw...@opengridcomputing.com]
Sent: Thursday, January 22, 2009 1:46 PM
To: John Russo
Cc: Woodruff, Robert J; gene...@lists.openfabrics.org; ewg@lists.openfabrics.org
Subject: Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

I think releasing OMPI-1.3 with iWARP support is also good justification.

And there are RDS issues with ofed-1.4 even over IB that I think will
add to justification.


John Russo wrote:
  

I understand but I think that this is another consideration that should be factored in.  
Even if there are no critical PRs to fix, the introduction of RHEL 5.3 (along 
with less critical PRs) may be enough justification.

I simply want to plant the seed in everyone's mind before our next meeting.

Thanks

-Original Message-
From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
Sent: Thursday, January 22, 2009 3:44 PM
To: John Russo; gene...@lists.openfabrics.org
Cc: ewg@lists.openfabrics.org
Subject: RE: RHEL 5.3 and OFED 1.4.x

In the last EWG meeting, we discussed waiting a month or so and seeing what 
kind of bugs
were reported against 1.4 to determine if a 1.4.1 release was needed.




From: general-boun...@lists.openfabrics.org 
[mailto:general-boun...@lists.openfabrics.org] On Behalf Of John Russo
Sent: Thursday, January 22, 2009 12:37 PM
To: gene...@lists.openfabrics.org
Subject: [ofa-general] RHEL 5.3 and OFED 1.4.x



Does the release of RHEL 5.3 create any additional justification for a 
maintenance release of OFED (1.4.1) to be generated?  I am already hearing 
requests for an OFED release that will support it.



John Russo

QLogic

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg




___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

2009-01-22 Thread Jeff Squyres
Also, FWIW, it has been discussed (and agreed, I thought) to include  
OMPI v1.3 in a 1.4.x release.



On Jan 22, 2009, at 5:07 PM, Steve Wise wrote:



I understand the desire to not release new features in a point  
release, but at the same time, these features are ready or near  
ready now.  And prior features have definitely been released in  
point releases.  (connectX for example).  Another key point is that  
these features do not need the kernel rebase that will happen with  
ofed-1.5, which will take months...


Just more thoughts.  :)

Steve.


Woodruff, Robert J wrote:

I think that we need to discuss this in the EWG meeting.
In the past I think that we have agreed to only do bug fixes
in point release and not add major new features.
If we do want to include the new MPI, then perhaps we should call
it 1.5 and pull in the schedule for 1.5.   Just a thought.

woody


-Original Message-
From: Steve Wise [mailto:sw...@opengridcomputing.com]
Sent: Thursday, January 22, 2009 1:46 PM
To: John Russo
Cc: Woodruff, Robert J; gene...@lists.openfabrics.org; ewg@lists.openfabrics.org
Subject: Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

I think releasing OMPI-1.3 with iWARP support is also good  
justification.


And there are RDS issues with ofed-1.4 even over IB that I think will
add to justification.


John Russo wrote:

I understand but I think that this is another consideration that  
should be factored in.  Even if there are no critical PRs to  
fix, the introduction of RHEL 5.3 (along with less critical PRs)  
may be enough justification.


I simply want to plant the seed in everyone's mind before our next  
meeting.


Thanks

-Original Message-
From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
Sent: Thursday, January 22, 2009 3:44 PM
To: John Russo; gene...@lists.openfabrics.org
Cc: ewg@lists.openfabrics.org
Subject: RE: RHEL 5.3 and OFED 1.4.x

In the last EWG meeting, we discussed waiting a month or so and  
seeing what kind of bugs
were reported against 1.4 to determine if a 1.4.1 release was  
needed.





From: general-boun...@lists.openfabrics.org [mailto:general-boun...@lists.openfabrics.org 
] On Behalf Of John Russo

Sent: Thursday, January 22, 2009 12:37 PM
To: gene...@lists.openfabrics.org
Subject: [ofa-general] RHEL 5.3 and OFED 1.4.x



Does the release of RHEL 5.3 create any additional justification  
for a maintenance release of OFED (1.4.1) to be generated?  I am  
already hearing requests for an OFED release that will support it.




John Russo

QLogic

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg




___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



--
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


RE: [ewg] RE: RHEL 5.3 and OFED 1.4.x

2009-01-22 Thread Woodruff, Robert J
Personally I do not have a problem with including it, since MPI is
an isolated component and does not effect the core stack,
but I thought that we had discussed in Sonoma last year
not including major new features in point releases to
reduce the QA that is needed. And, in general I think that
is the way that kernel.org works, point releases are just for
bug fixes.

In any case, lets discuss it again in the EWG on Monday.

woody
 

-Original Message-
From: Jeff Squyres [mailto:jsquy...@cisco.com] 
Sent: Thursday, January 22, 2009 2:17 PM
To: Steve Wise
Cc: Woodruff, Robert J; gene...@lists.openfabrics.org; ewg@lists.openfabrics.org
Subject: Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

Also, FWIW, it has been discussed (and agreed, I thought) to include  
OMPI v1.3 in a 1.4.x release.


On Jan 22, 2009, at 5:07 PM, Steve Wise wrote:


 I understand the desire to not release new features in a point  
 release, but at the same time, these features are ready or near  
 ready now.  And prior features have definitely been released in  
 point releases.  (connectX for example).  Another key point is that  
 these features do not need the kernel rebase that will happen with  
 ofed-1.5, which will take months...

 Just more thoughts.  :)

 Steve.


 Woodruff, Robert J wrote:
 I think that we need to discuss this in the EWG meeting.
 In the past I think that we have agreed to only do bug fixes
 in point release and not add major new features.
 If we do want to include the new MPI, then perhaps we should call
 it 1.5 and pull in the schedule for 1.5.   Just a thought.

 woody


 -Original Message-
 From: Steve Wise [mailto:sw...@opengridcomputing.com]
 Sent: Thursday, January 22, 2009 1:46 PM
 To: John Russo
 Cc: Woodruff, Robert J; gene...@lists.openfabrics.org; 
 ewg@lists.openfabrics.org
 Subject: Re: [ewg] RE: RHEL 5.3 and OFED 1.4.x

 I think releasing OMPI-1.3 with iWARP support is also good  
 justification.

 And there are RDS issues with ofed-1.4 even over IB that I think will
 add to justification.


 John Russo wrote:

 I understand but I think that this is another consideration that  
 should be factored in.  Even if there are no critical PRs to  
 fix, the introduction of RHEL 5.3 (along with less critical PRs)  
 may be enough justification.

 I simply want to plant the seed in everyone's mind before our next  
 meeting.

 Thanks

 -Original Message-
 From: Woodruff, Robert J [mailto:robert.j.woodr...@intel.com]
 Sent: Thursday, January 22, 2009 3:44 PM
 To: John Russo; gene...@lists.openfabrics.org
 Cc: ewg@lists.openfabrics.org
 Subject: RE: RHEL 5.3 and OFED 1.4.x

 In the last EWG meeting, we discussed waiting a month or so and  
 seeing what kind of bugs
 were reported against 1.4 to determine if a 1.4.1 release was  
 needed.


 

 From: general-boun...@lists.openfabrics.org 
 [mailto:general-boun...@lists.openfabrics.org 
 ] On Behalf Of John Russo
 Sent: Thursday, January 22, 2009 12:37 PM
 To: gene...@lists.openfabrics.org
 Subject: [ofa-general] RHEL 5.3 and OFED 1.4.x



 Does the release of RHEL 5.3 create any additional justification  
 for a maintenance release of OFED (1.4.1) to be generated?  I am  
 already hearing requests for an OFED release that will support it.



 John Russo

 QLogic

 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


-- 
Jeff Squyres
Cisco Systems

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH v1] mlx4_ib: Optimize hugetlab pages support

2009-01-22 Thread Roland Dreier
OK, looks better.  However the patch had a bunch of whitespace problems
(run checkpatch.pl to see them).  Also:

  +static int handle_hugetlb_user_mr(struct ib_pd *pd, struct mlx4_ib_mr *mr,
  +  u64 virt_addr, int access_flags)
  +{
  +#ifdef CONFIG_HUGETLB_PAGE
  +struct mlx4_ib_dev *dev = to_mdev(pd-device);
  +struct ib_umem_chunk *chunk;
  +unsigned dsize;
  +dma_addr_t daddr;
  +unsigned uninitialized_var(cur_size);
  +dma_addr_t uninitialized_var(cur_addr);
  +int restart;
  +int n;
  +struct ib_umem  *umem = mr-umem;
  +u64 *arr;
  +int err = 0;
  +int i;
  +int j = 0;
  +
  +n = PAGE_ALIGN(umem-length + umem-offset)  HPAGE_SHIFT;

seems this might underestimate by 1 if the region doesn't start/end on a
huge-page aligned boundary (but we would still want to use big pages to
register it).

  +arr = kmalloc(n * sizeof *arr, GFP_KERNEL);
  +if (!arr)
  +return -ENOMEM;
  +
  +restart = 1;
  +list_for_each_entry(chunk, umem-chunk_list, list)
  +for (i = 0; i  chunk-nmap; ++i) {
  +daddr = sg_dma_address(chunk-page_list[i]);
  +dsize = sg_dma_len(chunk-page_list[i]);
  +if (restart) {
  +cur_addr = daddr;
  +cur_size = dsize;
  +restart = 0;
  +} else if (cur_addr + cur_size != daddr) {
  +err = -EINVAL;
  +goto out;
  +} else
  +cur_size += dsize;
  +
  +if (cur_size  HPAGE_SIZE) {
  +err = -EINVAL;
  +goto out;
  +} else if (cur_size == HPAGE_SIZE) {
  +restart = 1;
  +arr[j++] = cur_addr;
  +}
  +}

I think we could avoid the uninitialized_var() stuff and having restart
at all by just doing cur_size = 0 at the start of the loop, and then
instead of if (restart) just test if cur_size is 0.

 - R.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg