[Ocfs2-devel] [RFC] ocfs2: Idea to make ocfs2_search_chain high efficiency

2015-08-24 Thread Norton.Zhu
In ocfs2_search_chain, I found it has low efficiency  while searching an 
available gd in some circumstances:
1) The lun has a great many gd(it reads lots of unavailable gd(free bits is 
zero));
2) Not too many gd, but the available gd is scattered in the unavailable 
gd(fragmentation);

So I have an idea to optimize the search method:
1) Use the reserved member in the ocfs2_group_desc to make an available chain 
list(gds in the list are all available, free bits more than zero);
2) At the beginning, the chain list is the same with origin chain list;
3) While do allocation, it searches gd in the available chain list;
4) After each allocation, if some gd's free bits is zero, Remove it from the 
available chain list;
5) After each reclaim, if some gd's free bits change from zero to positive, 
Append it to the head of the available chain list;

Once started with the basics outlined above, no unavailable gd will be read.

Anyone has better ideas or advices?


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [RFC] ocfs2: Idea to make ocfs2_search_chain high efficiency

2015-08-24 Thread Srinivas Eeda
Hi Norton,

while the localalloc bmp is enabled the chances of a particular gd 
ending with zero free bits is very minimal. local alloc bmp will pick 
next gd once min number of free bits falls below localalloc bmp size. So 
next gd is picked while the current gd still has free space.

Having said that, I understand the problem you are describing. Since we 
only move the gd with more free bits ahead of the previous unusable gd 
we end of rescanning unusable gds whenever the newly picked gd becomes 
unusable. This is inefficient and causes performance problem when fs 
gets fragmented

I am not sure if we will be allowed to use the bg_reserved2 ... but if 
we are then we could use it to remember  next gd we should look at.

For example, if we assume there are 'N' GDs and we are currently seeing 
that GD10 has more free space ... so the order of GDs are

GD10, GD1, GD2, ... GDN.

Now if we end up using most of the space in GD10, according to current 
logic we re-read GD1, GD2.. GD9 and then read GD11 and pick that 
(assuming it has free space). But instead if we had GD10 remembers GD11 
as the next one to look at ... then we can skip reading GD1..GD9.

Thanks,
--Srini


On 08/24/2015 05:30 AM, Norton.Zhu wrote:
 In ocfs2_search_chain, I found it has low efficiency  while searching an 
 available gd in some circumstances:
 1) The lun has a great many gd(it reads lots of unavailable gd(free bits is 
 zero));
 2) Not too many gd, but the available gd is scattered in the unavailable 
 gd(fragmentation);

 So I have an idea to optimize the search method:
 1) Use the reserved member in the ocfs2_group_desc to make an available chain 
 list(gds in the list are all available, free bits more than zero);
 2) At the beginning, the chain list is the same with origin chain list;
 3) While do allocation, it searches gd in the available chain list;
 4) After each allocation, if some gd's free bits is zero, Remove it from the 
 available chain list;
 5) After each reclaim, if some gd's free bits change from zero to positive, 
 Append it to the head of the available chain list;

 Once started with the basics outlined above, no unavailable gd will be read.

 Anyone has better ideas or advices?


 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 https://oss.oracle.com/mailman/listinfo/ocfs2-devel


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH] ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio

2015-08-24 Thread Ryan Ding
Orabug: 21612107

Use wrong return value in ocfs2_file_write_iter(). This will cause
ocfs2_rw_unlock() be called both in write_iter  end_io, and trigger a BUG_ON.

This issue exist since commit 7da839c475894ea872ec909a5d2e83dddccff5be.

Signed-off-by: Ryan Ding ryan.d...@oracle.com
---
 fs/ocfs2/file.c |   28 ++--
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 719f7f4..33efa33 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2372,6 +2372,20 @@ relock:
/* buffered aio wouldn't have proper lock coverage today */
BUG_ON(written == -EIOCBQUEUED  !(iocb-ki_flags  IOCB_DIRECT));
 
+   /*
+* deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
+* function pointer which is called when o_direct io completes so that
+* it can unlock our rw lock.
+* Unfortunately there are error cases which call end_io and others
+* that don't.  so we don't have to unlock the rw_lock if either an
+* async dio is going to do it in the future or an end_io after an
+* error has already done it.
+*/
+   if ((written == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
+   rw_level = -1;
+   unaligned_dio = 0;
+   }
+
if (unlikely(written = 0))
goto no_sync;
 
@@ -2396,20 +2410,6 @@ relock:
}
 
 no_sync:
-   /*
-* deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
-* function pointer which is called when o_direct io completes so that
-* it can unlock our rw lock.
-* Unfortunately there are error cases which call end_io and others
-* that don't.  so we don't have to unlock the rw_lock if either an
-* async dio is going to do it in the future or an end_io after an
-* error has already done it.
-*/
-   if ((ret == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
-   rw_level = -1;
-   unaligned_dio = 0;
-   }
-
if (unaligned_dio) {
ocfs2_iocb_clear_unaligned_aio(iocb);
mutex_unlock(OCFS2_I(inode)-ip_unaligned_aio);
-- 
1.7.1


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH] ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio

2015-08-24 Thread Junxiao Bi
On 08/24/2015 03:23 PM, Ryan Ding wrote:
 Orabug: 21612107
 
 Use wrong return value in ocfs2_file_write_iter(). This will cause
 ocfs2_rw_unlock() be called both in write_iter  end_io, and trigger a BUG_ON.
 
 This issue exist since commit 7da839c475894ea872ec909a5d2e83dddccff5be.
Better say:
This issue is introduced by commit 7da839c47589 (ocfs2: use
__generic_file_write_iter()) , or checkpatch will report a style error.
Other looks good.

Reviewed-by: Junxiao Bi junxiao...@oracle.com

Thanks,
Junxiao.
 
 Signed-off-by: Ryan Ding ryan.d...@oracle.com
 ---
  fs/ocfs2/file.c |   28 ++--
  1 files changed, 14 insertions(+), 14 deletions(-)
 
 diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
 index 719f7f4..33efa33 100644
 --- a/fs/ocfs2/file.c
 +++ b/fs/ocfs2/file.c
 @@ -2372,6 +2372,20 @@ relock:
   /* buffered aio wouldn't have proper lock coverage today */
   BUG_ON(written == -EIOCBQUEUED  !(iocb-ki_flags  IOCB_DIRECT));
  
 + /*
 +  * deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
 +  * function pointer which is called when o_direct io completes so that
 +  * it can unlock our rw lock.
 +  * Unfortunately there are error cases which call end_io and others
 +  * that don't.  so we don't have to unlock the rw_lock if either an
 +  * async dio is going to do it in the future or an end_io after an
 +  * error has already done it.
 +  */
 + if ((written == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
 + rw_level = -1;
 + unaligned_dio = 0;
 + }
 +
   if (unlikely(written = 0))
   goto no_sync;
  
 @@ -2396,20 +2410,6 @@ relock:
   }
  
  no_sync:
 - /*
 -  * deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
 -  * function pointer which is called when o_direct io completes so that
 -  * it can unlock our rw lock.
 -  * Unfortunately there are error cases which call end_io and others
 -  * that don't.  so we don't have to unlock the rw_lock if either an
 -  * async dio is going to do it in the future or an end_io after an
 -  * error has already done it.
 -  */
 - if ((ret == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
 - rw_level = -1;
 - unaligned_dio = 0;
 - }
 -
   if (unaligned_dio) {
   ocfs2_iocb_clear_unaligned_aio(iocb);
   mutex_unlock(OCFS2_I(inode)-ip_unaligned_aio);
 


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH] ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio

2015-08-24 Thread Ryan Ding
Orabug: 21612107

Use wrong return value in ocfs2_file_write_iter(). This will cause
ocfs2_rw_unlock() be called both in write_iter  end_io, and trigger a BUG_ON.

This issue exist since commit 7da839c475894ea872ec909a5d2e83dddccff5be.

Signed-off-by: Ryan Ding ryan.d...@oracle.com
---
 fs/ocfs2/file.c |   28 ++--
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 719f7f4..33efa33 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2372,6 +2372,20 @@ relock:
/* buffered aio wouldn't have proper lock coverage today */
BUG_ON(written == -EIOCBQUEUED  !(iocb-ki_flags  IOCB_DIRECT));
 
+   /*
+* deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
+* function pointer which is called when o_direct io completes so that
+* it can unlock our rw lock.
+* Unfortunately there are error cases which call end_io and others
+* that don't.  so we don't have to unlock the rw_lock if either an
+* async dio is going to do it in the future or an end_io after an
+* error has already done it.
+*/
+   if ((written == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
+   rw_level = -1;
+   unaligned_dio = 0;
+   }
+
if (unlikely(written = 0))
goto no_sync;
 
@@ -2396,20 +2410,6 @@ relock:
}
 
 no_sync:
-   /*
-* deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
-* function pointer which is called when o_direct io completes so that
-* it can unlock our rw lock.
-* Unfortunately there are error cases which call end_io and others
-* that don't.  so we don't have to unlock the rw_lock if either an
-* async dio is going to do it in the future or an end_io after an
-* error has already done it.
-*/
-   if ((ret == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
-   rw_level = -1;
-   unaligned_dio = 0;
-   }
-
if (unaligned_dio) {
ocfs2_iocb_clear_unaligned_aio(iocb);
mutex_unlock(OCFS2_I(inode)-ip_unaligned_aio);
-- 
1.7.1


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH] ocfs2: direct write will call ocfs2_rw_unlock() twice when doing aio+dio

2015-08-24 Thread Andrew Morton
On Mon, 24 Aug 2015 15:39:04 +0800 Junxiao Bi junxiao...@oracle.com wrote:

 On 08/24/2015 03:23 PM, Ryan Ding wrote:
  Orabug: 21612107
  
  Use wrong return value in ocfs2_file_write_iter(). This will cause
  ocfs2_rw_unlock() be called both in write_iter  end_io, and trigger a 
  BUG_ON.
  
  This issue exist since commit 7da839c475894ea872ec909a5d2e83dddccff5be.
 Better say:
 This issue is introduced by commit 7da839c47589 (ocfs2: use
 __generic_file_write_iter()) , or checkpatch will report a style error.
 Other looks good.
 
 Reviewed-by: Junxiao Bi junxiao...@oracle.com
 

Also we've recently adopted the convention of using the Fixes: tag:
Fixes: 7da839c47589 (ocfs2: use __generic_file_write_iter())

And as the patch fixes a regression we should add the Cc:
sta...@vger.kernel.org tag.

And we should the author of the bad patch for review (hi, Al).

I've made these changes and I'll get this patch into Linus this week,
for 4.2.


From: Ryan Ding ryan.d...@oracle.com
Subject: ocfs2: direct write will call ocfs2_rw_unlock() twice when doing 
aio+dio

Orabug: 21612107

ocfs2_file_write_iter() is usng the wrong return value ('written').  This
will cause ocfs2_rw_unlock() be called both in write_iter  end_io,
triggering a BUG_ON.

This issue was introduced by commit 7da839c47589 (ocfs2: use
__generic_file_write_iter()).

Fixes: 7da839c47589 (ocfs2: use __generic_file_write_iter())
Signed-off-by: Ryan Ding ryan.d...@oracle.com
Reviewed-by: Junxiao Bi junxiao...@oracle.com
Cc: Al Viro v...@zeniv.linux.org.uk
Cc: Mark Fasheh mfas...@suse.com
Cc: Joel Becker jl...@evilplan.org
Cc: sta...@vger.kernel.org
Signed-off-by: Andrew Morton a...@linux-foundation.org
---

 fs/ocfs2/file.c |   28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff -puN 
fs/ocfs2/file.c~ocfs2-direct-write-will-call-ocfs2_rw_unlock-twice-when-doing-aiodio
 fs/ocfs2/file.c
--- 
a/fs/ocfs2/file.c~ocfs2-direct-write-will-call-ocfs2_rw_unlock-twice-when-doing-aiodio
+++ a/fs/ocfs2/file.c
@@ -2366,6 +2366,20 @@ relock:
/* buffered aio wouldn't have proper lock coverage today */
BUG_ON(written == -EIOCBQUEUED  !(iocb-ki_flags  IOCB_DIRECT));
 
+   /*
+* deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
+* function pointer which is called when o_direct io completes so that
+* it can unlock our rw lock.
+* Unfortunately there are error cases which call end_io and others
+* that don't.  so we don't have to unlock the rw_lock if either an
+* async dio is going to do it in the future or an end_io after an
+* error has already done it.
+*/
+   if ((written == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
+   rw_level = -1;
+   unaligned_dio = 0;
+   }
+
if (unlikely(written = 0))
goto no_sync;
 
@@ -2390,20 +2404,6 @@ relock:
}
 
 no_sync:
-   /*
-* deep in g_f_a_w_n()-ocfs2_direct_IO we pass in a ocfs2_dio_end_io
-* function pointer which is called when o_direct io completes so that
-* it can unlock our rw lock.
-* Unfortunately there are error cases which call end_io and others
-* that don't.  so we don't have to unlock the rw_lock if either an
-* async dio is going to do it in the future or an end_io after an
-* error has already done it.
-*/
-   if ((ret == -EIOCBQUEUED) || (!ocfs2_iocb_is_rw_locked(iocb))) {
-   rw_level = -1;
-   unaligned_dio = 0;
-   }
-
if (unaligned_dio  ocfs2_iocb_is_unaligned_aio(iocb)) {
ocfs2_iocb_clear_unaligned_aio(iocb);
mutex_unlock(OCFS2_I(inode)-ip_unaligned_aio);
_


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [RFC] ocfs2: Idea to make ocfs2_search_chain high efficiency

2015-08-24 Thread Joel Becker
Srini has summed it up, mostly.  We shouldn't be hitting these
suboptimal chains of gds unless the localalloc is empty.  Hopefully
that's not terribly frequent, but it can happen.

Your solution does increase the complexity of the commit.  Pushing a
newly-available gd back on to the available list is cheap, as you're
already modifing the chain_rec and the gd.  But deleting a full gd from
the available list requires you to find the previous entry in the list
and edit it as well.  The lock ordering is hopefully not terrible.

Joel

On Mon, Aug 24, 2015 at 09:57:26AM -0700, Srinivas Eeda wrote:
 Hi Norton,
 
 while the localalloc bmp is enabled the chances of a particular gd 
 ending with zero free bits is very minimal. local alloc bmp will pick 
 next gd once min number of free bits falls below localalloc bmp size. So 
 next gd is picked while the current gd still has free space.
 
 Having said that, I understand the problem you are describing. Since we 
 only move the gd with more free bits ahead of the previous unusable gd 
 we end of rescanning unusable gds whenever the newly picked gd becomes 
 unusable. This is inefficient and causes performance problem when fs 
 gets fragmented
 
 I am not sure if we will be allowed to use the bg_reserved2 ... but if 
 we are then we could use it to remember  next gd we should look at.
 
 For example, if we assume there are 'N' GDs and we are currently seeing 
 that GD10 has more free space ... so the order of GDs are
 
 GD10, GD1, GD2, ... GDN.
 
 Now if we end up using most of the space in GD10, according to current 
 logic we re-read GD1, GD2.. GD9 and then read GD11 and pick that 
 (assuming it has free space). But instead if we had GD10 remembers GD11 
 as the next one to look at ... then we can skip reading GD1..GD9.
 
 Thanks,
 --Srini
 
 
 On 08/24/2015 05:30 AM, Norton.Zhu wrote:
  In ocfs2_search_chain, I found it has low efficiency  while searching an 
  available gd in some circumstances:
  1) The lun has a great many gd(it reads lots of unavailable gd(free bits is 
  zero));
  2) Not too many gd, but the available gd is scattered in the unavailable 
  gd(fragmentation);
 
  So I have an idea to optimize the search method:
  1) Use the reserved member in the ocfs2_group_desc to make an available 
  chain list(gds in the list are all available, free bits more than zero);
  2) At the beginning, the chain list is the same with origin chain list;
  3) While do allocation, it searches gd in the available chain list;
  4) After each allocation, if some gd's free bits is zero, Remove it from 
  the available chain list;
  5) After each reclaim, if some gd's free bits change from zero to positive, 
  Append it to the head of the available chain list;
 
  Once started with the basics outlined above, no unavailable gd will be read.
 
  Anyone has better ideas or advices?
 
 
  ___
  Ocfs2-devel mailing list
  Ocfs2-devel@oss.oracle.com
  https://oss.oracle.com/mailman/listinfo/ocfs2-devel
 
 
 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 https://oss.oracle.com/mailman/listinfo/ocfs2-devel

-- 

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel