Re: [PATCH] mm/swap: add independed bio pool for swap

2012-12-20 Thread Konstantin Khlebnikov

Andrew Morton wrote:

On Mon, 17 Dec 2012 20:45:13 -0800 (PST)
Hugh Dickins  wrote:


On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:


This bio pool guarantees reclaiming progress for anonymous pages.
All avaliable bio in fs_bio_set may be borrowed by writeback which may
never ends, because disk too slow or broken. I have seen this situation in
real life in system where was a lot of bio requests to a loop device which
laying on top of special fuse-based filesystem.


Hmm, perhaps, I'm not at all sure.


It probably maybe perhaps makes sense.  Of course, you're screwed if
one of your swap devices is "disk too slow or broken".

My crystal ball tells me that the 2015 kernel will have a bioset per
physical device...



I predict that they also will be per-cpu. Well... probably maybe we should
just wait for this shining future. I leave this to your discretion...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/swap: add independed bio pool for swap

2012-12-20 Thread Konstantin Khlebnikov

Andrew Morton wrote:

On Mon, 17 Dec 2012 20:45:13 -0800 (PST)
Hugh Dickinshu...@google.com  wrote:


On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:


This bio pool guarantees reclaiming progress for anonymous pages.
All avaliable bio in fs_bio_set may be borrowed by writeback which may
never ends, because disk too slow or broken. I have seen this situation in
real life in system where was a lot of bio requests to a loop device which
laying on top of special fuse-based filesystem.


Hmm, perhaps, I'm not at all sure.


It probably maybe perhaps makes sense.  Of course, you're screwed if
one of your swap devices is disk too slow or broken.

My crystal ball tells me that the 2015 kernel will have a bioset per
physical device...



I predict that they also will be per-cpu. Well... probably maybe we should
just wait for this shining future. I leave this to your discretion...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/swap: add independed bio pool for swap

2012-12-19 Thread Andrew Morton
On Mon, 17 Dec 2012 20:45:13 -0800 (PST)
Hugh Dickins  wrote:

> On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:
> 
> > This bio pool guarantees reclaiming progress for anonymous pages.
> > All avaliable bio in fs_bio_set may be borrowed by writeback which may
> > never ends, because disk too slow or broken. I have seen this situation in
> > real life in system where was a lot of bio requests to a loop device which
> > laying on top of special fuse-based filesystem.
> 
> Hmm, perhaps, I'm not at all sure.

It probably maybe perhaps makes sense.  Of course, you're screwed if
one of your swap devices is "disk too slow or broken".

My crystal ball tells me that the 2015 kernel will have a bioset per
physical device...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/swap: add independed bio pool for swap

2012-12-19 Thread Andrew Morton
On Mon, 17 Dec 2012 20:45:13 -0800 (PST)
Hugh Dickins hu...@google.com wrote:

 On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:
 
  This bio pool guarantees reclaiming progress for anonymous pages.
  All avaliable bio in fs_bio_set may be borrowed by writeback which may
  never ends, because disk too slow or broken. I have seen this situation in
  real life in system where was a lot of bio requests to a loop device which
  laying on top of special fuse-based filesystem.
 
 Hmm, perhaps, I'm not at all sure.

It probably maybe perhaps makes sense.  Of course, you're screwed if
one of your swap devices is disk too slow or broken.

My crystal ball tells me that the 2015 kernel will have a bioset per
physical device...

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/swap: add independed bio pool for swap

2012-12-17 Thread Hugh Dickins
On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:

> This bio pool guarantees reclaiming progress for anonymous pages.
> All avaliable bio in fs_bio_set may be borrowed by writeback which may
> never ends, because disk too slow or broken. I have seen this situation in
> real life in system where was a lot of bio requests to a loop device which
> laying on top of special fuse-based filesystem.

Hmm, perhaps, I'm not at all sure.

I don't particularly want to fragment off yet another pool if it's
not the right approach.  Or maybe it's loop or fuse which should
have the pool, rather than swap.

If the disk is slow, I'd expect us to be okay; but if it's not
responding at all, then yes, those mempools will remain exhausted.
You're imagining swap going to a more reliable disk, but it's being
starved by the unresponding disk, so deserves a separate pool?

Let's Cc Mel and Jens, who will each have plenty of experience of
running out of bios/mempools, and the proper way to avoid or accept it.

(Note that BIO_POOL_SIZE is only 2 nowadays: when mempools were
first introduced, indeed they were sized larger; but once we found so
much memory disappearing into them, they got cut down to the minimum
needed for forward progress - I forget why that's 2 not 1).

Hugh

> 
> Signed-off-by: Konstantin Khlebnikov 
> Cc: Andrew Morton 
> Cc: Hugh Dickins 
> ---
>  mm/page_io.c |   13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_io.c b/mm/page_io.c
> index 78eee32..699f85e 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -22,12 +22,14 @@
>  #include 
>  #include 
>  
> +static struct bio_set *swap_bio_set;
> +
>  static struct bio *get_swap_bio(gfp_t gfp_flags,
>   struct page *page, bio_end_io_t end_io)
>  {
>   struct bio *bio;
>  
> - bio = bio_alloc(gfp_flags, 1);
> + bio = bio_alloc_bioset(gfp_flags, 1, swap_bio_set);
>   if (bio) {
>   bio->bi_sector = map_swap_page(page, >bi_bdev);
>   bio->bi_sector <<= PAGE_SHIFT - 9;
> @@ -290,3 +292,12 @@ int swap_set_page_dirty(struct page *page)
>   return __set_page_dirty_no_writeback(page);
>   }
>  }
> +
> +static int __init swap_bio_init(void)
> +{
> + swap_bio_set = bioset_create(SWAP_CLUSTER_MAX, 0);
> + if (!swap_bio_set)
> + panic("can't allocate swap_bio_set\n");
> + return 0;
> +}
> +late_initcall(swap_bio_init);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/swap: add independed bio pool for swap

2012-12-17 Thread Hugh Dickins
On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:

 This bio pool guarantees reclaiming progress for anonymous pages.
 All avaliable bio in fs_bio_set may be borrowed by writeback which may
 never ends, because disk too slow or broken. I have seen this situation in
 real life in system where was a lot of bio requests to a loop device which
 laying on top of special fuse-based filesystem.

Hmm, perhaps, I'm not at all sure.

I don't particularly want to fragment off yet another pool if it's
not the right approach.  Or maybe it's loop or fuse which should
have the pool, rather than swap.

If the disk is slow, I'd expect us to be okay; but if it's not
responding at all, then yes, those mempools will remain exhausted.
You're imagining swap going to a more reliable disk, but it's being
starved by the unresponding disk, so deserves a separate pool?

Let's Cc Mel and Jens, who will each have plenty of experience of
running out of bios/mempools, and the proper way to avoid or accept it.

(Note that BIO_POOL_SIZE is only 2 nowadays: when mempools were
first introduced, indeed they were sized larger; but once we found so
much memory disappearing into them, they got cut down to the minimum
needed for forward progress - I forget why that's 2 not 1).

Hugh

 
 Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Hugh Dickins hu...@google.com
 ---
  mm/page_io.c |   13 -
  1 file changed, 12 insertions(+), 1 deletion(-)
 
 diff --git a/mm/page_io.c b/mm/page_io.c
 index 78eee32..699f85e 100644
 --- a/mm/page_io.c
 +++ b/mm/page_io.c
 @@ -22,12 +22,14 @@
  #include linux/frontswap.h
  #include asm/pgtable.h
  
 +static struct bio_set *swap_bio_set;
 +
  static struct bio *get_swap_bio(gfp_t gfp_flags,
   struct page *page, bio_end_io_t end_io)
  {
   struct bio *bio;
  
 - bio = bio_alloc(gfp_flags, 1);
 + bio = bio_alloc_bioset(gfp_flags, 1, swap_bio_set);
   if (bio) {
   bio-bi_sector = map_swap_page(page, bio-bi_bdev);
   bio-bi_sector = PAGE_SHIFT - 9;
 @@ -290,3 +292,12 @@ int swap_set_page_dirty(struct page *page)
   return __set_page_dirty_no_writeback(page);
   }
  }
 +
 +static int __init swap_bio_init(void)
 +{
 + swap_bio_set = bioset_create(SWAP_CLUSTER_MAX, 0);
 + if (!swap_bio_set)
 + panic(can't allocate swap_bio_set\n);
 + return 0;
 +}
 +late_initcall(swap_bio_init);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm/swap: add independed bio pool for swap

2012-12-14 Thread Konstantin Khlebnikov
This bio pool guarantees reclaiming progress for anonymous pages.
All avaliable bio in fs_bio_set may be borrowed by writeback which may
never ends, because disk too slow or broken. I have seen this situation in
real life in system where was a lot of bio requests to a loop device which
laying on top of special fuse-based filesystem.

Signed-off-by: Konstantin Khlebnikov 
Cc: Andrew Morton 
Cc: Hugh Dickins 
---
 mm/page_io.c |   13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index 78eee32..699f85e 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -22,12 +22,14 @@
 #include 
 #include 
 
+static struct bio_set *swap_bio_set;
+
 static struct bio *get_swap_bio(gfp_t gfp_flags,
struct page *page, bio_end_io_t end_io)
 {
struct bio *bio;
 
-   bio = bio_alloc(gfp_flags, 1);
+   bio = bio_alloc_bioset(gfp_flags, 1, swap_bio_set);
if (bio) {
bio->bi_sector = map_swap_page(page, >bi_bdev);
bio->bi_sector <<= PAGE_SHIFT - 9;
@@ -290,3 +292,12 @@ int swap_set_page_dirty(struct page *page)
return __set_page_dirty_no_writeback(page);
}
 }
+
+static int __init swap_bio_init(void)
+{
+   swap_bio_set = bioset_create(SWAP_CLUSTER_MAX, 0);
+   if (!swap_bio_set)
+   panic("can't allocate swap_bio_set\n");
+   return 0;
+}
+late_initcall(swap_bio_init);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm/swap: add independed bio pool for swap

2012-12-14 Thread Konstantin Khlebnikov
This bio pool guarantees reclaiming progress for anonymous pages.
All avaliable bio in fs_bio_set may be borrowed by writeback which may
never ends, because disk too slow or broken. I have seen this situation in
real life in system where was a lot of bio requests to a loop device which
laying on top of special fuse-based filesystem.

Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org
Cc: Andrew Morton a...@linux-foundation.org
Cc: Hugh Dickins hu...@google.com
---
 mm/page_io.c |   13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index 78eee32..699f85e 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -22,12 +22,14 @@
 #include linux/frontswap.h
 #include asm/pgtable.h
 
+static struct bio_set *swap_bio_set;
+
 static struct bio *get_swap_bio(gfp_t gfp_flags,
struct page *page, bio_end_io_t end_io)
 {
struct bio *bio;
 
-   bio = bio_alloc(gfp_flags, 1);
+   bio = bio_alloc_bioset(gfp_flags, 1, swap_bio_set);
if (bio) {
bio-bi_sector = map_swap_page(page, bio-bi_bdev);
bio-bi_sector = PAGE_SHIFT - 9;
@@ -290,3 +292,12 @@ int swap_set_page_dirty(struct page *page)
return __set_page_dirty_no_writeback(page);
}
 }
+
+static int __init swap_bio_init(void)
+{
+   swap_bio_set = bioset_create(SWAP_CLUSTER_MAX, 0);
+   if (!swap_bio_set)
+   panic(can't allocate swap_bio_set\n);
+   return 0;
+}
+late_initcall(swap_bio_init);

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/