subject:"\[PATCH\] mm\/swap\: abort swapoff after disk error"

Re: [PATCH] mm/swap: abort swapoff after disk error

2012-12-18 Thread Konstantin Khlebnikov


Hugh Dickins wrote:

On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:


Content of non-uptodate pages completely random, we cannot expose them into
userspace. This leads to information leak and will crash userspace for sure.


Good find, yes, it's very wrong as is.  But, sorry, I don't like your fix
- better than ignoring the issue as at present, but not the right answer.


Probably we can reuse hwpoison entries here, but tmpfs already too complex.


HWpoison entries?  They're for when that page of RAM is bad, but this is
quite a different case: the page is fine and can perfectly well be freed
and reused - what's bad is the data currently in it.



Signed-off-by: Konstantin Khlebnikov
Original-patch-by: Alexey Kuznetsov
Cc: Andrew Morton
Cc: Hugh Dickins
Cc: Andi Kleen
---
  mm/swapfile.c |   16 
  1 file changed, 16 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index e97a0e5..98fc2fd 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
wait_on_page_writeback(page);

/*
+* If read failed we cannot map not-uptodate page to
+* user space. Actually, we are in serious troubles,
+* we do not even know what process to kill. So, the only


try_to_unuse() is all about locating exactly where this page belongs;
and if the user is lucky, the page in question won't even be needed again
before the process exits, so nothing should be killed at this point.



+* variant remains: to stop swapoff() and allow someone
+* to kill processes to zap invalid pages.


No, we should not abort swapoff: there's every reason to continue,
to make sure that this unreliable area can be taken out of service.


+*
+* TODO replace page with hwpoison entry in pte and shmem.


Instead of blindly going ahead and inserting ptes pointing to the
!PageUptodate page, unuse_pte() and shmem_unuse_inode() should insert
a substitute bad swapentry, to generate SIGBUS if it's accessed.

swp_entry(1, 0) might serve, but there's probably a few mods needed
here and there; and getting the details right (e.g. memcg charges)
will need care.

Not as straightforward as your block below, I admit.  I wonder if you
posted that just to stir me to do better: or can you take it further?


I found this patch in our kernel tree. For some reason it wasn't sent
to mainline. So I decided to send it as is to not lose it for a few more
years. Using here hwpoison was just a guess. Your bad-swap-entry is much
more accurate solution. Seems like here is no rush, this bug was here from
the beginning, so I'll handle it. Thanks for your advice.



Thanks,
Hugh


+*/
+   if (unlikely(!PageUptodate(page))) {
+   unlock_page(page);
+   page_cache_release(page);
+   retval = -EIO;
+   break;
+   }
+
+   /*
 * Remove all references to entry.
 */
swcount = *swap_map;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/swap: abort swapoff after disk error

2012-12-18 Thread Konstantin Khlebnikov


Hugh Dickins wrote:

On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:


Content of non-uptodate pages completely random, we cannot expose them into
userspace. This leads to information leak and will crash userspace for sure.


Good find, yes, it's very wrong as is.  But, sorry, I don't like your fix
- better than ignoring the issue as at present, but not the right answer.


Probably we can reuse hwpoison entries here, but tmpfs already too complex.


HWpoison entries?  They're for when that page of RAM is bad, but this is
quite a different case: the page is fine and can perfectly well be freed
and reused - what's bad is the data currently in it.



Signed-off-by: Konstantin Khlebnikovkhlebni...@openvz.org
Original-patch-by: Alexey Kuznetsovkuz...@ms2.inr.ac.ru
Cc: Andrew Mortona...@linux-foundation.org
Cc: Hugh Dickinshu...@google.com
Cc: Andi Kleena...@firstfloor.org
---
  mm/swapfile.c |   16 
  1 file changed, 16 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index e97a0e5..98fc2fd 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
wait_on_page_writeback(page);

/*
+* If read failed we cannot map not-uptodate page to
+* user space. Actually, we are in serious troubles,
+* we do not even know what process to kill. So, the only


try_to_unuse() is all about locating exactly where this page belongs;
and if the user is lucky, the page in question won't even be needed again
before the process exits, so nothing should be killed at this point.



+* variant remains: to stop swapoff() and allow someone
+* to kill processes to zap invalid pages.


No, we should not abort swapoff: there's every reason to continue,
to make sure that this unreliable area can be taken out of service.


+*
+* TODO replace page with hwpoison entry in pte and shmem.


Instead of blindly going ahead and inserting ptes pointing to the
!PageUptodate page, unuse_pte() and shmem_unuse_inode() should insert
a substitute bad swapentry, to generate SIGBUS if it's accessed.

swp_entry(1, 0) might serve, but there's probably a few mods needed
here and there; and getting the details right (e.g. memcg charges)
will need care.

Not as straightforward as your block below, I admit.  I wonder if you
posted that just to stir me to do better: or can you take it further?


I found this patch in our kernel tree. For some reason it wasn't sent
to mainline. So I decided to send it as is to not lose it for a few more
years. Using here hwpoison was just a guess. Your bad-swap-entry is much
more accurate solution. Seems like here is no rush, this bug was here from
the beginning, so I'll handle it. Thanks for your advice.



Thanks,
Hugh


+*/
+   if (unlikely(!PageUptodate(page))) {
+   unlock_page(page);
+   page_cache_release(page);
+   retval = -EIO;
+   break;
+   }
+
+   /*
 * Remove all references to entry.
 */
swcount = *swap_map;

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/swap: abort swapoff after disk error

2012-12-17 Thread Hugh Dickins

On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:

> Content of non-uptodate pages completely random, we cannot expose them into
> userspace. This leads to information leak and will crash userspace for sure.

Good find, yes, it's very wrong as is.  But, sorry, I don't like your fix
- better than ignoring the issue as at present, but not the right answer.

> Probably we can reuse hwpoison entries here, but tmpfs already too complex.

HWpoison entries?  They're for when that page of RAM is bad, but this is
quite a different case: the page is fine and can perfectly well be freed
and reused - what's bad is the data currently in it.

> 
> Signed-off-by: Konstantin Khlebnikov 
> Original-patch-by: Alexey Kuznetsov 
> Cc: Andrew Morton 
> Cc: Hugh Dickins 
> Cc: Andi Kleen 
> ---
>  mm/swapfile.c |   16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index e97a0e5..98fc2fd 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
>   wait_on_page_writeback(page);
>  
>   /*
> +  * If read failed we cannot map not-uptodate page to
> +  * user space. Actually, we are in serious troubles,
> +  * we do not even know what process to kill. So, the only

try_to_unuse() is all about locating exactly where this page belongs;
and if the user is lucky, the page in question won't even be needed again
before the process exits, so nothing should be killed at this point.

> +  * variant remains: to stop swapoff() and allow someone
> +  * to kill processes to zap invalid pages.

No, we should not abort swapoff: there's every reason to continue,
to make sure that this unreliable area can be taken out of service.

> +  *
> +  * TODO replace page with hwpoison entry in pte and shmem.

Instead of blindly going ahead and inserting ptes pointing to the
!PageUptodate page, unuse_pte() and shmem_unuse_inode() should insert
a substitute bad swapentry, to generate SIGBUS if it's accessed.

swp_entry(1, 0) might serve, but there's probably a few mods needed
here and there; and getting the details right (e.g. memcg charges)
will need care.

Not as straightforward as your block below, I admit.  I wonder if you
posted that just to stir me to do better: or can you take it further?

Thanks,
Hugh

> +  */
> + if (unlikely(!PageUptodate(page))) {
> + unlock_page(page);
> + page_cache_release(page);
> + retval = -EIO;
> + break;
> + }
> +
> + /*
>* Remove all references to entry.
>*/
>   swcount = *swap_map;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/swap: abort swapoff after disk error

2012-12-17 Thread Hugh Dickins

On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:

 Content of non-uptodate pages completely random, we cannot expose them into
 userspace. This leads to information leak and will crash userspace for sure.

Good find, yes, it's very wrong as is.  But, sorry, I don't like your fix
- better than ignoring the issue as at present, but not the right answer.

 Probably we can reuse hwpoison entries here, but tmpfs already too complex.

HWpoison entries?  They're for when that page of RAM is bad, but this is
quite a different case: the page is fine and can perfectly well be freed
and reused - what's bad is the data currently in it.

 
 Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org
 Original-patch-by: Alexey Kuznetsov kuz...@ms2.inr.ac.ru
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Hugh Dickins hu...@google.com
 Cc: Andi Kleen a...@firstfloor.org
 ---
  mm/swapfile.c |   16 
  1 file changed, 16 insertions(+)
 
 diff --git a/mm/swapfile.c b/mm/swapfile.c
 index e97a0e5..98fc2fd 100644
 --- a/mm/swapfile.c
 +++ b/mm/swapfile.c
 @@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
   wait_on_page_writeback(page);
  
   /*
 +  * If read failed we cannot map not-uptodate page to
 +  * user space. Actually, we are in serious troubles,
 +  * we do not even know what process to kill. So, the only

try_to_unuse() is all about locating exactly where this page belongs;
and if the user is lucky, the page in question won't even be needed again
before the process exits, so nothing should be killed at this point.


 +  * variant remains: to stop swapoff() and allow someone
 +  * to kill processes to zap invalid pages.

No, we should not abort swapoff: there's every reason to continue,
to make sure that this unreliable area can be taken out of service.

 +  *
 +  * TODO replace page with hwpoison entry in pte and shmem.

Instead of blindly going ahead and inserting ptes pointing to the
!PageUptodate page, unuse_pte() and shmem_unuse_inode() should insert
a substitute bad swapentry, to generate SIGBUS if it's accessed.

swp_entry(1, 0) might serve, but there's probably a few mods needed
here and there; and getting the details right (e.g. memcg charges)
will need care.

Not as straightforward as your block below, I admit.  I wonder if you
posted that just to stir me to do better: or can you take it further?

Thanks,
Hugh

 +  */
 + if (unlikely(!PageUptodate(page))) {
 + unlock_page(page);
 + page_cache_release(page);
 + retval = -EIO;
 + break;
 + }
 +
 + /*
* Remove all references to entry.
*/
   swcount = *swap_map;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm/swap: abort swapoff after disk error

2012-12-14 Thread Konstantin Khlebnikov

Content of non-uptodate pages completely random, we cannot expose them into
userspace. This leads to information leak and will crash userspace for sure.
Probably we can reuse hwpoison entries here, but tmpfs already too complex.

Signed-off-by: Konstantin Khlebnikov 
Original-patch-by: Alexey Kuznetsov 
Cc: Andrew Morton 
Cc: Hugh Dickins 
Cc: Andi Kleen 
---
 mm/swapfile.c |   16 
 1 file changed, 16 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index e97a0e5..98fc2fd 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
wait_on_page_writeback(page);
 
/*
+* If read failed we cannot map not-uptodate page to
+* user space. Actually, we are in serious troubles,
+* we do not even know what process to kill. So, the only
+* variant remains: to stop swapoff() and allow someone
+* to kill processes to zap invalid pages.
+*
+* TODO replace page with hwpoison entry in pte and shmem.
+*/
+   if (unlikely(!PageUptodate(page))) {
+   unlock_page(page);
+   page_cache_release(page);
+   retval = -EIO;
+   break;
+   }
+
+   /*
 * Remove all references to entry.
 */
swcount = *swap_map;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm/swap: abort swapoff after disk error

2012-12-14 Thread Konstantin Khlebnikov

Content of non-uptodate pages completely random, we cannot expose them into
userspace. This leads to information leak and will crash userspace for sure.
Probably we can reuse hwpoison entries here, but tmpfs already too complex.

Signed-off-by: Konstantin Khlebnikov khlebni...@openvz.org
Original-patch-by: Alexey Kuznetsov kuz...@ms2.inr.ac.ru
Cc: Andrew Morton a...@linux-foundation.org
Cc: Hugh Dickins hu...@google.com
Cc: Andi Kleen a...@firstfloor.org
---
 mm/swapfile.c |   16 
 1 file changed, 16 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index e97a0e5..98fc2fd 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
wait_on_page_writeback(page);
 
/*
+* If read failed we cannot map not-uptodate page to
+* user space. Actually, we are in serious troubles,
+* we do not even know what process to kill. So, the only
+* variant remains: to stop swapoff() and allow someone
+* to kill processes to zap invalid pages.
+*
+* TODO replace page with hwpoison entry in pte and shmem.
+*/
+   if (unlikely(!PageUptodate(page))) {
+   unlock_page(page);
+   page_cache_release(page);
+   retval = -EIO;
+   break;
+   }
+
+   /*
 * Remove all references to entry.
 */
swcount = *swap_map;

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/swap: abort swapoff after disk error

Re: [PATCH] mm/swap: abort swapoff after disk error

Re: [PATCH] mm/swap: abort swapoff after disk error

Re: [PATCH] mm/swap: abort swapoff after disk error

[PATCH] mm/swap: abort swapoff after disk error

[PATCH] mm/swap: abort swapoff after disk error

6 matches

Site Navigation

Mail list logo

Footer information