Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-12-07 Thread Ashish Sangwan
On Thu, Dec 6, 2012 at 10:39 PM, Theodore Ts'o  wrote:
> On Thu, Dec 06, 2012 at 04:59:43PM +0530, Ashish Sangwan wrote:
>>
>> Did you get any time to look into this patch?
>> This problem is with ext4 only as ext4_truncate does not clean the
>> orphan list unlike that of ext3_truncate.
>> Instead, in case of failure to obtain handle, orphan list cleanup is
>> done in ext4_setattr.
>> But during mount, ext4_truncate is not called via ext4_setattr and
>> hence the problem.
>> What do you think?
>
> In the patch description, you mentioned that this occurs when the
> there is a failure to obtain a journal handle.  Is this a hypothetical
> thing that you exposed via some kind of tester which checks to see
> what happens if kmalloc() randomly fails some number of allocation
> requests?  Or was it happening in real life?  And if it is happening
> in real life, do we understand why it's happening, and is there
> something we should be doing to mitigate against the root cause of the
> failure?
>
Thanks for looking into the patch.
Yes, you are right. The situation is hypothetical.
We were checking about Ext4's behavior when it returns error at
different points in truncate/hole punch path and that's when we
stumbled on to this.
In real, we cannot re-produce this scenario without modifying Ext4's
code and hence an extra RFC added with the patch.
If you think that obtaining journal handle could never fail at mount
time, than please ignore this patch.

> The alternative to your patch is to do something similar to what ext3
> does.  That is, if there are any inodes left on the orphan list, to
> iterate through them all and then clean up the orphan list.  Perhaps
> we should then also call ext4_error() since technically the file
> system may very well be inconsistent (there may be allocated inodes
> holding blocks which are no longer connected the directory hierarchy,
> which e2fsck would be able to clean up).  But that could potentially
> cause the system to panic or remount the file system read-only,
> depending on what the errors= behavior is set to.  Which is why I go
> back to the original question; do we understand why ext4_truncate()
> was failing during orphan cleanup in the first place?
>
>- Ted

Right, using ext4_error() would have been a better choice than
explicitly setting RO flag.
I wonder why ext4_truncate does not clear the orphan_list in case of
failure to obtain journal handle unlike that of ext3_truncate?
We would not have to bother about explicitly clearing the orphan list
in the first place if we could just add a lable like out_notrans (as
in ext3)=>
 if (IS_ERR(handle))
goto out_notrans;

 out_notrans:
/*
 * Delete the inode from orphan list so that it doesn't stay there
 * forever and trigger assertion on umount.
 */
if (inode->i_nlink)
ext3_orphan_del(NULL, inode);

Regards,
Ashish
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-12-07 Thread Ashish Sangwan
On Thu, Dec 6, 2012 at 10:39 PM, Theodore Ts'o ty...@mit.edu wrote:
 On Thu, Dec 06, 2012 at 04:59:43PM +0530, Ashish Sangwan wrote:

 Did you get any time to look into this patch?
 This problem is with ext4 only as ext4_truncate does not clean the
 orphan list unlike that of ext3_truncate.
 Instead, in case of failure to obtain handle, orphan list cleanup is
 done in ext4_setattr.
 But during mount, ext4_truncate is not called via ext4_setattr and
 hence the problem.
 What do you think?

 In the patch description, you mentioned that this occurs when the
 there is a failure to obtain a journal handle.  Is this a hypothetical
 thing that you exposed via some kind of tester which checks to see
 what happens if kmalloc() randomly fails some number of allocation
 requests?  Or was it happening in real life?  And if it is happening
 in real life, do we understand why it's happening, and is there
 something we should be doing to mitigate against the root cause of the
 failure?

Thanks for looking into the patch.
Yes, you are right. The situation is hypothetical.
We were checking about Ext4's behavior when it returns error at
different points in truncate/hole punch path and that's when we
stumbled on to this.
In real, we cannot re-produce this scenario without modifying Ext4's
code and hence an extra RFC added with the patch.
If you think that obtaining journal handle could never fail at mount
time, than please ignore this patch.

 The alternative to your patch is to do something similar to what ext3
 does.  That is, if there are any inodes left on the orphan list, to
 iterate through them all and then clean up the orphan list.  Perhaps
 we should then also call ext4_error() since technically the file
 system may very well be inconsistent (there may be allocated inodes
 holding blocks which are no longer connected the directory hierarchy,
 which e2fsck would be able to clean up).  But that could potentially
 cause the system to panic or remount the file system read-only,
 depending on what the errors= behavior is set to.  Which is why I go
 back to the original question; do we understand why ext4_truncate()
 was failing during orphan cleanup in the first place?

- Ted

Right, using ext4_error() would have been a better choice than
explicitly setting RO flag.
I wonder why ext4_truncate does not clear the orphan_list in case of
failure to obtain journal handle unlike that of ext3_truncate?
We would not have to bother about explicitly clearing the orphan list
in the first place if we could just add a lable like out_notrans (as
in ext3)=
 if (IS_ERR(handle))
goto out_notrans;

 out_notrans:
/*
 * Delete the inode from orphan list so that it doesn't stay there
 * forever and trigger assertion on umount.
 */
if (inode-i_nlink)
ext3_orphan_del(NULL, inode);

Regards,
Ashish
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-12-06 Thread Theodore Ts'o
On Thu, Dec 06, 2012 at 04:59:43PM +0530, Ashish Sangwan wrote:
> 
> Did you get any time to look into this patch?
> This problem is with ext4 only as ext4_truncate does not clean the
> orphan list unlike that of ext3_truncate.
> Instead, in case of failure to obtain handle, orphan list cleanup is
> done in ext4_setattr.
> But during mount, ext4_truncate is not called via ext4_setattr and
> hence the problem.
> What do you think?

In the patch description, you mentioned that this occurs when the
there is a failure to obtain a journal handle.  Is this a hypothetical
thing that you exposed via some kind of tester which checks to see
what happens if kmalloc() randomly fails some number of allocation
requests?  Or was it happening in real life?  And if it is happening
in real life, do we understand why it's happening, and is there
something we should be doing to mitigate against the root cause of the
failure?

The alternative to your patch is to do something similar to what ext3
does.  That is, if there are any inodes left on the orphan list, to
iterate through them all and then clean up the orphan list.  Perhaps
we should then also call ext4_error() since technically the file
system may very well be inconsistent (there may be allocated inodes
holding blocks which are no longer connected the directory hierarchy,
which e2fsck would be able to clean up).  But that could potentially
cause the system to panic or remount the file system read-only,
depending on what the errors= behavior is set to.  Which is why I go
back to the original question; do we understand why ext4_truncate()
was failing during orphan cleanup in the first place?

   - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-12-06 Thread Ashish Sangwan
Hi Ted,

Did you get any time to look into this patch?
This problem is with ext4 only as ext4_truncate does not clean the
orphan list unlike that of ext3_truncate.
Instead, in case of failure to obtain handle, orphan list cleanup is
done in ext4_setattr.
But during mount, ext4_truncate is not called via ext4_setattr and
hence the problem.
What do you think?

On Fri, Oct 26, 2012 at 5:11 PM, Namjae Jeon  wrote:
> Add Cc in mail loop.
>
> Thanks.
>
> 2012/10/24, Ashish Sangwan :
>> During orphan cleanup while doing truncate, if we fail to obtain journal
>> handle, the inode for which truncate was called would not be removed from
>> both the on-disk and in-memory orphan lists as the call to ext4_orphan_del
>> would not be executed.
>>
>> This would have following consequences:
>> a) As the inode is not removed from the on-disk list, truncate would be
>> called again for the same inode. Each call would add the inode to the
>> in-memory list. This operation would continue endlessly or until truncate
>> is succeed.
>>
>> b) If somehow, after some iterations, truncate is succeed, ext4_orphan_del
>> will only remove the inode from in-memory list just 1 time. This will
>> trigger
>> j_assert during put super.
>>
>> This patch handles both the problems. If truncate fails, first in-memory
>> list is cleared and than the partition is mounted as read only.
>> Failure to obtain journal handle during mount may suggest that journal
>> device is corrupted.
>>
>> Signed-off-by: Ashish Sangwan 
>> Signed-off-by: Namjae Jeon 
>> ---
>>  fs/ext4/super.c | 16 +++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 4e2aacb..859eccb 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -2201,7 +2201,21 @@ static void ext4_orphan_cleanup(struct super_block
>> *sb,
>>   jbd_debug(2, "truncating inode %lu to %lld bytes\n",
>> inode->i_ino, inode->i_size);
>>   ext4_truncate(inode);
>> - nr_truncates++;
>> + if (list_empty(_I(inode)->i_orphan)) {
>> + nr_truncates++;
>> + } else {
>> + /* Remove inode from in-memory orphan list */
>> + list_del_init(_I(inode)->i_orphan);
>> + ext4_msg(sb, KERN_ERR, "Truncate failed for "
>> +  "orphan  inode = %lu. Running e2fsck"
>> +  " is recommended", inode->i_ino);
>> + if (!(s_flags & MS_RDONLY)) {
>> + ext4_msg(sb, KERN_INFO, "FS would be"
>> + " mounted as readonly");
>> + s_flags |= MS_RDONLY;
>> + }
>> + break;
>> + }
>>   } else {
>>   ext4_msg(sb, KERN_DEBUG,
>>   "%s: deleting unreferenced inode %lu",
>> --
>> 1.7.11.4
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-12-06 Thread Ashish Sangwan
Hi Ted,

Did you get any time to look into this patch?
This problem is with ext4 only as ext4_truncate does not clean the
orphan list unlike that of ext3_truncate.
Instead, in case of failure to obtain handle, orphan list cleanup is
done in ext4_setattr.
But during mount, ext4_truncate is not called via ext4_setattr and
hence the problem.
What do you think?

On Fri, Oct 26, 2012 at 5:11 PM, Namjae Jeon linkinj...@gmail.com wrote:
 Add Cc in mail loop.

 Thanks.

 2012/10/24, Ashish Sangwan ashishsangw...@gmail.com:
 During orphan cleanup while doing truncate, if we fail to obtain journal
 handle, the inode for which truncate was called would not be removed from
 both the on-disk and in-memory orphan lists as the call to ext4_orphan_del
 would not be executed.

 This would have following consequences:
 a) As the inode is not removed from the on-disk list, truncate would be
 called again for the same inode. Each call would add the inode to the
 in-memory list. This operation would continue endlessly or until truncate
 is succeed.

 b) If somehow, after some iterations, truncate is succeed, ext4_orphan_del
 will only remove the inode from in-memory list just 1 time. This will
 trigger
 j_assert during put super.

 This patch handles both the problems. If truncate fails, first in-memory
 list is cleared and than the partition is mounted as read only.
 Failure to obtain journal handle during mount may suggest that journal
 device is corrupted.

 Signed-off-by: Ashish Sangwan ashish.sangw...@gmail.com
 Signed-off-by: Namjae Jeon linkinj...@gmail.com
 ---
  fs/ext4/super.c | 16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

 diff --git a/fs/ext4/super.c b/fs/ext4/super.c
 index 4e2aacb..859eccb 100644
 --- a/fs/ext4/super.c
 +++ b/fs/ext4/super.c
 @@ -2201,7 +2201,21 @@ static void ext4_orphan_cleanup(struct super_block
 *sb,
   jbd_debug(2, truncating inode %lu to %lld bytes\n,
 inode-i_ino, inode-i_size);
   ext4_truncate(inode);
 - nr_truncates++;
 + if (list_empty(EXT4_I(inode)-i_orphan)) {
 + nr_truncates++;
 + } else {
 + /* Remove inode from in-memory orphan list */
 + list_del_init(EXT4_I(inode)-i_orphan);
 + ext4_msg(sb, KERN_ERR, Truncate failed for 
 +  orphan  inode = %lu. Running e2fsck
 +   is recommended, inode-i_ino);
 + if (!(s_flags  MS_RDONLY)) {
 + ext4_msg(sb, KERN_INFO, FS would be
 +  mounted as readonly);
 + s_flags |= MS_RDONLY;
 + }
 + break;
 + }
   } else {
   ext4_msg(sb, KERN_DEBUG,
   %s: deleting unreferenced inode %lu,
 --
 1.7.11.4


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-12-06 Thread Theodore Ts'o
On Thu, Dec 06, 2012 at 04:59:43PM +0530, Ashish Sangwan wrote:
 
 Did you get any time to look into this patch?
 This problem is with ext4 only as ext4_truncate does not clean the
 orphan list unlike that of ext3_truncate.
 Instead, in case of failure to obtain handle, orphan list cleanup is
 done in ext4_setattr.
 But during mount, ext4_truncate is not called via ext4_setattr and
 hence the problem.
 What do you think?

In the patch description, you mentioned that this occurs when the
there is a failure to obtain a journal handle.  Is this a hypothetical
thing that you exposed via some kind of tester which checks to see
what happens if kmalloc() randomly fails some number of allocation
requests?  Or was it happening in real life?  And if it is happening
in real life, do we understand why it's happening, and is there
something we should be doing to mitigate against the root cause of the
failure?

The alternative to your patch is to do something similar to what ext3
does.  That is, if there are any inodes left on the orphan list, to
iterate through them all and then clean up the orphan list.  Perhaps
we should then also call ext4_error() since technically the file
system may very well be inconsistent (there may be allocated inodes
holding blocks which are no longer connected the directory hierarchy,
which e2fsck would be able to clean up).  But that could potentially
cause the system to panic or remount the file system read-only,
depending on what the errors= behavior is set to.  Which is why I go
back to the original question; do we understand why ext4_truncate()
was failing during orphan cleanup in the first place?

   - Ted
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-10-26 Thread Namjae Jeon
Add Cc in mail loop.

Thanks.

2012/10/24, Ashish Sangwan :
> During orphan cleanup while doing truncate, if we fail to obtain journal
> handle, the inode for which truncate was called would not be removed from
> both the on-disk and in-memory orphan lists as the call to ext4_orphan_del
> would not be executed.
>
> This would have following consequences:
> a) As the inode is not removed from the on-disk list, truncate would be
> called again for the same inode. Each call would add the inode to the
> in-memory list. This operation would continue endlessly or until truncate
> is succeed.
>
> b) If somehow, after some iterations, truncate is succeed, ext4_orphan_del
> will only remove the inode from in-memory list just 1 time. This will
> trigger
> j_assert during put super.
>
> This patch handles both the problems. If truncate fails, first in-memory
> list is cleared and than the partition is mounted as read only.
> Failure to obtain journal handle during mount may suggest that journal
> device is corrupted.
>
> Signed-off-by: Ashish Sangwan 
> Signed-off-by: Namjae Jeon 
> ---
>  fs/ext4/super.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 4e2aacb..859eccb 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -2201,7 +2201,21 @@ static void ext4_orphan_cleanup(struct super_block
> *sb,
>   jbd_debug(2, "truncating inode %lu to %lld bytes\n",
> inode->i_ino, inode->i_size);
>   ext4_truncate(inode);
> - nr_truncates++;
> + if (list_empty(_I(inode)->i_orphan)) {
> + nr_truncates++;
> + } else {
> + /* Remove inode from in-memory orphan list */
> + list_del_init(_I(inode)->i_orphan);
> + ext4_msg(sb, KERN_ERR, "Truncate failed for "
> +  "orphan  inode = %lu. Running e2fsck"
> +  " is recommended", inode->i_ino);
> + if (!(s_flags & MS_RDONLY)) {
> + ext4_msg(sb, KERN_INFO, "FS would be"
> + " mounted as readonly");
> + s_flags |= MS_RDONLY;
> + }
> + break;
> + }
>   } else {
>   ext4_msg(sb, KERN_DEBUG,
>   "%s: deleting unreferenced inode %lu",
> --
> 1.7.11.4
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH, RFC] Ext4: Mount partition as read only if during orphan cleanup truncate fails to obtain journal handle.

2012-10-26 Thread Namjae Jeon
Add Cc in mail loop.

Thanks.

2012/10/24, Ashish Sangwan ashishsangw...@gmail.com:
 During orphan cleanup while doing truncate, if we fail to obtain journal
 handle, the inode for which truncate was called would not be removed from
 both the on-disk and in-memory orphan lists as the call to ext4_orphan_del
 would not be executed.

 This would have following consequences:
 a) As the inode is not removed from the on-disk list, truncate would be
 called again for the same inode. Each call would add the inode to the
 in-memory list. This operation would continue endlessly or until truncate
 is succeed.

 b) If somehow, after some iterations, truncate is succeed, ext4_orphan_del
 will only remove the inode from in-memory list just 1 time. This will
 trigger
 j_assert during put super.

 This patch handles both the problems. If truncate fails, first in-memory
 list is cleared and than the partition is mounted as read only.
 Failure to obtain journal handle during mount may suggest that journal
 device is corrupted.

 Signed-off-by: Ashish Sangwan ashish.sangw...@gmail.com
 Signed-off-by: Namjae Jeon linkinj...@gmail.com
 ---
  fs/ext4/super.c | 16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

 diff --git a/fs/ext4/super.c b/fs/ext4/super.c
 index 4e2aacb..859eccb 100644
 --- a/fs/ext4/super.c
 +++ b/fs/ext4/super.c
 @@ -2201,7 +2201,21 @@ static void ext4_orphan_cleanup(struct super_block
 *sb,
   jbd_debug(2, truncating inode %lu to %lld bytes\n,
 inode-i_ino, inode-i_size);
   ext4_truncate(inode);
 - nr_truncates++;
 + if (list_empty(EXT4_I(inode)-i_orphan)) {
 + nr_truncates++;
 + } else {
 + /* Remove inode from in-memory orphan list */
 + list_del_init(EXT4_I(inode)-i_orphan);
 + ext4_msg(sb, KERN_ERR, Truncate failed for 
 +  orphan  inode = %lu. Running e2fsck
 +   is recommended, inode-i_ino);
 + if (!(s_flags  MS_RDONLY)) {
 + ext4_msg(sb, KERN_INFO, FS would be
 +  mounted as readonly);
 + s_flags |= MS_RDONLY;
 + }
 + break;
 + }
   } else {
   ext4_msg(sb, KERN_DEBUG,
   %s: deleting unreferenced inode %lu,
 --
 1.7.11.4


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/