Re: [PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-18 Thread Chao Yu
Hi Xiang,

On 2019-8-18 18:52, Gao Xiang wrote:
> Hi Chao,
> 
> On Sun, Aug 18, 2019 at 06:39:52PM +0800, Chao Yu wrote:
>> On 2019-8-18 10:53, Matthew Wilcox wrote:
>>> On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
 On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
>> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct 
>> dir_context *ctx)
>>  unsigned int nameoff, maxsize;
>>  
>>  dentry_page = read_mapping_page(mapping, i, NULL);
>> -if (IS_ERR(dentry_page))
>> -continue;
>> +if (IS_ERR(dentry_page)) {
>> +errln("fail to readdir of logical block %u of 
>> nid %llu",
>> +  i, EROFS_V(dir)->nid);
>> +err = PTR_ERR(dentry_page);
>> +break;
>
> I don't think you want to use the errno that came back from
> read_mapping_page() (which is, I think, always going to be -EIO).
> Rather you want -EFSCORRUPTED, at least if I understand the recent
> patches to ext2/ext4/f2fs/xfs/...

 Thanks for your reply and noticing this. :)

 Yes, as I talked with you about read_mapping_page() in a xfs related
 topic earlier, I think I fully understand what returns here.

 I actually had some concern about that before sending out this patch.
 You know the status is
PG_uptodate is not set and PG_error is set here.

 But we cannot know it is actually a disk read error or due to
 corrupted images (due to lack of page flags or some status, and
 I think it could be a waste of page structure space for such
 corrupted image or disk error)...

 And some people also like propagate errors from insiders...
 (and they could argue about err = -EFSCORRUPTED as well..)

 I'd like hear your suggestion about this after my words above?
 still return -EFSCORRUPTED?
>>>
>>> I don't think it matters whether it's due to a disk error or a corrupted
>>> image.  We can't read the directory entry, so we should probably return
>>> -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
>>> also return -ENOMEM, so it should probably look something like this:
>>>
>>> err = 0;
>>> if (dentry_page == ERR_PTR(-ENOMEM))
>>> err = -ENOMEM;
>>> else if (IS_ERR(dentry_page)) {
>>> errln("fail to readdir of logical block %u of nid %llu",
>>>   i, EROFS_V(dir)->nid);
>>> err = -EFSCORRUPTED;
>>
>> Well, if there is real IO error happen under filesystem, we should return 
>> -EIO
>> instead of EFSCORRUPTED?
>>
>> The right fix may be that doing sanity check on on-disk blkaddr, and return
>> -EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller
>> erofs_readdir(), IIUC below error info.
> 
> In my thought, I actually don't care what is actually returned
> (In other words, I have no tendency about EFSCORRUPTED / EIO)
> as long as it behaves normal for corrupted image.

I doubt that user can and be willing to handle different error code in there
error path.

> 
> A little concern is that I have no idea whether all user problems
> can handle EUCLEAN properly.

Yes, I can see it's widely used in fileystem, I guess it would be better to
update manual of common fs interfaces to let user be aware of the meaning of it.

> 
> I don't want to limit blkaddr as what ->blocks recorded in
> erofs_super_block since it's already used for our hotpatching
> approach and that field is only used for statfs() for users
> to know its visible size, and block layer will block such
> invalid block access.
> 
> All in all, that is minor. I think we can fix as what Matthew said.

Yeah, as we discuss offline, we need keep flexibility on current version for
android, and maybe we can add a feature to check blkaddr validation later.

Thanks,

> 
> Thanks,
> Gao Xiang
> 
>>
>>> [36297.354090] attempt to access beyond end of device
>>> [36297.354098] loop17: rw=0, want=29887428984, limit=1953128
>>> [36297.354107] attempt to access beyond end of device
>>> [36297.354109] loop17: rw=0, want=29887428480, limit=1953128
>>> [36301.827234] attempt to access beyond end of device
>>> [36301.827243] loop17: rw=0, want=29887428480, limit=1953128
>>
>> Thanks,
>>
>>> }
>>>
>>> if (err)
>>> break;
>>>
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-18 Thread Gao Xiang
Hi Chao,

On Sun, Aug 18, 2019 at 06:39:52PM +0800, Chao Yu wrote:
> On 2019-8-18 10:53, Matthew Wilcox wrote:
> > On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
> >> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> >>> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
>  @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct 
>  dir_context *ctx)
>   unsigned int nameoff, maxsize;
>   
>   dentry_page = read_mapping_page(mapping, i, NULL);
>  -if (IS_ERR(dentry_page))
>  -continue;
>  +if (IS_ERR(dentry_page)) {
>  +errln("fail to readdir of logical block %u of 
>  nid %llu",
>  +  i, EROFS_V(dir)->nid);
>  +err = PTR_ERR(dentry_page);
>  +break;
> >>>
> >>> I don't think you want to use the errno that came back from
> >>> read_mapping_page() (which is, I think, always going to be -EIO).
> >>> Rather you want -EFSCORRUPTED, at least if I understand the recent
> >>> patches to ext2/ext4/f2fs/xfs/...
> >>
> >> Thanks for your reply and noticing this. :)
> >>
> >> Yes, as I talked with you about read_mapping_page() in a xfs related
> >> topic earlier, I think I fully understand what returns here.
> >>
> >> I actually had some concern about that before sending out this patch.
> >> You know the status is
> >>PG_uptodate is not set and PG_error is set here.
> >>
> >> But we cannot know it is actually a disk read error or due to
> >> corrupted images (due to lack of page flags or some status, and
> >> I think it could be a waste of page structure space for such
> >> corrupted image or disk error)...
> >>
> >> And some people also like propagate errors from insiders...
> >> (and they could argue about err = -EFSCORRUPTED as well..)
> >>
> >> I'd like hear your suggestion about this after my words above?
> >> still return -EFSCORRUPTED?
> > 
> > I don't think it matters whether it's due to a disk error or a corrupted
> > image.  We can't read the directory entry, so we should probably return
> > -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
> > also return -ENOMEM, so it should probably look something like this:
> > 
> > err = 0;
> > if (dentry_page == ERR_PTR(-ENOMEM))
> > err = -ENOMEM;
> > else if (IS_ERR(dentry_page)) {
> > errln("fail to readdir of logical block %u of nid %llu",
> >   i, EROFS_V(dir)->nid);
> > err = -EFSCORRUPTED;
> 
> Well, if there is real IO error happen under filesystem, we should return -EIO
> instead of EFSCORRUPTED?
> 
> The right fix may be that doing sanity check on on-disk blkaddr, and return
> -EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller
> erofs_readdir(), IIUC below error info.

In my thought, I actually don't care what is actually returned
(In other words, I have no tendency about EFSCORRUPTED / EIO)
as long as it behaves normal for corrupted image.

A little concern is that I have no idea whether all user problems
can handle EUCLEAN properly.

I don't want to limit blkaddr as what ->blocks recorded in
erofs_super_block since it's already used for our hotpatching
approach and that field is only used for statfs() for users
to know its visible size, and block layer will block such
invalid block access.

All in all, that is minor. I think we can fix as what Matthew said.

Thanks,
Gao Xiang

> 
> > [36297.354090] attempt to access beyond end of device
> > [36297.354098] loop17: rw=0, want=29887428984, limit=1953128
> > [36297.354107] attempt to access beyond end of device
> > [36297.354109] loop17: rw=0, want=29887428480, limit=1953128
> > [36301.827234] attempt to access beyond end of device
> > [36301.827243] loop17: rw=0, want=29887428480, limit=1953128
> 
> Thanks,
> 
> > }
> > 
> > if (err)
> > break;
> > 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-18 Thread Chao Yu
On 2019-8-18 10:53, Matthew Wilcox wrote:
> On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
>> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
>>> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
 @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct 
 dir_context *ctx)
unsigned int nameoff, maxsize;
  
dentry_page = read_mapping_page(mapping, i, NULL);
 -  if (IS_ERR(dentry_page))
 -  continue;
 +  if (IS_ERR(dentry_page)) {
 +  errln("fail to readdir of logical block %u of nid %llu",
 +i, EROFS_V(dir)->nid);
 +  err = PTR_ERR(dentry_page);
 +  break;
>>>
>>> I don't think you want to use the errno that came back from
>>> read_mapping_page() (which is, I think, always going to be -EIO).
>>> Rather you want -EFSCORRUPTED, at least if I understand the recent
>>> patches to ext2/ext4/f2fs/xfs/...
>>
>> Thanks for your reply and noticing this. :)
>>
>> Yes, as I talked with you about read_mapping_page() in a xfs related
>> topic earlier, I think I fully understand what returns here.
>>
>> I actually had some concern about that before sending out this patch.
>> You know the status is
>>PG_uptodate is not set and PG_error is set here.
>>
>> But we cannot know it is actually a disk read error or due to
>> corrupted images (due to lack of page flags or some status, and
>> I think it could be a waste of page structure space for such
>> corrupted image or disk error)...
>>
>> And some people also like propagate errors from insiders...
>> (and they could argue about err = -EFSCORRUPTED as well..)
>>
>> I'd like hear your suggestion about this after my words above?
>> still return -EFSCORRUPTED?
> 
> I don't think it matters whether it's due to a disk error or a corrupted
> image.  We can't read the directory entry, so we should probably return
> -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
> also return -ENOMEM, so it should probably look something like this:
> 
>   err = 0;
>   if (dentry_page == ERR_PTR(-ENOMEM))
>   err = -ENOMEM;
>   else if (IS_ERR(dentry_page)) {
>   errln("fail to readdir of logical block %u of nid %llu",
> i, EROFS_V(dir)->nid);
>   err = -EFSCORRUPTED;

Well, if there is real IO error happen under filesystem, we should return -EIO
instead of EFSCORRUPTED?

The right fix may be that doing sanity check on on-disk blkaddr, and return
-EFSCORRUPTED if the blkaddr is invalid and propagate the error to its caller
erofs_readdir(), IIUC below error info.

> [36297.354090] attempt to access beyond end of device
> [36297.354098] loop17: rw=0, want=29887428984, limit=1953128
> [36297.354107] attempt to access beyond end of device
> [36297.354109] loop17: rw=0, want=29887428480, limit=1953128
> [36301.827234] attempt to access beyond end of device
> [36301.827243] loop17: rw=0, want=29887428480, limit=1953128

Thanks,

>   }
> 
>   if (err)
>   break;
> 
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-17 Thread Gao Xiang
On Sat, Aug 17, 2019 at 07:53:39PM -0700, Matthew Wilcox wrote:
> On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
> > On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> > > On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> > > > @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct 
> > > > dir_context *ctx)
> > > > unsigned int nameoff, maxsize;
> > > >  
> > > > dentry_page = read_mapping_page(mapping, i, NULL);
> > > > -   if (IS_ERR(dentry_page))
> > > > -   continue;
> > > > +   if (IS_ERR(dentry_page)) {
> > > > +   errln("fail to readdir of logical block %u of 
> > > > nid %llu",
> > > > + i, EROFS_V(dir)->nid);
> > > > +   err = PTR_ERR(dentry_page);
> > > > +   break;
> > > 
> > > I don't think you want to use the errno that came back from
> > > read_mapping_page() (which is, I think, always going to be -EIO).
> > > Rather you want -EFSCORRUPTED, at least if I understand the recent
> > > patches to ext2/ext4/f2fs/xfs/...
> > 
> > Thanks for your reply and noticing this. :)
> > 
> > Yes, as I talked with you about read_mapping_page() in a xfs related
> > topic earlier, I think I fully understand what returns here.
> > 
> > I actually had some concern about that before sending out this patch.
> > You know the status is
> >PG_uptodate is not set and PG_error is set here.
> > 
> > But we cannot know it is actually a disk read error or due to
> > corrupted images (due to lack of page flags or some status, and
> > I think it could be a waste of page structure space for such
> > corrupted image or disk error)...
> > 
> > And some people also like propagate errors from insiders...
> > (and they could argue about err = -EFSCORRUPTED as well..)
> > 
> > I'd like hear your suggestion about this after my words above?
> > still return -EFSCORRUPTED?
> 
> I don't think it matters whether it's due to a disk error or a corrupted
> image.  We can't read the directory entry, so we should probably return
> -EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
> also return -ENOMEM, so it should probably look something like this:

OK, I will send the next version like what you described below.
I realized that at first but I have no tendency to return
EFSCORRUPTED or EIO here.

Thanks,
Gao Xiang

> 
>   err = 0;
>   if (dentry_page == ERR_PTR(-ENOMEM))
>   err = -ENOMEM;
>   else if (IS_ERR(dentry_page)) {
>   errln("fail to readdir of logical block %u of nid %llu",
> i, EROFS_V(dir)->nid);
>   err = -EFSCORRUPTED;
>   }
> 
>   if (err)
>   break;
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-17 Thread Matthew Wilcox
On Sun, Aug 18, 2019 at 10:32:45AM +0800, Gao Xiang wrote:
> On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> > On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> > > @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct 
> > > dir_context *ctx)
> > >   unsigned int nameoff, maxsize;
> > >  
> > >   dentry_page = read_mapping_page(mapping, i, NULL);
> > > - if (IS_ERR(dentry_page))
> > > - continue;
> > > + if (IS_ERR(dentry_page)) {
> > > + errln("fail to readdir of logical block %u of nid %llu",
> > > +   i, EROFS_V(dir)->nid);
> > > + err = PTR_ERR(dentry_page);
> > > + break;
> > 
> > I don't think you want to use the errno that came back from
> > read_mapping_page() (which is, I think, always going to be -EIO).
> > Rather you want -EFSCORRUPTED, at least if I understand the recent
> > patches to ext2/ext4/f2fs/xfs/...
> 
> Thanks for your reply and noticing this. :)
> 
> Yes, as I talked with you about read_mapping_page() in a xfs related
> topic earlier, I think I fully understand what returns here.
> 
> I actually had some concern about that before sending out this patch.
> You know the status is
>PG_uptodate is not set and PG_error is set here.
> 
> But we cannot know it is actually a disk read error or due to
> corrupted images (due to lack of page flags or some status, and
> I think it could be a waste of page structure space for such
> corrupted image or disk error)...
> 
> And some people also like propagate errors from insiders...
> (and they could argue about err = -EFSCORRUPTED as well..)
> 
> I'd like hear your suggestion about this after my words above?
> still return -EFSCORRUPTED?

I don't think it matters whether it's due to a disk error or a corrupted
image.  We can't read the directory entry, so we should probably return
-EFSCORRUPTED.  Thinking about it some more, read_mapping_page() can
also return -ENOMEM, so it should probably look something like this:

err = 0;
if (dentry_page == ERR_PTR(-ENOMEM))
err = -ENOMEM;
else if (IS_ERR(dentry_page)) {
errln("fail to readdir of logical block %u of nid %llu",
  i, EROFS_V(dir)->nid);
err = -EFSCORRUPTED;
}

if (err)
break;
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-17 Thread Gao Xiang
Hi Willy,

On Sat, Aug 17, 2019 at 07:20:55PM -0700, Matthew Wilcox wrote:
> On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> > @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct 
> > dir_context *ctx)
> > unsigned int nameoff, maxsize;
> >  
> > dentry_page = read_mapping_page(mapping, i, NULL);
> > -   if (IS_ERR(dentry_page))
> > -   continue;
> > +   if (IS_ERR(dentry_page)) {
> > +   errln("fail to readdir of logical block %u of nid %llu",
> > + i, EROFS_V(dir)->nid);
> > +   err = PTR_ERR(dentry_page);
> > +   break;
> 
> I don't think you want to use the errno that came back from
> read_mapping_page() (which is, I think, always going to be -EIO).
> Rather you want -EFSCORRUPTED, at least if I understand the recent
> patches to ext2/ext4/f2fs/xfs/...

Thanks for your reply and noticing this. :)

Yes, as I talked with you about read_mapping_page() in a xfs related
topic earlier, I think I fully understand what returns here.

I actually had some concern about that before sending out this patch.
You know the status is
   PG_uptodate is not set and PG_error is set here.

But we cannot know it is actually a disk read error or due to
corrupted images (due to lack of page flags or some status, and
I think it could be a waste of page structure space for such
corrupted image or disk error)...

And some people also like propagate errors from insiders...
(and they could argue about err = -EFSCORRUPTED as well..)

I'd like hear your suggestion about this after my words above?
still return -EFSCORRUPTED?

Thanks,
Gao Xiang

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-17 Thread Matthew Wilcox
On Sun, Aug 18, 2019 at 09:56:31AM +0800, Gao Xiang wrote:
> @@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct 
> dir_context *ctx)
>   unsigned int nameoff, maxsize;
>  
>   dentry_page = read_mapping_page(mapping, i, NULL);
> - if (IS_ERR(dentry_page))
> - continue;
> + if (IS_ERR(dentry_page)) {
> + errln("fail to readdir of logical block %u of nid %llu",
> +   i, EROFS_V(dir)->nid);
> + err = PTR_ERR(dentry_page);
> + break;

I don't think you want to use the errno that came back from
read_mapping_page() (which is, I think, always going to be -EIO).
Rather you want -EFSCORRUPTED, at least if I understand the recent
patches to ext2/ext4/f2fs/xfs/...
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH v2] staging: erofs: fix an error handling in erofs_readdir()

2019-08-17 Thread Gao Xiang
From: Gao Xiang 

Richard observed a forever loop of erofs_read_raw_page() [1]
which can be generated by forcely setting ->u.i_blkaddr
to 0xdeadbeef (as my understanding block layer can
handle access beyond end of device correctly).

After digging into that, it seems the problem is highly
related with directories and then I found the root cause
is an improper error handling in erofs_readdir().

Let's fix it now.

[1] 
https://lore.kernel.org/r/1163995781.68824.1566084358245.javamail.zim...@nod.at/

Reported-by: Richard Weinberger 
Fixes: 3aa8ec716e52 ("staging: erofs: add directory operations")
Cc:  # 4.19+
Signed-off-by: Gao Xiang 
---

changelog from v1:
 - fix the incorrect external link in commit message.

This patch is based on the following patch as well
https://lore.kernel.org/r/20190816071142.8633-1-gaoxian...@huawei.com/

and
https://lore.kernel.org/r/20190817082313.21040-1-hsiang...@aol.com/
can still be properly applied after this patch.

 drivers/staging/erofs/dir.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/erofs/dir.c b/drivers/staging/erofs/dir.c
index 5f38382637e6..f2d7539589e4 100644
--- a/drivers/staging/erofs/dir.c
+++ b/drivers/staging/erofs/dir.c
@@ -82,8 +82,12 @@ static int erofs_readdir(struct file *f, struct dir_context 
*ctx)
unsigned int nameoff, maxsize;
 
dentry_page = read_mapping_page(mapping, i, NULL);
-   if (IS_ERR(dentry_page))
-   continue;
+   if (IS_ERR(dentry_page)) {
+   errln("fail to readdir of logical block %u of nid %llu",
+ i, EROFS_V(dir)->nid);
+   err = PTR_ERR(dentry_page);
+   break;
+   }
 
de = (struct erofs_dirent *)kmap(dentry_page);
 
-- 
2.17.1

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel