Re: [PATCH] fix inconsistent device between /proc/pid/maps and stat

2017-03-24 Thread Hirotaka Yamamoto
Hi,

2017-03-24 20:58 GMT+09:00 David Sterba :
> On Tue, Mar 21, 2017 at 10:53:09AM +0900, Satoru Takeuchi wrote:
>> There have been some discussions about inconsistent device between 
>> /proc/pid/maps and stat(2).
>>
>> http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps
>> https://www.spinics.net/lists/linux-btrfs/msg09044.html
>> https://patchwork.kernel.org/patch/2825842/
>> https://patchwork.kernel.org/patch/2831525/
>>
>> It's important since it breaks user programs like lsof(8). There was a 
>> patche by Mark to fix this problem.
>> However, unfortunately, that patch is not merged so far.
>
> And no variant of the get_map_dev will ever be merged. Reworking this
> requires extensions to the superblock and subvolume structures, making
> it more generic and suitable for VFS.

Aside from how to fix, yet I think fixing this is quite important.

I have found the problem when I wanted to list processes using vulnerable
shared libraries by, say, "lsof /lib/x86_64-linux-gnu/libcrypto.so.1.0.0".

As lsof did not report any process, I almost failed to restart processes
that were using the library in reality after installing upgrades.

Leaving this problem would open a big security concern as btrfs is
getting more popular.

Satoru, thank you for the proposal, and good luck to further
improvements!

@ymmt2005
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix inconsistent device between /proc/pid/maps and stat

2017-03-24 Thread Satoru Takeuchi
2017-03-24 20:58 GMT+09:00 David Sterba :
> On Tue, Mar 21, 2017 at 10:53:09AM +0900, Satoru Takeuchi wrote:
>> There have been some discussions about inconsistent device between 
>> /proc/pid/maps and stat(2).
>>
>> http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps
>> https://www.spinics.net/lists/linux-btrfs/msg09044.html
>> https://patchwork.kernel.org/patch/2825842/
>> https://patchwork.kernel.org/patch/2831525/
>>
>> It's important since it breaks user programs like lsof(8). There was a 
>> patche by Mark to fix this problem.
>> However, unfortunately, that patch is not merged so far.
>
> And no variant of the get_map_dev will ever be merged. Reworking this
> requires extensions to the superblock and subvolume structures, making
> it more generic and suitable for VFS.

Thank you for your comment. I'm going to reconsider how to fix this problem.

Thanks,
Satoru
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix inconsistent device between /proc/pid/maps and stat

2017-03-24 Thread David Sterba
On Tue, Mar 21, 2017 at 10:53:09AM +0900, Satoru Takeuchi wrote:
> There have been some discussions about inconsistent device between 
> /proc/pid/maps and stat(2).
> 
> http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps
> https://www.spinics.net/lists/linux-btrfs/msg09044.html
> https://patchwork.kernel.org/patch/2825842/
> https://patchwork.kernel.org/patch/2831525/
> 
> It's important since it breaks user programs like lsof(8). There was a patche 
> by Mark to fix this problem.
> However, unfortunately, that patch is not merged so far.

And no variant of the get_map_dev will ever be merged. Reworking this
requires extensions to the superblock and subvolume structures, making
it more generic and suitable for VFS.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix inconsistent device between /proc/pid/maps and stat

2017-03-20 Thread Satoru Takeuchi
There have been some discussions about inconsistent device between 
/proc/pid/maps and stat(2).

http://thr3ads.net/btrfs-devel/2011/05/2346176-RFC-PATCH-0-2-btrfs-vfs-Return-same-device-in-stat-2-and-proc-pid-maps
https://www.spinics.net/lists/linux-btrfs/msg09044.html
https://patchwork.kernel.org/patch/2825842/
https://patchwork.kernel.org/patch/2831525/

It's important since it breaks user programs like lsof(8). There was a patche 
by Mark to fix this problem.
However, unfortunately, that patch is not merged so far.

  NOTE:
  1) This patch is inspired by the Mark's one and I tweak it as follows.
 - move a flag for this fix from sb->s_flags to kernel internal sb->s_iflags
 - change that flag's name from MS_STAT_FOR_DEV to SB_I_LOGICALVOL, to make 
its meaning clearer
  2) This patch is for Chris's for-linus-4.11 (commit 
e1699d2d7bf6e6cce3e1baff19f9dd4595a58664 ("")). It should modify to apply to 
4.11-rcX because of statx patches changes
 struct inode_operations->getattr() interface.

For more information about this problem, please refer to the patchat the end of 
this mail.

---
Subject: [PATCH] fix inconsistent devie between /proc/pid/maps and stat

/proc/pid/maps returns each device as follows.

===
dev = inode->i_sb->s_dev;
===

However, stat(2) returns it as follows.

===
stat->dev = BTRFS_I(inode)->root->anon_dev;
===

It results in device mismatch and this inconsistency breaks users programs like 
lsof(8) as follows.

Without this patch:
===
$ mount | grep /home/vagrant/mnt
/dev/vda5 on /home/vagrant/mnt type btrfs 
(rw,relatime,noacl,space_cache,subvolid=5,subvol=/)
$ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt
lsof  19292  root  txt   REG   0,44   
163136257 /home/sat/mnt/lsof
lsof  19292  root  mem   REG   0,42 
257 /home/sat/mnt/lsof (path dev=0,44)
lsof  19294  root  txt   REG   0,44   
163136257 /home/sat/mnt/lsof
lsof  19294  root  mem   REG   0,42 
257 /home/sat/mnt/lsof (path dev=0,44)
===

With this patch:
===
$ mount | grep /home/vagrant/mnt
/dev/vda5 on /home/vagrant/mnt type btrfs 
(rw,relatime,noacl,space_cache,subvolid=5,subvol=/)
$ /home/vagrant/mnt/lsof | grep /home/vagrant/mnt
lsof  752 root  txt   REG   0,44   163224   
 263 /home/vagrant/mnt/lsof
lsof  754 root  txt   REG   0,44   163224   
 263 /home/vagrant/mnt/lsof
===

Signed-off-by: Satoru Takeuchi 
CC: Signed-off-by: Mark Fasheh 
---
 fs/btrfs/super.c |  1 +
 fs/proc/generic.c| 13 +
 fs/proc/internal.h   |  1 +
 fs/proc/nommu.c  |  2 +-
 fs/proc/task_mmu.c   |  2 +-
 fs/proc/task_nommu.c |  2 +-
 include/linux/fs.h   |  1 +
 7 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index da687dc79cce..2ccfdb107ba0 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1133,6 +1133,7 @@ static int btrfs_fill_super(struct super_block *sb,
 #endif
sb->s_flags |= MS_I_VERSION;
sb->s_iflags |= SB_I_CGROUPWB;
+   sb->s_iflags |= SB_I_LOGICALVOL;
err = open_ctree(sb, fs_devices, (char *)data);
if (err) {
btrfs_err(fs_info, "open_ctree failed");
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index f6a01f09f79d..d38cd77b297c 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -646,3 +646,16 @@ void *PDE_DATA(const struct inode *inode)
return __PDE_DATA(inode);
 }
 EXPORT_SYMBOL(PDE_DATA);
+
+dev_t proc_get_map_dev(struct dentry *dentry)
+{
+   struct inode *inode = dentry->d_inode;
+   struct kstat kstat;
+
+   if (inode->i_sb->s_iflags & SB_I_LOGICALVOL &&
+   inode->i_op->getattr &&
+   inode->i_op->getattr(NULL, dentry, ) == 0)
+   return kstat.dev;
+
+   return inode->i_sb->s_dev;
+}
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 2de5194ba378..bf0f11fc209b 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -190,6 +190,7 @@ static inline struct proc_dir_entry *pde_get(struct 
proc_dir_entry *pde)
return pde;
 }
 extern void pde_put(struct proc_dir_entry *);
+dev_t proc_get_map_dev(struct dentry *);
 
 static inline bool is_empty_pde(const struct proc_dir_entry *pde)
 {
diff --git a/fs/proc/nommu.c b/fs/proc/nommu.c
index 75634379f82e..e9c29864a50e 100644
--- a/fs/proc/nommu.c
+++ b/fs/proc/nommu.c
@@ -46,7 +46,7 @@ static int nommu_region_show(struct seq_file *m, struct 
vm_region *region)
 
if (file) {
struct inode *inode = file_inode(region->vm_file);
-   dev = inode->i_sb->s_dev;
+   dev = proc_get_map_dev(vma->vm_file->f_path.dentry);
ino = inode->i_ino;
}
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index