Re: [PATCH v8 07/24] erofs: add directory operations

2019-08-15 Thread Gao Xiang
Hi Linus,

On Thu, Aug 15, 2019 at 09:13:19AM -0700, Linus Torvalds wrote:
> On Wed, Aug 14, 2019 at 9:42 PM Gao Xiang  wrote:
> >
> > +
> > +static const unsigned char erofs_filetype_table[EROFS_FT_MAX] = {
> > +   [EROFS_FT_UNKNOWN]  = DT_UNKNOWN,
> > +   [EROFS_FT_REG_FILE] = DT_REG,
> > +   [EROFS_FT_DIR]  = DT_DIR,
> > +   [EROFS_FT_CHRDEV]   = DT_CHR,
> > +   [EROFS_FT_BLKDEV]   = DT_BLK,
> > +   [EROFS_FT_FIFO] = DT_FIFO,
> > +   [EROFS_FT_SOCK] = DT_SOCK,
> > +   [EROFS_FT_SYMLINK]  = DT_LNK,
> > +};
> 
> Hmm.
> 
> The EROFS_FT_XYZ values seem to match the normal FT_XYZ values, and
> we've lately tried to just have filesystems use the standard ones
> instead of having a (pointless) duplicate conversion between the two.
> 
> And then you can use the common "fs_ftype_to_dtype()" to convert from
> FT_XYZ to DT_XYZ.
> 
> Maybe I'm missing something, and the EROFS_FT_x list actually differs
> from the normal FT_x list some way, but it would be good to not
> introduce another case of this in normal filesystems, just as we've
> been getting rid of them.
> 
> See for example commit e10892189428 ("ext2: use common file type conversion").

Yes, you're right. There is nothing different with DT_XYZ since
I followed what f2fs did when I wrote this place.

Actually, I noticed that patchset once in mailing list months ago
https://lore.kernel.org/r/20181023201952.GA15676@pathfinder/
but I didn't keep eyes on it (whether this patchset is merged or not...)

OK, let me fix that like other fses. Thanks for pointing out.

Thanks,
Gao Xiang

> 
>Linus


Re: [PATCH v8 07/24] erofs: add directory operations

2019-08-15 Thread Linus Torvalds
On Wed, Aug 14, 2019 at 9:42 PM Gao Xiang  wrote:
>
> +
> +static const unsigned char erofs_filetype_table[EROFS_FT_MAX] = {
> +   [EROFS_FT_UNKNOWN]  = DT_UNKNOWN,
> +   [EROFS_FT_REG_FILE] = DT_REG,
> +   [EROFS_FT_DIR]  = DT_DIR,
> +   [EROFS_FT_CHRDEV]   = DT_CHR,
> +   [EROFS_FT_BLKDEV]   = DT_BLK,
> +   [EROFS_FT_FIFO] = DT_FIFO,
> +   [EROFS_FT_SOCK] = DT_SOCK,
> +   [EROFS_FT_SYMLINK]  = DT_LNK,
> +};

Hmm.

The EROFS_FT_XYZ values seem to match the normal FT_XYZ values, and
we've lately tried to just have filesystems use the standard ones
instead of having a (pointless) duplicate conversion between the two.

And then you can use the common "fs_ftype_to_dtype()" to convert from
FT_XYZ to DT_XYZ.

Maybe I'm missing something, and the EROFS_FT_x list actually differs
from the normal FT_x list some way, but it would be good to not
introduce another case of this in normal filesystems, just as we've
been getting rid of them.

See for example commit e10892189428 ("ext2: use common file type conversion").

   Linus


Re: [PATCH v2] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread Jan Kara
On Wed 14-08-19 07:54:16, Mark Salyzyn wrote:
> On 8/14/19 4:00 AM, Jan Kara wrote:
> > On Tue 13-08-19 07:55:06, Mark Salyzyn wrote:
> > ...
> > > diff --git a/fs/xattr.c b/fs/xattr.c
> > > index 90dd78f0eb27..71f887518d6f 100644
> > > --- a/fs/xattr.c
> > > +++ b/fs/xattr.c
> > ...
> > >   ssize_t
> > >   __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char 
> > > *name,
> > > -void *value, size_t size)
> > > +void *value, size_t size, int flags)
> > >   {
> > >   const struct xattr_handler *handler;
> > > -
> > > - handler = xattr_resolve_name(inode, );
> > > - if (IS_ERR(handler))
> > > - return PTR_ERR(handler);
> > > - if (!handler->get)
> > > - return -EOPNOTSUPP;
> > > - return handler->get(handler, dentry, inode, name, value, size);
> > > -}
> > > -EXPORT_SYMBOL(__vfs_getxattr);
> > > -
> > > -ssize_t
> > > -vfs_getxattr(struct dentry *dentry, const char *name, void *value, 
> > > size_t size)
> > > -{
> > > - struct inode *inode = dentry->d_inode;
> > >   int error;
> > > + if (flags & XATTR_NOSECURITY)
> > > + goto nolsm;
> > Hum, is it OK for XATTR_NOSECURITY to skip even the xattr_permission()
> > check? I understand that for reads of security xattrs it actually does not
> > matter in practice but conceptually that seems wrong to me as
> > XATTR_NOSECURITY is supposed to skip just security-module checks to avoid
> > recursion AFAIU.
> 
> Good catch I think.
> 
> I was attempting to make this change purely inert, no change in
> functionality, only a change in API. Adding a call to xattr_permission would
> incur a change in overall functionality, as it would introduce into the
> current and original __vfs_getxattr a call to xattr_permission that was not
> there before.
> 
> (I will have to defer the real answer and requirements to the security
> folks)
> 
> AFAIK you are correct, and to make the call would reduce the attack surface,
> trading a very small amount of CPU utilization, for a much larger amount of
> trust.
> 
> Given the long history of this patch set (for overlayfs) and the large
> amount of stakeholders, I would _prefer_ to submit a followup independent
> functionality/security change to _vfs_get_xattr _after_ this makes it in.

You're right. The problem was there before. So ack to changing this later.

> > > diff --git a/include/uapi/linux/xattr.h b/include/uapi/linux/xattr.h
> > > index c1395b5bd432..1216d777d210 100644
> > > --- a/include/uapi/linux/xattr.h
> > > +++ b/include/uapi/linux/xattr.h
> > > @@ -17,8 +17,9 @@
> > >   #if __UAPI_DEF_XATTR
> > >   #define __USE_KERNEL_XATTR_DEFS
> > > -#define XATTR_CREATE 0x1 /* set value, fail if attr already 
> > > exists */
> > > -#define XATTR_REPLACE0x2 /* set value, fail if attr does not 
> > > exist */
> > > +#define XATTR_CREATE  0x1/* set value, fail if attr already 
> > > exists */
> > > +#define XATTR_REPLACE 0x2/* set value, fail if attr does not 
> > > exist */
> > > +#define XATTR_NOSECURITY 0x4 /* get value, do not involve security 
> > > check */
> > >   #endif
> > It seems confusing to export XATTR_NOSECURITY definition to userspace when
> > that is kernel-internal flag. I'd just define it in include/linux/xattr.h
> > somewhere from the top of flags space (like 0x4000).
> > 
> > Otherwise the patch looks OK to me (cannot really comment on the security
> > module aspect of this whole thing though).
> 
> Good point. However, we do need to keep these flags together to reduce
> maintenance risk, I personally abhor two locations for flags bits even if
> one comes from the opposite bit-side; collisions are undetectable at build
> time. Although I have not gone through the entire thought experiment, I am
> expecting that fuse could possibly benefit from this flag (if exposed) since
> it also has a security recursion. That said, fuse is probably the example of
> a gaping wide attack surface if user space had access to it ... your
> xattr_permissions call addition requested above would be realistically, not
> just pedantically, required!

Yeah, flags bits in two places are bad as well. So maybe at least
#ifdef __KERNEL__ bit around the definitiona and a comment that it is
kernel internal flag?

Honza
-- 
Jan Kara 
SUSE Labs, CR


[PATCH v4] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread Mark Salyzyn via Linux-erofs
Add a flag option to get xattr method that could have a bit flag of
XATTR_NOSECURITY passed to it.  XATTR_NOSECURITY is generally then
set in the __vfs_getxattr path.

This handles the case of a union filesystem driver that is being
requested by the security layer to report back the xattr data.

For the use case where access is to be blocked by the security layer.

The path then could be security(dentry) ->
__vfs_getxattr(dentry...XATTR_NOSECUIRTY) ->
handler->get(dentry...XATTR_NOSECURITY) ->
__vfs_getxattr(lower_dentry...XATTR_NOSECUIRTY) ->
lower_handler->get(lower_dentry...XATTR_NOSECUIRTY)
which would report back through the chain data and success as
expected, the logging security layer at the top would have the
data to determine the access permissions and report back the target
context that was blocked.

Without the get handler flag, the path on a union filesystem would be
the errant security(dentry) -> __vfs_getxattr(dentry) ->
handler->get(dentry) -> vfs_getxattr(lower_dentry) -> nested ->
security(lower_dentry, log off) -> lower_handler->get(lower_dentry)
which would report back through the chain no data, and -EACCES.

For selinux for both cases, this would translate to a correctly
determined blocked access. In the first case with this change a correct avc
log would be reported, in the second legacy case an incorrect avc log
would be reported against an uninitialized u:object_r:unlabeled:s0
context making the logs cosmetically useless for audit2allow.

This patch series is inert and is the wide-spread addition of the
flags option for xattr functions, and a replacement of _vfs_getxattr
with __vfs_getxattr(...XATTR_NOSECURITY).

Signed-off-by: Mark Salyzyn 
Cc: Stephen Smalley 
Cc: linux-ker...@vger.kernel.org
Cc: kernel-t...@android.com
Cc: linux-security-mod...@vger.kernel.org
Cc: Jan Kara 
Cc: sta...@vger.kernel.org # 4.4, 4.9, 4.14 & 4.19
---
v4:
- ifdef __KERNEL__ around XATTR_NOSECURITY to
  keep it colocated in uapi headers.

v3:
- poor aim on ubifs not ubifs_xattr_get, but static xattr_get

v2:
- Missed a spot: ubifs, erofs and afs.

v1:
- Removed from an overlayfs patch set, and made independent.
  Expect this to be the basis of some security improvements.
---
 drivers/staging/erofs/xattr.c |  3 ++-
 fs/9p/acl.c   |  3 ++-
 fs/9p/xattr.c |  3 ++-
 fs/afs/xattr.c|  8 +++
 fs/btrfs/xattr.c  |  3 ++-
 fs/ceph/xattr.c   |  3 ++-
 fs/cifs/xattr.c   |  2 +-
 fs/ecryptfs/inode.c   |  6 --
 fs/ecryptfs/mmap.c|  2 +-
 fs/ext2/xattr_trusted.c   |  2 +-
 fs/ext2/xattr_user.c  |  2 +-
 fs/ext4/xattr_security.c  |  2 +-
 fs/ext4/xattr_trusted.c   |  2 +-
 fs/ext4/xattr_user.c  |  2 +-
 fs/f2fs/xattr.c   |  4 ++--
 fs/fuse/xattr.c   |  4 ++--
 fs/gfs2/xattr.c   |  3 ++-
 fs/hfs/attr.c |  2 +-
 fs/hfsplus/xattr.c|  3 ++-
 fs/hfsplus/xattr_trusted.c|  3 ++-
 fs/hfsplus/xattr_user.c   |  3 ++-
 fs/jffs2/security.c   |  3 ++-
 fs/jffs2/xattr_trusted.c  |  3 ++-
 fs/jffs2/xattr_user.c |  3 ++-
 fs/jfs/xattr.c|  5 +++--
 fs/kernfs/inode.c |  3 ++-
 fs/nfs/nfs4proc.c |  6 --
 fs/ocfs2/xattr.c  |  9 +---
 fs/orangefs/xattr.c   |  3 ++-
 fs/overlayfs/super.c  |  8 ---
 fs/posix_acl.c|  2 +-
 fs/reiserfs/xattr_security.c  |  3 ++-
 fs/reiserfs/xattr_trusted.c   |  3 ++-
 fs/reiserfs/xattr_user.c  |  3 ++-
 fs/squashfs/xattr.c   |  2 +-
 fs/ubifs/xattr.c  |  3 ++-
 fs/xattr.c| 36 +++
 fs/xfs/xfs_xattr.c|  3 ++-
 include/linux/xattr.h |  9 
 include/uapi/linux/xattr.h|  7 --
 mm/shmem.c|  3 ++-
 net/socket.c  |  3 ++-
 security/commoncap.c  |  6 --
 security/integrity/evm/evm_main.c |  3 ++-
 security/selinux/hooks.c  | 11 ++
 security/smack/smack_lsm.c|  5 +++--
 46 files changed, 126 insertions(+), 84 deletions(-)

diff --git a/drivers/staging/erofs/xattr.c b/drivers/staging/erofs/xattr.c
index df40654b9fbb..69440065432c 100644
--- a/drivers/staging/erofs/xattr.c
+++ b/drivers/staging/erofs/xattr.c
@@ -463,7 +463,8 @@ int erofs_getxattr(struct inode *inode, int index,
 
 static int erofs_xattr_generic_get(const struct xattr_handler *handler,
   struct dentry *unused, struct inode *inode,
-  const char *name, void *buffer, size_t size)
+  const char *name, void *buffer, size_t size,
+  int flags)
 {
struct 

Re: [PATCH v8 00/24] erofs: promote erofs from staging v8

2019-08-15 Thread Greg Kroah-Hartman
On Thu, Aug 15, 2019 at 12:41:31PM +0800, Gao Xiang wrote:
> [I strip the previous cover letter, the old one can be found in v6:
>  https://lore.kernel.org/r/20190802125347.166018-1-gaoxian...@huawei.com/]
> 
> We'd like to submit a formal moving patch applied to staging tree
> for 5.4, before that we'd like to hear if there are some ACKs,
> suggestions or NAKs, objections of EROFS. Therefore, we can improve
> it in this round or rethink about the whole thing.
> 
> As related materials mentioned [1] [2], the goal of EROFS is to
> save extra storage space with guaranteed end-to-end performance
> for read-only files, which has better performance over exist Linux
> compression filesystems based on fixed-sized output compression
> and inplace decompression. It even has better performance in
> a large compression ratio range compared with generic uncompressed
> filesystems with proper CPU-storage combinations. And we think this
> direction is correct and a dedicated kernel team is continuously /
> actively working on improving it, enough testers and beta / end
> users using it.
> 
> EROFS has been applied to almost all in-service HUAWEI smartphones
> (Yes, the number is still increasing by time) and it seems like
> a success. It can be used in more wider scenarios. We think it's
> useful for Linux / Android OS community and it's the time moving
> out of staging.
> 
> In order to get started, latest stable mkfs.erofs is available at
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git -b dev
> 
> with README in the repository.
> 
> We are still tuning sequential read performance for ultra-fast
> speed NVME SSDs like Samsung 970PRO, but at least now you can
> try on your PC with some data with proper compression ratio,
> the latest Linux kernel, USB stick for convenience sake and
> a not very old-fashioned CPU. There are also benchmarks available
> in the above materials mentioned.
> 
> EROFS is a self-contained filesystem driver. Although there are
> still some TODOs to be more generic, we will actively keep on
> developping / tuning EROFS with the evolution of Linux kernel
> as the other in-kernel filesystems.
> 
> As I mentioned before in LSF/MM 2019, in the future, we'd like
> to generalize the decompression engine into a library for other
> fses to use after the whole system is mature like fscrypt.
> However, such metadata should be designed respectively for
> each fs, and synchronous metadata read cost will be larger
> than EROFS because of those ondisk limitation. Therefore EROFS
> is still a better choice for read-only scenarios.
> 
> EROFS is now ready for reviewing and moving, and the code is
> already cleaned up as shiny floors... Please kindly take some
> precious time, share your comments about EROFS and let us know
> your opinion about this. It's really important for us since
> generally speaking, we like to use Linux _in-tree_ stuffs rather
> than lack of supported out-of-tree / orphan stuffs as well.

I know everyone is busy, but given the length this has been in staging,
and the constant good progress toward cleaning it all up that has been
happening, I want to get this moved out of staging soon.

So, unless there are any objections, I'll take this patchset in a week
into my staging tree to move the filesystem into the "real" part of the
kernel.

thanks,

greg k-h


Re: [PATCH v7 08/24] erofs: add namei functions

2019-08-15 Thread Pavel Machek
Hi!

> > > + /*
> > > +  * on-disk error, let's only BUG_ON in the debugging mode.
> > > +  * otherwise, it will return 1 to just skip the invalid name
> > > +  * and go on (in consideration of the lookup performance).
> > > +  */
> > > + DBG_BUGON(qd->name > qd->end);
> > 
> > I believe you should check for errors in non-debug mode, too.
> 
> Thanks for your kindly reply!
> 
> The following is just my personal thought... If I am wrong, please
> kindly point out...
> 
> As you can see, this is a new prefixed string binary search algorithm
> which can provide similar performance with hashed approach (but no
> need to store hash value at all), so I really care about its lookup
> performance.
> 
> There is something needing to be concerned, is, whether namei() should
> report any potential on-disk issues or just return -ENOENT for these
> corrupted dirs, I think I tend to use the latter one.

-ENOENT is okay for corrupted directories, as long as corrupted
directories do not cause some kind of security bugs (memory
corruption, crashes, ...)


Best regards,
Pavel
-- 
DENX Software Engineering GmbH,  Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


signature.asc
Description: Digital signature