Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-13 Thread Eric Biggers
On Mon, Mar 13, 2017 at 11:27:32AM +0100, Florian Weimer wrote:
> On 03/13/2017 05:34 AM, Andreas Dilger wrote:
> >Not that it is a huge deal either way, but I'd think it is harder for the
> >compiler to optimize across a function call boundary like memset() vs. a
> >struct initialization in the same function where it can see that all but
> >a few of the fields are being overwritten immediately before they are used.
> 
> GCC treats memset as a function call only if options such as
> -ffreestanding or -fno-builtin are enabled, or if memset is
> redefined in a header file.  Does the kernel do this?
> 

No, it does not.  So gcc treats memset() as a request to clear memory, not as a
request to call a function called "memset()" specifically.  On x86_64 it's
compiling it into a "rep stos" instruction.

- Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-13 Thread Eric Biggers
On Mon, Mar 13, 2017 at 11:27:32AM +0100, Florian Weimer wrote:
> On 03/13/2017 05:34 AM, Andreas Dilger wrote:
> >Not that it is a huge deal either way, but I'd think it is harder for the
> >compiler to optimize across a function call boundary like memset() vs. a
> >struct initialization in the same function where it can see that all but
> >a few of the fields are being overwritten immediately before they are used.
> 
> GCC treats memset as a function call only if options such as
> -ffreestanding or -fno-builtin are enabled, or if memset is
> redefined in a header file.  Does the kernel do this?
> 

No, it does not.  So gcc treats memset() as a request to clear memory, not as a
request to call a function called "memset()" specifically.  On x86_64 it's
compiling it into a "rep stos" instruction.

- Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-13 Thread Florian Weimer

On 03/13/2017 05:34 AM, Andreas Dilger wrote:

Not that it is a huge deal either way, but I'd think it is harder for the
compiler to optimize across a function call boundary like memset() vs. a
struct initialization in the same function where it can see that all but
a few of the fields are being overwritten immediately before they are used.


GCC treats memset as a function call only if options such as 
-ffreestanding or -fno-builtin are enabled, or if memset is redefined in 
a header file.  Does the kernel do this?



I don't think the designated initializer is any less clear to the reader
that the struct is zeroed out compared to using memset().  Possibly the
best compromise is to use a designated initializer that specifies all of
the known fields, and leaves it to the compiler to initialize unset fields
or padding.


GCC will not always initialize padding if you specify a designated 
initializer because padding values are unspeficied and their value does 
not matter from a language point of view.


Thanks,
Florian


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-13 Thread Florian Weimer

On 03/13/2017 05:34 AM, Andreas Dilger wrote:

Not that it is a huge deal either way, but I'd think it is harder for the
compiler to optimize across a function call boundary like memset() vs. a
struct initialization in the same function where it can see that all but
a few of the fields are being overwritten immediately before they are used.


GCC treats memset as a function call only if options such as 
-ffreestanding or -fno-builtin are enabled, or if memset is redefined in 
a header file.  Does the kernel do this?



I don't think the designated initializer is any less clear to the reader
that the struct is zeroed out compared to using memset().  Possibly the
best compromise is to use a designated initializer that specifies all of
the known fields, and leaves it to the compiler to initialize unset fields
or padding.


GCC will not always initialize padding if you specify a designated 
initializer because padding values are unspeficied and their value does 
not matter from a language point of view.


Thanks,
Florian


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-12 Thread Andreas Dilger
On Mar 11, 2017, at 11:01 PM, Eric Biggers  wrote:
> 
> On Sat, Mar 11, 2017 at 08:02:06PM -0800, Eric Biggers wrote:
>> On Sun, Mar 12, 2017 at 02:29:27AM +, Al Viro wrote:
>>> 
>>> Oh, I agree that multiple __put_user() are wrong; I also agree that bulk
>>> copy is the right approach (when we get the unsafe stuff right, we can
>>> revisit that, but I suspect that on quite a few architectures a bulk copy
>>> will still give better time, no matter what).
>>> 
 If padding is a concern at all (AFAICS it's not actually an issue now
 with struct statx, but people tend to have different opinions on how
 careful they want to be with padding), then I think we'll just have to
 start by memsetting the whole struct to 0.
>>> 
>>> My point is simply that it's worth a comment in that code.
>> 
>> Okay, thanks.  I'll add a comment about the padding assumption, and I think
>> I'll take the suggestion to use a designated initializer.  Then at least
>> all *fields* get initialized by default.  And if in the future someone
>> wants to conditionally initialize fields, then they can use ?: or they can
>> do it after the initializer.  Either way, at least they won't be able to
>> forget to zero some field.
> 
> Okay, well, I may have changed my mind again...  I tried the designated
> initializer on x86_64 with gcc 4.8 and 6.3, and also on arm64 with gcc 4.8.
> In each case, it was compiled into first zeroing all 256 bytes of the struct,
> just like memset(, 0, sizeof(tmp)).  Yes, this was with
> CC_OPTIMIZE_FOR_PERFORMANCE=y.  So I think we might as well just write the
> full memset(), making it completely clear that everything is initialized.
> (This is especially useful for people who are auditing code paths like this
> for information leaks.)  Also, a smart compiler could potentially optimize
> away parts of the memset() anyway...

Not that it is a huge deal either way, but I'd think it is harder for the
compiler to optimize across a function call boundary like memset() vs. a
struct initialization in the same function where it can see that all but
a few of the fields are being overwritten immediately before they are used.

I don't think the designated initializer is any less clear to the reader
that the struct is zeroed out compared to using memset().  Possibly the
best compromise is to use a designated initializer that specifies all of
the known fields, and leaves it to the compiler to initialize unset fields
or padding.  That avoids double zeroing without any risk of exposing unset
fields to userspace:

static int cp_statx(const struct kstat *stat, struct statx __user *buffer)
{
struct statx tmp = {
.stx_mask = stat->result_mask;
.stx_blksize = stat->blksize;
.stx_attributes = stat->attributes;
.stx_nlink = stat->nlink;
.stx_uid = from_kuid_munged(current_user_ns(), stat->uid);
.stx_gid = from_kgid_munged(current_user_ns(), stat->gid);
.stx_mode = stat->mode;
.stx_ino = stat->ino;
.stx_size = stat->size;
.stx_blocks = stat->blocks;
.stx_atime.tv_sec = stat->atime.tv_sec;
.stx_atime.tv_nsec = stat->atime.tv_nsec;
.stx_btime.tv_sec = stat->btime.tv_sec;
.stx_btime.tv_nsec = stat->btime.tv_nsec;
.stx_ctime.tv_sec = stat->ctime.tv_sec;
.stx_ctime.tv_nsec = stat->ctime.tv_nsec;
.stx_mtime.tv_sec = stat->mtime.tv_sec;
.stx_mtime.tv_nsec = stat->mtime.tv_nsec;
.stx_rdev_major = MAJOR(stat->rdev);
.stx_rdev_minor = MINOR(stat->rdev);
.stx_dev_major = MAJOR(stat->dev);
.stx_dev_minor = MINOR(stat->dev);
};

return copy_to_user(buffer, , sizeof(tmp)) ? -EFAULT : 0;
}

Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-12 Thread Andreas Dilger
On Mar 11, 2017, at 11:01 PM, Eric Biggers  wrote:
> 
> On Sat, Mar 11, 2017 at 08:02:06PM -0800, Eric Biggers wrote:
>> On Sun, Mar 12, 2017 at 02:29:27AM +, Al Viro wrote:
>>> 
>>> Oh, I agree that multiple __put_user() are wrong; I also agree that bulk
>>> copy is the right approach (when we get the unsafe stuff right, we can
>>> revisit that, but I suspect that on quite a few architectures a bulk copy
>>> will still give better time, no matter what).
>>> 
 If padding is a concern at all (AFAICS it's not actually an issue now
 with struct statx, but people tend to have different opinions on how
 careful they want to be with padding), then I think we'll just have to
 start by memsetting the whole struct to 0.
>>> 
>>> My point is simply that it's worth a comment in that code.
>> 
>> Okay, thanks.  I'll add a comment about the padding assumption, and I think
>> I'll take the suggestion to use a designated initializer.  Then at least
>> all *fields* get initialized by default.  And if in the future someone
>> wants to conditionally initialize fields, then they can use ?: or they can
>> do it after the initializer.  Either way, at least they won't be able to
>> forget to zero some field.
> 
> Okay, well, I may have changed my mind again...  I tried the designated
> initializer on x86_64 with gcc 4.8 and 6.3, and also on arm64 with gcc 4.8.
> In each case, it was compiled into first zeroing all 256 bytes of the struct,
> just like memset(, 0, sizeof(tmp)).  Yes, this was with
> CC_OPTIMIZE_FOR_PERFORMANCE=y.  So I think we might as well just write the
> full memset(), making it completely clear that everything is initialized.
> (This is especially useful for people who are auditing code paths like this
> for information leaks.)  Also, a smart compiler could potentially optimize
> away parts of the memset() anyway...

Not that it is a huge deal either way, but I'd think it is harder for the
compiler to optimize across a function call boundary like memset() vs. a
struct initialization in the same function where it can see that all but
a few of the fields are being overwritten immediately before they are used.

I don't think the designated initializer is any less clear to the reader
that the struct is zeroed out compared to using memset().  Possibly the
best compromise is to use a designated initializer that specifies all of
the known fields, and leaves it to the compiler to initialize unset fields
or padding.  That avoids double zeroing without any risk of exposing unset
fields to userspace:

static int cp_statx(const struct kstat *stat, struct statx __user *buffer)
{
struct statx tmp = {
.stx_mask = stat->result_mask;
.stx_blksize = stat->blksize;
.stx_attributes = stat->attributes;
.stx_nlink = stat->nlink;
.stx_uid = from_kuid_munged(current_user_ns(), stat->uid);
.stx_gid = from_kgid_munged(current_user_ns(), stat->gid);
.stx_mode = stat->mode;
.stx_ino = stat->ino;
.stx_size = stat->size;
.stx_blocks = stat->blocks;
.stx_atime.tv_sec = stat->atime.tv_sec;
.stx_atime.tv_nsec = stat->atime.tv_nsec;
.stx_btime.tv_sec = stat->btime.tv_sec;
.stx_btime.tv_nsec = stat->btime.tv_nsec;
.stx_ctime.tv_sec = stat->ctime.tv_sec;
.stx_ctime.tv_nsec = stat->ctime.tv_nsec;
.stx_mtime.tv_sec = stat->mtime.tv_sec;
.stx_mtime.tv_nsec = stat->mtime.tv_nsec;
.stx_rdev_major = MAJOR(stat->rdev);
.stx_rdev_minor = MINOR(stat->rdev);
.stx_dev_major = MAJOR(stat->dev);
.stx_dev_minor = MINOR(stat->dev);
};

return copy_to_user(buffer, , sizeof(tmp)) ? -EFAULT : 0;
}

Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Eric Biggers
On Sat, Mar 11, 2017 at 08:02:06PM -0800, Eric Biggers wrote:
> On Sun, Mar 12, 2017 at 02:29:27AM +, Al Viro wrote:
> > 
> > Oh, I agree that multiple __put_user() are wrong; I also agree that bulk 
> > copy is
> > the right approach (when we get the unsafe stuff right, we can revisit 
> > that, but
> > I suspect that on quite a few architectures a bulk copy will still give 
> > better
> > time, no matter what).
> > 
> > > If padding is a concern at all (AFAICS it's not actually an issue now with
> > > struct statx, but people tend to have different opinions on how careful 
> > > they
> > > want to be with padding), then I think we'll just have to start by 
> > > memsetting
> > > the whole struct to 0.
> > 
> > My point is simply that it's worth a comment in that code.
> 
> Okay, thanks.  I'll add a comment about the padding assumption, and I think 
> I'll
> take the suggestion to use a designated initializer.  Then at least all 
> *fields*
> get initialized by default.  And if in the future someone wants to 
> conditionally
> initialize fields, then they can use ?: or they can do it after the 
> initializer.
> Either way, at least they won't be able to forget to zero some field.
> 
> - Eric

Okay, well, I may have changed my mind again...  I tried the designated
initializer on x86_64 with gcc 4.8 and 6.3, and also on arm64 with gcc 4.8.  In
each case, it was compiled into first zeroing all 256 bytes of the struct, just
like memset(, 0, sizeof(tmp)).  Yes, this was with
CC_OPTIMIZE_FOR_PERFORMANCE=y.  So I think we might as well just write the full
memset(), making it completely clear that everything is initialized.  (This is
especially useful for people who are auditing code paths like this for
information leaks.)  Also, a smart compiler could potentially optimize away
parts of the memset() anyway...

Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Eric Biggers
On Sat, Mar 11, 2017 at 08:02:06PM -0800, Eric Biggers wrote:
> On Sun, Mar 12, 2017 at 02:29:27AM +, Al Viro wrote:
> > 
> > Oh, I agree that multiple __put_user() are wrong; I also agree that bulk 
> > copy is
> > the right approach (when we get the unsafe stuff right, we can revisit 
> > that, but
> > I suspect that on quite a few architectures a bulk copy will still give 
> > better
> > time, no matter what).
> > 
> > > If padding is a concern at all (AFAICS it's not actually an issue now with
> > > struct statx, but people tend to have different opinions on how careful 
> > > they
> > > want to be with padding), then I think we'll just have to start by 
> > > memsetting
> > > the whole struct to 0.
> > 
> > My point is simply that it's worth a comment in that code.
> 
> Okay, thanks.  I'll add a comment about the padding assumption, and I think 
> I'll
> take the suggestion to use a designated initializer.  Then at least all 
> *fields*
> get initialized by default.  And if in the future someone wants to 
> conditionally
> initialize fields, then they can use ?: or they can do it after the 
> initializer.
> Either way, at least they won't be able to forget to zero some field.
> 
> - Eric

Okay, well, I may have changed my mind again...  I tried the designated
initializer on x86_64 with gcc 4.8 and 6.3, and also on arm64 with gcc 4.8.  In
each case, it was compiled into first zeroing all 256 bytes of the struct, just
like memset(, 0, sizeof(tmp)).  Yes, this was with
CC_OPTIMIZE_FOR_PERFORMANCE=y.  So I think we might as well just write the full
memset(), making it completely clear that everything is initialized.  (This is
especially useful for people who are auditing code paths like this for
information leaks.)  Also, a smart compiler could potentially optimize away
parts of the memset() anyway...

Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Eric Biggers
On Sun, Mar 12, 2017 at 02:29:27AM +, Al Viro wrote:
> 
> Oh, I agree that multiple __put_user() are wrong; I also agree that bulk copy 
> is
> the right approach (when we get the unsafe stuff right, we can revisit that, 
> but
> I suspect that on quite a few architectures a bulk copy will still give better
> time, no matter what).
> 
> > If padding is a concern at all (AFAICS it's not actually an issue now with
> > struct statx, but people tend to have different opinions on how careful they
> > want to be with padding), then I think we'll just have to start by 
> > memsetting
> > the whole struct to 0.
> 
> My point is simply that it's worth a comment in that code.

Okay, thanks.  I'll add a comment about the padding assumption, and I think I'll
take the suggestion to use a designated initializer.  Then at least all *fields*
get initialized by default.  And if in the future someone wants to conditionally
initialize fields, then they can use ?: or they can do it after the initializer.
Either way, at least they won't be able to forget to zero some field.

- Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Eric Biggers
On Sun, Mar 12, 2017 at 02:29:27AM +, Al Viro wrote:
> 
> Oh, I agree that multiple __put_user() are wrong; I also agree that bulk copy 
> is
> the right approach (when we get the unsafe stuff right, we can revisit that, 
> but
> I suspect that on quite a few architectures a bulk copy will still give better
> time, no matter what).
> 
> > If padding is a concern at all (AFAICS it's not actually an issue now with
> > struct statx, but people tend to have different opinions on how careful they
> > want to be with padding), then I think we'll just have to start by 
> > memsetting
> > the whole struct to 0.
> 
> My point is simply that it's worth a comment in that code.

Okay, thanks.  I'll add a comment about the padding assumption, and I think I'll
take the suggestion to use a designated initializer.  Then at least all *fields*
get initialized by default.  And if in the future someone wants to conditionally
initialize fields, then they can use ?: or they can do it after the initializer.
Either way, at least they won't be able to forget to zero some field.

- Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Al Viro
On Sat, Mar 11, 2017 at 06:16:55PM -0800, Eric Biggers wrote:
> Hi Al,
> 
> On Sun, Mar 12, 2017 at 01:24:15AM +, Al Viro wrote:
> > On Sat, Mar 11, 2017 at 01:45:55PM -0800, Eric Biggers wrote:
> > > From: Eric Biggers 
> > > 
> > > I found that statx() was significantly slower than stat().  As a
> > > microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> > > file to the same with statx() passed a NULL path:
> > 
> > Umm...
> > 
> 
> Well, it's a silly benchmark, but stat performance is important, and usually
> things are cached already so most of the time is just overhead --- which this
> measures.  And since nothing actually uses statx() yet, you can't do a 
> benchmark
> just by running some command like 'git status' or whatever.

Oh, I agree that multiple __put_user() are wrong; I also agree that bulk copy is
the right approach (when we get the unsafe stuff right, we can revisit that, but
I suspect that on quite a few architectures a bulk copy will still give better
time, no matter what).

> If padding is a concern at all (AFAICS it's not actually an issue now with
> struct statx, but people tend to have different opinions on how careful they
> want to be with padding), then I think we'll just have to start by memsetting
> the whole struct to 0.

My point is simply that it's worth a comment in that code.


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Al Viro
On Sat, Mar 11, 2017 at 06:16:55PM -0800, Eric Biggers wrote:
> Hi Al,
> 
> On Sun, Mar 12, 2017 at 01:24:15AM +, Al Viro wrote:
> > On Sat, Mar 11, 2017 at 01:45:55PM -0800, Eric Biggers wrote:
> > > From: Eric Biggers 
> > > 
> > > I found that statx() was significantly slower than stat().  As a
> > > microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> > > file to the same with statx() passed a NULL path:
> > 
> > Umm...
> > 
> 
> Well, it's a silly benchmark, but stat performance is important, and usually
> things are cached already so most of the time is just overhead --- which this
> measures.  And since nothing actually uses statx() yet, you can't do a 
> benchmark
> just by running some command like 'git status' or whatever.

Oh, I agree that multiple __put_user() are wrong; I also agree that bulk copy is
the right approach (when we get the unsafe stuff right, we can revisit that, but
I suspect that on quite a few architectures a bulk copy will still give better
time, no matter what).

> If padding is a concern at all (AFAICS it's not actually an issue now with
> struct statx, but people tend to have different opinions on how careful they
> want to be with padding), then I think we'll just have to start by memsetting
> the whole struct to 0.

My point is simply that it's worth a comment in that code.


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Eric Biggers
Hi Al,

On Sun, Mar 12, 2017 at 01:24:15AM +, Al Viro wrote:
> On Sat, Mar 11, 2017 at 01:45:55PM -0800, Eric Biggers wrote:
> > From: Eric Biggers 
> > 
> > I found that statx() was significantly slower than stat().  As a
> > microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> > file to the same with statx() passed a NULL path:
> 
> Umm...
> 

Well, it's a silly benchmark, but stat performance is important, and usually
things are cached already so most of the time is just overhead --- which this
measures.  And since nothing actually uses statx() yet, you can't do a benchmark
just by running some command like 'git status' or whatever.

> > +   tmp.stx_dev_major = MAJOR(stat->dev);
> > +   tmp.stx_dev_minor = MINOR(stat->dev);
> > +   memset(tmp.__spare2, 0, sizeof(tmp.__spare2));
> > +
> > +   return copy_to_user(buffer, , sizeof(tmp)) ? -EFAULT : 0;
> 
> That relies upon there being no padding in the damn structure.
> It's true and probably will be true on any target, but
>   a) it's bloody well worth stating explicitly
> and
>   b) struct statx tmp = {.stx_mask = stat->result_mask};
> will get rid of those memset() you've got there by implicit
> zeroing of fields missing in partial structure initializer.
> Padding is *not* included into that, but you are relying upon
> having no padding anyway...

If padding is a concern at all (AFAICS it's not actually an issue now with
struct statx, but people tend to have different opinions on how careful they
want to be with padding), then I think we'll just have to start by memsetting
the whole struct to 0.

stat() actually does that, except that it's overridden on some architectures,
like x86, to only memset the actual reserved fields.  I had kind of assumed that
as long as we're defining a new struct that has the same layout on all
architectures, with no padding, that we'd want to take the opportunity to only
memset the actual reserved fields, to make it a bit faster.  And yes, a
designated initializer would make this look a bit nicer, though it might have to
be changed later if any future statx() extensions involve fields that are
unioned or conditionally written.

Note that another approach would be to copy the fields individually, but with
unsafe_put_user() instead of __put_user().  That would avoid the overhead of
user_access_begin()/user_access_end() which was the main performance problem.
However, the infrastructure around this doesn't seem to be ready quite yet:
there aren't good implementations of unsafe_put_user() yet, nor is there an
unsafe_clear_user() yet.

- Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Eric Biggers
Hi Al,

On Sun, Mar 12, 2017 at 01:24:15AM +, Al Viro wrote:
> On Sat, Mar 11, 2017 at 01:45:55PM -0800, Eric Biggers wrote:
> > From: Eric Biggers 
> > 
> > I found that statx() was significantly slower than stat().  As a
> > microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> > file to the same with statx() passed a NULL path:
> 
> Umm...
> 

Well, it's a silly benchmark, but stat performance is important, and usually
things are cached already so most of the time is just overhead --- which this
measures.  And since nothing actually uses statx() yet, you can't do a benchmark
just by running some command like 'git status' or whatever.

> > +   tmp.stx_dev_major = MAJOR(stat->dev);
> > +   tmp.stx_dev_minor = MINOR(stat->dev);
> > +   memset(tmp.__spare2, 0, sizeof(tmp.__spare2));
> > +
> > +   return copy_to_user(buffer, , sizeof(tmp)) ? -EFAULT : 0;
> 
> That relies upon there being no padding in the damn structure.
> It's true and probably will be true on any target, but
>   a) it's bloody well worth stating explicitly
> and
>   b) struct statx tmp = {.stx_mask = stat->result_mask};
> will get rid of those memset() you've got there by implicit
> zeroing of fields missing in partial structure initializer.
> Padding is *not* included into that, but you are relying upon
> having no padding anyway...

If padding is a concern at all (AFAICS it's not actually an issue now with
struct statx, but people tend to have different opinions on how careful they
want to be with padding), then I think we'll just have to start by memsetting
the whole struct to 0.

stat() actually does that, except that it's overridden on some architectures,
like x86, to only memset the actual reserved fields.  I had kind of assumed that
as long as we're defining a new struct that has the same layout on all
architectures, with no padding, that we'd want to take the opportunity to only
memset the actual reserved fields, to make it a bit faster.  And yes, a
designated initializer would make this look a bit nicer, though it might have to
be changed later if any future statx() extensions involve fields that are
unioned or conditionally written.

Note that another approach would be to copy the fields individually, but with
unsafe_put_user() instead of __put_user().  That would avoid the overhead of
user_access_begin()/user_access_end() which was the main performance problem.
However, the infrastructure around this doesn't seem to be ready quite yet:
there aren't good implementations of unsafe_put_user() yet, nor is there an
unsafe_clear_user() yet.

- Eric


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Al Viro
On Sat, Mar 11, 2017 at 01:45:55PM -0800, Eric Biggers wrote:
> From: Eric Biggers 
> 
> I found that statx() was significantly slower than stat().  As a
> microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> file to the same with statx() passed a NULL path:

Umm...

> + struct statx tmp;
> +
> + tmp.stx_mask = stat->result_mask;
> + tmp.stx_blksize = stat->blksize;
> + tmp.stx_attributes = stat->attributes;
> + tmp.stx_nlink = stat->nlink;
> + tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid);
> + tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid);
> + tmp.stx_mode = stat->mode;
> + memset(tmp.__spare0, 0, sizeof(tmp.__spare0));
> + tmp.stx_ino = stat->ino;
> + tmp.stx_size = stat->size;
> + tmp.stx_blocks = stat->blocks;
> + memset(tmp.__spare1, 0, sizeof(tmp.__spare1));
> + init_statx_timestamp(_atime, >atime);
> + init_statx_timestamp(_btime, >btime);
> + init_statx_timestamp(_ctime, >ctime);
> + init_statx_timestamp(_mtime, >mtime);
> + tmp.stx_rdev_major = MAJOR(stat->rdev);
> + tmp.stx_rdev_minor = MINOR(stat->rdev);
> + tmp.stx_dev_major = MAJOR(stat->dev);
> + tmp.stx_dev_minor = MINOR(stat->dev);
> + memset(tmp.__spare2, 0, sizeof(tmp.__spare2));
> +
> + return copy_to_user(buffer, , sizeof(tmp)) ? -EFAULT : 0;

That relies upon there being no padding in the damn structure.
It's true and probably will be true on any target, but
a) it's bloody well worth stating explicitly
and
b) struct statx tmp = {.stx_mask = stat->result_mask};
will get rid of those memset() you've got there by implicit
zeroing of fields missing in partial structure initializer.
Padding is *not* included into that, but you are relying upon
having no padding anyway...


Re: [PATCH v2] statx: optimize copy of struct statx to userspace

2017-03-11 Thread Al Viro
On Sat, Mar 11, 2017 at 01:45:55PM -0800, Eric Biggers wrote:
> From: Eric Biggers 
> 
> I found that statx() was significantly slower than stat().  As a
> microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs
> file to the same with statx() passed a NULL path:

Umm...

> + struct statx tmp;
> +
> + tmp.stx_mask = stat->result_mask;
> + tmp.stx_blksize = stat->blksize;
> + tmp.stx_attributes = stat->attributes;
> + tmp.stx_nlink = stat->nlink;
> + tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid);
> + tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid);
> + tmp.stx_mode = stat->mode;
> + memset(tmp.__spare0, 0, sizeof(tmp.__spare0));
> + tmp.stx_ino = stat->ino;
> + tmp.stx_size = stat->size;
> + tmp.stx_blocks = stat->blocks;
> + memset(tmp.__spare1, 0, sizeof(tmp.__spare1));
> + init_statx_timestamp(_atime, >atime);
> + init_statx_timestamp(_btime, >btime);
> + init_statx_timestamp(_ctime, >ctime);
> + init_statx_timestamp(_mtime, >mtime);
> + tmp.stx_rdev_major = MAJOR(stat->rdev);
> + tmp.stx_rdev_minor = MINOR(stat->rdev);
> + tmp.stx_dev_major = MAJOR(stat->dev);
> + tmp.stx_dev_minor = MINOR(stat->dev);
> + memset(tmp.__spare2, 0, sizeof(tmp.__spare2));
> +
> + return copy_to_user(buffer, , sizeof(tmp)) ? -EFAULT : 0;

That relies upon there being no padding in the damn structure.
It's true and probably will be true on any target, but
a) it's bloody well worth stating explicitly
and
b) struct statx tmp = {.stx_mask = stat->result_mask};
will get rid of those memset() you've got there by implicit
zeroing of fields missing in partial structure initializer.
Padding is *not* included into that, but you are relying upon
having no padding anyway...