Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-14 Thread Ian Lance Taylor
On Sun, Feb 14, 2021 at 4:38 PM Dave Chinner  wrote:
>
> On Fri, Feb 12, 2021 at 03:54:48PM -0800, Darrick J. Wong wrote:
> > On Sat, Feb 13, 2021 at 10:27:26AM +1100, Dave Chinner wrote:
> >
> > > If you can't tell from userspace that a file has data in it other
> > > than by calling read() on it, then you can't use cfr on it.
> >
> > I don't know how to do that, Dave. :)
>
> If stat returns a non-zero size, then userspace knows it has at
> least that much data in it, whether it be zeros or previously
> written data. cfr will copy that data. The special zero length
> regular pipe files fail this simple "how much data is there to copy
> in this file" check...

This suggests that if the Go standard library sees that
copy_file_range reads zero bytes, we should assume that it failed, and
should use the fallback path as though copy_file_range returned
EOPNOTSUPP or EINVAL.  This will cause an extra system call for an
empty file, but as long as copy_file_range does not return an error
for cases that it does not support we are going to need an extra
system call anyhow.

Does that seem like a possible mitigation?  That is, are there cases
where copy_file_range will fail to do a correct copy, and will return
success, and will not return zero?

Ian


Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-12 Thread Ian Lance Taylor
On Fri, Feb 12, 2021 at 3:03 PM Dave Chinner  wrote:
>
> On Fri, Feb 12, 2021 at 04:45:41PM +0100, Greg KH wrote:
> > On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > > On Fri, Feb 12, 2021 at 12:38 AM Greg KH  
> > > wrote:
> > > >
> > > > Why are people trying to use copy_file_range on simple /proc and /sys
> > > > files in the first place?  They can not seek (well most can not), so
> > > > that feels like a "oh look, a new syscall, let's use it everywhere!"
> > > > problem that userspace should not do.
> > >
> > > This may have been covered elsewhere, but it's not that people are
> > > saying "let's use copy_file_range on files in /proc."  It's that the
> > > Go language standard library provides an interface to operating system
> > > files.  When Go code uses the standard library function io.Copy to
> > > copy the contents of one open file to another open file, then on Linux
> > > kernels 5.3 and greater the Go standard library will use the
> > > copy_file_range system call.  That seems to be exactly what
> > > copy_file_range is intended for.  Unfortunately it appears that when
> > > people writing Go code open a file in /proc and use io.Copy the
> > > contents to another open file, copy_file_range does nothing and
> > > reports success.  There isn't anything on the copy_file_range man page
> > > explaining this limitation, and there isn't any documented way to know
> > > that the Go standard library should not use copy_file_range on certain
> > > files.
> >
> > But, is this a bug in the kernel in that the syscall being made is not
> > working properly, or a bug in that Go decided to do this for all types
> > of files not knowing that some types of files can not handle this?
> >
> > If the kernel has always worked this way, I would say that Go is doing
> > the wrong thing here.  If the kernel used to work properly, and then
> > changed, then it's a regression on the kernel side.
> >
> > So which is it?
>
> Both Al Viro and myself have said "copy file range is not a generic
> method for copying data between two file descriptors". It is a
> targetted solution for *regular files only* on filesystems that store
> persistent data and can accelerate the data copy in some way (e.g.
> clone, server side offload, hardware offlead, etc). It is not
> intended as a copy mechanism for copying data from one random file
> descriptor to another.
>
> The use of it as a general file copy mechanism in the Go system
> library is incorrect and wrong. It is a userspace bug.  Userspace
> has done the wrong thing, userspace needs to be fixed.

OK, we'll take it out.

I'll just make one last plea that I think that copy_file_range could
be much more useful if there were some way that a program could know
whether it would work or not.  It's pretty unfortunate that we can't
use it in the Go standard library, or, indeed, in any general purpose
code, in any language, that is intended to support arbitrary file
names.  To be pedantically clear, I'm not saying that copy_file_range
should work on all file systems.  I'm only saying that on file systems
for which it doesn't work it should fail rather than silently
returning success without doing anything.

Ian


Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-12 Thread Ian Lance Taylor
On Fri, Feb 12, 2021 at 8:28 AM Greg KH  wrote:
>
> On Fri, Feb 12, 2021 at 07:59:04AM -0800, Ian Lance Taylor wrote:
> > On Fri, Feb 12, 2021 at 7:45 AM Greg KH  wrote:
> > >
> > > On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > > > On Fri, Feb 12, 2021 at 12:38 AM Greg KH  
> > > > wrote:
> > > > >
> > > > > Why are people trying to use copy_file_range on simple /proc and /sys
> > > > > files in the first place?  They can not seek (well most can not), so
> > > > > that feels like a "oh look, a new syscall, let's use it everywhere!"
> > > > > problem that userspace should not do.
> > > >
> > > > This may have been covered elsewhere, but it's not that people are
> > > > saying "let's use copy_file_range on files in /proc."  It's that the
> > > > Go language standard library provides an interface to operating system
> > > > files.  When Go code uses the standard library function io.Copy to
> > > > copy the contents of one open file to another open file, then on Linux
> > > > kernels 5.3 and greater the Go standard library will use the
> > > > copy_file_range system call.  That seems to be exactly what
> > > > copy_file_range is intended for.  Unfortunately it appears that when
> > > > people writing Go code open a file in /proc and use io.Copy the
> > > > contents to another open file, copy_file_range does nothing and
> > > > reports success.  There isn't anything on the copy_file_range man page
> > > > explaining this limitation, and there isn't any documented way to know
> > > > that the Go standard library should not use copy_file_range on certain
> > > > files.
> > >
> > > But, is this a bug in the kernel in that the syscall being made is not
> > > working properly, or a bug in that Go decided to do this for all types
> > > of files not knowing that some types of files can not handle this?
> > >
> > > If the kernel has always worked this way, I would say that Go is doing
> > > the wrong thing here.  If the kernel used to work properly, and then
> > > changed, then it's a regression on the kernel side.
> > >
> > > So which is it?
> >
> > I don't work on the kernel, so I can't tell you which it is.  You will
> > have to decide.
>
> As you have the userspace code, it should be easier for you to test this
> on an older kernel.  I don't have your userspace code...

Sorry, I'm not sure what you are asking.

I've attached a sample Go program.  On kernel version 2.6.32 this
program exits 0.  On kernel version 5.7.17 it prints

got "" want "./foo\x00"

and exits with status 1.

This program hardcodes the string "/proc/self/cmdline" for
convenience, but of course the same results would happen if this were
a generic copy program that somebody invoked with /proc/self/cmdline
as a command line option.

I could write the same program in C easily enough, by explicitly
calling copy_file_range.  Would it help to see a sample C program?


> > From my perspective, as a kernel user rather than a kernel developer,
> > a system call that silently fails for certain files and that provides
> > no way to determine either 1) ahead of time that the system call will
> > fail, or 2) after the call that the system call did fail, is a useless
> > system call.
>
> Great, then don't use copy_file_range() yet as it seems like it fits
> that category at the moment :)

That seems like an unfortunate result, but if that is the determining
opinion then I guess that is what we will have to do in the Go
standard library.

Ian
package main

import (
	"bytes"
	"fmt"
	"io"
	"io/ioutil"
	"os"
)

func main() {
	tmpfile, err := ioutil.TempFile("", "copy_file_range")
	if err != nil {
		fmt.Fprint(os.Stderr, err)
		os.Exit(1)
	}
	status := copy(tmpfile)
	os.Remove(tmpfile.Name())
	os.Exit(status)
}

func copy(tmpfile *os.File) int {
	cmdline, err := os.Open("/proc/self/cmdline")
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	defer cmdline.Close()
	if _, err := io.Copy(tmpfile, cmdline); err != nil {
		fmt.Fprintf(os.Stderr, "copy failed: %v\n", err)
		return 1
	}
	if err := tmpfile.Close(); err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	old, err := ioutil.ReadFile("/proc/self/cmdline")
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	new, err := ioutil.ReadFile(tmpfile.Name())
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	if !bytes.Equal(old, new) {
		fmt.Fprintf(os.Stderr, "got %q want %q\n", new, old)
		return 1
	}
	return 0
}


Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-12 Thread Ian Lance Taylor
On Fri, Feb 12, 2021 at 7:45 AM Greg KH  wrote:
>
> On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > On Fri, Feb 12, 2021 at 12:38 AM Greg KH  wrote:
> > >
> > > Why are people trying to use copy_file_range on simple /proc and /sys
> > > files in the first place?  They can not seek (well most can not), so
> > > that feels like a "oh look, a new syscall, let's use it everywhere!"
> > > problem that userspace should not do.
> >
> > This may have been covered elsewhere, but it's not that people are
> > saying "let's use copy_file_range on files in /proc."  It's that the
> > Go language standard library provides an interface to operating system
> > files.  When Go code uses the standard library function io.Copy to
> > copy the contents of one open file to another open file, then on Linux
> > kernels 5.3 and greater the Go standard library will use the
> > copy_file_range system call.  That seems to be exactly what
> > copy_file_range is intended for.  Unfortunately it appears that when
> > people writing Go code open a file in /proc and use io.Copy the
> > contents to another open file, copy_file_range does nothing and
> > reports success.  There isn't anything on the copy_file_range man page
> > explaining this limitation, and there isn't any documented way to know
> > that the Go standard library should not use copy_file_range on certain
> > files.
>
> But, is this a bug in the kernel in that the syscall being made is not
> working properly, or a bug in that Go decided to do this for all types
> of files not knowing that some types of files can not handle this?
>
> If the kernel has always worked this way, I would say that Go is doing
> the wrong thing here.  If the kernel used to work properly, and then
> changed, then it's a regression on the kernel side.
>
> So which is it?

I don't work on the kernel, so I can't tell you which it is.  You will
have to decide.

>From my perspective, as a kernel user rather than a kernel developer,
a system call that silently fails for certain files and that provides
no way to determine either 1) ahead of time that the system call will
fail, or 2) after the call that the system call did fail, is a useless
system call.  I can never use that system call, because I don't know
whether or not it will work.  So as a kernel user I would say that you
should fix the system call to report failure, or document some way to
know whether the system call will fail, or you should remove the
system call.  But I'm not a kernel developer, I don't have all the
information, and it's obviously your call.

I'll note that to the best of my knowledge this failure started
happening with the 5.3 kernel, as before 5.3 the problematic calls
would report a failure (EXDEV).  Since 5.3 isn't all that old I
personally wouldn't say that the kernel "has always worked this way."
But I may be mistaken about this.


> > So ideally the kernel will report EOPNOTSUPP or EINVAL when using
> > copy_file_range on a file in /proc or some other file system that
> > fails (and, minor side note, the copy_file_range man page should
> > document that it can return EOPNOTSUPP or EINVAL in some cases, which
> > does already happen on at least some kernel versions using at least
> > some file systems).
>
> Documentation is good, but what the kernel does is the true "definition"
> of what is going right or wrong here.

Sure.  The documentation comment was just a side note.  I hope that we
can all agree that accurate man pages are better than inaccurate ones.

Ian


Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-12 Thread Ian Lance Taylor
On Fri, Feb 12, 2021 at 12:38 AM Greg KH  wrote:
>
> Why are people trying to use copy_file_range on simple /proc and /sys
> files in the first place?  They can not seek (well most can not), so
> that feels like a "oh look, a new syscall, let's use it everywhere!"
> problem that userspace should not do.

This may have been covered elsewhere, but it's not that people are
saying "let's use copy_file_range on files in /proc."  It's that the
Go language standard library provides an interface to operating system
files.  When Go code uses the standard library function io.Copy to
copy the contents of one open file to another open file, then on Linux
kernels 5.3 and greater the Go standard library will use the
copy_file_range system call.  That seems to be exactly what
copy_file_range is intended for.  Unfortunately it appears that when
people writing Go code open a file in /proc and use io.Copy the
contents to another open file, copy_file_range does nothing and
reports success.  There isn't anything on the copy_file_range man page
explaining this limitation, and there isn't any documented way to know
that the Go standard library should not use copy_file_range on certain
files.

So ideally the kernel will report EOPNOTSUPP or EINVAL when using
copy_file_range on a file in /proc or some other file system that
fails (and, minor side note, the copy_file_range man page should
document that it can return EOPNOTSUPP or EINVAL in some cases, which
does already happen on at least some kernel versions using at least
some file systems).  Or, less ideally, there will be some documented
way that the Go standard library can determine that copy_file_range
will fail before trying to use it.  If neither of those can be done,
then I think the only option is for the Go standard library to never
use copy_file_range, as even though it will almost always work and
work well, in some unpredictable number of cases it will fail
silently.

Thanks.

Ian


Re: [BUG] copy_file_range with sysfs file as input

2021-01-25 Thread Ian Lance Taylor
Thanks for the note.  I'm not a kernel developer, but to me this
sounds like a kernel bug.  It seems particularly unfortunate that
copy_file_range returns 0 in this case.  From the perspective of the
Go standard library, what we would need is some mechanism to detect
when the copy_file_range system call will not or did not work
correctly.  As the biggest hammer, we currently only call
copy_file_range on kernel versions 5.3 and newer.  We can bump that
requirement if necessary.

Please feel free to open a bug about this at https://golang.org/issue,
but we'll need guidance as to what we should do to avoid the problem.
Thanks.

Ian

On Sun, Jan 24, 2021 at 11:54 PM Nicolas Boichat  wrote:
>
> Hi copy_file_range experts,
>
> We hit this interesting issue when upgrading Go compiler from 1.13 to
> 1.15 [1]. Basically we use Go's `io.Copy` to copy the content of
> `/sys/kernel/debug/tracing/trace` to a temporary file.
>
> Under the hood, Go now uses `copy_file_range` syscall to optimize the
> copy operation. However, that fails to copy any content when the input
> file is from sysfs/tracefs, with an apparent size of 0 (but there is
> still content when you `cat` it, of course).
>
> A repro case is available in comment7 (adapted from the man page),
> also copied below [2].
>
> Output looks like this (on kernels 5.4.89 (chromeos), 5.7.17 and
> 5.10.3 (chromeos))
> $ ./copyfrom /sys/kernel/debug/tracing/trace x
> 0 bytes copied
> $ cat x
> $ cat /sys/kernel/debug/tracing/trace
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 0/0   #P:8
> #
> #_-=> irqs-off
> #   / _=> need-resched
> #  | / _---=> hardirq/softirq
> #  || / _--=> preempt-depth
> #  ||| / delay
> #   TASK-PID CPU#     TIMESTAMP  FUNCTION
> #  | | |     | |
>
> I can try to dig further, but thought you'd like to get a bug report
> as soon as possible.
>
> Thanks,
>
> Nicolas
>
> [1] http://issuetracker.google.com/issues/178332739
> [2]
> #define _GNU_SOURCE
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> int
> main(int argc, char **argv)
> {
> int fd_in, fd_out;
> loff_t ret;
>
> if (argc != 3) {
> fprintf(stderr, "Usage: %s  \n", 
> argv[0]);
> exit(EXIT_FAILURE);
> }
>
> fd_in = open(argv[1], O_RDONLY);
> if (fd_in == -1) {
> perror("open (argv[1])");
> exit(EXIT_FAILURE);
> }
>
> fd_out = open(argv[2], O_CREAT | O_WRONLY | O_TRUNC, 0644);
> if (fd_out == -1) {
> perror("open (argv[2])");
> exit(EXIT_FAILURE);
> }
>
> ret = copy_file_range(fd_in, NULL, fd_out, NULL, 1024, 0);
> if (ret == -1) {
> perror("copy_file_range");
> exit(EXIT_FAILURE);
> }
> printf("%d bytes copied\n", (int)ret);
>
> close(fd_in);
> close(fd_out);
> exit(EXIT_SUCCESS);
> }


Re: RFC: adding Linux vsyscall-disable and similar backwards-incompatibility flags to ELF headers?

2015-09-01 Thread Ian Lance Taylor
On Tue, Sep 1, 2015 at 5:51 PM, Andy Lutomirski  wrote:
>
> Linux has a handful of weird features that are only supported for
> backwards compatibility.  The big one is the x86_64 vsyscall page, but
> uselib probably belongs on the list, too, and we might end up with
> more at some point.
>
> I'd like to add a way that new programs can turn these features off.
> In particular, I want the vsyscall page to be completely gone from the
> perspective of any new enough program.  This is straightforward if we
> add a system call to ask for the vsyscall page to be disabled, but I'm
> wondering if we can come up with a non-syscall way to do it.
>
> I think that the ideal behavior would be that anything linked against
> a sufficiently new libc would be detected, but I don't see a good way
> to do that using existing toolchain features.
>
> Ideas?  We could add a new phdr for this, but then we'd need to play
> linker script games, and I'm not sure that could be done in a clean,
> extensible way.

What sets up the vsyscall page, and what information does it have
before doing so?

I'm guessing it's the kernel that sets it up, and that all it can see
at that point is the program headers.

We could pass information using an appropriate note section.  My
recollection is that the linkers will turn an SHF_ALLOC note section
into a PT_NOTE program header.

Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: adding Linux vsyscall-disable and similar backwards-incompatibility flags to ELF headers?

2015-09-01 Thread Ian Lance Taylor
On Tue, Sep 1, 2015 at 5:51 PM, Andy Lutomirski  wrote:
>
> Linux has a handful of weird features that are only supported for
> backwards compatibility.  The big one is the x86_64 vsyscall page, but
> uselib probably belongs on the list, too, and we might end up with
> more at some point.
>
> I'd like to add a way that new programs can turn these features off.
> In particular, I want the vsyscall page to be completely gone from the
> perspective of any new enough program.  This is straightforward if we
> add a system call to ask for the vsyscall page to be disabled, but I'm
> wondering if we can come up with a non-syscall way to do it.
>
> I think that the ideal behavior would be that anything linked against
> a sufficiently new libc would be detected, but I don't see a good way
> to do that using existing toolchain features.
>
> Ideas?  We could add a new phdr for this, but then we'd need to play
> linker script games, and I'm not sure that could be done in a clean,
> extensible way.

What sets up the vsyscall page, and what information does it have
before doing so?

I'm guessing it's the kernel that sets it up, and that all it can see
at that point is the program headers.

We could pass information using an appropriate note section.  My
recollection is that the linkers will turn an SHF_ALLOC note section
into a PT_NOTE program header.

Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-16 Thread Ian Lance Taylor
On Mon, Jun 16, 2014 at 7:38 AM, Andi Kleen  wrote:
>> I think this issue started when some of the Go developers questioned
>> why the kernel needed to provide a very complex interface--parsing an
>> ELF shared shared library--for very simple functionality--looking up
>> the address of a magic function.  This approach has required special
>> support not just in Go, but also in the dynamic linker and gdb, and
>> does not work well for statically linked binaries.  The support in gdb
>> is perhaps a good idea, but elsewhere it does not make sense.
>>
>> So why not provide a simple interface?
>
> What good would it do now that everyone already supports it?

Do statically linked binaries use the vDSO calls?

Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-16 Thread Ian Lance Taylor
On Sun, Jun 15, 2014 at 7:36 PM, Andi Kleen  wrote:
>
> I haven't looked into this in detail, but my initial assumption would
> be that it wouldn't be useful to add new vdso interfaces just for Go.
> After all would you really want to force ever Go user to upgrade their
> kernel just to get fast fime? So it has to work with whatever is already
> there anyways.

Go works fine with the current interface.  There was, arguably, a bug
in Go's old implementation in that it assumed that the vDSO would have
a normal symbol table.  That bug has already been fixed.

I think this issue started when some of the Go developers questioned
why the kernel needed to provide a very complex interface--parsing an
ELF shared shared library--for very simple functionality--looking up
the address of a magic function.  This approach has required special
support not just in Go, but also in the dynamic linker and gdb, and
does not work well for statically linked binaries.  The support in gdb
is perhaps a good idea, but elsewhere it does not make sense.

So why not provide a simple interface?

Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-16 Thread Ian Lance Taylor
On Sun, Jun 15, 2014 at 7:36 PM, Andi Kleen a...@firstfloor.org wrote:

 I haven't looked into this in detail, but my initial assumption would
 be that it wouldn't be useful to add new vdso interfaces just for Go.
 After all would you really want to force ever Go user to upgrade their
 kernel just to get fast fime? So it has to work with whatever is already
 there anyways.

Go works fine with the current interface.  There was, arguably, a bug
in Go's old implementation in that it assumed that the vDSO would have
a normal symbol table.  That bug has already been fixed.

I think this issue started when some of the Go developers questioned
why the kernel needed to provide a very complex interface--parsing an
ELF shared shared library--for very simple functionality--looking up
the address of a magic function.  This approach has required special
support not just in Go, but also in the dynamic linker and gdb, and
does not work well for statically linked binaries.  The support in gdb
is perhaps a good idea, but elsewhere it does not make sense.

So why not provide a simple interface?

Ian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-16 Thread Ian Lance Taylor
On Mon, Jun 16, 2014 at 7:38 AM, Andi Kleen a...@firstfloor.org wrote:
 I think this issue started when some of the Go developers questioned
 why the kernel needed to provide a very complex interface--parsing an
 ELF shared shared library--for very simple functionality--looking up
 the address of a magic function.  This approach has required special
 support not just in Go, but also in the dynamic linker and gdb, and
 does not work well for statically linked binaries.  The support in gdb
 is perhaps a good idea, but elsewhere it does not make sense.

 So why not provide a simple interface?

 What good would it do now that everyone already supports it?

Do statically linked binaries use the vDSO calls?

Ian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-15 Thread Ian Lance Taylor
On Sun, Jun 15, 2014 at 12:31 PM, H. Peter Anvin  wrote:
> The weak symbols are well-known names.  The __vdso symbols are strong.

I see.  But I don't understand how this is supposed to work.  When I
link a program against gettimeofday, I get a reference to gettimeofday
with version GLIBC_2.2.5.  After all, I only link against libc.so; I
don't link against the vDSO.  The VDSO provides gettimeofday with
version LINUX_2.6.  Since those versions don't match, the gettimeofday
reference in my executable will not be satisfied by the definition in
the vDSO.  So at dynamic link time my program is always going to be
linked with the gettimeofday in libc.so, which will in turn call the
gettimeofday in the vDSO.

Am I missing something that makes the definition of gettimeofday with
version LINUX_2.6 in the vDSO useful?

Ian



> On June 15, 2014 12:22:17 PM PDT, Ian Lance Taylor  wrote:
>>On Sun, Jun 15, 2014 at 12:14 PM, H. Peter Anvin  wrote:
>>>
>>> If it doesn't, then you incur an additional indirection penalty.  The
>>strong __vdso symbol allows the libc wrapper to fall back to the vdso
>>implementation, the weak symbol allows three to be no wrapper at all.
>>This is good.
>>>
>>> The reason for changing ABI would be shifting types.  This is very
>>much how glibc manages transitions.
>>
>>The purpose of symbol versioning is so that symbols with well known
>>names, like stat, can continue to use those same names while changing
>>types.  Both old and new programs can continue to use the name stat
>>and continue to work even though they use different types.
>>
>>I don't see how this applies to the kernel VDSO.  Those symbols do not
>>use well-known names; they use names like __vdso_time.  If you change
>>the types used by those symbols, you can change the name as well.
>>What is the downside?
>>
>>Ian
>
> --
> Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-15 Thread Ian Lance Taylor
On Sun, Jun 15, 2014 at 12:14 PM, H. Peter Anvin  wrote:
>
> If it doesn't, then you incur an additional indirection penalty.  The strong 
> __vdso symbol allows the libc wrapper to fall back to the vdso 
> implementation, the weak symbol allows three to be no wrapper at all.  This 
> is good.
>
> The reason for changing ABI would be shifting types.  This is very much how 
> glibc manages transitions.

The purpose of symbol versioning is so that symbols with well known
names, like stat, can continue to use those same names while changing
types.  Both old and new programs can continue to use the name stat
and continue to work even though they use different types.

I don't see how this applies to the kernel VDSO.  Those symbols do not
use well-known names; they use names like __vdso_time.  If you change
the types used by those symbols, you can change the name as well.
What is the downside?

Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-15 Thread Ian Lance Taylor
On Sun, Jun 15, 2014 at 12:14 PM, H. Peter Anvin h...@zytor.com wrote:

 If it doesn't, then you incur an additional indirection penalty.  The strong 
 __vdso symbol allows the libc wrapper to fall back to the vdso 
 implementation, the weak symbol allows three to be no wrapper at all.  This 
 is good.

 The reason for changing ABI would be shifting types.  This is very much how 
 glibc manages transitions.

The purpose of symbol versioning is so that symbols with well known
names, like stat, can continue to use those same names while changing
types.  Both old and new programs can continue to use the name stat
and continue to work even though they use different types.

I don't see how this applies to the kernel VDSO.  Those symbols do not
use well-known names; they use names like __vdso_time.  If you change
the types used by those symbols, you can change the name as well.
What is the downside?

Ian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 0/2] __vdso_findsym

2014-06-15 Thread Ian Lance Taylor
On Sun, Jun 15, 2014 at 12:31 PM, H. Peter Anvin h...@zytor.com wrote:
 The weak symbols are well-known names.  The __vdso symbols are strong.

I see.  But I don't understand how this is supposed to work.  When I
link a program against gettimeofday, I get a reference to gettimeofday
with version GLIBC_2.2.5.  After all, I only link against libc.so; I
don't link against the vDSO.  The VDSO provides gettimeofday with
version LINUX_2.6.  Since those versions don't match, the gettimeofday
reference in my executable will not be satisfied by the definition in
the vDSO.  So at dynamic link time my program is always going to be
linked with the gettimeofday in libc.so, which will in turn call the
gettimeofday in the vDSO.

Am I missing something that makes the definition of gettimeofday with
version LINUX_2.6 in the vDSO useful?

Ian



 On June 15, 2014 12:22:17 PM PDT, Ian Lance Taylor i...@golang.org wrote:
On Sun, Jun 15, 2014 at 12:14 PM, H. Peter Anvin h...@zytor.com wrote:

 If it doesn't, then you incur an additional indirection penalty.  The
strong __vdso symbol allows the libc wrapper to fall back to the vdso
implementation, the weak symbol allows three to be no wrapper at all.
This is good.

 The reason for changing ABI would be shifting types.  This is very
much how glibc manages transitions.

The purpose of symbol versioning is so that symbols with well known
names, like stat, can continue to use those same names while changing
types.  Both old and new programs can continue to use the name stat
and continue to work even though they use different types.

I don't see how this applies to the kernel VDSO.  Those symbols do not
use well-known names; they use names like __vdso_time.  If you change
the types used by those symbols, you can change the name as well.
What is the downside?

Ian

 --
 Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/