Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-22 Thread Andi Kleen
On Fri, Dec 21, 2007 at 08:36:40PM -0800, David Miller wrote:
> From: Andi Kleen <[EMAIL PROTECTED]>
> Date: Sat, 22 Dec 2007 02:53:11 +0100
> 
> > > And to handle potentially ambiguous cases we, for a long time, have
> > > the force_successful_syscall_return() arch hook. 
> > 
> > Ah I see what you mean now.
> > 
> > Thanks for the clarification.
> 
> Thanks for continuing to insist it's "impossible" :-)

It's still hard -- e.g. i'm not sure your condition flag setting would 
be even possible for i386 SYSEXIT which does not restore EFLAGS
from memory and has some other constraints too. And there is no
free other register to use for this either on i386 nor x86-64.

Ok you could always disable SYSEXIT on force_successfull_return(),
but then e.g. all lseek()s would use the slow path which might not
be a good idea.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-22 Thread Andi Kleen
On Fri, Dec 21, 2007 at 08:36:40PM -0800, David Miller wrote:
 From: Andi Kleen [EMAIL PROTECTED]
 Date: Sat, 22 Dec 2007 02:53:11 +0100
 
   And to handle potentially ambiguous cases we, for a long time, have
   the force_successful_syscall_return() arch hook. 
  
  Ah I see what you mean now.
  
  Thanks for the clarification.
 
 Thanks for continuing to insist it's impossible :-)

It's still hard -- e.g. i'm not sure your condition flag setting would 
be even possible for i386 SYSEXIT which does not restore EFLAGS
from memory and has some other constraints too. And there is no
free other register to use for this either on i386 nor x86-64.

Ok you could always disable SYSEXIT on force_successfull_return(),
but then e.g. all lseek()s would use the slow path which might not
be a good idea.

-Andi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-21 Thread David Miller
From: Andi Kleen <[EMAIL PROTECTED]>
Date: Sat, 22 Dec 2007 02:53:11 +0100

> > And to handle potentially ambiguous cases we, for a long time, have
> > the force_successful_syscall_return() arch hook. 
> 
> Ah I see what you mean now.
> 
> Thanks for the clarification.

Thanks for continuing to insist it's "impossible" :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-21 Thread Andi Kleen
> And to handle potentially ambiguous cases we, for a long time, have
> the force_successful_syscall_return() arch hook. 

Ah I see what you mean now.

Thanks for the clarification.

Ok that could be in theory made to work yes. The migration would
probably be ugly though (how would glibc figure out if the kernel
does that or not?)

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-21 Thread Andi Kleen
> I'm suggesting that you set the condition codes based upon whether
> there is an error or not. 

And the only way the syscall code could find out if there is an error is by
checking err < 0 && err >= -4096 like glibc (except for the compat
syscall on 64bit kernel case) 

Or rewrite all code that returns errors to system calls to pass
a separate flag too.

> That is the critical thing x86 doesn't do
> that all the other platforms do.

It doesn't do it because it's useless without a kernel rewrite.

I frankly doubt it really works on Sparc :-) Maybe it could work
there on a hypothetical rewritten kernel, but not today.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-21 Thread David Miller
From: David Miller <[EMAIL PROTECTED]>
Date: Fri, 21 Dec 2007 17:41:24 -0800 (PST)

> I'm suggesting that you set the condition codes based upon whether
> there is an error or not.  That is the critical thing x86 doesn't do
> that all the other platforms do.

And if you still don't get it, I'm saying that x86, in the syscall
trap return path, should set the conditon codes based upon whether the
system call is really signalling an error or not.

And to handle potentially ambiguous cases we, for a long time, have
the force_successful_syscall_return() arch hook.  System call
implementations use this when the return values they give could be
mis-construed as error values.

And if you'll notice x86 makes no attempt to implement that hook,
because it currently can't.  That's what needs to be fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-21 Thread David Miller
From: Andi Kleen <[EMAIL PROTECTED]>
Date: Sat, 22 Dec 2007 01:42:02 +0100

> David Miller <[EMAIL PROTECTED]> writes:
> 
> > Only on x86 platforms.  Sparc, IA64, MIPS, powerpc, etc. all get this
> > case right.
> 
> Do they for 32bit kernels or for 64bit userland? 

Both.  This is not a compat issue.

> > Yes it's another unfortunate side effect of how error status is
> > indicated for x86 system calls.
> 
> Maybe I'm dense, but doesn't all the kernel code pass it the
> same way as the x86 syscall code? For your proposal you
> would need a separate error bit coming out of the sys_* to
> handle this case. Basically rewrite all code that ever returns
> errors in the kernel. Or do I miss something? 

I'm suggesting that you set the condition codes based upon whether
there is an error or not.  That is the critical thing x86 doesn't do
that all the other platforms do.

x86 relies on interpretation of purely the integer returned from the
system call to userspace, and that means a certain chunk of the return
value space can never represent valid values.

If you use the condition codes to signal "the return value is an
error" you don't have these problems.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-21 Thread Andi Kleen
David Miller <[EMAIL PROTECTED]> writes:

> Only on x86 platforms.  Sparc, IA64, MIPS, powerpc, etc. all get this
> case right.

Do they for 32bit kernels or for 64bit userland? 

> Yes it's another unfortunate side effect of how error status is
> indicated for x86 system calls.

Maybe I'm dense, but doesn't all the kernel code pass it the
same way as the x86 syscall code? For your proposal you
would need a separate error bit coming out of the sys_* to
handle this case. Basically rewrite all code that ever returns
errors in the kernel. Or do I miss something? 

Or are you talking about solving it only for 32bit compat emulation
on 64bit kernels? There it would be possible, but only doing 
it there and not on native 32bit systems wouldn't seem clean to me.

At least on x86-64 the compat code's (near) only goal in live is to be
as compatible to 32it as possible not better.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-21 Thread Andi Kleen
David Miller [EMAIL PROTECTED] writes:

 Only on x86 platforms.  Sparc, IA64, MIPS, powerpc, etc. all get this
 case right.

Do they for 32bit kernels or for 64bit userland? 

 Yes it's another unfortunate side effect of how error status is
 indicated for x86 system calls.

Maybe I'm dense, but doesn't all the kernel code pass it the
same way as the x86 syscall code? For your proposal you
would need a separate error bit coming out of the sys_* to
handle this case. Basically rewrite all code that ever returns
errors in the kernel. Or do I miss something? 

Or are you talking about solving it only for 32bit compat emulation
on 64bit kernels? There it would be possible, but only doing 
it there and not on native 32bit systems wouldn't seem clean to me.

At least on x86-64 the compat code's (near) only goal in live is to be
as compatible to 32it as possible not better.

-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-21 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED]
Date: Sat, 22 Dec 2007 01:42:02 +0100

 David Miller [EMAIL PROTECTED] writes:
 
  Only on x86 platforms.  Sparc, IA64, MIPS, powerpc, etc. all get this
  case right.
 
 Do they for 32bit kernels or for 64bit userland? 

Both.  This is not a compat issue.

  Yes it's another unfortunate side effect of how error status is
  indicated for x86 system calls.
 
 Maybe I'm dense, but doesn't all the kernel code pass it the
 same way as the x86 syscall code? For your proposal you
 would need a separate error bit coming out of the sys_* to
 handle this case. Basically rewrite all code that ever returns
 errors in the kernel. Or do I miss something? 

I'm suggesting that you set the condition codes based upon whether
there is an error or not.  That is the critical thing x86 doesn't do
that all the other platforms do.

x86 relies on interpretation of purely the integer returned from the
system call to userspace, and that means a certain chunk of the return
value space can never represent valid values.

If you use the condition codes to signal the return value is an
error you don't have these problems.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-21 Thread David Miller
From: David Miller [EMAIL PROTECTED]
Date: Fri, 21 Dec 2007 17:41:24 -0800 (PST)

 I'm suggesting that you set the condition codes based upon whether
 there is an error or not.  That is the critical thing x86 doesn't do
 that all the other platforms do.

And if you still don't get it, I'm saying that x86, in the syscall
trap return path, should set the conditon codes based upon whether the
system call is really signalling an error or not.

And to handle potentially ambiguous cases we, for a long time, have
the force_successful_syscall_return() arch hook.  System call
implementations use this when the return values they give could be
mis-construed as error values.

And if you'll notice x86 makes no attempt to implement that hook,
because it currently can't.  That's what needs to be fixed.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-21 Thread Andi Kleen
 I'm suggesting that you set the condition codes based upon whether
 there is an error or not. 

And the only way the syscall code could find out if there is an error is by
checking err  0  err = -4096 like glibc (except for the compat
syscall on 64bit kernel case) 

Or rewrite all code that returns errors to system calls to pass
a separate flag too.

 That is the critical thing x86 doesn't do
 that all the other platforms do.

It doesn't do it because it's useless without a kernel rewrite.

I frankly doubt it really works on Sparc :-) Maybe it could work
there on a hypothetical rewritten kernel, but not today.

-Andi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-21 Thread Andi Kleen
 And to handle potentially ambiguous cases we, for a long time, have
 the force_successful_syscall_return() arch hook. 

Ah I see what you mean now.

Thanks for the clarification.

Ok that could be in theory made to work yes. The migration would
probably be ugly though (how would glibc figure out if the kernel
does that or not?)

-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-21 Thread David Miller
From: Andi Kleen [EMAIL PROTECTED]
Date: Sat, 22 Dec 2007 02:53:11 +0100

  And to handle potentially ambiguous cases we, for a long time, have
  the force_successful_syscall_return() arch hook. 
 
 Ah I see what you mean now.
 
 Thanks for the clarification.

Thanks for continuing to insist it's impossible :-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-20 Thread David Miller
From: Michael Kerrisk <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 12:36:52 +0100

> Some testing just now shows me that lseek() on /dev/mem suffers similar
> problems when seeking to bytes 0xf001 through to 0x.

Only on x86 platforms.  Sparc, IA64, MIPS, powerpc, etc. all get this
case right.

Yes it's another unfortunate side effect of how error status is
indicated for x86 system calls.

I would suggest, that we put something in place to fix this
in the long term:

1) Start setting the condition codes properly to indicate
   error in the system call return path on x86 like other
   platforms do now.  Make sure that it tips off on
   force_successful_syscall_return(), as needed.

2) Come up with a transition plan for glibc et al. to take
   advantage of this.

It actually sounds like the kind of problem that could be
solved well using the VDSO page. :-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-12-20 Thread Michael Kerrisk
David, Andrew, Paul,

A late coda to this thread, but I'll just note some changes I'm making to
the man page (which I'd like you to review -- please see below), and note a
few other points.

Andrew, you asked about what happens for x86 with the -1 to -4095 return
for other syscalls.  At least two other syscalls suffer the same problem.
>From the fcntl(2) man page:

   BUGS
   A limitation of the Linux system call conventions on some
   architectures  (notably  x86)  means that if a (negative)
   process group ID to be returned by F_GETOWN falls in  the
   range  -1  to  -4095,  then  the  return value is wrongly
   interpreted by glibc as an error in the system call; that
   is,  the  return  value  of fcntl() will be -1, and errno
   will contain the (positive) process group ID.

Some testing just now shows me that lseek() on /dev/mem suffers similar
problems when seeking to bytes 0xf001 through to 0x.

Ulrich Drepper wrote:
> Chris Friesen wrote:
>>> A possible remedy is to return the ticks since process start time, which
>>> delays the wrap around much further.  POSIX only demands consistency
>>> within the same process.
>> This would be an interesting solution.
> 
>> The man page for linux states that the return code is time since system
>> boot, so that could realistically be expected to correlate between
>> different processes.
> 
> The Linux man page is documenting existing functionality on top of what
> the standard requires.  Programmers should ever only require what the
> standard guarantees.
> 
> I am perfectly willing to support a solution where the time is measured
>>from process startup time.  The only code using times() I found is
> cross-platform and most likely does not depend on the value returned is
> usable in isolation (only in a difference).

Did I miss anything?  Is anyone actually working on a solution along these
lines?

In the meantime, for man-pages-2.74, I've reworked the description of the
return value:

   RETURN VALUE
   times() returns the  number  of  clock  ticks  that  have
   elapsed since an arbitrary point in the past.  The return
   value may overflow the possible range  of  type  clock_t.
   On  error,  (clock_t) -1  is  returned,  and errno is set
   appropriately.

I moved the Linux specific details of the return value to NOTES, and added
a warning about relying on those details:

   NOTES
   ...

   On Linux, the "arbitrary point in the  past"  from  which
   the return value of times() is measured has varied across
   kernel versions.  On Linux 2.4 and earlier this point  is
   the  moment the system was booted.  Since Linux 2.6, this
   point is (2^32/HZ) - 300 (i.e., about 429  million)  sec-
   onds  before  system  boot time.  This variability across
   kernel versions (and across Unix  implementations),  com-
   bined  with the fact that the returned value may overflow
   the range of clock_t, means that a  portable  application
   would  be  wise  to  avoid  using this value.  To measure
   changes in elapsed time, use gettimeofday(2) instead.

Under BUGS I added:

   BUGS
   A limitation of the Linux system call conventions on some
   architectures (notably x86) means that on Linux 2.6 there
   is  a small time window (41 seconds) soon after boot when
   times(2) can return -1, falsely indicating that an  error
   occurred.   The  same  problem  can occur when the return
   value wraps passed the maximum value that can  be  stored
   in clockid_t.

Look okay to you folks?

Cheers,

Michael
-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-20 Thread Michael Kerrisk
David, Andrew, Paul,

A late coda to this thread, but I'll just note some changes I'm making to
the man page (which I'd like you to review -- please see below), and note a
few other points.

Andrew, you asked about what happens for x86 with the -1 to -4095 return
for other syscalls.  At least two other syscalls suffer the same problem.
From the fcntl(2) man page:

   BUGS
   A limitation of the Linux system call conventions on some
   architectures  (notably  x86)  means that if a (negative)
   process group ID to be returned by F_GETOWN falls in  the
   range  -1  to  -4095,  then  the  return value is wrongly
   interpreted by glibc as an error in the system call; that
   is,  the  return  value  of fcntl() will be -1, and errno
   will contain the (positive) process group ID.

Some testing just now shows me that lseek() on /dev/mem suffers similar
problems when seeking to bytes 0xf001 through to 0x.

Ulrich Drepper wrote:
 Chris Friesen wrote:
 A possible remedy is to return the ticks since process start time, which
 delays the wrap around much further.  POSIX only demands consistency
 within the same process.
 This would be an interesting solution.
 
 The man page for linux states that the return code is time since system
 boot, so that could realistically be expected to correlate between
 different processes.
 
 The Linux man page is documenting existing functionality on top of what
 the standard requires.  Programmers should ever only require what the
 standard guarantees.
 
 I am perfectly willing to support a solution where the time is measured
from process startup time.  The only code using times() I found is
 cross-platform and most likely does not depend on the value returned is
 usable in isolation (only in a difference).

Did I miss anything?  Is anyone actually working on a solution along these
lines?

In the meantime, for man-pages-2.74, I've reworked the description of the
return value:

   RETURN VALUE
   times() returns the  number  of  clock  ticks  that  have
   elapsed since an arbitrary point in the past.  The return
   value may overflow the possible range  of  type  clock_t.
   On  error,  (clock_t) -1  is  returned,  and errno is set
   appropriately.

I moved the Linux specific details of the return value to NOTES, and added
a warning about relying on those details:

   NOTES
   ...

   On Linux, the arbitrary point in the  past  from  which
   the return value of times() is measured has varied across
   kernel versions.  On Linux 2.4 and earlier this point  is
   the  moment the system was booted.  Since Linux 2.6, this
   point is (2^32/HZ) - 300 (i.e., about 429  million)  sec-
   onds  before  system  boot time.  This variability across
   kernel versions (and across Unix  implementations),  com-
   bined  with the fact that the returned value may overflow
   the range of clock_t, means that a  portable  application
   would  be  wise  to  avoid  using this value.  To measure
   changes in elapsed time, use gettimeofday(2) instead.

Under BUGS I added:

   BUGS
   A limitation of the Linux system call conventions on some
   architectures (notably x86) means that on Linux 2.6 there
   is  a small time window (41 seconds) soon after boot when
   times(2) can return -1, falsely indicating that an  error
   occurred.   The  same  problem  can occur when the return
   value wraps passed the maximum value that can  be  stored
   in clockid_t.

Look okay to you folks?

Cheers,

Michael
-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-12-20 Thread David Miller
From: Michael Kerrisk [EMAIL PROTECTED]
Date: Thu, 20 Dec 2007 12:36:52 +0100

 Some testing just now shows me that lseek() on /dev/mem suffers similar
 problems when seeking to bytes 0xf001 through to 0x.

Only on x86 platforms.  Sparc, IA64, MIPS, powerpc, etc. all get this
case right.

Yes it's another unfortunate side effect of how error status is
indicated for x86 system calls.

I would suggest, that we put something in place to fix this
in the long term:

1) Start setting the condition codes properly to indicate
   error in the system call return path on x86 like other
   platforms do now.  Make sure that it tips off on
   force_successful_syscall_return(), as needed.

2) Come up with a transition plan for glibc et al. to take
   advantage of this.

It actually sounds like the kind of problem that could be
solved well using the VDSO page. :-)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-09 Thread Ulrich Drepper
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chris Friesen wrote:
>> A possible remedy is to return the ticks since process start time, which
>> delays the wrap around much further.  POSIX only demands consistency
>> within the same process.
> 
> This would be an interesting solution.
> 
> The man page for linux states that the return code is time since system
> boot, so that could realistically be expected to correlate between
> different processes.

The Linux man page is documenting existing functionality on top of what
the standard requires.  Programmers should ever only require what the
standard guarantees.

I am perfectly willing to support a solution where the time is measured
from process startup time.  The only code using times() I found is
cross-platform and most likely does not depend on the value returned is
usable in isolation (only in a difference).

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHNKTZ2ijCOnn/RHQRAv2wAJsHOnWRrbE2N2Z4R35bsU1+BIZEGQCguaxL
zY9f4XEhJnAoNF5jFxm76qI=
=0nsU
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-09 Thread Ulrich Drepper
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chris Friesen wrote:
 A possible remedy is to return the ticks since process start time, which
 delays the wrap around much further.  POSIX only demands consistency
 within the same process.
 
 This would be an interesting solution.
 
 The man page for linux states that the return code is time since system
 boot, so that could realistically be expected to correlate between
 different processes.

The Linux man page is documenting existing functionality on top of what
the standard requires.  Programmers should ever only require what the
standard guarantees.

I am perfectly willing to support a solution where the time is measured
from process startup time.  The only code using times() I found is
cross-platform and most likely does not depend on the value returned is
usable in isolation (only in a difference).

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHNKTZ2ijCOnn/RHQRAv2wAJsHOnWRrbE2N2Z4R35bsU1+BIZEGQCguaxL
zY9f4XEhJnAoNF5jFxm76qI=
=0nsU
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-08 Thread Denys Vlasenko
On Thursday 08 November 2007 03:07, Andrew Morton wrote:
> > On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote:
> > Andrew Morton writes:
> > 
> > > Given all this stuff, the return value from sys_times() doesn't seem a
> > > particularly useful or reliable kernel interface.
> > 
> > I think the best thing would be to ignore any error from copy_to_user
> > and always return the number of clock ticks.  We should call
> > force_successful_syscall_return, and glibc on x86 should be taught not
> > to interpret negative values as an error.
> 
> Changing glibc might be hard ;)
> 
> > POSIX doesn't require us to return an EFAULT error if the buf argument
> > is bogus.  If userspace does supply a bogus buf pointer, then either
> > it will dereference it itself and get a segfault, or it won't
> > dereference it, in which case it obviously didn't care about the
> > values we tried to put there.
> > 
> > If we try to return an error under some circumstances, then there is
> > at least one 32-bit value for the number of ticks that will cause
> > confusion.  We can either change that value (or values) to some other
> > value, which seems pretty bogus, or we can just decide not to return
> > any errors.  The latter seems to me to have no significant downside
> > and to be the simplest solution to the problem.
> 
> "the latter" is what my protopatch does isn't it?  It wraps at 0x7fff.
> It appears that glibc treats all of 0x8000-0x as an error.

The best solution is to change the kernel to never return an error
and to change glibc to never treat return as an error.
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-08 Thread Denys Vlasenko
On Thursday 08 November 2007 02:09, David Miller wrote:
> > I think the best thing would be to ignore any error from copy_to_user
> > and always return the number of clock ticks.  We should call
> > force_successful_syscall_return, and glibc on x86 should be taught not
> > to interpret negative values as an error.
> > 
> > POSIX doesn't require us to return an EFAULT error if the buf argument
> > is bogus.  If userspace does supply a bogus buf pointer, then either
> > it will dereference it itself and get a segfault, or it won't
> > dereference it, in which case it obviously didn't care about the
> > values we tried to put there.
> > 
> > If we try to return an error under some circumstances, then there is
> > at least one 32-bit value for the number of ticks that will cause
> > confusion.  We can either change that value (or values) to some other
> > value, which seems pretty bogus, or we can just decide not to return
> > any errors.  The latter seems to me to have no significant downside
> > and to be the simplest solution to the problem.
> 
> I agree with this analysis.
> 
> The Linux man page for times() explicitly lists (clock_t) -1 as a
> return value meaning error.
> 
> So even if we did make some effort to return errors "properly" (via
> force_successful_syscall_return() et al.) userspace would still be
> screwed because (clock_t) -1 would be interpreted as an error.
> 
> Actually I think this basically proves we cannot return (clock_t) -1
> ever because all existing userland (I'm not talking about inside
> glibc, I'm talking about inside of applications) will see this as an
> error.

What error? I'd argue it's perfectly sane for application to
assume that times() never fails.

struct tms t;
clock_t start = times();
...
clock_t end = times();
clock_t delta = end - start;

The only error form kernel POV is that passed pointer can be
invalid. But from application POV in the above example it
cannot be true and

if (start == -1)
 error("error in times!");

would be and exercise in wasting CPU cycles, producing dead code
and feeding one's paranoia.

> User applications have no other way to check for error.

And in all realistic scenarios it doesn't need to.

In this particular case, it makes sense to ignore standards and
never return an error. If user indeed passed invalid pointer,
just don't store anything there, but still return valid value.
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-08 Thread Chris Friesen

Andreas Schwab wrote:


A possible remedy is to return the ticks since process start time, which
delays the wrap around much further.  POSIX only demands consistency
within the same process.


This would be an interesting solution.

The man page for linux states that the return code is time since system 
boot, so that could realistically be expected to correlate between 
different processes.


Could we get away with changing the man page and breaking any apps 
relying on this previously-documented behaviour?




Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-08 Thread Andreas Schwab
David Miller <[EMAIL PROTECTED]> writes:

> I agree with this analysis.
>
> The Linux man page for times() explicitly lists (clock_t) -1 as a
> return value meaning error.
>
> So even if we did make some effort to return errors "properly" (via
> force_successful_syscall_return() et al.) userspace would still be
> screwed because (clock_t) -1 would be interpreted as an error.
>
> Actually I think this basically proves we cannot return (clock_t) -1
> ever because all existing userland (I'm not talking about inside
> glibc, I'm talking about inside of applications) will see this as an
> error.
>
> User applications have no other way to check for error.
>
> This API is definitely very poorly designed, no matter which way we
> "fix" this some case will remain broken.

A possible remedy is to return the ticks since process start time, which
delays the wrap around much further.  POSIX only demands consistency
within the same process.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-08 Thread Paul Mackerras
David Miller writes:

> On x86 only.  We could use force_successful_syscall_return()
> to make sure the condition codes get set correctly on
> other platforms.
> 
> But even in that case we'd still be broken when the return
> value is exactly -1 and that's what the application is going
> to compare against to test for errors.

We could special-case that and turn it into 0.  That would introduce a
0.01 second blip, which would be better than a 41 second window for
bad behaviour like we have at the moment.

It's also possible that many applications already don't check for
errors.  For example, glibc deliberately doesn't check for errors when
it calls __times in the clock() implementation.  There is a comment in
sysdeps/unix/sysv/linux/clock.c that says this:

  /* We don't check for errors here.  The only error the kernel
 returns is EFAULT if the value cannot be written to the struct we
 pass a pointer to.  Otherwise the kernel returns an `unsigned
 long' value which is the number of jiffies since system start.
 But this number can be negative (when read as `long') when the
 system is up for some time.  Ignoring errors should therefore
 have no negative impacts but solve the problem.  */
  __times ();

Paul.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-08 Thread Paul Mackerras
David Miller writes:

 On x86 only.  We could use force_successful_syscall_return()
 to make sure the condition codes get set correctly on
 other platforms.
 
 But even in that case we'd still be broken when the return
 value is exactly -1 and that's what the application is going
 to compare against to test for errors.

We could special-case that and turn it into 0.  That would introduce a
0.01 second blip, which would be better than a 41 second window for
bad behaviour like we have at the moment.

It's also possible that many applications already don't check for
errors.  For example, glibc deliberately doesn't check for errors when
it calls __times in the clock() implementation.  There is a comment in
sysdeps/unix/sysv/linux/clock.c that says this:

  /* We don't check for errors here.  The only error the kernel
 returns is EFAULT if the value cannot be written to the struct we
 pass a pointer to.  Otherwise the kernel returns an `unsigned
 long' value which is the number of jiffies since system start.
 But this number can be negative (when read as `long') when the
 system is up for some time.  Ignoring errors should therefore
 have no negative impacts but solve the problem.  */
  __times (buf);

Paul.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-08 Thread Andreas Schwab
David Miller [EMAIL PROTECTED] writes:

 I agree with this analysis.

 The Linux man page for times() explicitly lists (clock_t) -1 as a
 return value meaning error.

 So even if we did make some effort to return errors properly (via
 force_successful_syscall_return() et al.) userspace would still be
 screwed because (clock_t) -1 would be interpreted as an error.

 Actually I think this basically proves we cannot return (clock_t) -1
 ever because all existing userland (I'm not talking about inside
 glibc, I'm talking about inside of applications) will see this as an
 error.

 User applications have no other way to check for error.

 This API is definitely very poorly designed, no matter which way we
 fix this some case will remain broken.

A possible remedy is to return the ticks since process start time, which
delays the wrap around much further.  POSIX only demands consistency
within the same process.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-08 Thread Chris Friesen

Andreas Schwab wrote:


A possible remedy is to return the ticks since process start time, which
delays the wrap around much further.  POSIX only demands consistency
within the same process.


This would be an interesting solution.

The man page for linux states that the return code is time since system 
boot, so that could realistically be expected to correlate between 
different processes.


Could we get away with changing the man page and breaking any apps 
relying on this previously-documented behaviour?




Chris
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-08 Thread Denys Vlasenko
On Thursday 08 November 2007 02:09, David Miller wrote:
  I think the best thing would be to ignore any error from copy_to_user
  and always return the number of clock ticks.  We should call
  force_successful_syscall_return, and glibc on x86 should be taught not
  to interpret negative values as an error.
  
  POSIX doesn't require us to return an EFAULT error if the buf argument
  is bogus.  If userspace does supply a bogus buf pointer, then either
  it will dereference it itself and get a segfault, or it won't
  dereference it, in which case it obviously didn't care about the
  values we tried to put there.
  
  If we try to return an error under some circumstances, then there is
  at least one 32-bit value for the number of ticks that will cause
  confusion.  We can either change that value (or values) to some other
  value, which seems pretty bogus, or we can just decide not to return
  any errors.  The latter seems to me to have no significant downside
  and to be the simplest solution to the problem.
 
 I agree with this analysis.
 
 The Linux man page for times() explicitly lists (clock_t) -1 as a
 return value meaning error.
 
 So even if we did make some effort to return errors properly (via
 force_successful_syscall_return() et al.) userspace would still be
 screwed because (clock_t) -1 would be interpreted as an error.
 
 Actually I think this basically proves we cannot return (clock_t) -1
 ever because all existing userland (I'm not talking about inside
 glibc, I'm talking about inside of applications) will see this as an
 error.

What error? I'd argue it's perfectly sane for application to
assume that times() never fails.

struct tms t;
clock_t start = times(t);
...
clock_t end = times(t);
clock_t delta = end - start;

The only error form kernel POV is that passed pointer can be
invalid. But from application POV in the above example it
cannot be true and

if (start == -1)
 error(error in times!);

would be and exercise in wasting CPU cycles, producing dead code
and feeding one's paranoia.

 User applications have no other way to check for error.

And in all realistic scenarios it doesn't need to.

In this particular case, it makes sense to ignore standards and
never return an error. If user indeed passed invalid pointer,
just don't store anything there, but still return valid value.
--
vda
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-08 Thread Denys Vlasenko
On Thursday 08 November 2007 03:07, Andrew Morton wrote:
  On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras [EMAIL PROTECTED] wrote:
  Andrew Morton writes:
  
   Given all this stuff, the return value from sys_times() doesn't seem a
   particularly useful or reliable kernel interface.
  
  I think the best thing would be to ignore any error from copy_to_user
  and always return the number of clock ticks.  We should call
  force_successful_syscall_return, and glibc on x86 should be taught not
  to interpret negative values as an error.
 
 Changing glibc might be hard ;)
 
  POSIX doesn't require us to return an EFAULT error if the buf argument
  is bogus.  If userspace does supply a bogus buf pointer, then either
  it will dereference it itself and get a segfault, or it won't
  dereference it, in which case it obviously didn't care about the
  values we tried to put there.
  
  If we try to return an error under some circumstances, then there is
  at least one 32-bit value for the number of ticks that will cause
  confusion.  We can either change that value (or values) to some other
  value, which seems pretty bogus, or we can just decide not to return
  any errors.  The latter seems to me to have no significant downside
  and to be the simplest solution to the problem.
 
 the latter is what my protopatch does isn't it?  It wraps at 0x7fff.
 It appears that glibc treats all of 0x8000-0x as an error.

The best solution is to change the kernel to never return an error
and to change glibc to never treat return as an error.
--
vda
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 23:09:16 -0800

> I don't think that's a big problem?  This syscall can (oddly) return any
> 32-bit (64-bit) number and a smart application developer (after saying wtf)
> would realise that he just can't check for errors and have correctly
> working code.
> 
> Then again, if he was smart he just wouldn't use times(2)'s return value
> for anything.  But what is the alternative?  I don't think there is one,
> apart from much saner things like gettimeofday().

You and I would say "wtf", but the manual states what it does:

On error, (clock_t) -1 is returned, and errno is  set  appro-
priately.

And I think this (obviously bogus) convention is something we
are really stuck with.

Another awful aspect of this is that glibc is going to overwrite
'errno' for this return value range.  That will likely cause more
application misbehavior than some of the other side effects we've been
discussing.

In short we have two problems:

1) glibc thinks -4096 < x < 0 is an error, and will write this
   value into errno and return -1 to the application

2) the manual states that -1 means error

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 07 Nov 2007 22:25:30 -0800 (PST) David Miller <[EMAIL PROTECTED]> 
> wrote:
> From: Andrew Morton <[EMAIL PROTECTED]>
> Date: Wed, 7 Nov 2007 21:20:05 -0800
> 
> > Yup.  But userspace will already have a fit if either the start or end time
> > advanced into the glibc-thought-that-was-an-error range.
> 
> On x86 only.  We could use force_successful_syscall_return()
> to make sure the condition codes get set correctly on
> other platforms.
> 
> But even in that case we'd still be broken when the return
> value is exactly -1 and that's what the application is going
> to compare against to test for errors.

I don't think that's a big problem?  This syscall can (oddly) return any
32-bit (64-bit) number and a smart application developer (after saying wtf)
would realise that he just can't check for errors and have correctly
working code.

Then again, if he was smart he just wouldn't use times(2)'s return value
for anything.  But what is the alternative?  I don't think there is one,
apart from much saner things like gettimeofday().

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 21:20:05 -0800

> Yup.  But userspace will already have a fit if either the start or end time
> advanced into the glibc-thought-that-was-an-error range.

On x86 only.  We could use force_successful_syscall_return()
to make sure the condition codes get set correctly on
other platforms.

But even in that case we'd still be broken when the return
value is exactly -1 and that's what the application is going
to compare against to test for errors.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 16:15:51 +1100

> David Miller writes:
> 
> > I can't see where x86 is doing this though, so perhaps for x86
> > glibc does make the negative value check.  But I doubt it is
> > checking the range 0x8000-0x, otherwise mmap() would
> > be busted.
> 
> At least for the INTERNAL_SYSCALL macro in glibc, the error check is:
> 
> #define INTERNAL_SYSCALL_ERROR_P(val, err) \
>   ((unsigned int) (val) >= 0xf001u)
> 
> in sysdeps/unix/sysv/linux/i386/sysdep.h.  Similarly the PSEUDO macro
> in that file does a cmpl $-4095,%eax to test for error.  (There is also
> a PSEUDO_NOERRNO which doesn't test for error.)
> 
> So the convention on (32-bit) x86 is that -4095 .. -1 are error
> values, and other values are successful return values.

Thanks for figuring that out.

Really there is no way to fix sys_times() return values
universally.  Each proposed solution either doesn't fix
the problem, or adds a new failure mode.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 15:59:12 +1100

> Not on powerpc.  On powerpc the error indication is carried separately
> in a condition register bit.  So a force_successful_syscall_return()
> call will make glibc automatically do the right thing without any
> glibc changes on powerpc.

It still won't fix the problem.

When the return value is (clock_t) -1, all the
force_successful_syscall_return() calls and glibc condition
codes checks in the world are not going to fix the application
code which checks for error using -1.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 8 Nov 2007 16:36:08 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote:
> Andrew Morton writes:
> 
> > Yup.  But userspace will already have a fit if either the start or end time
> > advanced into the glibc-thought-that-was-an-error range.
> 
> Not nearly as much of a fit.  The effect on x86 is that values between
> -4095 and -1 are reported as -1, so the end-start difference will be
> out by less than 41 seconds.  That's not nearly as dramatic as a
> difference of 21 million seconds (over 16 years). :)
> 
> I really think that wrapping at 0x7fff makes the situation worse,
> not better.
> 

Sure.

So we need to do what you say: never return an error from sys_times() and
change glibc to not perform error-interpretation on sys_times() return
values and recommend that people bypass libc and go direct to the syscall
so they'll work correctly on older glibc.   Lovely.

I wonder what happens with things like F_GETOWN, shmat() and lseek(/dev/mem)
on x86 (things which use force_successful_syscall_return()).  According
to the comment in include/linux/ptrace.h, glibc should be special-casing
these.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Brown

On Wed, Nov 07, 2007 at 03:28:33PM -0800, Andrew Morton wrote:

On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote:



will return '-1' to user space and set the negated clock_t value to errno.

At minimum, perhaps it should return a sane errno value.


RETURN VALUE
  times()  returns  the  number of clock ticks that have elapsed since an
  arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
  the  moment  the  system  was  booted.   Since Linux 2.6, this point is
  (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
  time.   The  return  value  may  overflow  the  possible  range of type
  clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
  priately.


The strange -1 behavior is enshrined in history.  I think a better answer
is to tell people to use getrusage() if they want a return result without
this problem.

Adding INITIAL_JIFFIES will fix the case where an embedded system is booted
up to run a test and then shut down, and the mask, although it causes
discontinuities periodically at least moves them away from the early boot.

INITIAL_JIFFIES was a good idea, but it is probably best to keep it inside
of the kernel.

David Brown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

> Yup.  But userspace will already have a fit if either the start or end time
> advanced into the glibc-thought-that-was-an-error range.

Not nearly as much of a fit.  The effect on x86 is that values between
-4095 and -1 are reported as -1, so the end-start difference will be
out by less than 41 seconds.  That's not nearly as dramatic as a
difference of 21 million seconds (over 16 years). :)

I really think that wrapping at 0x7fff makes the situation worse,
not better.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 8 Nov 2007 15:59:12 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote:
> Andrew Morton writes:
> 
> > "the latter" is what my protopatch does isn't it?  It wraps at 0x7fff.
> > It appears that glibc treats all of 0x8000-0x as an error.
> 
> Not on powerpc.  On powerpc the error indication is carried separately
> in a condition register bit.  So a force_successful_syscall_return()
> call will make glibc automatically do the right thing without any
> glibc changes on powerpc.

OK

> Wrapping at 0x7fff will cause programs to see large negative
> deltas between successive calls when the wrap occurs.  I can see that
> giving userspace fits. :)
> 

Yup.  But userspace will already have a fit if either the start or end time
advanced into the glibc-thought-that-was-an-error range.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
David Miller writes:

> I can't see where x86 is doing this though, so perhaps for x86
> glibc does make the negative value check.  But I doubt it is
> checking the range 0x8000-0x, otherwise mmap() would
> be busted.

At least for the INTERNAL_SYSCALL macro in glibc, the error check is:

#define INTERNAL_SYSCALL_ERROR_P(val, err) \
  ((unsigned int) (val) >= 0xf001u)

in sysdeps/unix/sysv/linux/i386/sysdep.h.  Similarly the PSEUDO macro
in that file does a cmpl $-4095,%eax to test for error.  (There is also
a PSEUDO_NOERRNO which doesn't test for error.)

So the convention on (32-bit) x86 is that -4095 .. -1 are error
values, and other values are successful return values.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

> "the latter" is what my protopatch does isn't it?  It wraps at 0x7fff.
> It appears that glibc treats all of 0x8000-0x as an error.

Not on powerpc.  On powerpc the error indication is carried separately
in a condition register bit.  So a force_successful_syscall_return()
call will make glibc automatically do the right thing without any
glibc changes on powerpc.

Wrapping at 0x7fff will cause programs to see large negative
deltas between successive calls when the wrap occurs.  I can see that
giving userspace fits. :)

Paul.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 19:07:14 -0800

> It appears that glibc treats all of 0x8000-0x as an
> error.

glibc treats it as an error if the system call returns with
the carry condition code set.  At least that's how I've
understood it to work and at a minimum this is how it works
on sparc, ppc, ia64, mips, etc.

The error indication is being created by the system call return path
in the kernel.  It tests for values between -512 and 0, and marks
those as errors unless force_successful_syscall() has been called.

I can't see where x86 is doing this though, so perhaps for x86
glibc does make the negative value check.  But I doubt it is
checking the range 0x8000-0x, otherwise mmap() would
be busted.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote:
> Andrew Morton writes:
> 
> > Given all this stuff, the return value from sys_times() doesn't seem a
> > particularly useful or reliable kernel interface.
> 
> I think the best thing would be to ignore any error from copy_to_user
> and always return the number of clock ticks.  We should call
> force_successful_syscall_return, and glibc on x86 should be taught not
> to interpret negative values as an error.

Changing glibc might be hard ;)

> POSIX doesn't require us to return an EFAULT error if the buf argument
> is bogus.  If userspace does supply a bogus buf pointer, then either
> it will dereference it itself and get a segfault, or it won't
> dereference it, in which case it obviously didn't care about the
> values we tried to put there.
> 
> If we try to return an error under some circumstances, then there is
> at least one 32-bit value for the number of ticks that will cause
> confusion.  We can either change that value (or values) to some other
> value, which seems pretty bogus, or we can just decide not to return
> any errors.  The latter seems to me to have no significant downside
> and to be the simplest solution to the problem.

"the latter" is what my protopatch does isn't it?  It wraps at 0x7fff.
It appears that glibc treats all of 0x8000-0x as an error.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 12:53:57 +1100

> Andrew Morton writes:
> 
> > Given all this stuff, the return value from sys_times() doesn't seem a
> > particularly useful or reliable kernel interface.
> 
> I think the best thing would be to ignore any error from copy_to_user
> and always return the number of clock ticks.  We should call
> force_successful_syscall_return, and glibc on x86 should be taught not
> to interpret negative values as an error.
> 
> POSIX doesn't require us to return an EFAULT error if the buf argument
> is bogus.  If userspace does supply a bogus buf pointer, then either
> it will dereference it itself and get a segfault, or it won't
> dereference it, in which case it obviously didn't care about the
> values we tried to put there.
> 
> If we try to return an error under some circumstances, then there is
> at least one 32-bit value for the number of ticks that will cause
> confusion.  We can either change that value (or values) to some other
> value, which seems pretty bogus, or we can just decide not to return
> any errors.  The latter seems to me to have no significant downside
> and to be the simplest solution to the problem.

I agree with this analysis.

The Linux man page for times() explicitly lists (clock_t) -1 as a
return value meaning error.

So even if we did make some effort to return errors "properly" (via
force_successful_syscall_return() et al.) userspace would still be
screwed because (clock_t) -1 would be interpreted as an error.

Actually I think this basically proves we cannot return (clock_t) -1
ever because all existing userland (I'm not talking about inside
glibc, I'm talking about inside of applications) will see this as an
error.

User applications have no other way to check for error.

This API is definitely very poorly designed, no matter which way we
"fix" this some case will remain broken.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

> Given all this stuff, the return value from sys_times() doesn't seem a
> particularly useful or reliable kernel interface.

I think the best thing would be to ignore any error from copy_to_user
and always return the number of clock ticks.  We should call
force_successful_syscall_return, and glibc on x86 should be taught not
to interpret negative values as an error.

POSIX doesn't require us to return an EFAULT error if the buf argument
is bogus.  If userspace does supply a bogus buf pointer, then either
it will dereference it itself and get a segfault, or it won't
dereference it, in which case it obviously didn't care about the
values we tried to put there.

If we try to return an error under some circumstances, then there is
at least one 32-bit value for the number of ticks that will cause
confusion.  We can either change that value (or values) to some other
value, which seems pretty bogus, or we can just decide not to return
any errors.  The latter seems to me to have no significant downside
and to be the simplest solution to the problem.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Thu, 08 Nov 2007 01:54:40 +0100 Andreas Schwab <[EMAIL PROTECTED]> wrote:
> Andrew Morton <[EMAIL PROTECTED]> writes:
> 
> > diff -puN kernel/compat.c~a kernel/compat.c
> > --- a/kernel/compat.c~a
> > +++ a/kernel/compat.c
> > @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
> > if (copy_to_user(tbuf, , sizeof(tmp)))
> > return -EFAULT;
> > }
> > -   return compat_jiffies_to_clock_t(jiffies);
> > +   return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
> > +   LONG_MAX);
> 
> Are you sure you want LONG_MAX here, not 0x7fff?
> 

I'm not sure of anything - I'm just trolling ;)

That's 0x7fff for architectures which implement this function. 
I think that lines up correctly with jiffies and the return value from
compat_sys_times().

Perhaps formally it should be USERSPACE_CLOCK_T_MAX, but we don't have that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 07 Nov 2007 16:50:22 -0800 (PST) David Miller <[EMAIL PROTECTED]> 
> wrote:
> From: Andrew Morton <[EMAIL PROTECTED]>
> Date: Wed, 7 Nov 2007 15:28:33 -0800
> 
> > Perhaps this is a bug in glibc: it is interpreting the times() return value
> > in the same way as other syscalls.
> 
> The problem is more likely that we are failing to
> invoke force_successful_syscall_return() here.
> 
> Otherwise the syscall return path interprets negative
> values as errors, and sets the cpu condition codes.
> 
> And that is what userspace is actually checking for
> to determine if there is an error or not.

hm, I'd forgotten about that.

It seems to be a no-op on lots of architectures?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andreas Schwab
Andrew Morton <[EMAIL PROTECTED]> writes:

> diff -puN kernel/compat.c~a kernel/compat.c
> --- a/kernel/compat.c~a
> +++ a/kernel/compat.c
> @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
>   if (copy_to_user(tbuf, , sizeof(tmp)))
>   return -EFAULT;
>   }
> - return compat_jiffies_to_clock_t(jiffies);
> + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
> + LONG_MAX);

Are you sure you want LONG_MAX here, not 0x7fff?

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 7 Nov 2007 15:28:33 -0800

> Perhaps this is a bug in glibc: it is interpreting the times() return value
> in the same way as other syscalls.

The problem is more likely that we are failing to
invoke force_successful_syscall_return() here.

Otherwise the syscall return path interprets negative
values as errors, and sets the cpu condition codes.

And that is what userspace is actually checking for
to determine if there is an error or not.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 7 Nov 2007 15:28:33 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote:
> > compat_sys_times() has bogus return until jiffies is >= 0.  I discovered
> > this running LTP within 5 minutes of booting.
> > 
> > The return result
> > 
> > return compat_jiffies_to_clock_t(jiffies);
> > 
> > will return '-1' to user space and set the negated clock_t value to errno.
> > 
> > I'm not sure what the correct fix for this is.  I can come up with a patch
> > if anyone has ideas on how to fix it.
> > 
> > At minimum, perhaps it should return a sane errno value.
> 
> RETURN VALUE
>times()  returns  the  number of clock ticks that have elapsed since an
>arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
>the  moment  the  system  was  booted.   Since Linux 2.6, this point is
>(2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
>time.   The  return  value  may  overflow  the  possible  range of type
>clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
>priately.
> 
> 
> Perhaps this is a bug in glibc: it is interpreting the times() return value
> in the same way as other syscalls.
> 
> It would have been sensible for us to add INITIAL_JIFFIES to the value
> instead of exposing this kernel-only detail to the world, although the
> problem will of course reoccur once jiffies hits 0x8000.  Unfortunately
> we've even gone and enshrined this bogon in the manpage.
> 
> Proposed fix:
> 
> -return compat_jiffies_to_clock_t(jiffies);
> +return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
> + 0x7fff);
> 
> ?

Like this?

It gets messy.


From: Andrew Morton <[EMAIL PROTECTED]>

David Brown points out that compat_sys_times() (and sys_times()) can return
arbitrary 32-bit (or 64-bit values).  If these happen to be negative (jiffy
wrap, or before INITIAL_JIFFIES) then libc will interpret this as an error and
will return -1 to the libc user and will set errno.

The manpage for times(2) says:

   times()  returns  the  number of clock ticks that have elapsed since an
   arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
   the  moment  the  system  was  booted.   Since Linux 2.6, this point is
   (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
   time.   The  return  value  may  overflow  the  possible  range of type
   clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
   priately.

We can fix this by masking the return value down to a 31-bit (63-bit) value.

Also, let's correct for INTIAL_JIFFIES - this isn't a detail which should be
exposed to userspace.

Unfortunately this change can break userspace.  If a program was (correctly)
doing:

unsigned long start = times(...);
...
unsigned long end = times(...);
unsigned long delta = end - start;

then `delta' can be grossly wrong if we wrapped in the interval.  Instead
userspace will need to mask `delta' by 0x7fff (0x7fff) to get
the correct number.

But userspace was already busted in the presence of wraparound, due to glibc's
convert-to-negative-one behaviour.

Given all this stuff, the return value from sys_times() doesn't seem a
particularly useful or reliable kernel interface.

Cc: David Brown <[EMAIL PROTECTED]>
Cc: Ulrich Drepper <[EMAIL PROTECTED]>
Cc: Michael Kerrisk <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 kernel/compat.c |3 ++-
 kernel/sys.c|3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff -puN kernel/sys.c~a kernel/sys.c
--- a/kernel/sys.c~a
+++ a/kernel/sys.c
@@ -897,7 +897,8 @@ asmlinkage long sys_times(struct tms __u
if (copy_to_user(tbuf, , sizeof(struct tms)))
return -EFAULT;
}
-   return (long) jiffies_64_to_clock_t(get_jiffies_64());
+   return jiffies_64_to_clock_t((get_jiffies_64() + INITIAL_JIFFIES) &
+   LONG_MAX);
 }
 
 /*
diff -puN kernel/compat.c~a kernel/compat.c
--- a/kernel/compat.c~a
+++ a/kernel/compat.c
@@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
if (copy_to_user(tbuf, , sizeof(tmp)))
return -EFAULT;
}
-   return compat_jiffies_to_clock_t(jiffies);
+   return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
+   LONG_MAX);
 }
 
 /*
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread Andrew Morton
> On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote:
> compat_sys_times() has bogus return until jiffies is >= 0.  I discovered
> this running LTP within 5 minutes of booting.
> 
> The return result
> 
>   return compat_jiffies_to_clock_t(jiffies);
> 
> will return '-1' to user space and set the negated clock_t value to errno.
> 
> I'm not sure what the correct fix for this is.  I can come up with a patch
> if anyone has ideas on how to fix it.
> 
> At minimum, perhaps it should return a sane errno value.

RETURN VALUE
   times()  returns  the  number of clock ticks that have elapsed since an
   arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
   the  moment  the  system  was  booted.   Since Linux 2.6, this point is
   (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
   time.   The  return  value  may  overflow  the  possible  range of type
   clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
   priately.


Perhaps this is a bug in glibc: it is interpreting the times() return value
in the same way as other syscalls.

It would have been sensible for us to add INITIAL_JIFFIES to the value
instead of exposing this kernel-only detail to the world, although the
problem will of course reoccur once jiffies hits 0x8000.  Unfortunately
we've even gone and enshrined this bogon in the manpage.

Proposed fix:

-return compat_jiffies_to_clock_t(jiffies);
+return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) &
+   0x7fff);

?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


compat_sys_times() bogus until jiffies >= 0.

2007-11-07 Thread David Brown

compat_sys_times() has bogus return until jiffies is >= 0.  I discovered
this running LTP within 5 minutes of booting.

The return result

return compat_jiffies_to_clock_t(jiffies);

will return '-1' to user space and set the negated clock_t value to errno.

I'm not sure what the correct fix for this is.  I can come up with a patch
if anyone has ideas on how to fix it.

At minimum, perhaps it should return a sane errno value.

Thanks,
David Brown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Brown

compat_sys_times() has bogus return until jiffies is = 0.  I discovered
this running LTP within 5 minutes of booting.

The return result

return compat_jiffies_to_clock_t(jiffies);

will return '-1' to user space and set the negated clock_t value to errno.

I'm not sure what the correct fix for this is.  I can come up with a patch
if anyone has ideas on how to fix it.

At minimum, perhaps it should return a sane errno value.

Thanks,
David Brown
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Wed, 7 Nov 2007 14:47:22 -0800 David Brown [EMAIL PROTECTED] wrote:
 compat_sys_times() has bogus return until jiffies is = 0.  I discovered
 this running LTP within 5 minutes of booting.
 
 The return result
 
   return compat_jiffies_to_clock_t(jiffies);
 
 will return '-1' to user space and set the negated clock_t value to errno.
 
 I'm not sure what the correct fix for this is.  I can come up with a patch
 if anyone has ideas on how to fix it.
 
 At minimum, perhaps it should return a sane errno value.

RETURN VALUE
   times()  returns  the  number of clock ticks that have elapsed since an
   arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
   the  moment  the  system  was  booted.   Since Linux 2.6, this point is
   (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
   time.   The  return  value  may  overflow  the  possible  range of type
   clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
   priately.


Perhaps this is a bug in glibc: it is interpreting the times() return value
in the same way as other syscalls.

It would have been sensible for us to add INITIAL_JIFFIES to the value
instead of exposing this kernel-only detail to the world, although the
problem will of course reoccur once jiffies hits 0x8000.  Unfortunately
we've even gone and enshrined this bogon in the manpage.

Proposed fix:

-return compat_jiffies_to_clock_t(jiffies);
+return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) 
+   0x7fff);

?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Thu, 8 Nov 2007 15:59:12 +1100 Paul Mackerras [EMAIL PROTECTED] wrote:
 Andrew Morton writes:
 
  the latter is what my protopatch does isn't it?  It wraps at 0x7fff.
  It appears that glibc treats all of 0x8000-0x as an error.
 
 Not on powerpc.  On powerpc the error indication is carried separately
 in a condition register bit.  So a force_successful_syscall_return()
 call will make glibc automatically do the right thing without any
 glibc changes on powerpc.

OK

 Wrapping at 0x7fff will cause programs to see large negative
 deltas between successive calls when the wrap occurs.  I can see that
 giving userspace fits. :)
 

Yup.  But userspace will already have a fit if either the start or end time
advanced into the glibc-thought-that-was-an-error range.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

 Yup.  But userspace will already have a fit if either the start or end time
 advanced into the glibc-thought-that-was-an-error range.

Not nearly as much of a fit.  The effect on x86 is that values between
-4095 and -1 are reported as -1, so the end-start difference will be
out by less than 41 seconds.  That's not nearly as dramatic as a
difference of 21 million seconds (over 16 years). :)

I really think that wrapping at 0x7fff makes the situation worse,
not better.

Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Wed, 7 Nov 2007 19:07:14 -0800

 It appears that glibc treats all of 0x8000-0x as an
 error.

glibc treats it as an error if the system call returns with
the carry condition code set.  At least that's how I've
understood it to work and at a minimum this is how it works
on sparc, ppc, ia64, mips, etc.

The error indication is being created by the system call return path
in the kernel.  It tests for values between -512 and 0, and marks
those as errors unless force_successful_syscall() has been called.

I can't see where x86 is doing this though, so perhaps for x86
glibc does make the negative value check.  But I doubt it is
checking the range 0x8000-0x, otherwise mmap() would
be busted.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Miller
From: Paul Mackerras [EMAIL PROTECTED]
Date: Thu, 8 Nov 2007 12:53:57 +1100

 Andrew Morton writes:
 
  Given all this stuff, the return value from sys_times() doesn't seem a
  particularly useful or reliable kernel interface.
 
 I think the best thing would be to ignore any error from copy_to_user
 and always return the number of clock ticks.  We should call
 force_successful_syscall_return, and glibc on x86 should be taught not
 to interpret negative values as an error.
 
 POSIX doesn't require us to return an EFAULT error if the buf argument
 is bogus.  If userspace does supply a bogus buf pointer, then either
 it will dereference it itself and get a segfault, or it won't
 dereference it, in which case it obviously didn't care about the
 values we tried to put there.
 
 If we try to return an error under some circumstances, then there is
 at least one 32-bit value for the number of ticks that will cause
 confusion.  We can either change that value (or values) to some other
 value, which seems pretty bogus, or we can just decide not to return
 any errors.  The latter seems to me to have no significant downside
 and to be the simplest solution to the problem.

I agree with this analysis.

The Linux man page for times() explicitly lists (clock_t) -1 as a
return value meaning error.

So even if we did make some effort to return errors properly (via
force_successful_syscall_return() et al.) userspace would still be
screwed because (clock_t) -1 would be interpreted as an error.

Actually I think this basically proves we cannot return (clock_t) -1
ever because all existing userland (I'm not talking about inside
glibc, I'm talking about inside of applications) will see this as an
error.

User applications have no other way to check for error.

This API is definitely very poorly designed, no matter which way we
fix this some case will remain broken.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Wed, 07 Nov 2007 22:25:30 -0800 (PST) David Miller [EMAIL PROTECTED] 
 wrote:
 From: Andrew Morton [EMAIL PROTECTED]
 Date: Wed, 7 Nov 2007 21:20:05 -0800
 
  Yup.  But userspace will already have a fit if either the start or end time
  advanced into the glibc-thought-that-was-an-error range.
 
 On x86 only.  We could use force_successful_syscall_return()
 to make sure the condition codes get set correctly on
 other platforms.
 
 But even in that case we'd still be broken when the return
 value is exactly -1 and that's what the application is going
 to compare against to test for errors.

I don't think that's a big problem?  This syscall can (oddly) return any
32-bit (64-bit) number and a smart application developer (after saying wtf)
would realise that he just can't check for errors and have correctly
working code.

Then again, if he was smart he just wouldn't use times(2)'s return value
for anything.  But what is the alternative?  I don't think there is one,
apart from much saner things like gettimeofday().

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Miller
From: Paul Mackerras [EMAIL PROTECTED]
Date: Thu, 8 Nov 2007 16:15:51 +1100

 David Miller writes:
 
  I can't see where x86 is doing this though, so perhaps for x86
  glibc does make the negative value check.  But I doubt it is
  checking the range 0x8000-0x, otherwise mmap() would
  be busted.
 
 At least for the INTERNAL_SYSCALL macro in glibc, the error check is:
 
 #define INTERNAL_SYSCALL_ERROR_P(val, err) \
   ((unsigned int) (val) = 0xf001u)
 
 in sysdeps/unix/sysv/linux/i386/sysdep.h.  Similarly the PSEUDO macro
 in that file does a cmpl $-4095,%eax to test for error.  (There is also
 a PSEUDO_NOERRNO which doesn't test for error.)
 
 So the convention on (32-bit) x86 is that -4095 .. -1 are error
 values, and other values are successful return values.

Thanks for figuring that out.

Really there is no way to fix sys_times() return values
universally.  Each proposed solution either doesn't fix
the problem, or adds a new failure mode.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Thu, 8 Nov 2007 16:36:08 +1100 Paul Mackerras [EMAIL PROTECTED] wrote:
 Andrew Morton writes:
 
  Yup.  But userspace will already have a fit if either the start or end time
  advanced into the glibc-thought-that-was-an-error range.
 
 Not nearly as much of a fit.  The effect on x86 is that values between
 -4095 and -1 are reported as -1, so the end-start difference will be
 out by less than 41 seconds.  That's not nearly as dramatic as a
 difference of 21 million seconds (over 16 years). :)
 
 I really think that wrapping at 0x7fff makes the situation worse,
 not better.
 

Sure.

So we need to do what you say: never return an error from sys_times() and
change glibc to not perform error-interpretation on sys_times() return
values and recommend that people bypass libc and go direct to the syscall
so they'll work correctly on older glibc.   Lovely.

I wonder what happens with things like F_GETOWN, shmat() and lseek(/dev/mem)
on x86 (things which use force_successful_syscall_return()).  According
to the comment in include/linux/ptrace.h, glibc should be special-casing
these.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Wed, 7 Nov 2007 15:28:33 -0800 Andrew Morton [EMAIL PROTECTED] wrote:
  On Wed, 7 Nov 2007 14:47:22 -0800 David Brown [EMAIL PROTECTED] wrote:
  compat_sys_times() has bogus return until jiffies is = 0.  I discovered
  this running LTP within 5 minutes of booting.
  
  The return result
  
  return compat_jiffies_to_clock_t(jiffies);
  
  will return '-1' to user space and set the negated clock_t value to errno.
  
  I'm not sure what the correct fix for this is.  I can come up with a patch
  if anyone has ideas on how to fix it.
  
  At minimum, perhaps it should return a sane errno value.
 
 RETURN VALUE
times()  returns  the  number of clock ticks that have elapsed since an
arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
the  moment  the  system  was  booted.   Since Linux 2.6, this point is
(2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
time.   The  return  value  may  overflow  the  possible  range of type
clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
priately.
 
 
 Perhaps this is a bug in glibc: it is interpreting the times() return value
 in the same way as other syscalls.
 
 It would have been sensible for us to add INITIAL_JIFFIES to the value
 instead of exposing this kernel-only detail to the world, although the
 problem will of course reoccur once jiffies hits 0x8000.  Unfortunately
 we've even gone and enshrined this bogon in the manpage.
 
 Proposed fix:
 
 -return compat_jiffies_to_clock_t(jiffies);
 +return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) 
 + 0x7fff);
 
 ?

Like this?

It gets messy.


From: Andrew Morton [EMAIL PROTECTED]

David Brown points out that compat_sys_times() (and sys_times()) can return
arbitrary 32-bit (or 64-bit values).  If these happen to be negative (jiffy
wrap, or before INITIAL_JIFFIES) then libc will interpret this as an error and
will return -1 to the libc user and will set errno.

The manpage for times(2) says:

   times()  returns  the  number of clock ticks that have elapsed since an
   arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
   the  moment  the  system  was  booted.   Since Linux 2.6, this point is
   (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
   time.   The  return  value  may  overflow  the  possible  range of type
   clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
   priately.

We can fix this by masking the return value down to a 31-bit (63-bit) value.

Also, let's correct for INTIAL_JIFFIES - this isn't a detail which should be
exposed to userspace.

Unfortunately this change can break userspace.  If a program was (correctly)
doing:

unsigned long start = times(...);
...
unsigned long end = times(...);
unsigned long delta = end - start;

then `delta' can be grossly wrong if we wrapped in the interval.  Instead
userspace will need to mask `delta' by 0x7fff (0x7fff) to get
the correct number.

But userspace was already busted in the presence of wraparound, due to glibc's
convert-to-negative-one behaviour.

Given all this stuff, the return value from sys_times() doesn't seem a
particularly useful or reliable kernel interface.

Cc: David Brown [EMAIL PROTECTED]
Cc: Ulrich Drepper [EMAIL PROTECTED]
Cc: Michael Kerrisk [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 kernel/compat.c |3 ++-
 kernel/sys.c|3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff -puN kernel/sys.c~a kernel/sys.c
--- a/kernel/sys.c~a
+++ a/kernel/sys.c
@@ -897,7 +897,8 @@ asmlinkage long sys_times(struct tms __u
if (copy_to_user(tbuf, tmp, sizeof(struct tms)))
return -EFAULT;
}
-   return (long) jiffies_64_to_clock_t(get_jiffies_64());
+   return jiffies_64_to_clock_t((get_jiffies_64() + INITIAL_JIFFIES) 
+   LONG_MAX);
 }
 
 /*
diff -puN kernel/compat.c~a kernel/compat.c
--- a/kernel/compat.c~a
+++ a/kernel/compat.c
@@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
if (copy_to_user(tbuf, tmp, sizeof(tmp)))
return -EFAULT;
}
-   return compat_jiffies_to_clock_t(jiffies);
+   return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) 
+   LONG_MAX);
 }
 
 /*
_

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

 Given all this stuff, the return value from sys_times() doesn't seem a
 particularly useful or reliable kernel interface.

I think the best thing would be to ignore any error from copy_to_user
and always return the number of clock ticks.  We should call
force_successful_syscall_return, and glibc on x86 should be taught not
to interpret negative values as an error.

POSIX doesn't require us to return an EFAULT error if the buf argument
is bogus.  If userspace does supply a bogus buf pointer, then either
it will dereference it itself and get a segfault, or it won't
dereference it, in which case it obviously didn't care about the
values we tried to put there.

If we try to return an error under some circumstances, then there is
at least one 32-bit value for the number of ticks that will cause
confusion.  We can either change that value (or values) to some other
value, which seems pretty bogus, or we can just decide not to return
any errors.  The latter seems to me to have no significant downside
and to be the simplest solution to the problem.

Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Wed, 7 Nov 2007 15:28:33 -0800

 Perhaps this is a bug in glibc: it is interpreting the times() return value
 in the same way as other syscalls.

The problem is more likely that we are failing to
invoke force_successful_syscall_return() here.

Otherwise the syscall return path interprets negative
values as errors, and sets the cpu condition codes.

And that is what userspace is actually checking for
to determine if there is an error or not.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andreas Schwab
Andrew Morton [EMAIL PROTECTED] writes:

 diff -puN kernel/compat.c~a kernel/compat.c
 --- a/kernel/compat.c~a
 +++ a/kernel/compat.c
 @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
   if (copy_to_user(tbuf, tmp, sizeof(tmp)))
   return -EFAULT;
   }
 - return compat_jiffies_to_clock_t(jiffies);
 + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) 
 + LONG_MAX);

Are you sure you want LONG_MAX here, not 0x7fff?

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Wed, 07 Nov 2007 16:50:22 -0800 (PST) David Miller [EMAIL PROTECTED] 
 wrote:
 From: Andrew Morton [EMAIL PROTECTED]
 Date: Wed, 7 Nov 2007 15:28:33 -0800
 
  Perhaps this is a bug in glibc: it is interpreting the times() return value
  in the same way as other syscalls.
 
 The problem is more likely that we are failing to
 invoke force_successful_syscall_return() here.
 
 Otherwise the syscall return path interprets negative
 values as errors, and sets the cpu condition codes.
 
 And that is what userspace is actually checking for
 to determine if there is an error or not.

hm, I'd forgotten about that.

It seems to be a no-op on lots of architectures?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Thu, 08 Nov 2007 01:54:40 +0100 Andreas Schwab [EMAIL PROTECTED] wrote:
 Andrew Morton [EMAIL PROTECTED] writes:
 
  diff -puN kernel/compat.c~a kernel/compat.c
  --- a/kernel/compat.c~a
  +++ a/kernel/compat.c
  @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct 
  if (copy_to_user(tbuf, tmp, sizeof(tmp)))
  return -EFAULT;
  }
  -   return compat_jiffies_to_clock_t(jiffies);
  +   return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) 
  +   LONG_MAX);
 
 Are you sure you want LONG_MAX here, not 0x7fff?
 

I'm not sure of anything - I'm just trolling ;)

That's 0x7fff for architectures which implement this function. 
I think that lines up correctly with jiffies and the return value from
compat_sys_times().

Perhaps formally it should be USERSPACE_CLOCK_T_MAX, but we don't have that.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Brown

On Wed, Nov 07, 2007 at 03:28:33PM -0800, Andrew Morton wrote:

On Wed, 7 Nov 2007 14:47:22 -0800 David Brown [EMAIL PROTECTED] wrote:



will return '-1' to user space and set the negated clock_t value to errno.

At minimum, perhaps it should return a sane errno value.


RETURN VALUE
  times()  returns  the  number of clock ticks that have elapsed since an
  arbitrary point in the past.  For Linux 2.4 and earlier this  point  is
  the  moment  the  system  was  booted.   Since Linux 2.6, this point is
  (2^32/HZ) - 300 (i.e., about 429 million) seconds  before  system  boot
  time.   The  return  value  may  overflow  the  possible  range of type
  clock_t.  On error, (clock_t) -1 is returned, and errno is  set  appro-
  priately.


The strange -1 behavior is enshrined in history.  I think a better answer
is to tell people to use getrusage() if they want a return result without
this problem.

Adding INITIAL_JIFFIES will fix the case where an embedded system is booted
up to run a test and then shut down, and the mask, although it causes
discontinuities periodically at least moves them away from the early boot.

INITIAL_JIFFIES was a good idea, but it is probably best to keep it inside
of the kernel.

David Brown
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Miller
From: Paul Mackerras [EMAIL PROTECTED]
Date: Thu, 8 Nov 2007 15:59:12 +1100

 Not on powerpc.  On powerpc the error indication is carried separately
 in a condition register bit.  So a force_successful_syscall_return()
 call will make glibc automatically do the right thing without any
 glibc changes on powerpc.

It still won't fix the problem.

When the return value is (clock_t) -1, all the
force_successful_syscall_return() calls and glibc condition
codes checks in the world are not going to fix the application
code which checks for error using -1.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Wed, 7 Nov 2007 21:20:05 -0800

 Yup.  But userspace will already have a fit if either the start or end time
 advanced into the glibc-thought-that-was-an-error range.

On x86 only.  We could use force_successful_syscall_return()
to make sure the condition codes get set correctly on
other platforms.

But even in that case we'd still be broken when the return
value is exactly -1 and that's what the application is going
to compare against to test for errors.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Andrew Morton
 On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras [EMAIL PROTECTED] wrote:
 Andrew Morton writes:
 
  Given all this stuff, the return value from sys_times() doesn't seem a
  particularly useful or reliable kernel interface.
 
 I think the best thing would be to ignore any error from copy_to_user
 and always return the number of clock ticks.  We should call
 force_successful_syscall_return, and glibc on x86 should be taught not
 to interpret negative values as an error.

Changing glibc might be hard ;)

 POSIX doesn't require us to return an EFAULT error if the buf argument
 is bogus.  If userspace does supply a bogus buf pointer, then either
 it will dereference it itself and get a segfault, or it won't
 dereference it, in which case it obviously didn't care about the
 values we tried to put there.
 
 If we try to return an error under some circumstances, then there is
 at least one 32-bit value for the number of ticks that will cause
 confusion.  We can either change that value (or values) to some other
 value, which seems pretty bogus, or we can just decide not to return
 any errors.  The latter seems to me to have no significant downside
 and to be the simplest solution to the problem.

the latter is what my protopatch does isn't it?  It wraps at 0x7fff.
It appears that glibc treats all of 0x8000-0x as an error.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED]
Date: Wed, 7 Nov 2007 23:09:16 -0800

 I don't think that's a big problem?  This syscall can (oddly) return any
 32-bit (64-bit) number and a smart application developer (after saying wtf)
 would realise that he just can't check for errors and have correctly
 working code.
 
 Then again, if he was smart he just wouldn't use times(2)'s return value
 for anything.  But what is the alternative?  I don't think there is one,
 apart from much saner things like gettimeofday().

You and I would say wtf, but the manual states what it does:

On error, (clock_t) -1 is returned, and errno is  set  appro-
priately.

And I think this (obviously bogus) convention is something we
are really stuck with.

Another awful aspect of this is that glibc is going to overwrite
'errno' for this return value range.  That will likely cause more
application misbehavior than some of the other side effects we've been
discussing.

In short we have two problems:

1) glibc thinks -4096  x  0 is an error, and will write this
   value into errno and return -1 to the application

2) the manual states that -1 means error

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Paul Mackerras
Andrew Morton writes:

 the latter is what my protopatch does isn't it?  It wraps at 0x7fff.
 It appears that glibc treats all of 0x8000-0x as an error.

Not on powerpc.  On powerpc the error indication is carried separately
in a condition register bit.  So a force_successful_syscall_return()
call will make glibc automatically do the right thing without any
glibc changes on powerpc.

Wrapping at 0x7fff will cause programs to see large negative
deltas between successive calls when the wrap occurs.  I can see that
giving userspace fits. :)

Paul.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: compat_sys_times() bogus until jiffies = 0.

2007-11-07 Thread Paul Mackerras
David Miller writes:

 I can't see where x86 is doing this though, so perhaps for x86
 glibc does make the negative value check.  But I doubt it is
 checking the range 0x8000-0x, otherwise mmap() would
 be busted.

At least for the INTERNAL_SYSCALL macro in glibc, the error check is:

#define INTERNAL_SYSCALL_ERROR_P(val, err) \
  ((unsigned int) (val) = 0xf001u)

in sysdeps/unix/sysv/linux/i386/sysdep.h.  Similarly the PSEUDO macro
in that file does a cmpl $-4095,%eax to test for error.  (There is also
a PSEUDO_NOERRNO which doesn't test for error.)

So the convention on (32-bit) x86 is that -4095 .. -1 are error
values, and other values are successful return values.

Paul.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/