race condition in mv -i (Was: race condition with set -C)

2016-11-07 Thread Stephane Chazelas
2016-11-07 17:19:13 +, Geoff Clare:
[...]
> > Is that allowed by POSIX?
> 
> No; the standard clearly says "The mv utility shall perform actions
> equivalent to the rename() function" in step 3.

Well, when moving across file systems, it already does something
very different from an atomic rename(), so it seems a bit
artificial to mandate a rename() equivalent call for the same-fs
moves.

I suppose the issue with:

if (link(a, b) fails) {
  if (error was EEXIST) {
prompt user;
if (yes) {
  rename(a, b);
}
  } else die;
} else {
  unlink(a);
}

is that we could unlink a different "a" file (like if someone
does a rename(c, a) in between the link and unlink, but then if
they had done that before the link() we couldn't tell the
difference).

That sounds like less of a problem than "mv" clobbering a file
when we explicitly requested it not to.

However I suppose link() has restrictions that rename() doesn't
have (like the /proc/sys/fs/protected_hardlinks in Linux 3.6+)
so it couldn't be used for at least that reason.

-- 
Stephane



Re: macOS 10.12, broken PTHREAD_CANCEL_DISABLE and UNIX certification

2016-11-07 Thread Shware Systems
Given frame 6 and 7, it looks like write is calling pthread_exit directly, 
rather than pthread_cancel, so would be where the bug is, unless write
is required to exit for that particular circumstance. If it has to exit, then 
the setup code necessary to avoid it is missing from the main thread's code, it 
looks. As I don't see the main thread reading the data the other thread writes, 
it may be write() is generating a SIGPIPE, due to EPIPE, that is unblocked and 
has a T default action. This would occur after an arbitrary time with no read() 
by any thread on the read descriptor, I expect; just keeping the descriptor 
unclosed without even a single read() attempt I wouldn't consider sufficient to 
avoid it.

I'm pretty sure someone at Open Group has fielding non-conformance reports in 
their job description, but who that would be at this point I have no idea, 
sorry.

On Sunday, November 6, 2016 Per Mildner  wrote:


On 5 Nov 2016, at 10:22, Shware Systems  wrote:

>From the output, I'm wondering about the source of the Illegal instruction: 4 
>diagnostic. If SIGILL isn't blocked, it would also exit the process, and I 
>believe run cancel handlers as part of process shutdown, whatever cancelstate 
>set to. So something about the code is suspect, but it may be a problem 
>internal to the pipe reads or writes, not the pthread routines or how they're 
>being used; possibly a buffer overrun or aggressive optimization issue, as a 
>guess.



The illegal instruction is because of an ud2 instruction used as a last 
fallback in abort() (really in __abort()). Repeating the test with a debugger 
attached verifies that the cleanup handler is called when the write() in 
pthread_start_routine is cancelled, i.e. something that would not happen if 
PTHREAD_CANCEL_DISABLE was working.


Starting test, 1 iterations, sleep interval 10ms
cancel_leak.c:46: ERROR cancelled while PTHREAD_CANCEL_DISABLE
Process 28027 stopped
* thread #2: tid = 0x4f4c05, 0x7fffbc4ec4db libsystem_c.dylib`__abort + 
172, stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x7fffbc4ec4db libsystem_c.dylib`__abort + 172
libsystem_c.dylib`__abort:
->  0x7fffbc4ec4db <+172>: ud2    

libsystem_c.dylib`abort_report_np:
    0x7fffbc4ec4dd <+0>:   pushq  %rbp
    0x7fffbc4ec4de <+1>:   movq   %rsp, %rbp
    0x7fffbc4ec4e1 <+4>:   pushq  %r14
(lldb) bt
bt
warning: could not load any Objective-C class information. This will 
significantly reduce the quality of type information available.
* thread #2: tid = 0x4f4c05, 0x7fffbc4ec4db libsystem_c.dylib`__abort + 
172, stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x7fffbc4ec4db libsystem_c.dylib`__abort + 172
    frame #1: 0x7fffbc4ec42f libsystem_c.dylib`abort + 144
    frame #2: 0x00011d6a 
cancel_leak`cleanup_routine(arg=0x) + 74 at cancel_leak.c:48
    frame #3: 0x7fffbc671233 libsystem_pthread.dylib`_pthread_exit + 130
    frame #4: 0x7fffbc671da8 libsystem_pthread.dylib`pthread_exit + 30
    frame #5: 0x7fffbc66ee26 
libsystem_pthread.dylib`_pthread_exit_if_canceled + 71
    frame #6: 0x7fffbc57fda1 libsystem_kernel.dylib`cerror + 13
    frame #7: 0x00011c25 
cancel_leak`pthread_start_routine(vcookie=0x7fff5fbff8b0) + 549 at 
cancel_leak.c:80
    frame #8: 0x7fffbc66faab libsystem_pthread.dylib`_pthread_body + 180
    frame #9: 0x7fffbc66f9f7 libsystem_pthread.dylib`_pthread_start + 286
    frame #10: 0x7fffbc66f221 libsystem_pthread.dylib`thread_start + 13
(lldb) 

As to certification, the person running the conformance test suites and 
submitting the results probably doesn't monitor bug reports. If the test suite 
passes, they happy, go on vacation, and figure any actual bugs a feature that 
can be ignored or is some underling's job to handle. If it doesn't pass, they 
file reports, not read them, and wait for someone to tell them try running it 
again. This may be unfair, but is frequently enough accurate. Whether the test 
suite is doing sufficient test cases to catch intermittent environmentally 
induced failures also unknown, and is another possibility, but at least one of 
the test suite maintainers does monitor this list.

Is there a way to make formal bug-reports against conformance, i.e. a formal 
way to tell the Unix certification authority about non-conformance? It seems 
possible that a vendor is not really interested in fixing a conformance problem 
unless it is reported by many users, or the vendor risks losing the marketing 
benefit of Unix certification. And, as you point out, it may well be that the 
ones responsible for certification at the vendor do not even hear about the 
bugs reported to the vendor bug reporting system. A nudge from the 
certification authority may be more likely to reach the right people.

Regards,


On Friday, November 4, 2016 Per Mildner  wrote:


What does the standard say about whether assignments are visible for subsequent expansions in a simple command without a command name?

2016-11-07 Thread Mark Galeck
Hello,

the current shell standard says in Section 2.9.1:

---

If no command name results, variable assignments shall affect the current 
execution environment. 

If the command name is not a special built-in utility or function, the variable 
assignments (...) shall not affect the current execution environment (...) In 
this case it is unspecified: 

Whether or not the assignments are visible for subsequent expansions in step 4

---

As you can see, if we do have a command, the standard mentions that it is 
unspecified if the assignments are subsequently visible.  So one can say, it is 
needed to discuss the notion of subsequent visibility.


Yet if we don't have a command, the standard does not discuss that at all...

One could perhaps interpret that to mean, it is unspecified in that case also.  
That would mean, there is no need to discuss the notion of subsequent 
visibility.


Thus we apparently have a contradiction - one one hand there is a need to 
discuss some X, on the other, there is not.  


As an aside, for dash such assignments are not visible:

$ A=; A=a B=$A; echo $B 

$ 



but for bash, they are:

~>A=; A=a B=$A; echo $B 
a 
~>


Questions:

1.  Is that apparent contradiction a bug in the standard that should be fixed, 
either by adding the visibility clause for both cases, or deleting it for both 
cases? (in that case, I will make a report).

2.  In the larger sense, if the standard does not specify some behaviour that a 
shell must implement, does that automatically mean it is "unspecified" or 
"undefined", in the sense of Base Definitions, Section 1.5?

3. In this specific case, is it unspecified what the behaviour is? Or is it 
specified (which would mean dash or bash have a bug), and what is it?


Thank you

Mark



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Martijn Dekker
Op 07-11-16 om 03:55 schreef Shware Systems:
> To last question, yes, but the effects are supposed to be documented so
> generic guard code that may invoke platform specific pre-ln attempt
> handling can be written. This is a compromise to disqualifying a system
> that defines additional file types from being considered conforming at
> all. In a script this might look like:
> if [ -e /app/$platform/linkhandler ] ;
> then { . .../linkhandler }
> else { do ln directly }; fi

Thanks for the reply. Where can I find more info about this? Is there a
standardised /app directory structure? I don't find it on any actual
system I've access to.

> To some extent it's also the operator's responsibility to sandbox use of
> non-standard file types outside directories the standard says portable
> applications need access to, such as $TMP, to avoid issues.

That makes sense. Many things would break if /tmp does not operate portably.

> A platform
> aware application might create such a file in a $TMP/appsubdir directory
> but shouldn't link it into /tmp after, iow, but to an ~/app/files type
> directory instead. That is more a training issue to me, not something
> the standard can reasonably address or make a requirement.

Unfortunately, by far the most common use case of 'mktemp' is to create
a temporary file in /tmp, so my cross-platform shell implementation of
it will have to be able to do that.

For the time being, unless anyone has concrete evidence or convincing
arguments to the contrary, I will assume that any issues with 'ln'
atomicity into /tmp are theoretical.

- M.



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Stephane Chazelas
2016-11-02 13:32:44 +, Martijn Dekker:
[...]
> If both 'mkdir' and 'ln' operate atomically, there could be a safe
> workaround for creating a regular file directly under /tmp. It would
> involve creating a (very) temporary directory under /tmp using 'mkdir
> -m700', then creating the file inside there, setting the mode, etc. with
> no need for atomicity, then attempting to 'ln' that file back to /tmp
> until we've got an available name. Do you think this could work?
[...]

I don't think you can use ln here.

ln "$tempdir/file" "$tempfile"

would create a "$tempfile/file" link if "$tempfile" existed and
was of type directory or a symlink eventually resolving to a
directory. Same problem with "mv" (which I think would work just
as well (with LC_ALL=C mv -i < /dev/null 2> /dev/null))

It would not clobber a file but could create one in unwanted
places like /etc/profile.d or /var/spool/cron/crontabs or just
/tmp/foo/ where the attacker could replace it with his own one.

You could use "link" (Unix, not POSIX), or "ln -T" (GNU, not
POSIX) or "mv -Tn" (GNU) instead.

-- 
Stephane



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Stephane Chazelas
2016-11-07 15:57:25 +, Stephane Chazelas:
> 2016-11-07 15:40:15 +, Geoff Clare:
> [...]
> > > Same problem with "mv" (which I think would work just
> > > as well (with LC_ALL=C mv -i < /dev/null 2> /dev/null))
> > 
> > No, mv -i doesn't work just as well - it has a race condition.
> > If a file is created in between the existence check and the
> > rename() call, mv will remove the file.
> 
> How so? "mv -i" with /dev/null as stdin ("no" answer to prompt)
> is not supposed to remove anything.
[...]

However, at least with GNU mv, the exit code if the file exists
will be 0. So it can't be used here.

BTW, there's an issue in the spec for "mv":

> EXIT STATUS
>
>  The following exit values shall be returned:
>
>   0
>  All input files were moved successfully.
>  >0
>  An error occurred.

In

mv -i a b

if the user says "no", "a" will not be moved successfully, and
there will not have been any error.

Should probably be something like:

   0
  All input files (approved by the user with -i) were
  moved successfully.

Also, should failure to write the prompt or read the answer be
considered an error?

Using:

   mv -i a b  2>&- <&-

or

   mv -i a b < /dev/null 2>&-

seem to work with GNU or Solaris10 mv (in that it returns with
an error when a prompt fails to be issued) but not with
FreeBSD's

-- 
Stephane



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Stephane Chazelas
2016-11-07 16:20:08 +, Geoff Clare:
[...]
> > How so? "mv -i" with /dev/null as stdin ("no" answer to prompt)
> > is not supposed to remove anything.
> 
> Only if it prompts.  If the existence check fails, mv will not
> prompt and will just call rename().

OK, sorry, I had assumed rename() would fail if the target exits
already.

So that's another "broken" command then if it can end-up removing
the target without asking the user despite the fact "-i" was
passed?

I suppose "mv" could use "link(src, dst) && unlink(src)" here to
work around that (like it does copy+unlink when across file
systems).

Is that allowed by POSIX? Do any implementations do it?

-- 
Stephane



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Geoff Clare
Stephane Chazelas  wrote, on 07 Nov 2016:
>
> 2016-11-07 16:20:08 +, Geoff Clare:
> [...]
> > > How so? "mv -i" with /dev/null as stdin ("no" answer to prompt)
> > > is not supposed to remove anything.
> > 
> > Only if it prompts.  If the existence check fails, mv will not
> > prompt and will just call rename().
> 
> OK, sorry, I had assumed rename() would fail if the target exits
> already.
> 
> So that's another "broken" command then if it can end-up removing
> the target without asking the user despite the fact "-i" was
> passed?
> 
> I suppose "mv" could use "link(src, dst) && unlink(src)" here to
> work around that (like it does copy+unlink when across file
> systems).

That would just exchange one race condition for another, although
I suppose it would be a smaller time window between link() and
unlink() than the one between the existence check and calling
rename() - particularly if the user takes time to answer the
prompt.

> Is that allowed by POSIX?

No; the standard clearly says "The mv utility shall perform actions
equivalent to the rename() function" in step 3.

> Do any implementations do it?

Only museum pieces from before rename() existed (I assume).

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Geoff Clare
Stephane Chazelas  wrote, on 07 Nov 2016:
>
> 2016-11-07 16:57:34 +, Stephane Chazelas:
> [...]
> > OK, sorry, I had assumed rename() would fail if the target exits
> > already.
> [...]
> 
> BTW, in the spec of link(2):
> 
>  [EEXIST]
>   The path2 argument resolves to an existing
> directory entry or refers to a symbolic link.
> 
> Why the "or refers to a symbolic link"? That would still be a
> directory entry.

Because it says "resolves".  Pathname resolution follows symlinks
by default.  So if it did not mention symlinks explicitly here,
a symlink with a non-existent target would not cause EEXIST.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Geoff Clare
Stephane Chazelas  wrote, on 07 Nov 2016:
>
> 2016-11-07 15:40:15 +, Geoff Clare:
> [...]
> > > Same problem with "mv" (which I think would work just
> > > as well (with LC_ALL=C mv -i < /dev/null 2> /dev/null))
> > 
> > No, mv -i doesn't work just as well - it has a race condition.
> > If a file is created in between the existence check and the
> > rename() call, mv will remove the file.
> 
> How so? "mv -i" with /dev/null as stdin ("no" answer to prompt)
> is not supposed to remove anything.

Only if it prompts.  If the existence check fails, mv will not
prompt and will just call rename().

> > > You could use "link" (Unix, not POSIX), or "ln -T" (GNU, not
> > > POSIX) or "mv -Tn" (GNU) instead.
> > 
> > The standard allows systems to make "link" available only to
> > processes with appropriate privileges, so that solution might
> > not be sufficiently portable.
> [...]
> 
> That seems to be in contradiction with it calling the link()
> system call.
> 
> I suspect that text is there for attempts to call "link" on a
> directory.

If that were the case, I would have expected it to be worded like
Joerg's man page quote.  Instead it just says "A user may need
appropriate privileges to invoke the link utility."

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Stephane Chazelas
2016-11-07 16:10:01 +, Stephane Chazelas:
[...]
>mv -i a b < /dev/null 2>&-
> 
> seem to work with GNU or Solaris10 mv (in that it returns with
> an error when a prompt fails to be issued) but not with
> FreeBSD's
[...]

Sorry, I messed up my tests on Solaris. Solaris /bin/mv doesn't
issue a prompt when stdin is not a terminal (and renames the
file!). I did observe a non-zero exit status with that one but
that was because I had renamed the file in an earlier test.

/usr/xpg4/bin/mv behaves POSIXly, but still exits with 0 if the
prompt can't be issued or the answer can't be read.

-- 
Stephane



RE: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Schwarz, Konrad
> -Original Message-
> From: Geoff Clare [mailto:g...@opengroup.org]
> Sent: Monday, November 07, 2016 5:20 PM
> To: austin-group-l@opengroup.org
> Subject: Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set
> -C
> > That seems to be in contradiction with it calling the link() system
> > call.
> >
> > I suspect that text is there for attempts to call "link" on a
> > directory.
> 
> If that were the case, I would have expected it to be worded like
> Joerg's man page quote.  Instead it just says "A user may need
> appropriate privileges to invoke the link utility."

When do you call link(1) in lieu of ln(1)?

BTW: the rationale for ln says:
This volume of POSIX.1-2008 does not allow the ln utility
to unlink existing destination paths by default for the
following reasons:

The ln utility has historically been used to provide locking for
shell applications, a usage that is incompatible with ln unlinking
the destination path by default.  There was no corresponding
technical advantage to adding this functionality.



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Geoff Clare
Stephane Chazelas  wrote, on 07 Nov 2016:
>
> BTW, there's an issue in the spec for "mv":
> 
> > EXIT STATUS
> >
> >  The following exit values shall be returned:
> >
> >   0
> >  All input files were moved successfully.
> >  >0
> >  An error occurred.
> 
> In
> 
> mv -i a b
> 
> if the user says "no", "a" will not be moved successfully, and
> there will not have been any error.

Good catch.

> Should probably be something like:
> 
>    0
> All input files (approved by the user with -i) were
> moved successfully.

We should change it to match rm, which says:

Each directory entry was successfully removed, unless its removal
was canceled by a non-affirmative response to a prompt for
confirmation.

> Also, should failure to write the prompt or read the answer be
> considered an error?

I would say yes.

> Using:
> 
>mv -i a b  2>&- <&-
> 
> or
> 
>mv -i a b < /dev/null 2>&-

However, that's not a valid test because of the following text on the
execl() page:

If file descriptor 0, 1, or 2 would otherwise be closed after
a successful call to one of the exec family of functions,
implementations may open an unspecified file for the file descriptor
in the new process image. If a standard utility or a conforming
application is executed with file descriptor 0 not open for reading
or with file descriptor 1 or 2 not open for writing, the environment
in which the utility or application is executed shall be deemed
non-conforming, and consequently the utility or application might
not behave as described in this standard.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Stephane Chazelas
2016-11-07 16:57:34 +, Stephane Chazelas:
[...]
> OK, sorry, I had assumed rename() would fail if the target exits
> already.
[...]

BTW, in the spec of link(2):

 [EEXIST]
  The path2 argument resolves to an existing
  directory entry or refers to a symbolic link.

Why the "or refers to a symbolic link"? That would still be a
directory entry.

-- 
Stephane



Re: [1003.1(2013)/Issue7+TC1 0001016]: race condition with set -C

2016-11-07 Thread Shware Systems
No, there is no app dir structure mandated
by POSIX. That is considered an administrative issue outside of the base POSIX 
scope, and is generally covered by packaging conventions like DEB and RPM as de 
facto standards. I don't think any platform implemented the proposed 1003.1j 
standard. Only /., /.., /dev and /tmp are required as fixed root directory 
names, per XBD 10. Each vendor can use whatever paths they prefer, besides 
those 4, for the symbolic names listed in XBD 8, such as $HOME or $LIB. In an 
embedded system a vendor might well put all utilities and shared libraries 
directly into / on a ROM and mount /tmp as an alias of a /dev/NVRAMFS device 
subdir.

As to mktemp, as long as it only creates files of regular type, or parent 
directories of one, there shouldn't be any problem I see with it using /tmp 
since ln is required to handle both those types. A platform that adds, as a 
contrived example, a "mimo" file type as a scatter/gather superset of fifos 
abstraction for persisting iovec_t templates, might need guard code for use of 
the complementary mkmimo utility in a script, rather than emulating the 
functionality with multiple use or redirects of fifos to a regular file in a cp 
pipe or for loop, as the portable option for a gather operation.

On Monday, November 7, 2016 Martijn Dekker  wrote:

Op 07-11-16 om 03:55 schreef Shware Systems:
> To last question, yes, but the effects are supposed to be documented so
> generic guard code that may invoke platform specific pre-ln attempt
> handling can be written. This is a compromise to disqualifying a system
> that defines additional file types from being considered conforming at
> all. In a script this might look like:
> if [ -e /app/$platform/linkhandler ] ;
> then { . .../linkhandler }
> else { do ln directly }; fi

Thanks for the reply. Where can I find more info about this? Is there a
standardised /app directory structure? I don't find it on any actual
system I've access to.

> To some extent it's also the operator's responsibility to sandbox use of
> non-standard file types outside directories the standard says portable
> applications need access to, such as $TMP, to avoid issues.

That makes sense. Many things would break if /tmp does not operate portably.

> A platform
> aware application might create such a file in a $TMP/appsubdir directory
> but shouldn't link it into /tmp after, iow, but to an ~/app/files type
> directory instead. That is more a training issue to me, not something
> the standard can reasonably address or make a requirement.

Unfortunately, by far the most common use case of 'mktemp' is to create
a temporary file in /tmp, so my cross-platform shell implementation of
it will have to be able to do that.

For the time being, unless anyone has concrete evidence or convincing
arguments to the contrary, I will assume that any issues with 'ln'
atomicity into /tmp are theoretical.

- M.





Re: Intended difference between waitpid() and waitid() ??

2016-11-07 Thread Robert Elz
Date:Sun, 6 Nov 2016 23:19:30 -0500
From:Shware Systems 
Message-ID:  <1583d033695-e5e-e...@webprd-a49.mail.aol.com>

  | I believe the difference
  | is to take into account a pid may be both a specific process id and
  | process group id for a pipe,

That's true of both waitpid() and waitid() - the latter has a cleaner
arg syntax, and could be extended more easily (though it seems to be
lacking the (defined) ability to wait for a child in my own process group,
that waitpid() offers -- not that that is much of a problem to overcome)
and waitid() returns results in a different way (less packing of bits into
an int ...) but aside from that they are essentially the same.

It is the diffeence in the way that results are returned that makes me
wonder if the error code usage is supposed to be different as well ...
On success waitid() always returns 0 (which includes the WNOHANG case
where there is a process that will eventually, one presumes, exit, but
has not done so yet) - so should it also return 0 in the case where there
is a child process (ECHILD is apparently only for when the process has
no children) but the pid passed (with P_PID) is not a child of the current
process ?   Or if there are no children of the current process that are
in the process group passed (with P_PGID) ?

This also raises another question, what should waitid() do if called in a
way that (other than an interrupt by a signal) can never succeed, or fail,
as for example

child = fork(); /* ignore test for error here */
if (child == 0) _exit(0);
sleep(60);
what = waitpid(P_PID, child, , WCONTINUED|WSTOPPED);

that's not EINVAL, the flags are valid, that unless, one considers this
case to be what an "invalid set of processes" means, it isn't ECHILD,
"child" is certainly an existing (zombie) child of the process (which,
because it is a zombie, can never stop or continue).   EINTR can happen
of course, but only if something external makes it.   Aside from that
what is expected here?   A hang forever?  Some error (which?)

This isn't an issue for waitpid() as WEXITED is implicit there.

kre



Re: Intended difference between waitpid() and waitid() ??

2016-11-07 Thread Robert Elz
Date:Mon, 7 Nov 2016 09:51:32 +
From:Geoff Clare 
Message-ID:  <20161107095132.GA14686@lt.loopback>

  | I am fairly certain the difference is not intentional.

OK, thanks.

  | All certified UNIX systems give an ECHILD error from waitid()
  | when there are no children, regardless of what flags are used.

Great, that simplifies my life a bit.

  | I think EINVAL is intended to cover completely invalid values,

That is what I would have assumed/guessed.

  | Seems to me that the standard is silent on this case.  We would need
  | to do a survey of existing practice if we want to rectify that.

OK ... waitid() (and friends) is relatively new in NetBSD, and we're
just in the process of making sure it really works as it should.   This
kind of thing can be made to do whatever seems best, so I'd appreciate
opinions from people as to what is best here - my default would probably
be to copy waitpid() and return ECHILD if there's no child that matches the
idtype/id pair, regardless of whether or not there are other children
(partly because we want to have one super-wait in the kernel that can
be used to implement the three wait*() functions posix defines, as well
as all of the others that have existed historically on BSD (which includes
old compat functions from when the usage structs that wait3() (and 4 and 6)
return contained time values that were 32 bit limted, etc, so old binaries
keep working) - so the less differences between them the better.)

But then there's the question I asked a bit earlier (in a message that
apparently got hung up in my laptop, and not sent - until just now when
I wondered why I had not seen it back from the list) of what to do with
the impossible to succeed cases - currently we just hang (the semi-code
(when flushed out so it actually compiles) when run on NetBSD is no
different than pause()).   However I doubt that's ideal, and some kind of
error return would be better.

kre




[1003.1(2013)/Issue7+TC1 0001020]: snprintf: the description of the n argument conflicts with ISO C

2016-11-07 Thread Austin Group Bug Tracker

A NOTE has been added to this issue. 
== 
http://austingroupbugs.net/view.php?id=1020 
== 
Reported By:ch3root
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   1020
Category:   System Interfaces
Type:   Error
Severity:   Objection
Priority:   normal
Status: Resolved
Name:   Alexander Cherepanov 
Organization:
User Reference:  
Section:fprintf() description of snprintf() 
Page Number:900 
Line Number:30166-30167 
Interp Status:  --- 
Final Accepted Text:http://austingroupbugs.net/view.php?id=1020#c3482 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2016-01-05 16:34 UTC
Last Modified:  2016-11-07 09:21 UTC
== 
Summary:snprintf: the description of the n argument
conflicts with ISO C
==
Relationships   ID  Summary
--
related to  761 Requirement of error for snprintf with ...
== 

-- 
 (0003483) shware_systems (reporter) - 2016-11-07 09:21
 http://austingroupbugs.net/view.php?id=1020#c3483 
-- 
Re: 3020
I agree the language is more vague on details than other parts of the
standard, but I don't see it as particularly sloppy. The point of the '_s'
interfaces is having versions that don't have common security holes like
buffer overruns, so when it says 'fit within' or 'too big' to me the intent
is the implementations >>shall do what's necessary, documented or not<< to
prevent those overruns. It seems there it's left open 'how' to work that
magic so imps, at their discretion, can use their existing code to get it
done or new methods.

Either way could reference the same data in the memory manager free(void *)
has to, to get the value used with malloc() or realloc(), or calculate a
usable size if block chaining/coalescing used. This applies whether the
malloc() calls happen at runtime or when creating object descriptions for
an object file at compile time, for auto or static declarations. On POSIX
systems there's the complexity of supporting dl_sym() too, in some fashion,
for externs exported by libraries, but code exists for that also. The vague
aspect I see is whether the sizes are 'as declared' or 'as allocated to
satisfy alignment practices', not that sizing involves any magic. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2016-01-05 16:34 ch3rootNew Issue
2016-01-05 16:34 ch3rootName  => Alexander
Cherepanov
2016-01-05 16:34 ch3rootSection   => snprintf
2016-01-05 16:34 ch3rootPage Number   => (page or range of
pages)
2016-01-05 16:34 ch3rootLine Number   => (Line or range of
lines)
2016-01-05 17:49 jsm28  Note Added: 0002999  
2016-01-05 20:07 shware_systems Note Added: 0003000  
2016-01-05 20:18 eblake Relationship added   related to 761  
2016-01-05 22:26 Florian Weimer Issue Monitored: Florian Weimer 
  
2016-01-05 22:35 ch3rootNote Added: 0003006  
2016-01-07 05:02 shware_systems Note Added: 0003014  
2016-01-07 10:37 ch3rootNote Added: 0003016  
2016-01-07 11:43 Vincent LefevreNote Added: 0003018  
2016-01-07 13:51 shware_systems Note Added: 0003019  
2016-01-07 14:14 shware_systems Note Edited: 0003019 
2016-01-08 00:45 random832  Note Added: 0003020  
2016-01-08 02:52 Don Cragun Section  snprintf => fprintf()
description of snprintf()
2016-01-08 02:52 Don Cragun Page Number  (page or range of
pages) => 900
2016-01-08 02:52 Don Cragun Line Number  (Line or range of
lines) => 30166-30167
2016-01-08 02:52 Don Cragun Interp Status => ---  

Re: Intended difference between waitpid() and waitid() ??

2016-11-07 Thread Geoff Clare
Robert Elz  wrote, on 06 Nov 2016:
>
> The spec (C165) for wait() (though this is only relevant to waitpid())
> says ...
> 
>   If waitpid( ) was invoked with WNOHANG set in options, it
>   has at least one child process specified by pid for which status
>   is not available, and status is not available for any process
>   specified by pid, 0 is returned. Otherwise, -1 shall be returned,
>   and errno set to indicate the error.
> 
> Whereas for waitid() the the similar spec is ...
> 
>   If WNOHANG was specified and status is not available for any process
>   specified by idtype and id, 0 shall be returned. If waitid( ) returns
>   due to the change of state of one of its children, 0 shall be returned.
>   Otherwise, -1 shall be returned and errno set to indicate the error.
> 
> Note a lack in waitid() of anything corresponding to the "it has at least
> one child process..." clause that exists for waitpid().
> 
> Is this intentional?   That is, should I assume that a process with no
> children (at all) which does a waitid() with WNOHANG set is not intended
> to receive ECHILD, but just a 0 return (with the appropriate fields of the
> siginfo set to 0 as well of course.)

I am fairly certain the difference is not intentional.  All certified
UNIX systems give an ECHILD error from waitid() when there are no
children, regardless of what flags are used.  (It is covered by one of
the conformance tests, with 31 combinations of flags, 16 of which
include WNOHANG.)

> And while I am here, what is expected (from waitid()) if the
> process identified by idtype & id (for waitid) does not exist,
> but there are other child processes ?
> 
> One might expect ECHILD, as it is clear for waitpid() that is the
> correct error - except in waitid() ...
> 
>[ECHILD]  The calling process has no existing unwaited-for child processes.
> 
> Thus is not true in the postulated case, there are existing unwaited for
> child processes, but not the one requested.
> 
> That is, let us assume the current process is pid 4, it has a single
> child process (pid 5) whose state does not matter here, and pid 4 does
> 
>   pid = waitid(P_PID, 6, ...);/* remaining args assumed 
> correct,
>   but are irrelevant here */
> 
> what is expected to happen in that case?   The calling process(4) has
> existing unwaited for child processes (pid 5) so ECHILD is apparently
> not the correct error.
> 
> Perhaps:
> [EINVAL]  An invalid value was specified for options, or idtype
>   and id specify an invalid set of processes.
> 
> Is specifying a pid which is not a child of the current process "an
> invalid set of processes" or does that text mean something different?

I think EINVAL is intended to cover completely invalid values, such
as P_PID with a pid of -1.

> If EINVAL is correct here, is that an intended difference from waitpid()
> or just an error?

Seems to me that the standard is silent on this case.  We would need
to do a survey of existing practice if we want to rectify that.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England