On 04/05/2020 20:32, Bob Proulx wrote:
Jonny Grant wrote:
Paul Eggert wrote:
Jonny Grant wrote:
Is a more accurate strerror considered unreliable?

Current:
mkdir: cannot create directory ‘test’: File exists

Proposed:
mkdir: cannot create directory ‘test’: Is a directory

I don't understand this comment. As I understand it you're proposing a change to
the mkdir command not a change to the strerror library function, and the change
you're proposing would introduce a race condition to the mkdir command.

As the mkdir error returned to the shell is the same, I don't feel the
difference between the words "File exists" and "Is a directory" on the
terminal can be considered a race condition.

I read the message thread carefully and the proposal was to add an
additional non-atomic stat(2) call to the logic.  That sets up the
race condition.

The difference in the words of the error string is not the race
condition.  The race condition is created when trying to stat(2) the
file to see why it failed.  That can only be done as a separate
action.  That cannot be an atomic operation.  That can only create a
race condition.

For the low level utilities it is almost always a bad idea to layer in
additional system calls that are not otherwise there.  Doing so almost
always creates additional bugs.  And then there will be new bug
reports about those problems.  And those will be completely valid.

Try this experiment on your own.

   /tmp$ strace -e trace=mkdir mkdir foodir1
   mkdir("foodir1", 0777)                  = 0
   +++ exited with 0 +++

   /tmp$ strace -e trace=mkdir mkdir foodir1
   mkdir("foodir1", 0777)                  = -1 EEXIST (File exists)
   mkdir: cannot create directory ‘foodir1’: File exists
   +++ exited with 1 +++

The first mkdir("foodir1", 0777) call succeeded.  The second
mkdir("foodir1", 0777) call fail, returned -1, set errno = EEXIST,
EEXIST is the error number for "File exists".

Note that this output line:

   mkdir("foodir1", 0777)                  = -1 EEXIST (File exists)

That line was entirely reported by the 'strace' command and is not any
code related to the Coreutils mkdir command.  The strace command
reported the same "File exists" message as mkdir did later, due to the
EEXIST error code.

Let's try the same experiment with a file.  And also with a pipe and a
character device too.

   /tmp$ touch file1

   /tmp$ strace -e trace=mkdir mkdir file1
   mkdir("file1", 0777)                    = -1 EEXIST (File exists)
   mkdir: cannot create directory ‘file1’: File exists
   +++ exited with 1 +++

   /tmp$ mkfifo fifo1

   strace -e trace=mkdir mkdir fifo1
   mkdir("fifo1", 0777)                    = -1 EEXIST (File exists)
   mkdir: cannot create directory ‘fifo1’: File exists
   +++ exited with 1 +++

   /tmp$ sudo mknod char1 c 5 0

   /tmp$ strace -e trace=mkdir mkdir char1
   mkdir("char1", 0777)                    = -1 EEXIST (File exists)
   mkdir: cannot create directory ‘char1’: File exists
   +++ exited with 1 +++

And so we see that the kernel is returning the same EEXIST error code
for *all* cases where a file previously exists.  And it is correct
because all of those are files.  Because directories are files, pipes
are files, and files are files.  Everything is a file.  Therefore
EEXIST is a correct error message.

In order to correctly change the message being reported the change
should be made in the kernel so that the kernel, which has the
information at that time atomically, could report an error providing
more detail than simply EEXIST.

You have proposed that mkdir add a stat(2) system call to extract this
additional information.

as it's easy enough to call stat() like other package maintainers
do, as you can see in binutils.

*That* stat() addition creates the race condition.  Adding a stat()
call cannot be done atomically.

It would need to be done either before the mkdir(), after the mkdir(),
or both before and after.  Let's see how that can go wrong.  Let's say
we stat(), does not exist, we continue with mkdir(), fails with EEXIST
because another process got there first.  So then we stat() again and
by that time the other process has already finished processing and
removed the directory again.  A system call trace would look like
this.

   lstat("foodir1", 0x7ffcafc12800)       = -1 ENOENT (No such file or 
directory)
   mkdir("foodir1", 0777)                 = -1 EEXIST (File exists)
   lstat("foodir1", 0x7ffcafc12800)       = -1 ENOENT (No such file or 
directory)

Okay.  That's confusing.  The only value in hand being EEXIST then
that is the error to be reported.  If this were repeated many times
then sometimes we would catch it as an actual directory.

   lstat("foodir1", 0x7ffcafc12800)       = -1 ENOENT (No such file or 
directory)
   mkdir("foodir1", 0777)                 = -1 EEXIST (File exists)
   lstat("foodir1", {st_mode=S_IFDIR|0775, st_size=40, ...}) = 0

In that case the proposal is to report it as EISDIR.  If we were to
set up two processes coordinating using a directory as a semaphore and
printing out the errors then we would see a stream of output that is
sometimes the creation of the directory, sometimes the failure since a
directory already exists, and sometimes a failure because it is a
file.  Race condition.  That would be extremely confusing!  I assure
you the valid bug reports would be swift and correctly merciless!

And there is another possibility that is also bad too.  We stat(),
does not exist, we continue with mkdir(), fails with EEXIST because
another process got there first creating it as a file.  So then we
stat() again and by that time the other process has removed the file
and replaced it with a fifo device node.

   lstat("file1", 0x7ffdedf9aca0)         = -1 ENOENT (No such file or 
directory)
   mkdir("file1", 0777)                   = -1 EEXIST (File exists)
   lstat("fifo1", {st_mode=S_IFIFO|0664, st_size=0, ...}) = 0

At that point the reason the mkdir actually failed was that it was a
file but the second stat() would have a report that it was a fifo
device which would be incorrect.  Race condition.

For the low level utilities it is almost always best to remain close
to the kernel and report the actual error reported by the kernel.
Constructing a layer in the low level utilities is almost always a bad
idea.

However in your own shell scripts if you wish to construct exactly
this same layer that you are proposing then that is your prerogative.
But it would open your program up to valid bug reports for when the
race condition tripped one of the problem cases.

You're right, there will be a race condition where two processes are both
creating and deleting the same files. Any software which is creating and
deleting the same directories in parallel will encounter a multitude of
errors - all bets are off.

I am glad that you agree.  That is exactly why adding non-atomic
operations to the low level utilities creating those race coditions is
a bad idea.

A better fix would be to change the mkdir system call so that it sets errno to
EISDIR in this situation. This would fix not only the mkdir utility, but also
lots of other programs; and it wouldn't introduce a race condition. So if you're
interested in getting the problem fixed, I suggest that you propose such a
change to the Linux kernel developers.

Yes, if Linux kernel developers would deviate from POSIX.  I emailed
linux-e...@vger.kernel.org the lines of code to change.

I am glad to see that you are following up in the right place.  That
place being in the kernel.  It can't be addressed correctly in user space.

I'm not confident it will get in, even harder to get into POSIX I expect.

ext4_match() is what would need to be updated to check if an entry is a
directory

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/ext4/namei.c

ext4?  But what if I am using xfs?  Does this then behave differently
on ext4 versus ext3 versus xfs versus nfs4 versus...  Well...  You get
the idea.

Sometimes the only way to win is not to play.  Is the current EEXIST
really so bad that we would create additional problems just to avoid
it?  Simplest is almost always best.

Bob


Thank you for your reply, and going through it all.

yes, I only asked Linux kernel ext4 team, as it is one place to ask the question, and judge the response. They also don't want to make any change. We're stuck with all these old interfaces. Unless someone wants to come up with mkdir2() and get it into POSIX.

Maybe a simple localised string is better in your mkdir tool? After-all the man page and POSIX gives the exact meaning of EEXIST in this context?

"pathname already exists"


The output is only incorrect because it defaults to system localised strerror(EEXIST)

So with this change, it would be:

mkdir: cannot create directory ‘test’: pathname already exists

Cheers, Jonny



Reply via email to