Re: du counting files in hard-linked directories twice

2009-08-27 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Matthew Woehlke on 8/27/2009 5:09 PM:
> Hard-linked *directories*? I wasn't aware those were allowed... (If I
> read 'man 3 link' right, they aren't allowed on Linux.)

POSIX permits, but does not require, them (and Linux intentionally does
not permit them).  Solaris has them (for superuser only), but they are
annoying; they break some optimizations that would otherwise be possible
due to unlink() vs. rmdir() semantics.  Therefore, coreutils already
intentionally drops privileges on Solaris in order to remove the ability
to unlink a hard-linked directory.

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqXRuwACgkQ84KuGfSFAYBpMACfTjrR2VmKWa0N04Y3SKr5xsJE
WhMAn1LFTOJiLGIbbePvkC5sYgnMfH05
=TGOI
-END PGP SIGNATURE-




Re: du counting files in hard-linked directories twice

2009-08-27 Thread Michael Schwarz

Am 2009-08-28 um 01:09 schrieb Matthew Woehlke:

Hard-linked *directories*? I wasn't aware those were allowed... (If  
I read 'man 3 link' right, they aren't allowed on Linux.)


That got added in Mac OS X 10.5. It allows Time Machine create a  
backup in O(number of changed files) time instead of O(number of  
files) as unchanged subtrees can be hard-linked from a previous backup.


This is also where I saw this bug, when doing a du -hs * in the backup  
dir to get a roundup on how much new data is in each backup.


Michael




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Ulrich Drepper
On Thu, Aug 27, 2009 at 15:55, Davide Libenzi wrote:
> Can't the handling be done on close(), like (modulo some errno save/restore):

No.  You can have any file descriptor closed when the process is
started.  No close in the process with the special close.




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Davide Libenzi
On Thu, 27 Aug 2009, Eric Blake wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> According to Davide Libenzi on 8/25/2009 3:53 PM:
> >> Another solution is for the application to sanitize all newly-created
> >> fds: GNU coreutils provides a wrapper open_safer, which does nothing
> >> extra in the common case that open() returned 3 or larger, but calls
> >> fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3.
> >> However, this leads to triple the syscall cost for every open() call
> >> if the process starts life with a std fd closed; and if O_CLOEXEC is
> >> not used, still leaves a window of time where the fd can be leaked
> >> through another thread's use of fork/exec.
> > 
> > I think we can say that the vast majority of the software is not going to 
> > notice the proposed open_safer(), performance-wise, since the first three 
> > fds are always filled. So IMO the performance impact argument is a weak one.
> > If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can 
> > be used to match it.
> 
> The current gnulib implementation of open_safer (pre-O_CLOEXEC support) is
> (roughly):
> 
> /* Wrap open, to guarantee that a successful return value is >= 3.  */
> int open_safer (const char *name, int flags, int mode)
> {
>   int fd = open (name, flags, mode);
>   if (0 <= fd && fd <= 2)
> {
>   int dup = fcntl (fd, F_DUPFD, 3);
>   int saved_errno = errno;
>   close (fd);
>   errno = saved_errno;
>   fd = dup;
> }
>   return fd;
> }
> 
> which has the desired property of no overhead in the common case of all
> standard fds open.  But it obviously mishandles the O_CLOEXEC flag.
> Here's a first cut at supporting it:
> 
> int open_safer (const char *name, int flags, int mode)
> {
>   int fd = open (name, flags, mode);
>   if (0 <= fd && fd <= 2)
> {
>   int dup = fcntl (fd, ((flags & O_CLOEXEC)
> ? F_DUPFD_CLOEXEC : F_DUPFD), 3);
>   int saved_errno = errno;
>   close (fd);
>   errno = saved_errno;
>   fd = dup;
> }
>   return fd;
> }
> 
> If the user requested open_safer(O_CLOEXEC), then we still have the
> desired property of no syscall overhead and no fd leak.  But if the user
> intentionally does not pass O_CLOEXEC because they wanted to create an
> inheritable fd, but without stomping on standard fds, then this version
> still has a window for an fd leak.  So let's try this version, which
> guarantees no fd leak, while still keeping the semantics of giving the
> user an inheritable fd outside the std fd range:
> 
> int open_safer (const char *name, int flags, int mode)
> {
>   int fd = open (name, flags | O_CLOEXEC, mode);
>   if (0 <= fd && fd <= 2)
> {
>   int dup = fcntl (fd, ((flags & O_CLOEXEC)
> ? F_DUPFD_CLOEXEC : F_DUPFD), 3);
>   int saved_errno = errno;
>   close (fd);
>   errno = saved_errno;
>   fd = dup;
> }
>   else if (!(flags & O_CLOEXEC))
> {
>   if ((flags = fcntl (fd, F_GETFD)) < 0
>   || fcntl (fd, F_SETFD, flags & ~FD_CLOEXEC) == -1)
> {
>   int saved_errno = errno;
>   close (fd);
>   fd = -1;
>   errno = saved_errno;
> }
> }
>   return fd;
> }
> 
> This solves the fd leak, and open_safer(O_CLOEXEC) is still cheap in the
> common case.  But now the case of open_safer without O_CLOEXEC costs 3
> syscalls, regardless of whether the standard fds were already open (if we
> assumed there were no possibility of new FD_* flags, we could cut the
> common-case penalty from three to two by skipping the fcntl(fd,F_GETFD)
> and just using fcntl(fd,F_SETFD,0), but that's asking for problems).
> 
> > While the patch is simple, IMO this is something that can be easily taken 
> > care in glibc layers w/out huge drawbacks.
> 
> I hope that my example shows why doing it in the kernel is desirable -
> there is no safe way to keep the pre-O_CLOEXEC efficiency using just the
> library, but there IS a way to do it with kernel support:
> 
> int open_safer (const char *name, int flags, int mode)
> {
>   return open (name, flags | O_NOSTD, mode);
> }

Can't the handling be done on close(), like (modulo some errno save/restore):

int safer_close(int fd) {
int error = close(fd);

if (fd < 3 && fd >= 0) {
if ((fd = open("/dev/null", O_RDWR)) > 2)
close(fd);
}

return error;
}



- Davide






Re: du counting files in hard-linked directories twice

2009-08-27 Thread Matthew Woehlke

Michael Schwarz wrote:
The Problem I see as a bug is that, in some circumstances, du is 
counting the same files multiple times when they appear inside of 
hard-linked directories.


Hard-linked *directories*? I wasn't aware those were allowed... (If I 
read 'man 3 link' right, they aren't allowed on Linux.)


...Aside, of course, from 'd' and all 'd/*/..' being the same.

--
Matthew
Please do not quote my e-mail address unobfuscated in message bodies.
--
For great justice!! -- Captain (Zero Wing)





Re: Correct semantics in rm -rf ?

2009-08-27 Thread shailesh jain
On Thu, Aug 27, 2009 at 12:40 PM, Jim Meyering  wrote:

> shailesh jain wrote:
> > Here it is.
> > Outcome is that Child gets deleted but parent does not get deleted.
> > (Now you will say it's a filesystem issue ;) )
>
> Let's start with the command you used here.
> It must not have been rm -rf, because the ENOENT is not being ignored:
>

huh ? I used rm -rf  If ENOENT should have been ignored then there you
see another bug  ;).



>
> > fstatat64(AT_FDCWD, "parent", {st_mode=S_IFDIR|0755, st_size=72, ...},
> > AT_SYMLINK_NOFOLLOW) = 0
> > unlinkat(AT_FDCWD, "parent", 0) = -1 EISDIR (Is a directory)
> > openat(AT_FDCWD, "parent",
> > O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
> > = 3
> > fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> > fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> > fcntl64(3, F_GETFL) = 0x28800 (flags
> > O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
> > fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> > fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> > getdents64(3, /* 3 entries */, 1024)= 80
> > openat(3, "Child", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW) =
> 4
> > fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
> > fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
> > fcntl64(4, F_GETFL) = 0x28800 (flags
> > O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
> > fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
> > close(3)= 0
> > fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
> > fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
> > getdents64(4, /* 2 entries */, 1024)= 48
> > getdents64(4, /* 0 entries */, 1024)= 0
> > fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
> > openat(4, "..", O_RDONLY|O_LARGEFILE)   = 3
> > close(4)= 0
> > fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> > unlinkat(3, "Child", AT_REMOVEDIR)  = 0
> > fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> > fcntl64(3, F_GETFL) = 0x8000 (flags
> > O_RDONLY|O_LARGEFILE)
> > fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> > fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> > getdents64(3, /* 3 entries */, 1024)= 80
> > openat(3, "Child", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW) =
> -1
> > ENOENT (No such file or directory)
> > getdents64(3, /* 0 entries */, 1024)= 0
> > fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)
>
> POSIX allows readdir to work that way.
>

Sorry I do not follow your comment. What way POSIX allows readdir to work ?


>
> I already suggested that making this change might help you:
>
> -CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 10
> +CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 1
>

I did not try above option. I will try this later tonight.



Shailesh Jain


Re: Correct semantics in rm -rf ?

2009-08-27 Thread Jim Meyering
shailesh jain wrote:
> Here it is.
> Outcome is that Child gets deleted but parent does not get deleted.
> (Now you will say it's a filesystem issue ;) )

Let's start with the command you used here.
It must not have been rm -rf, because the ENOENT is not being ignored:

> fstatat64(AT_FDCWD, "parent", {st_mode=S_IFDIR|0755, st_size=72, ...},
> AT_SYMLINK_NOFOLLOW) = 0
> unlinkat(AT_FDCWD, "parent", 0) = -1 EISDIR (Is a directory)
> openat(AT_FDCWD, "parent",
> O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
> = 3
> fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> fcntl64(3, F_GETFL) = 0x28800 (flags
> O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> getdents64(3, /* 3 entries */, 1024)= 80
> openat(3, "Child", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW) = 4
> fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
> fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
> fcntl64(4, F_GETFL) = 0x28800 (flags
> O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
> fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
> close(3)= 0
> fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
> fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
> getdents64(4, /* 2 entries */, 1024)= 48
> getdents64(4, /* 0 entries */, 1024)= 0
> fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
> openat(4, "..", O_RDONLY|O_LARGEFILE)   = 3
> close(4)= 0
> fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> unlinkat(3, "Child", AT_REMOVEDIR)  = 0
> fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> fcntl64(3, F_GETFL) = 0x8000 (flags
> O_RDONLY|O_LARGEFILE)
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
> getdents64(3, /* 3 entries */, 1024)= 80
> openat(3, "Child", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW) = -1
> ENOENT (No such file or directory)
> getdents64(3, /* 0 entries */, 1024)= 0
> fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)

POSIX allows readdir to work that way.

I already suggested that making this change might help you:

-CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 10
+CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 1




Re: Correct semantics in rm -rf ?

2009-08-27 Thread shailesh jain
Here it is.
Outcome is that Child gets deleted but parent does not get deleted.
(Now you will say it's a filesystem issue ;) )


fstatat64(AT_FDCWD, "parent", {st_mode=S_IFDIR|0755, st_size=72, ...},
AT_SYMLINK_NOFOLLOW) = 0
unlinkat(AT_FDCWD, "parent", 0) = -1 EISDIR (Is a directory)
openat(AT_FDCWD, "parent",
O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
= 3
fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
fcntl64(3, F_GETFL) = 0x28800 (flags
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
getdents64(3, /* 3 entries */, 1024)= 80
openat(3, "Child", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW) = 4
fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
fcntl64(4, F_GETFL) = 0x28800 (flags
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW)
fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
close(3)= 0
fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
fstat64(4, {st_mode=S_IFDIR|0755, st_size=48, ...}) = 0
getdents64(4, /* 2 entries */, 1024)= 48
getdents64(4, /* 0 entries */, 1024)= 0
fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
openat(4, "..", O_RDONLY|O_LARGEFILE)   = 3
close(4)= 0
fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
unlinkat(3, "Child", AT_REMOVEDIR)  = 0
fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
fcntl64(3, F_GETFL) = 0x8000 (flags
O_RDONLY|O_LARGEFILE)
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
fstat64(3, {st_mode=S_IFDIR|0755, st_size=72, ...}) = 0
getdents64(3, /* 3 entries */, 1024)= 80
openat(3, "Child", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_LARGEFILE|O_NOFOLLOW) = -1
ENOENT (No such file or directory)
getdents64(3, /* 0 entries */, 1024)= 0
fcntl64(3, F_GETFD) = 0x1 (flags FD_CLOEXEC)
close(3)= 0
close(0)= 0
close(1)= 0
close(2)= 0
exit_group(1)   = ?


Shailesh Jain



On Thu, Aug 27, 2009 at 3:44 AM, Jim Meyering  wrote:

> shailesh jain wrote:
> ...
> >> > /* Now get directory entries of 'parent'. There are no guarantees that
> >> > updates should have been reflected here, but rm -rf assumes this!
> Updates
> >> > are guaranteed to be reflects only on next openat() */
> >>
> >> Or rewinddir.  In practice, only a few unusual (I'd say crufty) file
> >> systems have had problems here.  Hence the periodic rewinddir call,
> >> when deemed necessary.  I think one was hfs on MacOS of a few
> >> years ago.  That version of that OS appears no longer to be relevant.
> >>
> >
> >
> > Yes, that is what I expected to have either rewinddir() or Close()/Open()
> > call.  I agree that 'rm -rf' will work on most (if not all) filesystems
> but
> > clearly just because if works on most filesystems does not make it
> > compliant.
> >
> > And thus I do not understand when you say "If you see behavior from rm
> that
> > *is* not POSIX compliant, given compliant infrastructure, then I'll be
> happy
> > to investigate it."
> >
> > It clearly isn't.
>
> How can it be clear when you have not demonstrated a failure?
> Did rm -rf fail to remove a hierarchy that POSIX specifies it must?
> Your strace output did not show any failure.
>
> While rm -rf may do extra work on a file system like you describe,
> I would be surprised if rm -rf fails to conform, since
> with -rf, rm ignores any failure due to e.g., ENOENT.
>
> > Underlying file-systems are not required by POSIX to
> > respect the assumption that 'rm' is making. Why not have a rewinddir()
> call
> > there  ?
>
> Try it, and you'll see that performance is affected.
>
> This might do it:
>
> -CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 10
> +CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 1
>
> > Given that you plan to remove remove.c, you might not want to do it. But
>
> I have rewritten the core of remove.c.  I'm not removing the file.
>
> > really it's just one call to rewinddir() :) which will make
> POSIX-confirmers
> > (like me) happy as well.
>
> IMHO, you still haven't shown how rm fails to conform.
> Did you demonstrate how rm -rf fails?
> Did you demonstrate how some other invocation of rm fails?
>


log
Description: Binary data


du counting files in hard-linked directories twice

2009-08-27 Thread Michael Schwarz

Hello everybody

The Problem I see as a bug is that, in some circumstances, du is  
counting the same files multiple times when they appear inside of hard- 
linked directories.


I have tested this with GNU coreutils 7.4 (MacPorts) and 7.5 (vanilla)  
on Mac OS X 10.5.8 on x86-64. I did all tests on an HFS+ volume,  
formatted as being non-cases-sensitive.


Judging from du.c, the problem seems to be that du only inserts a file  
into it's table of visited files if it's link count is greater than  
one (du.c, Lines 513-517):


  if (skip
  || (!opt_count_all
  && ! S_ISDIR (sb->st_mode)
  && 1 < sb->st_nlink
  && ! hash_ins (sb->st_ino, sb->st_dev)))

Replacing the 1 in the condition above with a 0 solves the problem,  
although it would probably be a much better idea to insert hard-linked  
directories into the hash table instead. This would also prevent du  
from counting the size of directory inodes twice.


Appended is a simple script which makes three tests:
* One where there is a single file with two links.
* One where there is a directory with two links containing one file
	* One where there is a directory with two links containing two links  
to a single file.


After the script is it's output on my machine and the output of ls - 
ilR for reference.


With all the tests there is exactly one file, 1 MiB in size and only  
for the second test, du reports a total size of 2 MiB.


I hope you agree that this is a bug in du and that I did not miss a  
problem on my side in evaluating this. I will gladly be of any help I  
can on fixing this problem, e.g. testing patches for people who do not  
have a setup that may be used to reproduce this problem.


Thank you
Michael


$ cat test.sh
#! /usr/bin/env bash

set -e -o pipefail

# Test with a single file with two links.
(
echo 'Test with file hard links:'
rm -rf 'test-d1-f2'
mkdir 'test-d1-f2'
cd 'test-d1-f2'

mkdir 'top-1'
mkdir 'top-2'

dd bs=1k count=1k < /dev/zero > 'top-1/file' 2> /dev/null
link 'top-1/file' 'top-2/file'
du -h
echo
)

# Test with a directory with two links containing a single file with  
one link.

(
echo 'Test with directory hard links:'
rm -rf 'test-d2-f1'
mkdir 'test-d2-f1'
cd 'test-d2-f1'

	# Because of a restriction in HFS+ or the Mac OS, we need to put the  
two directory entries for the hard-linked directory into sepparate  
directories.

mkdir 'top-1'
mkdir 'top-2'

mkdir 'top-1/dir'
link 'top-1/dir' 'top-2/dir'
dd bs=1k count=1k < /dev/zero > 'top-1/dir/file' 2> /dev/null
du -h
echo
)

# Test with a directory with two links containing a single file with  
two links.

(
echo 'Test with file and directory hard links:'
rm -rf 'test-d2-f2'
mkdir 'test-d2-f2'
cd 'test-d2-f2'

mkdir 'top-1'
mkdir 'top-2'

mkdir 'top-1/dir'
link 'top-1/dir' 'top-2/dir'
dd bs=1k count=1k < /dev/zero > 'top-1/dir/file-1' 2> /dev/null
link 'top-1/dir/file-1' 'top-1/dir/file-2'
du -h
echo
)
$ ./test.sh
Test with file hard links:
1.0M./top-1
0   ./top-2
1.0M.

Test with directory hard links:
1.0M./top-1/dir
1.0M./top-1
1.0M./top-2/dir
1.0M./top-2
2.0M.

Test with file and directory hard links:
1.0M./top-1/dir
1.0M./top-1
0   ./top-2/dir
0   ./top-2
1.0M.

$ ls -ilR
.:
total 4
12382224 drwxr-xr-x 4 michi michi  136 Aug 27 18:06 test-d1-f2
12382230 drwxr-xr-x 4 michi michi  136 Aug 27 18:06 test-d2-f1
12382237 drwxr-xr-x 4 michi michi  136 Aug 27 18:06 test-d2-f2
12382194 -rwxr-xr-x 1 michi michi 1207 Aug 27 18:06 test.sh

./test-d1-f2:
total 0
12382225 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382226 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2

./test-d1-f2/top-1:
total 1024
12382227 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file

./test-d1-f2/top-2:
total 1024
12382227 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file

./test-d2-f1:
total 0
12382231 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382232 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2

./test-d2-f1/top-1:
total 0
12382233 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 dir

./test-d2-f1/top-1/dir:
total 1024
12382236 -rw-r--r-- 1 michi michi 1048576 Aug 27 18:06 file

./test-d2-f1/top-2:
total 0
12382233 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 dir

./test-d2-f1/top-2/dir:
total 1024
12382236 -rw-r--r-- 1 michi michi 1048576 Aug 27 18:06 file

./test-d2-f2:
total 0
12382238 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382239 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2

./test-d2-f2/top-1:
total 0
12382240 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 dir

./test-d2-f2/top-1/dir:
total 2048
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-1
12382243 -rw-r--r--

Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Florian Weimer
* Eric Blake:

> int open_safer (const char *name, int flags, int mode)
> {
>   int fd = open (name, flags | O_CLOEXEC, mode);
>   if (0 <= fd && fd <= 2)
> {
>   int dup = fcntl (fd, ((flags & O_CLOEXEC)
> ? F_DUPFD_CLOEXEC : F_DUPFD), 3);
>   int saved_errno = errno;
>   close (fd);
>   errno = saved_errno;
>   fd = dup;
> }
>   else if (!(flags & O_CLOEXEC))
> {
>   if ((flags = fcntl (fd, F_GETFD)) < 0
>   || fcntl (fd, F_SETFD, flags & ~FD_CLOEXEC) == -1)
> {
>   int saved_errno = errno;
>   close (fd);
>   fd = -1;
>   errno = saved_errno;
> }
> }
>   return fd;
> }

> This solves the fd leak,

It's still buggy.  You need something like this:

int open_safer(const char *name, int flags, int mode)
{
  int opened_fd[3] = {0, 0, 0};
  int fd, i, errno_saved;
  while (1) {
fd = open(name, flags | O_CLOEXEC, mode);
if (fd < 0 || fd > 2) {
  break;
}
opened_fd[fd] = 1;
  }
  for (int i = 0; i <= 2; ++i) {
if (opened_fd[i]) {
  errno_saved = errno;
  close(i);
  errno = errno_saved;
}
  }
  return fd;
}

It's untested, so it's probably still buggy.

(O_CLOEXEC should have been a thread attribute, like the base path in
the *_at functions. *sigh*)

-- 
Florian Weimer
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Ulrich Drepper

On 08/27/2009 06:54 AM, Eric Blake wrote:

I hope that my example shows why doing it in the kernel is desirable -
there is no safe way to keep the pre-O_CLOEXEC efficiency using just the
library, but there IS a way to do it with kernel support:


You're describing a very special case where the performance implications 
are really minimal and try to argue that is a good enough reason?  I 
don't think so.


If a program really has to do thousands of these safe open calls then it 
can invest time into opening /dev/null for any of the unallocated 
descriptors < 3.  You can even embed this logic in the safer_open function.


--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Davide Libenzi on 8/25/2009 3:53 PM:
>> Another solution is for the application to sanitize all newly-created
>> fds: GNU coreutils provides a wrapper open_safer, which does nothing
>> extra in the common case that open() returned 3 or larger, but calls
>> fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3.
>> However, this leads to triple the syscall cost for every open() call
>> if the process starts life with a std fd closed; and if O_CLOEXEC is
>> not used, still leaves a window of time where the fd can be leaked
>> through another thread's use of fork/exec.
> 
> I think we can say that the vast majority of the software is not going to 
> notice the proposed open_safer(), performance-wise, since the first three 
> fds are always filled. So IMO the performance impact argument is a weak one.
> If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can 
> be used to match it.

The current gnulib implementation of open_safer (pre-O_CLOEXEC support) is
(roughly):

/* Wrap open, to guarantee that a successful return value is >= 3.  */
int open_safer (const char *name, int flags, int mode)
{
  int fd = open (name, flags, mode);
  if (0 <= fd && fd <= 2)
{
  int dup = fcntl (fd, F_DUPFD, 3);
  int saved_errno = errno;
  close (fd);
  errno = saved_errno;
  fd = dup;
}
  return fd;
}

which has the desired property of no overhead in the common case of all
standard fds open.  But it obviously mishandles the O_CLOEXEC flag.
Here's a first cut at supporting it:

int open_safer (const char *name, int flags, int mode)
{
  int fd = open (name, flags, mode);
  if (0 <= fd && fd <= 2)
{
  int dup = fcntl (fd, ((flags & O_CLOEXEC)
? F_DUPFD_CLOEXEC : F_DUPFD), 3);
  int saved_errno = errno;
  close (fd);
  errno = saved_errno;
  fd = dup;
}
  return fd;
}

If the user requested open_safer(O_CLOEXEC), then we still have the
desired property of no syscall overhead and no fd leak.  But if the user
intentionally does not pass O_CLOEXEC because they wanted to create an
inheritable fd, but without stomping on standard fds, then this version
still has a window for an fd leak.  So let's try this version, which
guarantees no fd leak, while still keeping the semantics of giving the
user an inheritable fd outside the std fd range:

int open_safer (const char *name, int flags, int mode)
{
  int fd = open (name, flags | O_CLOEXEC, mode);
  if (0 <= fd && fd <= 2)
{
  int dup = fcntl (fd, ((flags & O_CLOEXEC)
? F_DUPFD_CLOEXEC : F_DUPFD), 3);
  int saved_errno = errno;
  close (fd);
  errno = saved_errno;
  fd = dup;
}
  else if (!(flags & O_CLOEXEC))
{
  if ((flags = fcntl (fd, F_GETFD)) < 0
  || fcntl (fd, F_SETFD, flags & ~FD_CLOEXEC) == -1)
{
  int saved_errno = errno;
  close (fd);
  fd = -1;
  errno = saved_errno;
}
}
  return fd;
}

This solves the fd leak, and open_safer(O_CLOEXEC) is still cheap in the
common case.  But now the case of open_safer without O_CLOEXEC costs 3
syscalls, regardless of whether the standard fds were already open (if we
assumed there were no possibility of new FD_* flags, we could cut the
common-case penalty from three to two by skipping the fcntl(fd,F_GETFD)
and just using fcntl(fd,F_SETFD,0), but that's asking for problems).

> While the patch is simple, IMO this is something that can be easily taken 
> care in glibc layers w/out huge drawbacks.

I hope that my example shows why doing it in the kernel is desirable -
there is no safe way to keep the pre-O_CLOEXEC efficiency using just the
library, but there IS a way to do it with kernel support:

int open_safer (const char *name, int flags, int mode)
{
  return open (name, flags | O_NOSTD, mode);
}

Also, remember this statement from Ulrich in the series that first
introduced O_CLOEXEC/O_NONBLOCK support across all the fd creation APIs:

"In future there will be other uses.  Like a O_* flag to enable the use
of non-sequential descriptors."
http://marc.info/?l=linux-kernel&m=121010907127190&w=2

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqWj/gACgkQ84KuGfSFAYClIgCdEw6/7+PiFhR7aKGyuLUd5RdR
lbAAmgKLqCC5zlhkOm/x4K+Om7nqeD0i
=Ibq4
-END PGP SIGNATURE-




Re: sort -R not working

2009-08-27 Thread Pádraig Brady
Barty Pleydell-Bouverie wrote:
> Hi 
> 
> sort -R fails to randomly sort files. I am using fedora8 - any known reason 
> for this?

I vaguely remember there being a bug where sort -R was
not working on some locales, but the git history is not
helping me out. Can you retry with `LANG=C sort -R`.
I can't reproduce though:

$ sudo chroot /f8
# rpm -q coreutils
coreutils-6.9-12.fc8
# echo $LANG
en_IE.UTF-8
# seq 5 | sort -R
3
2
1
4
5

cheers,
Pádraig.




Re: sort -R not working

2009-08-27 Thread Kamil Dudka
On Thursday 27 of August 2009 12:46:51 Barty Pleydell-Bouverie wrote:
> Hi
>
> sort -R fails to randomly sort files. I am using fedora8 - any known reason
> for this?

Please define "not working".

What is the command you have used?

What is the output?

What is the expected output?

Fedora 8 is EOL [1]. Are you able to reproduce the bug on a newer Fedora?

Kamil

[1] http://fedoraproject.org/wiki/LifeCycle/EOL




Re: sort -R not working

2009-08-27 Thread Jim Meyering
Barty Pleydell-Bouverie wrote:
> sort -R fails to randomly sort files. I am using fedora8 - any known reason 
> for this?

Fedora 8 is rather old.  I suggest you upgrade.

Please show us what command you ran, and the inputs and outputs.
And what sort --version prints.




Re: sort -R not working

2009-08-27 Thread Philip Rowlands

On Thu, 27 Aug 2009, Barty Pleydell-Bouverie wrote:


sort -R fails to randomly sort files. I am using fedora8 - any known reason for 
this?


Fedora 8 was declared end-of-life (i.e. unsupported) on January 7, 2009. 
Can you reproduce this problem either with a current Fedora release, or 
with the latest stable coreutils source?


Also, when filing bug reports please state with as much detail as 
possible:

- What you did
- What happened (this is lacking in detail above)
- What you expected to happen


Cheers,
Phil




sort -R not working

2009-08-27 Thread Barty Pleydell-Bouverie
Hi 

sort -R fails to randomly sort files. I am using fedora8 - any known reason for 
this?



[PATCH] build: prefix a few rules with $(AM_V_GEN)

2009-08-27 Thread Jim Meyering
FYI,

>From de619c8fa5ea4e5cd3ca4a9632af81a487c33c0b Mon Sep 17 00:00:00 2001
From: Jim Meyering 
Date: Thu, 27 Aug 2009 10:17:55 +0200
Subject: [PATCH] build: prefix a few rules with $(AM_V_GEN)

* Makefile.am (.version, dist-hook, gen-ChangeLog): Use $(AM_V_GEN)
and $(AM_V_at), so that automake's silent-rules option (make V=1/V=0)
now controls whether the commands are printed at build time.
(THANKS-to-translators, check-ls-dircolors): Likewise.
---
 Makefile.am |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index 57b3e33..603a444 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -98,20 +98,20 @@ rm_subst = \

 BUILT_SOURCES = .version
 .version:
-   echo $(VERSION) > $...@-t && mv $...@-t $@
+   $(AM_V_GEN)echo $(VERSION) > $...@-t && mv $...@-t $@

 # Arrange so that .tarball-version appears only in the distribution
 # tarball, and never in a checked-out repository.
 # The perl substitution is to change some key uses of "rm" to "/bin/rm".
 # See the rm_subst comment for details.
 dist-hook: gen-ChangeLog
-   echo $(VERSION) > $(distdir)/.tarball-version
-   perl -pi -e '$(rm_subst)' $(distdir)/src/Makefile.in
+   $(AM_V_GEN)echo $(VERSION) > $(distdir)/.tarball-version
+   $(AM_V_at)perl -pi -e '$(rm_subst)' $(distdir)/src/Makefile.in

 gen_start_date = 2008-02-08
 .PHONY: gen-ChangeLog
 gen-ChangeLog:
-   if test -d .git; then   \
+   $(AM_V_GEN)if test -d .git; then\
  $(top_srcdir)/build-aux/gitlog-to-changelog   \
--since=$(gen_start_date) > $(distdir)/cl-t;\
  rm -f $(distdir)/ChangeLog;   \
@@ -125,7 +125,7 @@ distcheck-hook: check-ls-dircolors
 DISTCLEANFILES = VERSION
 MAINTAINERCLEANFILES = THANKS-to-translators
 THANKS-to-translators: po/LINGUAS THANKStt.in
-   (   \
+   $(AM_V_GEN)(\
  cat $(srcdir)/THANKStt.in;\
  for lang in `cat $(srcdir)/po/LINGUAS`; do\
echo http://translationproject.org/team/$$lang.html;\
@@ -136,7 +136,7 @@ THANKS-to-translators: po/LINGUAS THANKStt.in
 # remain in sync.
 .PHONY: check-ls-dircolors
 check-ls-dircolors:
-   dc=$$(sed -n '/static.*ls_codes\[/,/};'/p   \
+   $(AM_V_GEN)dc=$$(sed -n '/static.*ls_codes\[/,/};'/p\
$(srcdir)/src/dircolors.c   \
  |sed -n '/^  *"/p'|tr , '\n'|sed 's/^  *//'   \
  |sed -n 's/^"\(..\)"/\1/p'|sort -u);  \
--
1.6.4.1.359.g4fc77




Re: Correct semantics in rm -rf ?

2009-08-27 Thread Jim Meyering
shailesh jain wrote:
...
>> > /* Now get directory entries of 'parent'. There are no guarantees that
>> > updates should have been reflected here, but rm -rf assumes this! Updates
>> > are guaranteed to be reflects only on next openat() */
>>
>> Or rewinddir.  In practice, only a few unusual (I'd say crufty) file
>> systems have had problems here.  Hence the periodic rewinddir call,
>> when deemed necessary.  I think one was hfs on MacOS of a few
>> years ago.  That version of that OS appears no longer to be relevant.
>>
>
>
> Yes, that is what I expected to have either rewinddir() or Close()/Open()
> call.  I agree that 'rm -rf' will work on most (if not all) filesystems but
> clearly just because if works on most filesystems does not make it
> compliant.
>
> And thus I do not understand when you say "If you see behavior from rm that
> *is* not POSIX compliant, given compliant infrastructure, then I'll be happy
> to investigate it."
>
> It clearly isn't.

How can it be clear when you have not demonstrated a failure?
Did rm -rf fail to remove a hierarchy that POSIX specifies it must?
Your strace output did not show any failure.

While rm -rf may do extra work on a file system like you describe,
I would be surprised if rm -rf fails to conform, since
with -rf, rm ignores any failure due to e.g., ENOENT.

> Underlying file-systems are not required by POSIX to
> respect the assumption that 'rm' is making. Why not have a rewinddir() call
> there  ?

Try it, and you'll see that performance is affected.

This might do it:

-CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 10
+CONSECUTIVE_READDIR_UNLINK_THRESHOLD = 1

> Given that you plan to remove remove.c, you might not want to do it. But

I have rewritten the core of remove.c.  I'm not removing the file.

> really it's just one call to rewinddir() :) which will make POSIX-confirmers
> (like me) happy as well.

IMHO, you still haven't shown how rm fails to conform.
Did you demonstrate how rm -rf fails?
Did you demonstrate how some other invocation of rm fails?