bug#24730: rmdir/mkdir error(s) and/or not working "reciprocally" w/each other
Assaf Gordon wrote: Hello, Before deciding on the wording, it's worth nothing that the errors and reasons for the errors are different between mkdir and rmdir, and between the two cases. On 10/18/2016 03:49 PM, L. A. Walsh wrote: mkdir -p ./a/b/c # no error rmdir -p ./a/b/c # get error msg, but a,b,c removed. The error in this case (at least on Linux) is "Invalid Argument", because 'rmdir .' is invalid and rejected by the kernel (EINVAL) while 'mkdir .' returns EEXISTS - and '-p' specifically instruct it to silently ignore EEXIST. I see... so in ".a/b/c", a,b,c are removed, but the error comes in "."? $ strace -e rmdir rmdir . rmdir(".") = -1 EINVAL (Invalid argument) rmdir: failed to remove '.': Invalid argument This is also mandated by posix: http://pubs.opengroup.org/onlinepubs/9699919799/functions/rmdir.html "If the path argument refers to a path whose final component is either dot or dot-dot, rmdir() shall fail." --- Ok, but is "-p" a posix switch in mkdir or rmdir? If not, wouldn't the behavior be undefined? mkdir -p a/../b # no error rmdir -p a/../b # error, but a & b removed At least on my system (Linux kernel 3.13), 'a' is not removed in the above example, and the error is due to non-empty directory: --- Right..., my bad, only checked 'b'. (*duh*) Note that coreutils' mkdir contains an optimization not to try and make '..' as it is guaranteed to exist (implemented in gnulib's mkancesdirs.c). --- Right, same when rmdir traverses, when it hits "..", that must exist (even in '/'). However, by definition, when 'rmdir' traverses the directories on the given path, the directory 'a/..' is not empty (it contains 'a') - so this must fail. with "-p"? Am talking about the case where you create a dir with "mkdir -p "$dirname" and later, rmdir -p "$dirname" (where dirname is passed as a parameter, but is not user-input). If you want to 'rmdir' to silently ignore non-empty directories, there's a gnu extension option of 'rmdir --ignore-fail-on-non-empty': $ strace -e rmdir rmdir --ignore-fail-on-non-empty -p a/../b rmdir("a/../b") = 0 rmdir("a/..") = -1 ENOTEMPTY (Directory not empty) +++ exited with 0 +++ But note that this causes 'rmdir' to stop upon first failure, and 'a' is still not removed. ==> seems to be best wording & solution: "mkdir -p", it seems should really be restated to: follow given path and make directories as possible" then "rmdir -p" could be "follow given path and delete directories if empty" This does not accurately reflect how 'rmdir' is currently implemented. A more accurate description is "follow given path and delete directories, until the first encountered failure". Right, but am trying to get rmdir to be a useful "opposite" to a "mkdir" -- at least w/r/t "-p"... If you want a behavior where 'rmdir' will continue beyond the first failure and try to delete all directories in the given path (e.g. a/b/c in the example above), then it sounds like a new feature request, and a non-standard behavior which will be a gnu extension (and also quite confusing behavior, IMHO). --- Yeah -- new feature, but confusing? I think the fact that you can mkdir -p XXX and later canNOT rmdir -p XXX, is quite confusing behavior... But is -p's behavior in mkdir and rmdir mandated by POSIX?
bug#22567: Factoring 38 nines
Jim Meyering wrote: On Fri, Feb 5, 2016 at 11:29 AM, Eric Blakewrote: On 02/05/2016 11:30 AM, SasQ wrote: OK, this convinces me this is not a bug. 4m30 on my machine. But it's definitely a user-interface fail ;) It should at least output some warning that the computations might take longer, or display some progress status / estimated time along the way. --- I thought factoring was something that was able to be made faster with parallel processing? I notice on linux and cygwin, only 1 cpu being used. If someone **really** wanted to speed things up, I think using video cards that support much larger parallelism, would likely be a big boost... but that would be even more likely to be a _little used feature_.
bug#22001: Is it possible to tab separate concatenated files?
Bob Proulx wrote: That example shows a completely different problem. It shows that your input plain text files have no terminating newline, making them officially[/sic/] not plain text files but binary files. Because every plain text line in a file must be terminated with a newline. That's only a recent POSIX definition. It's not related to real life. When I looked for a text file definition on google, nothing was mentioned about needing a newline on the last line -- except on 1 site -- and that site was clearly not talking about 'text' files, but Unix-text-record files w/each record terminated by a NL char. On a mac, txt files have records separated by 'CR', and on DOS/Win, txt files have txt records separated by CRLF. Wikipedia quotes the Unicode definition of txt files -- which doesn't require the POSIX txt-record definition. Also POSIX limits txt format to 'LINE_MAX' bytes -- notice it says 'bytes' and not characters. Yet a unicode line of 256 characters can easily exceed 1024 bytes. Yet never in the the history of the english language have lines been restricted to some number of bytes or characters. But one could note that the posix definition ONLY refers to files -- not streams of TEXT (whatever the character set). Specificially, note, that with 'TEXT COLUMNMS', describe text columns measured in column widths -- yet that conflicts with the definition Text File, in that textfiles use 'bytes' for a maximum line length, while text columns use 'characters' (which can be 1-4 bytes in unicode, UTF-8 or UTF-16 encoded). Of specific note -- "text" composed of characters, MUST support 'NUL' (as well as 'the audio bell' (control-g), the backspace (control-h), vertical tabs(U+000B), form-feed(U+000C). No standard definition outside POSIX include any of those characters -- because text characters are supposed to be readable and visible. But POSIX compatibility claims that Portable Character Set ( http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html#tag_06_01) must include those characters. The 'text'-files-must-have-NL' group ignores the POSIX 2008 definition of a portable character set -- but globs onto the implied definition of a text line as part of a 'text file'. But as already noted, POSIX has conflicting definitions about what text is. (Unicode measured in chars/columns or ascii (measured in bytes). But POSIX 2008 (same url as above) clearly states: A null character, NUL, which has all bits set to zero, shall be in the set of [supported] characters. In all plain-text definitions, it is mentioned that 'text' is is a set of displayable characters that can be broken into lines with the text-line separator definition. The last line of the file Needs No separation character at the end of the line as it doesn't need to be separated from anything. The GNU standard should not limit itself to an *arcane* (and not well known outside of POSIX-fans) definition of text, as it makes text files created before 2008, potentially incompatible. POSIX was supposed to be about portability... it certainly doesn't follow the internet-design-mime of "Accept input liberally, and generate output conservatively. If they are not then it isn't a text line. Must be binary. --- Whereas I maintain that Newlines are required to break plain-text into records -- but not at the end-of-file, since there is no record following. Why isn't there a newline at the end of the file? Fix that and all of your problems and many others go away. --- Didn't used to be a requirement -- it was added because of a broken interpretation of the posix standard. Please remember that a a posixified definition of 'X' (for any X), may not be the same as a real-live 'X'. In this case, we have a file containing *text* by the POSIX def, which you claim doesn't meet the POSIX definition of "text file". It's similar to Orwellian-speak -- redefining common terms to mean something else, so people don't notice the requirement change, then later telling others to clean-up their old input code/data that doesn't meet the newly created definition. Text files have been around alot longer than 8 years. Posix disqualifies most text files, for example, those created on the most widely laptop/desktop/commercial computerer OS in the world (Windows). I think what may be true is that 'POSIX text files' describe a data format that may not be how it is stored on disk. I find it very interesting in how 'NUL' is defined to be part of any POSIX text character set definition where such apps claim to support or process 'text'. It's sad to see the GNU utils becoming less flexible and more restricted over time -- much like the trend in computers to steer the public away from general purpose processing (and computers that can do such), to a tightly controlled, walled garden where consumers are only allowed to do what the manufacturer tells them to do.
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
Paul Eggert wrote: Linda Walsh wrote: time rm -fr . 183.23sec 0.69usr 36.25sys (20.16% cpu) time find . ! -name . -prune -exec rm -fr {} + 219.58sec 0.87usr 40.81sys (18.98% cpu) -- about 36 seconds (~20%) longer Benchmarks like this are often suspect since a lot of it depends on factors that are hard to reproduce. That being said, when I tried a similar benchmark on my machine, the 'find' solution was over 30% --- Did you run them on separate partitions over the same file-structure? Neither rm nor find had a hot or even warm cache, as I mounted the file systems just for this test. You can use the same partitions/files, if you use dropcaches: #!/bin/bash function dropcaches () { echo -n 3|sudo dd status=none of=/proc/sys/vm/drop_caches } #if [[ ${BASH_LINE[@]} == 0 ]]; then time dropcaches #fi - If you run it, it runs the function then, if you source it, you'd have to insert 'time' manually later --- which I usually do, as I like to know how long things take. Have had it take as long as over 60 seconds on my system, though more often under 10. faster. In any event the minor performance improvements we're talking about would not be a compelling argument for adding UI complexity to 'rm', even if the 'rm' approach was uniformly faster --- I was addressing Bernhard's explicitly stated concerns. They were not my concerns. But you also didn't address points (3), (4) or (5).. They aren't a problem either. As I mentioned, the find approach conforms to POSIX and so is quite portable; that covers (3). --- You can't believe that. People with older systems don't always keep upto-date with the latest versions -- and likely wrote most of their maint-scripts under the original POSIX charter. They won't know until they are bitten and complain alot louder than I'm comfortable with. It doesn't solve '4', since it's about users wanting similar behaviors not only in other packages, but within the same package. It's a broken wart that is not restricted in other utils -- and causes you to have to defend one-file-system no longer being usable because the users should have known. It would handle (5), _probably_. But allowing dir/. would not cause the same problems a complete ban has, nor is it a likely candidate for abuse or accident. Can you think of anything I've suggested that you've been supportive on, yet I know I
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
Paul Eggert wrote: Linda Walsh wrote: time rm -fr . 183.23sec 0.69usr 36.25sys (20.16% cpu) time find . ! -name . -prune -exec rm -fr {} + 219.58sec 0.87usr 40.81sys (18.98% cpu) -- about 36 seconds (~20%) longer Benchmarks like this are often suspect since a lot of it depends on factors that are hard to reproduce. That being said, when I tried a similar benchmark on my machine, the 'find' solution was over 30% faster. --- Nearly impossible except for leaving out that that find DOES have some multi-cpu capability in it's system scan, If rm had the same, then the addition of 500,000+ calls to an external process can't take 'zero time'. In any event the minor performance improvements we're talking about would not be a compelling argument for adding UI complexity to 'rm', even if the 'rm' approach was uniformly faster But you also didn't address points (3), (4) or (5).. They aren't a problem either. As I mentioned, the find approach conforms to POSIX and so is quite portable; that covers (3). You claim it adheres to POSIX, but there is no single POSIX -- what version are talking about, as the version of POSIX from 2003 or earlier wouldn't have problems. POSIX was supposed to describe what was actually implemented in the systems out there so people could move to a common base to provide API compatibility. Adding descriptions of the base commands and the arguments supported was to help write shell scripts that would be portable. Removing functionality in a backwards incompatible way is anything but helping portability. (3) is not maintained. If you don't want to cross file system boundaries, add the POSIX-required -xdev option to 'find' and the GNU extension --one-file-system argument to 'rm'; that covers (4). Not really -- we are talking about the 'rm' command. Not rewriting scripts and humans to use new commands. Answer this: How does disabling functionality make something more portable? Gnu coreutils have tons of things that enable new functionality, that are not portable unless you assume all platforms will have the new functionality. But **removing** capabilities from programs can never provide backwards compat -- rm -fr dir/. was there for 30 years, and now you think removing that feature makes it portable? I'm sorry, your logic is not logical. If you want to use the safety reason as an overriding reason then I can see banning . .. and / (even though gnu went for a workaround on /. But safety wouldn't be an excuse for removing rm -fr this_is_a_dir/.. I've never even heard of '.' being aproblem and it is supported in the rest of coreutils (except rmdir -- where dirname_this_is/. should also be allowed. People that don't keep up-to-date can't rely on any changes that we would make to 'rm'. I keep up-to-date -- that's why it bit me. But I still haven't upgraded my perl beyond 5.16 because too many of my script break due to them installing or removing various supported featers. I'm still working on that -- but it's alot of scripts. I ran my little test on the same file system. Did you at least drop the caches between runs (i.e.) by echoing '3' to /proc/sys/vm/drop_caches? I'm afraid I'm not persuaded. You really think when people find they can't do: cp -axf /usr/. /usr2/.#... no wanted that in /usr3 mkdir /usr3 cp -alxf /usr2/. /usr3/. ... ESPACE...! rm -fxr /usr2/. /usr3/. ## except this will fail... cp -axf /usr/. /usr3/. They'll instantly think of find? -- Where else besides rmdir is dir/. banned?
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
Bernhard Voelker wrote: On 08/02/2015 10:15 AM, Paul Eggert wrote: Linda Walsh wrote: find, by itself, has no way to remove all of the items under a tree even if you own them all. That's not a problem. Have 'find' call 'rm'. Something like this, say: find . ! -name . -prune -exec rm -fr {} + So there's no need to change 'rm'. +1 Adding additional code to find out if the file to remove is still on the same file system would add bloat, and would open another can of worms: corner cases, races and a big performance penalty Um... The code to find out if the file to remove is on the same file system is already in rm. -3 for attempting to create strawmen. #!/usr/bin/perl eval 'exec /usr/bin/perl -S $0 ${1+$@}' if 0; # not running under some shell # inspired by treescan by Jamie Lokier ja...@imbolc.ucc.ie # about 40% faster than the original version (on my fs and raid :) use strict; use Getopt::Long; use Time::HiRes (); use IO::AIO; our $VERSION = $IO::AIO::VERSION; Getopt::Long::Configure (bundling, no_ignore_case, require_order, auto_help, auto_version); my ($opt_silent, $opt_print0, $opt_stat, $opt_nodirs, $opt_nofiles, $opt_grep, $opt_progress); GetOptions quiet|q= \$opt_silent, print0|0 = \$opt_print0, stat|s = \$opt_stat, dirs|d = \$opt_nofiles, files|f= \$opt_nodirs, grep|g=s = \$opt_grep, progress|p = \$opt_progress, or die Usage: try $0 --help; @ARGV = . unless @ARGV; $opt_grep = qr{$opt_grep}s; my ($n_dirs, $n_files, $n_stats) = (0, 0, 0); my ($n_last, $n_start) = (Time::HiRes::time) x 2; sub printfn { my ($prefix, $files, $suffix) = @_; if ($opt_grep) { @$files = grep $prefix$_ =~ $opt_grep, @$files; } if ($opt_print0) { print map $prefix$_$suffix\0, @$files; } elsif (!$opt_silent) { print map $prefix$_$suffix\n, @$files; } } sub scan { my ($path) = @_; $path .= /; IO::AIO::poll_cb; if ($opt_progress and $n_last + 1 Time::HiRes::time) { $n_last = Time::HiRes::time; my $d = $n_last - $n_start; printf STDERR \r%d dirs (%g/s) %d files (%g/s) %d stats (%g/s) , $n_dirs, $n_dirs / $d, $n_files, $n_files / $d, $n_stats, $n_stats / $d if $opt_progress; } aioreq_pri (-1); ++$n_dirs; aio_scandir $path, 8, sub { my ($dirs, $files) = @_ or warn $path: $!\n; printfn , [$path] unless $opt_nodirs; printfn $path, $files unless $opt_nofiles; $n_files += @$files; if ($opt_stat) { aio_wd $path, sub { my $wd = shift; aio_lstat [$wd, $_] for @$files; $n_stats += @$files; }; } scan ($path$_) for @$dirs; }; } IO::AIO::max_outstanding 100; # two fds per directory, so limit accordingly IO::AIO::min_parallel 20; for my $seed (@ARGV) { $seed =~ s/\/+$//; aio_lstat $seed/., sub { if ($_[0]) { print STDERR $seed: $!\n; } elsif (-d _) { scan $seed; } else { printfn , $seed, /; } }; } IO::AIO::flush;
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
Paul Eggert wrote: Linda Walsh wrote: find, by itself, has no way to remove all of the items under a tree even if you own them all. That's not a problem. Have 'find' call 'rm'. Something like this, say: find . ! -name . -prune -exec rm -fr {} + So there's no need to change 'rm'. Bernard is worried about performance. Do you know how long it would take for find to call rm? a half-a-million times? Um time rm -fr . 183.23sec 0.69usr 36.25sys (20.16% cpu) time find . ! -name . -prune -exec rm -fr {} + 219.58sec 0.87usr 40.81sys (18.98% cpu) -- about 36 seconds (~20%) longer So you've already slowed things down -- and those times were just for my home directory...! (non-critical data was used for these tests (copies of my home directory that existed on different partitions)) But you also didn't address points (3), (4) or (5).. -.5
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
Paul Eggert wrote: Linda Walsh wrote: Since there is no opposition to this, I presume, all you need now is a patch? My impression is that hardly anybody cares about this corner case. How about the following idea instead? We could have --no-preserve-root also skip the special treatment for '.' and '..'. That way, we shouldn't need to add an option. --- Though I've never had a problem with doing something like 'rm -fr /', I'd prefer not to chance it -- I can't believe this is a corner case -- not given the putrid hate spewed at me by some BSD supporter who pushed through the mandatory restrictions. Instead of 'rm -fr /', I prefer dd if=/dev/sda2 of=/dev/sda ...wait, where was my new partition? ARG! Another issue I haven't raised yet, because this is more important to me is the horrible execution of --one-file-system. Since the rm -fr --one-file-system foo/. was removed and the suggested replacements were use '*'... gee.. so you mean under foo/' you had bind mounts to your root, /usr and /home partitions? But one-file-system didn't catch it because they all were presented to 'rm' as cmd-line args -- and the man page legalese says when removing a hierarchy recursively, skip any directory that is on a file system different from that of the corresponding command line argument. So it limits it to deleting files for all the files you used to be safe from deleting when dir/. was allowed. Example on my system... snapshot dir: Filesystem Size Used Avail Use% Mounted on /dev/Data/Home-2015.04.22-03.07.02 6.8G 5.5G 1.3G 81% /home/.snapdir/@GMT-2015.04.22-03.07.02 /dev/Data/Home-2015.04.30-03.07.02 1.1G 913M 186M 84% /home/.snapdir/@GMT-2015.04.30-03.07.02 /dev/Data/Home-2015.05.17-13.11.21 762M 647M 115M 85% /home/.snapdir/@GMT-2015.05.17-13.11.21 /dev/Data/Home-2015.05.18-00.40.55 1.2G 981M 193M 84% /home/.snapdir/@GMT-2015.05.18-00.40.55 /dev/Data/Home-2015.05.18-13.05.04 1.7G 1.4G 287M 83% /home/.snapdir/@GMT-2015.05.18-13.05.04 /dev/Data/Home-2015.05.19-04.08.02 1.2G 957M 189M 84% /home/.snapdir/@GMT-2015.05.19-04.08.02 /dev/Data/Home-2015.05.20-04.08.02 922M 774M 149M 84% /home/.snapdir/@GMT-2015.05.20-04.08.02 /dev/Data/Home-2015.05.21-04.08.03 802M 676M 126M 85% /home/.snapdir/@GMT-2015.05.21-04.08.03 /dev/Data/Home-2015.05.22-04.08.02 2.3G 1.9G 421M 82% /home/.snapdir/@GMT-2015.05.22-04.08.02 /dev/Data/Home-2015.05.23-04.08.02 4.5G 3.7G 874M 81% /home/.snapdir/@GMT-2015.05.23-04.08.02 /dev/Data/Home-2015.05.24-04.08.04 7.2G 5.8G 1.4G 81% /home/.snapdir/@GMT-2015.05.24-04.08.04 /dev/Data/Home-2015.05.26-03.39.31 1.3G 1.1G 218M 84% /home/.snapdir/@GMT-2015.05.26-03.39.31 /dev/Data/Home-2015.05.27-04.08.05 5.4G 4.4G 1.1G 82% /home/.snapdir/@GMT-2015.05.27-04.08.05 /dev/Data/Home-2015.06.01-14.19.28 4.1G 3.3G 779M 82% /home/.snapdir/@GMT-2015.06.01-14.19.28 /dev/Data/Home 1.5T 1.1T 494G 68% /home /dev/Data/Home-2015.06.02-12.53.34 1.5T 1.1T 502G 68% /home/.snapdir/@GMT-2015.06.02-12.53.34 That's the type of harm caused by removing the cd snapshots rm -fr --one-file-system . or rm -fr --one-file-system snapshots/. and telling them it's not needed in rm because the shell's '*' will expand it. I could safely use the disabled features in that dir -- sometimes junk builds up where something got copied into a directory that didn't have the corresponding partition mounted, as an example. With snapshots becoming more in vogue -- in another decade or so, POSIX will require banning wildcard usage from a shell (if shell-access hasn't been disabled before that, of course... ;^). Besides.. with my suggested change, rm would only need 1 new switch, not '2' like '/' did ;-), though I admit to wanting to add -x (find, rsync, maybe others having such a switch as meaning stay on 1-dev (--xdev) -- and if I had my druthers, using -x would NOT use the fact that cmdline args were on different filesystems as an excuse to do more than operate on one-file-system... (would it be that hard to check the device id's of the cmd-line args before starting a recursive delete based off them?)... Eh...like I said, for me, just the special option to allow . or dir/. is far more important.
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
Linda Walsh wrote: Andreas Schwab wrote: If either of the files dot or dot-dot are specified as the basename portion of an operand (that is, the final pathname component) or if an operand resolves to the root directory, rm shall write a diagnostic message to standard error and do nothing more with such operands. I'll grant it also says you can't remove /, So a special flag --use_depth_first_inspection that says not to look at a basename until it's children have been processed wouldn't be any more out of place than special flags to handle / processing, right? The fact that they put the ., .. and / together, outside of the 1-4 processing leads one to the idea that they should be treated similarly, no? --- Since there is no opposition to this, I presume, all you need now is a patch? I.e. - POSIX now demands that /, . and .. all be ignored in a basename, yet the some smart gnu folks decided that leaving in a non-default optional behavior to override the new dumb-down restrictions would best serve the community. So I might reason that they would be equally smart and/or use similar logic to allow a non-default option to remove the dumb-down on the . path. NOTE: I have no issue with NOT _attempting_ a delete on . after doing the designed depth-first traversal. Applying the POSIX restriction on not attempting to delete . makes perfect sense to me, since I know that doing so can give inconsistent and undefined behavior depending on the OS, but using . as a semantic place holder to allow one to reference a starting point for some action (imagine using 'find' if '.' was banned as starting point: find '' -type f find: ‘’: No such file or directory *cheers*
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
In looking at the 2013 specification for rm (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html), it no longer says to stop processing if the path basename equals . or ... It says that the entries . and .. shall not be removed. It also says rm empty dir shall behave like rmdir -- i.e. it will delete empty directories. But in the case of foo/. it would be expected to process child inodes before processing the directory itself. But step 4 on that page says that rm should remove empty directories without requiring other special switches.
bug#21084: rm appears to no longer be POSIX compliant (as of 2013 edition) re: deleting empty dirs and files under path/.
reopen 21084 thanks Andreas Schwab wrote: Linda Walsh coreut...@tlinx.org writes: In looking at the 2013 specification for rm (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/rm.html), it no longer says to stop processing if the path basename equals . or ... If either of the files dot or dot-dot are specified as the basename portion of an operand (that is, the final pathname component) or if an operand resolves to the root directory, rm shall write a diagnostic message to standard error and do nothing more with such operands. I'll grant it also says you can't remove /, So a special flag --use_depth_first_inspection that says not to look at a basename until it's children have been processed wouldn't be any more out of place than special flags to handle / processing, right? The fact that they put the ., .. and / together, outside of the 1-4 processing leads one to the idea that they should be treated similarly, no?
bug#21000: coreutils 8.24 sort -h gets ordering wrong
Paul Eggert wrote: Linda Walsh wrote: since the output is targetted for 3-4 digits+ a suffix, one would just normalize them. If the output is that of 'du' or 'df' and has just a few digits and always uses a consistent suffix, then there is no need to normalize it; that is what 'sort' does now. In this case 'sort' doesn't need to know whether the output uses powers-of-1024 or powers-of-1000. However, if the output is from some other random program that uses suffixes inconsistently, all that goes out the window: normalization and some form of arbitrary-precision arithmetic would be required and 'sort' would need to be told whether the suffixes use powers of 1024 or powers of 1000. I didn't understand the code snippet you sent, but it appears to have rounding errors. 'sort' doesn't have rounding errors when doing the comparison in question. Let's not introduce them now. --- It has to. It's rounding to 2-3 digits.
bug#21000: coreutils 8.24 sort -h gets ordering wrong
Paul Eggert wrote: Linda Walsh wrote: I didn't understand the code snippet you sent, but it appears to have rounding errors. 'sort' doesn't have rounding errors when doing the comparison in question. Let's not introduce them now. --- It has to. It's rounding to 2-3 digits. Sorry, I don't understand that comment. 'sort -h' does not round. It doesn't have to. Your code *does* round, and I suppose in some sense it has to, but that's a bug compared to what 'sort' does now. --- Perhaps, but what it does now is fail on the type of input presented by the OP. vs. the code you are saying has a 'bug' is the code that correctly sorts the OP's test input -- but the rounding bit is in the output routine. Your needs may be different than mine, but I think you were just talking about the expectation of using normalized-metric numbers -- thus my consideration of it being fair to produce rounded, normalized-metric output. vs. the existing product sort -h doesn't sort the input -- which pretty much is a fail as far as handling non-normalized binary input. You can say, well it wasn't designed to do that... but it could up to 64-bits which would provide utility for the output of coreutil's other tools. I'd also add an --si to make it clear it is tied to the same units used elsewhere. The C++ code has similar output constraints -- in that it needs to put out a 2-3 digit number. But sort doesn't need a summary number -- it just needs to sort the input -- where it fails. My sorting code loses no precision up to the Exabyte range (fits in a 64-bit unsigned. The OP's input only used numbers up to the low Gigabyte range. To actually convert each number to a suffix free 64-bit unsigned, and sort that, then print the original numbers out in the sorted order would be well within sort's perview. The issue of bin or dec prefixes could be solved with a --si switch paralleling the output of 'du'. Please don't be offended by my suggestion that sort *could* handle the OP's case w/o rounding. I get that, right now, this is an academic discussion. However, if one offered such sorting, wouldn't it follow that summation ability might be asked for? What to choose for output? Isn't default to choose #digits of your fewest-digits input (least precision?) Anyway, I just was disappointed at how quickly the OP's expectation for sort was shot down.
bug#20936: suggestion for a 'wart-ish' extension off of 'sort'
I admit the ability to show a summary line might not bethe first thing you'd think a pure-sorting utility might do, but it would be awfully handy if sort had a 'Numeric sum' option (-N -- preferred '-s', but it's already taken) to go with the -h sorting: ala: du -sh *|sort -h|tail 6.0Mfirmware 6.7Mkernel 8.4Mtools 26M net 29M sound 30M Documentation 31M include 37M fs 128March 330Mdrivers --- vs. --- du -sh *|hsort -s|tail -12 6.0Mfirmware 6.7Mkernel 8.4Mtools 26M net 29M sound 30M Documentation 31M include 37M fs 128March 330Mdrivers - 649.4M TOTAL --- I'd donate the code for hsort, but its in perl -- I wrote it several years ago to do what 'sort -h' does, but also put in the option for a summary line -- handy companion for 'human numbers', which would otherwise take alot more typing (I think -- unless there's some hidden switch I don't know about).
bug#20936: suggestion for a 'wart-ish' extension off of 'sort'
On 6/30/2015 3:51 AM, Erik Auerswald wrote: Ishtar:/tmp/dutest du -shc * |sort -h|tail 1.5Msperl,v 3.6Mtotal 5.0Mtotal Ishtar:/tmp/dutest du -sh * |hsort -s|tail 1.5Msperl,v 3.6Mtotal - 5.1MTOTAL But more a more obvious problem is 'du -shc' seems to be coming up with the wrong number -- i.e. 1.5+3.6 = 5.1, not 5.0. That are probably rounding errors avoided by du, that hsort cannot avoid anymore. --- 1) I think you're right, but 2) it still looks odd to see 1.5+3.6=5.0 and not 5.1
bug#20936: suggestion for a 'wart-ish' extension off of 'sort'
On 6/30/2015 12:46 AM, Erik Auerswald wrote: du -sh *|sort -h|tail Why not use 'du -shc * | sort -h | tail -n11'? The total produced by du will sort after all the individual parts. Good idea -- didn't know about '-c', but two things, 1 troubling, the other a confusion. If you have a dir named 'total' it can be slightly confusing: Ishtar:/tmp/dutest du -shc * |sort -h|tail 1.5Msperl,v 3.6Mtotal 5.0Mtotal Ishtar:/tmp/dutest du -sh * |hsort -s|tail 1.5Msperl,v 3.6Mtotal - 5.1MTOTAL But more a more obvious problem is 'du -shc' seems to be coming up with the wrong number -- i.e. 1.5+3.6 = 5.1, not 5.0. In my original example, it's off by more: Ishtar:linux/linux-4.1.0 du -sch *|sort -h|tail 6.7Mkernel 8.4Mtools 26M net 29M sound 30M Documentation 31M include 37M fs 128March 330Mdrivers 645Mtotal Ishtar:linux/linux-4.1.0 du -sh *|hsort -s|tail 8.4Mtools 26M net 29M sound 30M Documentation 31M include 37M fs 128March 330Mdrivers - 649.4M TOTAL
bug#20678: new bug that Paul asked for... grep -P aborts on non-utf8 input.
Bernhard Voelker wrote: On 05/28/2015 12:24 AM, Linda Walsh wrote: ok... ARG -- I just installed the new version of grep from my distro (suse13.2) -- grep-2.20-2.4.1.x86_64 I think they'll be out with a new distro release in about a year...(yes, I can probably build my own...like I have to with a growing body of Software) This is openSUSE specific. When you've built your own version with a patch for a problem, nothing prevents you from simply creating a submit request for that patch on OBS to Base:System/grep, and maybe even creating a maintenance request for openSUSE:13.2/grep. Get involved. Main thing my patch is restoring functionality of 'rm' to allow rm -fr ., I'm not daft enough to try to sneak that in as a default. Maybe in a different command, maybe as a non-default, but I'm anything but duplicitous (unfortunately). I _have_ always thought that a shorthand combination of rd and rm, might be nice -- maybe 'r'... Of course it would only work like rmdir on empty dirs unless they specify the -r flag so it could remove contents first. And of course it would pay attention to the posix rule about not trying to delete '.' after it finished its' depth first traversal... But no one else seems to really care that much, so I'm not sure how much effort I want to put into it to package something like that up. But it has entered my mind... Cheers, Lina
bug#20678: new bug that Paul asked for... grep -P aborts on non-utf8 input.
Paul Eggert wrote: On 05/27/2015 02:41 PM, L. A. Walsh wrote: *** file = libvtkUtilitiesPythonInitializer-pv4.2.so.1 grep: invalid UTF-8 byte sequence in input This looks like you're using an old version of libpcre, or of grep. I can't reproduce the problem with the latest stable versions of both (libpcre 8.37, grep-2.21). I can find similar problems if I use old libpcre. --- ok... ARG -- I just installed the new version of grep from my distro (suse13.2) -- grep-2.20-2.4.1.x86_64 I think they'll be out with a new distro release in about a year...(yes, I can probably build my own...like I have to with a growing body of Software) -- something that has gotten me in trouble with my distro at times when I've caught them locking different pieces of software to specific libraries (not == xxx but ==)... grrr...I could acknowledge their point that most people wouldn't bother rebuilding all the perl modules if they upgraded perl... but that's not *everyone*!...sigh. coreutils isn't as stable as it used to be (not entirely the CU-devel team either: I've caught suse's hand in 1-2)... Just ran into problems in their new gvim sudo -- I think the sudo prob is the sudo-dev team...but the gvim I filed a bug on in previous version... guess it didn't get fixed. Filing bugs more often than not is a big waste of time. *grump* *grump*... ;-)
bug#20442: bug+patch: du output misaligned on different terminals
Pádraig Brady wrote: There are backwards compatibility issues to consider. Could you demonstrate any where similar issues wouldn't affect output from 'ls' Second, I don't remember numfmt being part being part of posix, but my solution seems to fall under POSIX: From: Pádraig Brady Subject: Re: du: POSIX mandating a single space instead of tab? Date: Tue, 28 Apr 2015 16:51:06 +0100 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 On 28/04/15 15:42, Eric Blake wrote: No, the space stands for any (positive) amount of white space. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05 Andreas. Thanks for pointing that out Andreas. So a ' ' in a format implies any amount of blank chars. Correct. So we could separate the du columns with spaces rather than tab, Yes, I'd prefer that we did that. It is much easier to guarantee alignment when using spaces and completely avoiding whatever tab stops people have set up. = I find myself agreeing with Eric on this issue. Actually, I would prefer a envvar USE_TABS=[[ytfnx]] (yes true false/no/false/no/expand) and their upper-case versions for simplicity as defaulting to 8 space/tab -- though I could also support the absence of 'USE_TABS' as being compatible with current functionality. I can see the use of something like this in desktop programs tty/consoles, editors, etc... I would want to see standard column terminology (ala sort cut ranging from 1 - 80). If the var is undefined, I think it would be _more_ _predictable_ to go go with expanded as '8', OR with like the tty_tabs command w/no args: it shows you the current settings. (Why doesn't 'tabs' show them?) However -- it we agree on the env-var for tab presence/expansion and tabstop definitions, I could also agree on leaving behaviors in current programs the same as they are now (w/o the USE_TABS envvar). tabs tabs: no tab-list given --- hmmm not too useful tabs 2 -- also tabs doesn't take columns but spaces of separation. Tabstops are usually at set points like 1/2 or 1/4 or some fixed number, so if I use tabs 2:, One might like 'tabs' to display the current tabs: tty_tab (from 1, tabs skip to column: 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 80 I note the tab program also supports irregular tabs: for i in a a2 c c2 c3 f p s u;do tabs -$i tty_tab done (from 1, tabs skip to column: 10 16 36 72 80 (from 1, tabs skip to column: 10 16 40 72 80 (from 1, tabs skip to column: 8 20 55 80 (from 1, tabs skip to column: 6 10 14 49 80 (from 1, tabs skip to column: 6 10 22 26 30 34 38 42 46 50 54 58 62 67 80 (from 1, tabs skip to column: 7 11 15 19 23 80 (from 1, tabs skip to column: 5 9 13 17 21 25 29 33 37 49 53 80 (from 1, tabs skip to column: 10 55 80 (from 1, tabs skip to column: 12 20 44 80 - I would like to see this in an ENV var so people could use it for other utils in their session (like vim/emacs or whatever). Some files like the /etc/fstab file really need variable tabs.
bug#20442: bug+patch: du output misaligned on different terminals
reopen 20442 thanks === Your more general case doesn't work: du -sh /tmp/t*|numfmt --format %10f numfmt: rejecting suffix in input: ‘4.0K’ (consider using --from) du -sh --time /tmp/t*|numfmt --format %10f numfmt: rejecting suffix in input: ‘4.0K’ (consider using --from) I usually use other arguments with 'du'. Your external tool solution doesn't handle the general case of du's output. The point was to correct 'du's output, not find a *custom* solution to correct assumptions made by 'du'. Why would you reject something that fixes this problem? Are you proposing to remove the special tab-handling in 'dir', 'ls', 'cat', 'expand', 'pr', 'unexpand', 'vdir' among many other cmdline utils? Relying on hard-coded constants is usually considered poor programming practice. In this case, you are relying on all terminals/output devices conforming to a fixed value that you deem correct. Is there a benefit to choosing an inferior design that doesn't work across different terminal sizes? The patch resolves the problem and works on all terminals. Pádraig Brady wrote: tag 20442 wontfix close 20442 stop On 27/04/15 20:11, L. A. Walsh wrote: This is a fix/work-around for (RFE#19849 (bug#19849) which was about addingg options to expand tabs and/or set a tabsize for output from 'du' so output would line up as intended. Without that enhancement, the current output is messed up on terminals/consoles that don't use hard-coded-constant widths for tabs (like many or most of the Xterm linux consoles). Adding the switches is more work than I want to chew off right now, but the misaligned output made for difficult reading (besides looking bad), especially w/a monospace font where it is clear that the columns were meant to lineup. So I threw together a quick patch against the current git source (changes limited to 'du.c'). If someone would look it over, try it, or such and apply it to the current coreutils source tree (it's in patch form against 'src/du.c') for some soon future release, (at least until such time as the above mentioned RFE can be addressed). 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 The current du output (example from my tmp dir) on a term w/o hard-coded-constant expansion looks like: Ishtar:tools/coreutils/work/src /usr/bin/du /tmp/t* 4 /tmp/t 1160 /tmp/t1 680 /tmp/t2 4 /tmp/tab2.patch 20 /tmp/tabs 4 /tmp/tmpf 4 /tmp/topcmds 24 /tmp/topcmds-hlps 24 /tmp/topcmds2 8 /tmp/topcmds2.txt 4 /tmp/tq1 32 /tmp/tt 32 /tmp/tt In fairness, this is with the unusual case after running `tabs 2` *Without* the assumption of hard-coded or fixed tabs (using a 8-spaces/tab as seems to be the implementors assumption / intention), the output columns, again, line-up vertically: Ishtar:tools/coreutils/work/src ./du /tmp/t* 4 /tmp/t 1160/tmp/t1 680 /tmp/t2 4 /tmp/tab2.patch 20 /tmp/tabs 4 /tmp/tmpf 4 /tmp/topcmds 24 /tmp/topcmds-hlps 24 /tmp/topcmds2 8 /tmp/topcmds2.txt 4 /tmp/tq1 32 /tmp/tt While not addressing the RFE, at least the original output format should look the same on all terminals Thanks for the patch, however the same could be achieved more generally with external tools. For example numbers are better for human consumption when right aligned, so you could achieve both with: du | numfmt --format %10f cheers, Pádraig.
bug#19969: problem: wc -c doesn't read actual # of bytes in file
Jim Meyering wrote: As root: # cd /proc # find -H [^0-9]* -name self -prune -o -name thread-self -prune -o -type f ! -name kmsg ! -name kcore ! -name kpagecount ! -name kpageflags -print0|wc -c --files0-from=- |sort -n Thanks for the report. However, with wc from coreutils-8.23 and a 3.10 kernel, this is no longer an issue. --- with coreutils 8.23 from suse 13.2 and uname: Linux Ishtar 3.18.5-Isht-Van #1 SMP PREEMPT Wed Feb 4 14:50:44 PST 2015 x86_64 x86_64 x86_64 GNU/Linux it is an issue. All the /proc/sys entries are still 0. Here's the output (with a some lines elided)... 0 mpt/summary 0 net/netfilter/nfnetlink_log 0 sys/abi/vsyscall32 0 sys/debug/exception-trace 0 sys/dev/hpet/max-user-freq 0 sys/dev/raid/speed_limit_max 0 sys/dev/raid/speed_limit_min 0 sys/dev/scsi/logging_level 0 sys/fs/aio-max-nr 0 sys/fs/aio-nr 0 sys/fs/dentry-state 0 sys/fs/dir-notify-enable 0 sys/fs/epoll/max_user_watches 0 sys/fs/file-max 0 sys/fs/file-nr 0 sys/fs/inode-nr 0 sys/fs/inode-state 0 sys/fs/inotify/max_queued_events 0 sys/fs/inotify/max_user_instances 0 sys/fs/inotify/max_user_watches 0 sys/fs/lease-break-time 0 sys/fs/leases-enable 0 sys/fs/mqueue/msg_default 0 sys/fs/mqueue/msg_max 0 sys/fs/mqueue/msgsize_default 0 sys/fs/mqueue/msgsize_max 0 sys/fs/mqueue/queues_max 0 sys/fs/nr_open 0 sys/fs/overflowgid 0 sys/fs/overflowuid 0 sys/fs/pipe-max-size 0 sys/fs/protected_hardlinks 0 sys/fs/protected_symlinks 0 sys/fs/suid_dumpable 0 sys/fs/xfs/age_buffer_centisecs 0 sys/fs/xfs/error_level 0 sys/fs/xfs/filestream_centisecs 0 sys/fs/xfs/inherit_noatime 0 sys/fs/xfs/inherit_nodefrag 0 sys/fs/xfs/inherit_nodump 0 sys/fs/xfs/inherit_nosymlinks 0 sys/fs/xfs/inherit_sync 0 sys/fs/xfs/irix_sgid_inherit 0 sys/fs/xfs/irix_symlink_mode 0 sys/fs/xfs/panic_mask 0 sys/fs/xfs/rotorstep 0 sys/fs/xfs/speculative_prealloc_lifetime 0 sys/fs/xfs/stats_clear 0 sys/fs/xfs/xfsbufd_centisecs 0 sys/fs/xfs/xfssyncd_centisecs 0 sys/fscache/object_max_active 0 sys/fscache/operation_max_active 0 sys/kernel/acct 0 sys/kernel/auto_msgmni 0 sys/kernel/bootloader_type 0 sys/kernel/bootloader_version 0 sys/kernel/cad_pid 0 sys/kernel/cap_last_cap 0 sys/kernel/compat-log 0 sys/kernel/core_pattern 0 sys/kernel/core_pipe_limit 0 sys/kernel/core_uses_pid 0 sys/kernel/ctrl-alt-del 0 sys/kernel/dmesg_restrict 0 sys/kernel/domainname 0 sys/kernel/ftrace_dump_on_oops 0 sys/kernel/hostname 0 sys/kernel/hotplug 0 sys/kernel/hung_task_check_count 0 sys/kernel/hung_task_panic 0 sys/kernel/hung_task_timeout_secs 0 sys/kernel/hung_task_warnings 0 sys/kernel/io_delay_type 0 sys/kernel/kexec_load_disabled 0 sys/kernel/keys/gc_delay 0 sys/kernel/keys/maxbytes 0 sys/kernel/keys/maxkeys 0 sys/kernel/keys/persistent_keyring_expiry 0 sys/kernel/keys/root_maxbytes 0 sys/kernel/keys/root_maxkeys 0 sys/kernel/kptr_restrict 0 sys/kernel/kstack_depth_to_print 0 sys/kernel/latencytop 0 sys/kernel/lock_stat 0 sys/kernel/max_lock_depth 0 sys/kernel/modprobe 0 sys/kernel/modules_disabled 0 sys/kernel/msg_next_id 0 sys/kernel/msgmax 0 sys/kernel/msgmnb 0 sys/kernel/msgmni 0 sys/kernel/ngroups_max 0 sys/kernel/nmi_watchdog 0 sys/kernel/ns_last_pid 0 sys/kernel/numa_balancing 0 sys/kernel/numa_balancing_scan_delay_ms 0 sys/kernel/numa_balancing_scan_period_max_ms 0 sys/kernel/numa_balancing_scan_period_min_ms 0 sys/kernel/numa_balancing_scan_size_mb 0 sys/kernel/osrelease 0 sys/kernel/ostype 0 sys/kernel/overflowgid 0 sys/kernel/overflowuid 0 sys/kernel/panic 0 sys/kernel/panic_on_io_nmi 0 sys/kernel/panic_on_oops 0 sys/kernel/panic_on_unrecovered_nmi 0 sys/kernel/perf_cpu_time_max_percent 0 sys/kernel/perf_event_max_sample_rate 0 sys/kernel/perf_event_mlock_kb 0 sys/kernel/perf_event_paranoid 0 sys/kernel/pid_max 0 sys/kernel/poweroff_cmd 0 sys/kernel/print-fatal-signals 0 sys/kernel/printk 0 sys/kernel/printk_delay 0 sys/kernel/printk_ratelimit 0 sys/kernel/printk_ratelimit_burst 0 sys/kernel/pty/max 0 sys/kernel/pty/nr 0 sys/kernel/pty/reserve 0 sys/kernel/random/boot_id 0 sys/kernel/random/entropy_avail 0 sys/kernel/random/poolsize 0 sys/kernel/random/read_wakeup_threshold 0 sys/kernel/random/urandom_min_reseed_secs 0 sys/kernel/random/uuid 0 sys/kernel/random/write_wakeup_threshold 0 sys/kernel/randomize_va_space 0 sys/kernel/sched_autogroup_enabled 0 sys/kernel/sched_cfs_bandwidth_slice_us 0 sys/kernel/sched_child_runs_first 0 sys/kernel/sched_domain/cpu{0..11}/domain0/busy_factor 0 sys/kernel/sched_domain/cpu{0..11}/domain0/busy_idx 0 sys/kernel/sched_domain/cpu{0..11}/domain0/cache_nice_tries 0 sys/kernel/sched_domain/cpu{0..11}/domain0/flags 0 sys/kernel/sched_domain/cpu{0..11}/domain0/forkexec_idx 0 sys/kernel/sched_domain/cpu{0..11}/domain0/idle_idx 0 sys/kernel/sched_domain/cpu{0..11}/domain0/imbalance_pct 0 sys/kernel/sched_domain/cpu{0..11}/domain0/max_interval 0 sys/kernel/sched_domain/cpu{0..11}/domain0/max_newidle_lb_cost 0 sys/kernel/sched_domain/cpu{0..11}/domain0/min_interval 0 sys/kernel/sched_domain/cpu{0..11}/domain0/name 0
bug#19969: problem: wc -c doesn't read actual # of bytes in file
Bernhard Voelker wrote: On 02/28/2015 09:59 AM, Linda Walsh wrote: (coreutils-8.21-7.7.7) wc -c(bytes) doesn't seem to reliably read the number of bytes in a file. I was wanting to find out what the largest data-source files in '/proc' and '/sys' (didn't get around to trying /sys, since all the files under /proc/sys return 0 bytes. Note -- wc -l doesn't return '0' on the /proc/sys files. As root: # cd /proc # find -H [^0-9]* -name self -prune -o -name thread-self -prune -o -type f ! -name kmsg ! -name kcore ! -name kpagecount ! -name kpageflags -print0|wc -c --files0-from=- |sort -n Thanks for the report. However, I'm not 100% sure what the problem is - as you didn't narrow the case down to a certain file, nor did you show us the output and what you expected. --- Yes. I did. Anything under /proc/sys is affected. There is more than 1 'certain file' that is affected. What I would _expect_ is for it to show the actual # of bytes in the file rather than a size of 0. That's why I included the command to duplicate the problem. Are you saying you tried it and it doesn't do the same thing on your system? Without me debugging the source, how much more do you want me to narrow it down? If you change the wc -c = wc -l, you will see non-zero numbers for files under /sys/proc. BTW .. it seems many files under /sys are also affected. As for files under /proc, the 'net' directory seems to work: Are you wanting a list like this: Ishtar:/proc find -H sys net -name self -prune -o -name thread-self -prune -o -type f ! -name kmsg ! -name kcore ! -name kpagecount ! -name kpageflags -print0|wc -c --files0-from=- |sort -n wc: sys/fs/protected_hardlinks: Permission denied wc: sys/fs/protected_symlinks: Permission denied wc: sys/kernel/cad_pid: Permission denied wc: sys/kernel/usermodehelper/bset: Permission denied wc: sys/kernel/usermodehelper/inheritable: Permission denied wc: sys/net/ipv4/route/flush: Permission denied wc: sys/net/ipv4/tcp_fastopen_key: Permission denied wc: sys/vm/compact_memory: Permission denied 0 net/netfilter/nfnetlink_log 0 sys/abi/vsyscall32 0 sys/debug/exception-trace 0 sys/dev/hpet/max-user-freq 0 sys/dev/raid/speed_limit_max 0 sys/dev/raid/speed_limit_min 0 sys/dev/scsi/logging_level 0 sys/fs/aio-max-nr 0 sys/fs/aio-nr 0 sys/fs/dentry-state 0 sys/fs/dir-notify-enable 0 sys/fs/epoll/max_user_watches 0 sys/fs/file-max 0 sys/fs/file-nr 0 sys/fs/inode-nr 0 sys/fs/inode-state 0 sys/fs/inotify/max_queued_events 0 sys/fs/inotify/max_user_instances 0 sys/fs/inotify/max_user_watches 0 sys/fs/lease-break-time ... (pattern is /proc/sys/* = size==0) vs. directory /proc/net: 11 net/ip_tables_names 17 net/ip_tables_targets 36 net/psched 39 net/connector 47 net/mcfilter 54 net/ip_mr_cache 55 net/ip_tables_matches 71 net/ip_mr_vif 128 net/icmp 128 net/rt_cache 128 net/udplite 142 net/sockstat 171 net/ptype 233 net/raw 321 net/netfilter/nf_log 381 net/packet 644 net/dev_mcast 733 net/fib_triestat 822 net/igmp 896 net/route 1062 net/protocols 1090 net/dev 1188 net/softnet_stat 1234 net/arp 1385 net/snmp 1506 net/stat/arp_cache 1678 net/netlink 1718 net/fib_trie 2077 net/stat/rt_cache 2412 net/netstat 4096 net/rt_acct 6528 net/udp 2 net/unix 41550 net/tcp 92581 total --- Sorry to be unclear, but I would expect the numbers under /proc/sys to look something like the numbers under /proc/net. FWIW, the 'prune' and exceptions in the find are to filter out some degenerate cases (like byte counts in /proc/kmem) as well as too much verbosity (skip the processes, including self and thread-self). I also used the -H switch to not skip initial symlinks, but ignore lower ones. For the numbers under /sys, most seem to return a wrong size.a size of 0, but doing a 'cat FILENAME|wc -c' shows the correct number. I didn't give much data about /sys, as getting a test case that would run just for /proc, w/o hanging or looping took too long (like trying to read 'kmsg' takes forever and will basically hang the test case). A quick shot is http://bugs.gnu.org/18621 --- Certainly looks like what I'm seeing. HOWEVER, why do I get numbers in /proc/net but not under /sys? Seems like there is something else going on here. and the corresponding commit (after the latest v8.23 release) http://git.sv.gnu.org/cgit/coreutils.git/commit/?id=2662702b Does this - i.e., a wc(1) built from git - solve your problem? --- It may work *around* the problem, but doesn't explain the inconsistencies between, for example, /proc/sys and /proc/net. If the patch forces a read, there's a good chance it will work, but the inconsistent returns from /proc/sys, /proc/net AND /sys(seems to often show 4K as noted in the other bug report), would seem to indicted a kernel bug somewhere. If they were ALL 0 or all 4K, then I'd say the kernel doesn't return size for either of those interfaces (and that may be as far as that goes and the workaround
bug#19969: problem: wc -c doesn't read actual # of bytes in file
(coreutils-8.21-7.7.7) wc -c(bytes) doesn't seem to reliably read the number of bytes in a file. I was wanting to find out what the largest data-source files in '/proc' and '/sys' (didn't get around to trying /sys, since all the files under /proc/sys return 0 bytes. Note -- wc -l doesn't return '0' on the /proc/sys files. As root: # cd /proc # find -H [^0-9]* -name self -prune -o -name thread-self -prune -o -type f ! -name kmsg ! -name kcore ! -name kpagecount ! -name kpageflags -print0|wc -c --files0-from=- |sort -n
bug#19849: RFE: du output uses undefined screen-tabsize: expand tabs to spaces or -Ttabsize option?
I run a linux compat term that allows setting the tab size. Since most of my usage is using tabsize=2, I set the term's tabsize to such when it comes up. Programs that can display tabs in output like 'ls', 'diff', 'less(or more)', to name a few, have some type of expand-tabs or -[tT] option to expand tabs on output (or input to line up input columns). Ex: ls: -T, --tabsize=COLS assume tab stops at each COLS instead of 8 diff: -t, --expand-tabs expand tabs to spaces in output -T, --initial-tab make tabs line up by prepending a tab --tabsize=NUM tab stops every NUM (default 8) print columns (etc..). I propose 'du' gain a -T option like 'ls' to allow for formatted output: So instead of : 20K My layouts/linda-default.fcl 20K My layouts/new-default.fcl 0 My layouts/foo.fcl 2.2M autobackup/autobackup.20141103-042819.zip 2.3M autobackup/bak 12K configuration/Core.cfg 12K playlists/0106.fpl 24K playlists/index.dat 2.1M pygrabber/libs 28K pygrabber/scripts 1.3M user-components/foo_AdvancedControls I could see: 20K My layouts/linda-default.fcl 20K My layouts/new-default.fcl 0 My layouts/foo.fcl 2.2Mautobackup/autobackup.20141103-042819.zip 2.3Mautobackup/bak 12K configuration/Core.cfg 12K playlists/0106.fpl 24K playlists/index.dat 2.1Mpygrabber/libs 28K pygrabber/scripts 1.3Muser-components/foo_AdvancedControls Two other readability examples from different programs follow and a description of the attachment. Of note, 'ls' defaults to explanding tabs to spaces, so it doesn't have the problem of variable expansion, but if one tells it to use '8 space/tab (example pruned from /tmp): 4.0K 0bPwr3N_7s 4.0K cyg2lin.env4.0K prereqs.txt 4.0K 1 16K diff 4.0K rmdirs 4.0K 24.0K dirs0 ssh-Y3YzuDAD5w/ 0 3173-f1.txt4.0K do_diffs* 0 ssh-a9nNm0VQ2c/ 4.0K 5QXcX6apwV 4.0K done0 ssh-oszB2InjXA/ 0 CPAN-Reporter-lib-1WVP/ 4.0K fZuwIWpHXO 0 ssh-pOlsxOkr0U/ 0 CPAN-Reporter-lib-wDln/ 4.0K files 0 ssh-vSPNXq8i3I/ 4.0K HUk8j_zP_d 4.0K fq22uj4fYU 0 t1 4.0K all 4.0K lnx.txt 0 veKj4PS/ 4.0K awstest.out456K log 0 vq0XVTv/ 104K boot-cons.msg4.0K lt.h 40K x.log 0 boot.msgs/ 4.0K meterlist 24K x.txt vs. ls' -CFhsT2: 4.0K 0bPwr3N_7s 4.0K cyg2lin.env4.0K prereqs.txt 4.0K 1 16K diff 4.0K rmdirs 4.0K 24.0K dirs 0 ssh-Y3YzuDAD5w/ 0 3173-f1.txt 4.0K do_diffs* 0 ssh-a9nNm0VQ2c/ 4.0K 5QXcX6apwV 4.0K done 0 ssh-oszB2InjXA/ 0 CPAN-Reporter-lib-1WVP/ 4.0K fZuwIWpHXO0 ssh-pOlsxOkr0U/ 0 CPAN-Reporter-lib-wDln/ 4.0K files 0 ssh-vSPNXq8i3I/ 4.0K HUk8j_zP_d 4.0K fq22uj4fYU0 t1 4.0K all 4.0K lnx.txt 0 veKj4PS/ 4.0K awstest.out 456K log 0 vq0XVTv/ 104K boot-cons.msg4.0K lt.h40K x.log 0 boot.msgs/ 4.0K meterlist 24K x.txt As a final short example -- something I use to print a shortened version of my current directory in my prompt: w/default -8 tabs in less: less -x8 spwd #!/bin/bash cols() { declare size=$(stty size /dev/tty) echo ${size#* } } export -f cols shopt -s expand_aliases alias int=declare\ -i _e=echo _pf=printf exp=export ret=return exp __dpf__='local -a PF=( /$1/$2/$3/../\${$[$#-1]}/\${$#} /$1/$2/../\${$[$#-1]}/\${$#} /$1/../\${$[$#-1]}/\${$#} /$1/../\${$#} .../\${$#} ... )' function spwd () { \ (($#)) || { _e spwd called with null arg; ret 1; }; \ int w=${COLUMNS:-$(cols)}/2 ;\ ( _pf -v _p %s $1 ; exp IFS=/ ;\ set $_p; shift; unset IFS ;\ t=${_p#${HOME%${USER}}} ;\ int tl=${#t} ;\ if (($#=6 tlw));then ((tl=2)) \ { _e -En ${_p};ret 0; } ;\ else
bug#19544: RFE: please fix limited dd output control (want xfer stats, but not blocks).
Pádraig Brady wrote: There is a new status=progress option that will output the above format every second, but on a single updated line. --- excellent. (Which, BTW, uses program intelligence to use the same output units as the user used for input units, rather than giving them units in an unfamiliar dialect). this has been discussed previously. --- True, but it hasn't been fixed and reasons against *localization* to the user's stated dialect and preference were never given. How is it that giving the users what they want meets with such resistance? It seems I wasn't the only one who had the opinion I presented, and there was no good technical reason why it shouldn't be done. As you say, this was previously discussed and the result was there was no good technical reason for not doing it, but those who own 'dd' decided not to do it in their version at that time. That doesn't mean you might not have come up with some technical reason not to it since your last discussion. thanks, Pádraig thanks as well, ;-) Linda
bug#19544: RFE: please fix limited dd output control (want xfer stats, but not blocks).
The blocks are a bit uninteresting: 7+0 records in 7+0 records out 6+0 records in 11+0 records out 8+0 records in 8+0 records out 2+0 records in ... 2+0 records out 15+0 records in 15+0 records out --- Tells me nothing -- not size of recs, nor time.. nothing interesting. What I'd rather see: 983040 bytes (983 KB) copied, 0.0135631 s, 72.5 MB/s 327680 bytes (328 KB) copied, 0.00869602 s, 37.7 MB/s 393216 bytes (393 KB) copied, 0.00978036 s, 40.2 MB/s 458752 bytes (459 KB) copied, 0.00906681 s, 50.6 MB/s ... 65536 bytes (66 KB) copied, 0.00843794 s, 7.8 MB/s 65536 bytes (66 KB) copied, 0.00845365 s, 7.8 MB/s 983040 bytes (983 KB) copied, 0.0128341 s, 76.6 MB/s 262144 bytes (262 KB) copied, 0.01019 s, 25.7 MB/s 262144 bytes (262 KB) copied, 0.00933135 s, 28.1 MB/s 589824 bytes (590 KB) copied, 0.0124597 s, 47.3 MB/s 1048576 bytes (1.0 MB) copied, 0.0138104 s, 75.9 MB/s --- (Which, BTW, uses program intelligence to use the same output units as the user used for input units, rather than giving them units in an unfamiliar dialect). Side note: --- Which, BTW, is consistent using with use of other non-metric units, like hours (you don't hear about kilo-hours or kilo-minutes) because the units are not a multiple of 10 that is appropriate to use with Metric prefixes). I think the litmus test is whether or not the unit being used is some power of 10 (and no arbitrary constant 'k' to convert it to a metric unit. Thus bits, being a single, recognizable flux change on disk correspond on a 1:1 basis (sans compression) to flux changes, and the number of flux changes one can pack into a cm^2 is a pure calculation with base-10 metric units. VS. Speaking of 'Bytes', or Sectors, one is no longer speaking of a direct metric unit, but of a convertible one. It is rare and not official practice to use metric prefixes with non Metric units. As bytes are not a metric unit, using base-10 metric prefixes shouldn't a point of discussion or contentions. Bytes infer base-2 multiple quantities that can only be precisely specified by base-2 prefixes.
bug#19051: rm symboliclink/ # Is a directory
Eric Blake wrote: Still, my point remains when you use 'rm -r b/': on Linux, it fails (so does 'rm -r b@' as a symlink to a file). The linux way to address the directory has been rm -r b/. POSIX blocked the linux way of addressing the directory in rm, though, for example, it still works in 'cp': cp -rl b/. a/. correctly makes links in 'a/.' of 'b/.' files. But it doesn't see the . in a and fail to execute normally. In fact, attempting that copy isn't even an error in 'cp'/, even though. b/. is a symlink to a/. This came up before with symlinked targets and 'rm' and supposedly core utils would always attempt a dir-op with / appended, but try a symlink-op w/o it. Linux is behaving consistently, in that rm applies to symlinks, not the targets. On linux to address the content of a directory, dir/. has generally been used, but as you mention below, POSIX intentionally violated the linux behavior in its 2008 update. I.e. the linux behavior was prsent long before POSIX was changed to prevent the linux addressing strategy (cannot remove 'b/': Not a directory), on Solaris it succeeds at removing 'a' and leaving 'b' dangling. The fact that Linux intentionally violates POSIX on some of the corner cases related to symlinks to directories makes it harder to definitively state what coreutils should do. --- Posix semantics changing away from the original POSIX standard break the portability aspect of the POSIX, and, are hard to take seriously as being a portability standard when it can't even maintain standards between it's own versions. Given the relative number of users of the various *nix's, It sorta looks like POSIX is attempting to wag the dog. I strongly question the logic in linux following some minority standard when the largest user base is likely to be used to the linux behavior.
bug#18681: cp Specific fail example
Bob Proulx wrote: Wow. Just to be clear an rsync copy took 75 to 90 minutes but a cp --- Actually in the case I used for illustration, it was 110 minutes, but that was longer than normal. Last night's figures: : rsync took 87m, 34s [which is fairly quick given the size of the diffs.] : Empty-directory removal took 1m, 58s : Find used space for /home.diff...sz=2.5GB, min=3.1GB, extsz=4.0MB, n-ext'=806 : Copying diffs to dated static snap...Time: 0m, 17s. It wasn't a copy, but a diff between 2 volumes (the same volume, but one is a ~24+hour snapshot started the on the previous run. So I look at the differences between two temporal copies then copy that to a 3rd partition that starts out empty. So rsync is comparing file times (doesn't do file reads, _by_ _default_, unless it needs to move the data (as indicated by size and timestamps) -- examines all file time/dates on my 'home' partition, and compares those against a mostly-the-same- active LVM snapshot. Out of 871G, on the long day, it found ~5G of changes -- last night was only 3G... varies based on how much change happened to the volume over the period... smallest size now is 600m, largest I've seen has been about 18G. Once the *difference* is on the 3rd volume (home.diff), I destroy the active snapshot created 'yesterday', then recreate it as as a dynamically sized static -- enough to hold the diff. Then cp is used to move whatever diffs were put on the diff volume by rsync. So Those diffs -- most of them are _likely_ to be in memory -- AND as I mentioned, I didn't do a sync after the copy (it happens automatically, but isn't included in the timing). But if I used rsync to do that exact same copy, it would take at least 2-3 times as long... actually... hold on... I can copy it from that partition made yesterday ... into the diff parition.. but will tar up the source to prime the cache... This is the volume: df . Filesystem Size Used Avail Use% Mounted on /dev/Data/Home-2014.10.08-03.07.05 5.5G 4.4G 1.1G 81%\ /home/.snapdir/@GMT-2014.10.08-03.07.05 Ishtar:.snapdir/@GMT-2014.10.08-03.07.05 du -sh . 4.4G . ok... running cp 1st, then remove, then rsync...: Ishtar:.snapdir/@GMT-2014.10.08-03.07.05 \ time sudo cp -a . /home.diff/. 6.39sec 0.15usr 6.23sys (99.81% cpu) Ishtar:.snapdir/@GMT-2014.10.08-03.07.05 \ time sudo rm -fr /home.diff/. 1.69sec 0.03usr 1.64sys (99.43% cpu) Ishtar:.snapdir/@GMT-2014.10.08-03.07.05 \ time sudo rsync -aHAX . /home.diff/. 20.83sec 27.02usr 11.68sys (185.84% cpu) 185% cpu!... hey! that's cheating and still 3x slower... here's 1 core: Ishtar:.snapdir/@GMT-2014.10.08-03.07.05 \ time sudo rm -fr /home.diff/. 1.73sec 0.03usr 1.69sys (99.39% cpu) Ishtar:.snapdir/@GMT-2014.10.08-03.07.05 \ time sudo taskset -a 02 rsync -aHAX . /home.diff/. 38.52sec 25.92usr 11.90sys (98.18% cpu) --- so limiting it to 1 cpu... 6x slower. (remember this is all in memory buffered) Note... rsync has been sped up slightly over the past couple of years and 'cp' has slown down somewhat over the same time period, so these diffs used to be worse. Then 'cp' is used to copy the image on 'home.diff' to the dynamically sized copy took less than 1 minute? I find that very suspicious. --- Well, hopefully the above explanation is more clear and highlights what we wanted to measure. It appears that you are using features from rsync that do not exist in cp. Therefore the work being done in the task isn't equivalent work. In that case it is probably quite reasonable for rsync to be slower than cp. Yup... Never would argue differently, but for what it does, rsync is still pig slow, but when the amount of data you need to move is hundreds of times smaller than the total, it can't be beat! Also consider that if cp were to acquire all of the enhancements that have been requested for cp as time has gone by then cp would be just as featureful (bloated!) as rsync and likely just as slow as rsync too. Nope...rsync is slow because it does everything over a client server model --- even when it is local. So everything is written through a pipe .. that's why it can't come close to cp -- and why cp would never be so slow -- I can't imagine it using a pipe to copy a file anywhere! This is something to consider every time someone asks for a creeping feature to cp. Especially if they say they want the feature in cp because it is faster than rsync. The natural progression is that cp would become rsync. Not even! Note. cp already has a comparison function built in that it uses during cp -u... but it doesn't go through pipes. It used to use larger buffer sizes or maybe tell posix to pre-alloc the destination space, dunno, but it used to be faster.. I can't say for certain, but it seems to be using smaller buffer sizes. Another reason rsync is so slow -- uses a relatively
bug#18681: cp Specific fail example
Bob Proulx wrote: Meanwhile... I would be one of those suggesting that perhaps you should try using rsync instead of cp. The cp command is lean and mean by comparison to rsync (and should stay that way). But rsync has many attractive features for doing large copies. fwiw...--- Like large execution times... from the latest snapshot on my system -- I use rsync to only move differences between yesterday and today[whenever new snap is taken]... it was a larger than normal snap -- most only take 75-90 minutes...but rsync (these are the script messages) with some debugging output still turned on... even an rm over the resulting diff took 101 seconds... then cp comes along.. even w/a sync it would still be under a minute. I.e. rsync copied just the diffs to /home.diff, then find with -empty -delete is used to get rid of empty dirs (rsync creates many of these). then a static partition is created to hold the diff output -- and cp took walked and copied the tree in 12s. (output wasn't flushed, but it's not that long.. a minute...). If rsync wasn't so slow at local I/O...*sigh* rsync took 110m, 14s Empty-directory removal took 1m, 41s Find used space for /home.diff...sz=4.3GB, min=5.4GB, extsz=4.0MB, n-ext'=1388 target extents num=1388, size=4.0M Old volume active: Deactivated. Removed. Create vol. Home-2014.10.08-03.07.05, size 5.4G {L=141008030705, /dev/Data/Home-2014.10.08-03.07.05=CODE(0xbf24a0), f=CODE(0xbf24e8), d={su=64k, sw=1}, i={maxpct=10, size=256}, s={size=4096}} About to copy base-diff dir to static Copying diffs to dated static snap...Time: 0m, 12s. mklabel@ /home/.snapdir/@GMT-2014.10.08-03.07.05/./._snapdat_=snap_copy_complete after copy2staticsnap: complete
bug#8527: cp/mv in coreutils don't respect the default ACL of parent
Sorry, I didn't forward this to the right list... The user data / extended attribute forks are where linux store the ACL's. ext4 should be configurable to do what you want to do, but I haven't personally used it -- but I understand it has similar functionality as xfs. The process umask is a masking off of privs/permissions one sets on a normal file (ACL's aside). It affects the permission bits on the file So if your umask was 077, then you open a file for rwx rwx rwx, it would mask off group and other allowing the permissions to be 700 or rwx, --- ---. (I might have the order backwards, but it's the standard order you see in ls with numeric permissions)...Your umask will affect your file mode creation, but it depends on what flags you use when you use 'cp' -- which is one of the main points of my detail... after everything was shown to be working correctly in my case, a setting I have in an alias to my cp would have over-ridden any other settings and made it look like 'cp' ignored directory ACL or (sounds like you might be talking group-owner ship -- of a dir -- or are you talking both). Really, I'm not a member of the core utils devel group, so I really prefer you send your answers and questions there, as they'll catch alot more things than I would -- I was just showing an example of how your setting can override everything you think you are setting -- so you'll need to provide more detail about what your umask is, (type umask at prompt to see), and whether or not you have any aliases or ENV vars in effect that could alter things. If you can give an exact formula along the lines of what I did to demonstrate your problem, that will help the developers the most. The detail I gave was only to show how things you don't think of may be affecting you and to be sure to check for them. I'm cc'ing the list on my reply, but leaving your email off of it, so if you want to ask them if they need more information that's fine... otherwise, write down the exact commands you typed and your environment, to repeat it.. (umask included). If you want to use my lsacl script.. it's a trivial build on top of the chacl program. But please post to the list so everyone can be on the same page lsacl script more lsacl #!/bin/bash acllen=0 for fn in $@; do out=$(chacl -l $fn) qfn=$(printf %q $fn) perm=${out#$qfn} thislen=${#perm} if ((thislenacllen)); then acllen=$thislen; fi printf %-${acllen}s %s\n $perm $fn done = Very trivial... but allowed me to look at multiple files at a time... IF you can give a recipe or script that duplicates the problem you saw, that would be the best way to move this bug along (toward cockpit error or new special case found!). Best of luck either way!
bug#18386: logname fails with error logname: no login name (is there an echo in here)
Linda Walsh wrote: I do not have auditing nor selinux incorporated, nor do I really want to. Seems like glibc relying on having a specific security model loaded would seem to be a bug. I.e. It should work on normal linux security without extra security modules configed. BTW, FWIW, including audit in the kernel enables this command to work, though who is still snafu'd... That's a more complicated issue as not only has the login table moved, but it's maintenance has been moved to systemd... Would have been nice if they'd updated the standalone version to work and had systemd use that, but .. it's part of the systemd-lockin-mentality.
bug#18503: [bug-report] the output of ls -lsh
On 09/19/2014 12:17 AM, Linda A. Walsh wrote: gemfield wrote: 4 * 1K blocks = 4.0K blocks. ^^ - bytes I think the ambiguity is that there is no unit output. With the human output options, bytes are the implicit unit rather than blocks. Those darn trees! Can't see the forest because of...
bug#18386: logname fails with error logname: no login name (is there an echo in here)
Bernhard Voelker wrote: On 09/02/2014 02:06 AM, Linda Walsh wrote: logname logname: no login name logname(1) works here on a regular openSUSE-13.1 system, and just calls getlogin(3) to get the information as required: $ ltrace logname ... getlogin() = berny With the same I get: ltrace -f logname| more [pid 40807] __libc_start_main(0x4017c0, 1, 0x7fff8b0306f8, 0x4042d0 unfinished ... [pid 40807] strrchr(logname, '/') = nil [pid 40807] setlocale(LC_ALL, )= LC_CTYPE=en_US.UTF-8;LC_NUMERIC=... [pid 40807] bindtextdomain(coreutils, /usr/share/locale) = /usr/share/locale [pid 40807] textdomain(coreutils) = coreutils [pid 40807] __cxa_atexit(0x401c20, 0, 0, 0x736c6974756572) = 0 [pid 40807] getopt_long(1, 0x7fff8b0306f8, , 0, nil) = -1 [pid 40807] getlogin() = nil which in turn seems to retrieve the information like this: $ strace logname ... open(/proc/self/loginuid, O_RDONLY) = 3 I don't have a /proc/self/loginuid How is it enabled in a vanilla kernel? read(3, 717, 12) = 3 close(3)= 0 socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3 connect(3, {sa_family=AF_LOCAL, sun_path=/var/run/nscd/socket}, 110) = 0 sendto(3, \2\0\0\0\v\0\0\0\7\0\0\0passwd\0, 19, MSG_NOSIGNAL, NULL, 0) = 19 ... Don't you have /proc mounted? --- Yup mine is stock from kernel.org I don't recall seeing any option for loginuid. What module is it? I probably don't have it mounted.
bug#18386: logname fails with error logname: no login name (is there an echo in here)
I do not have auditing nor selinux incorporated, nor do I really want to. Seems like glibc relying on having a specific security model loaded would seem to be a bug. I.e. It should work on normal linux security without extra security modules configed. BTW I specifically try to specify differences between my installation and standard in case they do contribute to any problems. It wouldn't be in my interest to complicate issues by hiding such details. Whether or not they 'should' be relevant is another matter though That's why i try to operate with vanilla objects of some things (kernels being most common) in my installation as a bellwether for compatibility.
bug#18386: logname fails with error logname: no login name (is there an echo in here)
This is at a pty: tty /dev/pts/0 logname logname: no login name logname --version logname (GNU coreutils) 8.21 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by FIXME: unknown. I notice that while 'whoami' prints my login name (as does id with a bunch of groups), The who command also seems AWOL Ishtar:/ who --version who (GNU coreutils) 8.21 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Joseph Arceneaux, David MacKenzie, and Michael Stone. Ishtar:/ who -a Ishtar:/ who -t Ishtar:/ who -u Ishtar:/ who -b Linux Ishtar 3.16.0-Isht-Van #1 SMP PREEMPT Tue Aug 12 11:11:02 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux I wonder if the who command's output being broken is related? Distro is a 13.1 SuSE base, but wouldn't call it a stock system (still boots w/sysvinit)
bug#18186: cat.c needs fixing since 1998
Paul Eggert wrote: James Simmons, President CEO wrote: cat file1.txt file2.txt output.txt file1.txt contains a single line containing IP addresses. file2.txt contains a single line with more IP addresses. output.txt SHOULD contain a single line If the input contains two lines, the output should too. Sorry, I don't see a bug here. --- If he is using an editor like Vim, it will not show you the last linefeeds in the file because it always inserts them automatically unless you are in binary mode. I had a similar problem with vim force inserting an LF on a file update, just to help me. The problem comes when you want to concatenate text fields, where neither are terminated by an LF. I think that may be what the OP is expecting, not knowing that the problem may be in their text editor that won't inhibit LF@EOF in text mode even though it has a flag for it.
bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)
Christian Groessler wrote: On 07/27/14 19:11, Linda Walsh wrote: It is more common to specify transfer sizes in SI and mean IEC if you are in the US where the digital computer was created. People in the US have not adopted SI units and many wouldn't know a meter from a molehill, so SI units aren't the first thing that they are likely to be meaning. Computer scientists and the industry here, grew up with using IEC prefixes where multiples of 8 are already in use. I.e. if you are talking *bytes*, you are using base 2. I didn't grow up in the US, and grew up with the metric system, but when I'm talking about memory sizes I always mean IEC (2^10) and never SI (10^3). The only pitfall here are hard disk sizes where I have to remember that they mean SI. I was trying to come up with some reason for Padraig's belief that people usually meant SI when using IEC prefixes for computer sizes like units bytes (2^3bits) or sectors (2^12 bits)... now what power of 10 is that? I've never heard of anyone supporting Padraig position -- so I assumed it must be some foreign country where the metric system and metric prefixes are meant to apply to non-unary and non-base-10 quantities. Pádraig: where did you get your impression? When it comes to disk space -- computers always give it in IEC -- except where they've bought the line that mixed base-2 and power-of-10 prefixes is a good thing, then they try to get others to buy into such. But reality is that one can't express disk space as a power of 10 as there is no multiple of 10 that lines up with a 512-byte multiple. I.e. the system is designed to be inaccurate and confuse the issue to make it harder for consumers to do comparisons. I don't get the reason for the dynamic switch at all. Can somebody enlighten me? I think it was thrown in as a red herring, as I can't think of any useful case for it. Having the output vary units randomly, not at the bequest of the user, doesn't seem especially useful.
bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)
Pádraig Brady wrote: That was the original approach but is a bit worse than the dynamic approach since it's common to specify transfer sizes in IEC units for SI sized data. It is more common to specify transfer sizes in SI and mean IEC if you are in the US where the digital computer was created. People in the US have not adopted SI units and many wouldn't know a meter from a molehill, so SI units aren't the first thing that they are likely to be meaning. Computer scientists and the industry here, grew up with using IEC prefixes where multiples of 8 are already in use. I.e. if you are talking *bytes*, you are using base 2. It is inconsistent to switch to decimal prefixes when talking about binary numbers. OTOH, if you are talking *bits*, I would say usage meaning SI units are more common. Bytes = 2^3 bits. not 10 bits. Now I was willing to go so far as to not force incompatible or bad nomenclature upon others, but to use their own nomenclature when replying to them. If someone came up to you and spoke a question in French, would you answer them in English and make some comment about people using French by accident and they really mean to use English? If you goal was clear communication, you'd try to answer in the language they were querying in (presuming you knew it). Only giving responses in English, when you accept input in French, would likely be thought insulting. If people are that concerned to get the output they want in SI, they might be bothered to use it on input (or read the manpage and find out how to make it happen). For those that are concerned to get the output they want in computer compatible binary, you seem to be saying they are S-O-L, which seems a poor and selfish attitude to be taking. BTW I was playing devil's advocate with my mention of the SIGUSR1 inconsistency. I'm still of the opinion that the dynamic switch of human units based on current transferred amount is the lesser of two evils, since this output is destined for human consumption. If it is for human consumption, humans like consistency -- if they speak to you in 1 language, they likely appreciate being replied to in the same .. same goes for terminology and units. If someone asks you how many kilometers it is to XXX and you come back with 38 miles, you think that's a user friendly design? cheers, Pádraig.
bug#17505: Pádraig: does this solve your consistency concern? (was bug#17505: dd statistics output)
Pádraig: you may have missed this as it was a reply to an old thread, but, changing the subj and composing as new should prevent that (I hope) You were concerned that the user would get different outputs based on the previously suggested algorithm -- as well as possibly different output when SIGUSR1 came in. This idea seems to solve both of those -- so if the patch that was proposed for this was modified in line with this suggestion, would there be any further problems? Linda Walsh wrote: Found old bug, still open... Pádraig Brady wrote: On 07/16/2014 10:38 AM, Pádraig Brady wrote: http://bugs.gnu.org/17505#37 was proposed do the following automatically (depending on the amount output): 268435456 bytes (256 MiB) copied, 0.0248346 s, 10.8 GB/s However that wasn't applied due to inconsistency concerns. I'm still of the opinion that the change above would be a net gain, as the number in brackets is for human interpretation, and in the vast majority of cases would be the best representation for that. One patch that would not be inconsistent: If the user uses units of a single system (i.e. doesn't use 'si' and b2 units in same statement), then display the summary units using the same notation the user used: dd if=xx bs=256M ...(256M copied) vs. dd if=xx bs=256MB ...(256MB copied)... Note another reason to _not_ apply the patch is that requests to print the statistics can come async through SIGUSR1, and thus increase the chances of inconsistent output. Solves this too, since the units are decided when the command is parsed, so SIGUSR would use the same units as would come out on a final summary. Or is using consistent units w/what the user users not ok? Note, for statements w/o units (or mixed system), there would be no reason to change current behavior.
bug#17505: dd statistics output
Found old bug, still open... Pádraig Brady wrote: On 07/16/2014 10:38 AM, Pádraig Brady wrote: http://bugs.gnu.org/17505#37 was proposed do the following automatically (depending on the amount output): 268435456 bytes (256 MiB) copied, 0.0248346 s, 10.8 GB/s However that wasn't applied due to inconsistency concerns. I'm still of the opinion that the change above would be a net gain, as the number in brackets is for human interpretation, and in the vast majority of cases would be the best representation for that. One patch that would not be inconsistent: If the user uses units of a single system (i.e. doesn't use 'si' and b2 units in same statement), then display the summary units using the same notation the user used: dd if=xx bs=256M ...(256M copied) vs. dd if=xx bs=256MB ...(256MB copied)... Note another reason to _not_ apply the patch is that requests to print the statistics can come async through SIGUSR1, and thus increase the chances of inconsistent output. Solves this too, since the units are decided when the command is parsed, so SIGUSR would use the same units as would come out on a final summary. Or is using consistent units w/what the user users not ok? Note, for statements w/o units (or mixed system), there would be no reason to change current behavior.
bug#17505: Interface inconsistency, use of intelligent defaults.
Paul Eggert wrote: Linda Walsh wrote: 125MB/s is literally impossible with a 1Gbit/s line - there will be overhead This comment is using the usual powers-of-1000 abbreviations for both the first figure (125 MB/s) and the second one (1 Gb/s), so it supports the assertion that powers-of-1000 are more common in ordinary usage. 125 MB/s is impossible is because there is some overhead at lower protocol levels, which means that you cannot possibly transfer 1 Gb of data over a 1 Gb/s line in one second, i.e., you cannot possibly transfer 125 MB of data over that line in one second, and that's what the comment says. I see what you are saying, but having done that measurement myself, I can assure you the 125MB/s is exactly what 'dd' reports (using direct I/O). As I stated previously, when talking about bits, I see the decimal usage as often as not. But when people talk about timings, they want to know how long it will take to transfer the data on their disk -- given in base2 units to 'X'... Compare to 'ls', 'du', -- all give base2 units. If you think about it the only way it would be impossible is if they though it was 125 * 2^20. But getting 125*10^6, is relatively trivial if your overhead is 1% -- dd won't show it. I could ask for clarification whether they were using 2^20 or 10^6 for M. But 'dd' only requires that the overhead be less than .4 or .5% to display 125.
bug#17505: Interface inconsistency, use of intelligent defaults.
Pádraig Brady wrote: On 05/16/2014 11:01 AM, Ruediger Meier wrote: On Friday 16 May 2014, Pádraig Brady wrote: The attached patch changes the output to: $ dd if=/dev/zero of=/dev/null bs=256M count=2 2+0 records in 2+0 records out 536870912 bytes (512 MiB) copied, 0.152887 s, 3.3 GiB/s Thanks! What about just 512 M which looks IMO better, is a valid input unit and is explained in the man page. That would be less clear I think since in standards notation, 512M is 51200. Also adding the B removes any ambiguity as to whether this referred to bytes of blocks. Since 'B' already refers to 2^3 (most commonly) bits of information saying KiB = 1024 information Bytes. What other type of bytes are there? I would acknowledge some ambiguity when using the prefixes with 'bits', but with 'Bytes' you only have their usage/reference in relation to 'information'. Note that in the information field, when referring to timings, milli, micro, nano -- all refer to an abstract, non-information quantity (time in 's'). When referring to non-computer units SI prefixes would be the default. But for space, in 'bytes' -- they are an 'information unit' that has no physical basis for measurement. I think the SI standard was too hastily pushed upon the nascent computer industry by established and more dominant companies that were used to talking about physical that relate to concrete physical quantities. I'm beginning to wonder how one would go about correcting the SI standard so as not to introduce inaccuracies in measurement in the computer industry.
bug#17505: Interface inconsistency, use of intelligent defaults.
Paul Eggert wrote: Pádraig Brady wrote: The attached patch changes the output to: $ dd if=/dev/zero of=/dev/null bs=256M count=2 2+0 records in 2+0 records out 536870912 bytes (512 MiB) copied, 0.152887 s, 3.3 GiB/s I recall considering this when I added this kind of diagnostic to GNU dd back in 2004, and going with powers-of-1000 abbreviations because secondary storage devices are normally measured that way. For this reason, I expect many users will prefer powers-of-1000 here. This is particularly true for transfer rates: it's rare to see GiB/s in real-world prose. So it'd be unwise to make this change. When users see 512 MB copied, they expect it means 512*1024*1024. The same goes for the GB/s figure. If you went with Gb/s -- that's different, as we are more used to seeing bits/s, which is why I could go either way with that. The simplest thing to do is to leave dd alone, which is my mild preference. Alternatively, we could make the proposed behavior optional, with the default being the current behavior. If we do that, though, the behavior shouldn't be affected by the abbreviation chosen for the block size. Even if the block size is given in powers-of-1024 (which is common, because block sizes are about internal memory units, where powers-of-1024 are typical), the total number of bytes transferred and the transfer rates are more commonly interpreted in the external world, where powers-of-1000 are typical. What external world are you talking about? Where you talk about MB or GB /s outside of the computer world? If what you said was true, then people wouldn't have responded that 125MB/s was impossible (in the external world) on a 1Gb ethernet. Yet that's what 'dd' displays. See http://superuser.com/questions/753597/fastest-way-to-copy-1tb-safely-over-the-wire/753617;. See the comments under the the 2nd answer. 125MB/s is literally impossible with a 1Gbit/s line - there will be overhead...-(Bob) and Without very significant compression (which is only achievable on extremely low entropy data), you're never going to see 125 MB/s in any direction on GbE. (allquixotic). They don't believe 125MB/s is possible even though that's what 'dd' stated. It never occurs to people, talking about computers and speeds that someone has slipped in decimal -- it never happened before disk manufacturers wanted to inflate their figures. By not putting a stop to the nonsense that MB != 1024*1024 when disk manufacturers muddied the waters, it's led to all sorts of miscommunications. The industry leader in computing doesn't use KB to mean 1000B, nor M=10^6 ... Microsoft's disk space and rates both use 1024 based measurements. So what external world (who's opinion matters in the computer world) are you talking about?
bug#17505: Interface inconsistency, use of intelligent defaults.
On programs that allow input and output by specifying computer-base2 powers of K/M/G OR decimal based powers of 10, If the input units are specified in in powers of 2 then the output should be given in the same units. Example: dd if=/dev/zero of=/dev/null bs=256M count=2 ... So 512MB, total -... but what do I see: 536870912 bytes (537 MB) copied, 0.225718 s, 2.4 GB/s Clearly 256*2 != 537. At the very least this violates the design principle of 'least surprise' and/or 'least astonishment'. The SI suffixes are a pox put on us bye the disk manufacturers because they wanted to pretend to have 2GB or 4GB drives, when they really only have 1.8GB, or 1907MB. Either way, disks are created in powers of 512 (or 4096) byte sectors, , so while you can exactly specify sizes in powers of 1024, you can't do the same with powers of 1000 (where the result mush be some multiple of or 4096 for some new disks). If I compare this to df, and see my disk taking 2G, then I should be able to xfer it to another 2G disk but this is not the case do to immoral actions on the part of diskmakers. People knew, at the time, that 9600 was a 960 character/second -- it was a phone communication speed where decimal was used, but for storage, units were expressed in multples of 512 (which the power-of-10 prefixes are not). (Yes, I know for official purposes, and where the existing established held sway before the advent of computers, metric-base-10 became understood as power of 10 based, but in computers, there was never confusion until disk manufacturers tried to take advantage of people. Memory does not come in 'kB' mB or gB (kmg=10^(3*{1,2,3}).. it comes in sizes of KB/MB/GB or (KMG=2^10**{1,2,3}). But this isn't about changing all unit everywhere... but maintaining consistency with the units the user used on input (where such can be verified). Reasonable? Or are inconsistent results more reasonable? ;-)
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Kees Cook wrote: I outline some of it in the original commit: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=800179c9b8a1e796e441674776d11cd4c05d61d7 (had already read that though not from that source)... It seems more like use of a blunt instrument rather than making use of the mode bits (or DACL) on a symlink. As far as the given reasoning for symlink control, I've not heard of any issues related to TOU on devices/pipes or other file system objects that couldn't be applied to files. I.e. Do you know why they'd blanket ban everything except files? The best example of hardlink insanity is for a system were /usr/bin is on the same partition as /tmp or /home. A local user can hardlink /usr/bin/sudo to $HOME/sudo, and when a flaw is found in sudo, the administrator will upgrade the sudo package. However, due to the package manager deleting /usr/bin/sudo and replacing it, the original sudo remains in $HOME/sudo, leaving the security flaw available for exploitation by the local user. --- OK, then why restrict hardlinks to symlinks -- they can't be setXID. Same with anything other than a file. They can't be used in the same way. The restrictions on 'non-files' became worse -- in that the DACL(incl mode) is ignored. For files... disallow linking to setXid (or setcap) files, and for 'icing' disallow hardlinks to/from files in world readable sticky dirs. Wouldn't those restrictions have given the same benefit...(focusing, BTW, on the restrictions on hardlinks). ToCToU races for hardlinks (like symlinks) also exist. Say some local root daemon writes to /tmp/bad-idea.log, a local user could hardlink (or symlink) this to /etc/passwd and destroy the system. --- In that case, should 'root's DACL override ability trump the protections setup by the sticky-bit. If root programs are going to insist on using the same world-writeable sticky bits as all other users, it seems only prudent that increased restrictions apply -- not only to protect root, but other users --- If root can just overwrite any file -- if that was a tmp file that is being read for final saving in the user dir of an important file -- wouldn't that be equally bad? I.e. maybe root shouldn't be able to open an existing file (not owned by root) in a sticky-dir -- but would need to move or remove it first. Wouldn't the above restrictions accomplish the same security goals with less impact to compatibility w/existing features?
bug#17138: how to respect target symlinks w/cp? problem?
Is some server bottled up somewhere? this bug was the last one I saw come through... Linda Walsh wrote: I was wanting to copy a source tree to a target where the target had some symlink'd dirs.. /arch64/ \cp -r usr/. ../usr/. cp: cannot overwrite non-directory `../usr/././share' with directory `usr/./share' I have a setup on a bi-arch machine where /usr/share under each 'arch' points to a common /common/share/... I see options in the manpage for having cp respect the SOURCE symlinks, but see no option to have it respect symlinks in the target. Note: If I wanted the target's symlinks to be overwritten or ignored, I would use cp -fa that would overwrite the symlinks (I think) and create them as directories, but barring a, why doesn't it just follow the path? The purpose of symlinks was to allow seamless redirection, but now more utils seem to be ignoring that usage just like on a security level group access is being increasingly ignored. tar just overwrites the symlink with the dir w/o warning...
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Kees Cook wrote: So, allowing a hardlink to a symlink means that symlinks owned by the target user could be hardlinked to in /tmp. --- How would that be different than the user creating a symlink in tmp? I.e. Theoretically, they can already create a symlink in tmp. --- The attack gets more and more remote, but these kind of flaws are not unheard of. If there's a URL for to explain why this is needed, I'd love to read more. My background is computer science and have have worked in security, so I'm aware of theory, but logically, I am still not seeing the chain of events. It seems like the protected symlink was designed for use in a world-writeable w/ sticky bit set, so I'm not seeing the need for the extra check on hard-link in relation to that. It seems more like use of a blunt instrument rather than making use of the mode bits (or DACL) on a symlink. As far as the given reasoning for symlink control, I've not heard of any issues related to TOU on devices/pipes or other file system objects that couldn't be applied to files. I.e. Do you know why they'd blanket ban everything except files? BTW -- you said: Though this case of hardlink-copying a writable unowned tree is pretty unusual already! :) The business of using groups for access control is being increasingly ignored in many system utils -- something I find very annoying. Maybe, with increasing use of user and group specific ACL's, someone might realize that group access can also be used selectively. Too many system utils are checking to see that they are root-user-only writeable which supersedes a local site's security policy. Regardless, if your system isn't at risk for such attacks, makes sense to turn if off if it's getting in your way. :) --- I wasn't are of the problem until just recently and I think the hardlink part of it is 'not well considered' given the current evidence, but I'd really like to know better what made it a problem. At least, we are agreed on 'cp' doing the most good for the most users. That principle seems increasingly lost amongst the legalistically inclined (c.f. shoveling snow off the sidewalk in front of your house, where owners were held liable if they had shoveled (i.e. took mitigating action), but not if they didn't). Sigh.
bug#17138: how to respect target symlinks w/cp? problem?
I was wanting to copy a source tree to a target where the target had some symlink'd dirs.. /arch64/ \cp -r usr/. ../usr/. cp: cannot overwrite non-directory `../usr/././share' with directory `usr/./share' I have a setup on a bi-arch machine where /usr/share under each 'arch' points to a common /common/share/... I see options in the manpage for having cp respect the SOURCE symlinks, but see no option to have it respect symlinks in the target. Note: If I wanted the target's symlinks to be overwritten or ignored, I would use cp -fa that would overwrite the symlinks (I think) and create them as directories, but barring a, why doesn't it just follow the path? The purpose of symlinks was to allow seamless redirection, but now more utils seem to be ignoring that usage just like on a security level group access is being increasingly ignored. tar just overwrites the symlink with the dir w/o warning...
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Eric Blake wrote: On 03/26/2014 10:44 PM, Linda Walsh wrote: cp has a workaround for directories and it has exactly this workaround on other OS that don't support hardlinking. That workaround is behavior mandated by POSIX, and has existed prior to POSIX even being standardized. I don't see why this shouldn't be treated similarly to the 2nd case, as the OS no longer supports hardlinking in as many cases as it used to -- so why shouldn't it fall back? It's better to not second guess the kernel; you may want to take this up on the kernel lists if you want something changed. --- It's not second guessing -- it's responding to a lower capability (or less privileged) environment. It also depends on whether or not those features are turned on (i.e. by assigning 0/1 to /proc/sys/fs/paranoid_protected_{hard,soft}links). AFAIK, my vendor may have them set somewhere in boot code I haven't audited ( -- I've never had a need to audit my startup code till they started forcing systemd down everyone's throat and putting dummy wrapper calls in the sysVinit code). I.e. they've converted it from a system where doing a 'cp -al' worked reliably to one where it doesn't. If it doesn't work reliably, then it seems the, *cough*, posix mandate should followed But the outcome would be that cp would still just work -- just within the bounds of what it is allowed to do. Instead the attitude seems to be, gosh if we can't have it the way it was, we shouldn't try to have it all. The kernel bug is a separate issue -- since I CAN link to the file (permissions allow it), but the same permissions on the link are ignored. Note -- I grant that permissions ANd ownership on symlinks have always been ignored (at least in experience), but if they are going to start using permissions to enable/disallow hardlinking, maybe they shouldn't carve out a special exception for symlinks. But those are separate for how cp should behave on filesystems with varying, assumed capabilities...(i.e. failing because one can't link to a symlink when linking to symlinks isn't a requirement for this to be allowed on systems that don't support symlinking at all). I.e. as it stands, the ability to hardlink to a file is dependent on what features and policies your kernel has built in. Cp should work as well as possible regardless of those policies. I.e. what is the posix policy for handling linking requests when the OS has disabled them? If they wanted to disable copying the tree, they would make it non-readable.
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Pádraig Brady wrote: On 03/27/2014 02:10 PM, Linda Walsh wrote: But those are separate for how cp should behave on filesystems with varying, assumed capabilities...(i.e. failing because one can't link to a symlink when linking to symlinks isn't a requirement for this to be allowed on systems that don't support symlinking at all). I.e. as it stands, the ability to hardlink to a file is dependent on what features and policies your kernel has built in. Cp should work as well as possible regardless of those policies. Agreed, but :) Old systems that didn't support hardlinks to symlinks, would not depend on that functionality, and thus the workaround of creating new symlinks is OK. Going forward, all systems will support hardlinks to symlinks and those systems might rely on that functionality. The above statement is no longer true on linux with the new feature -- which is enabled by default (I find nothing under '/etc/' that would change or references 'protected_' other than some reference where it is in an ENV string, but nothing sets it to 'on' @ boot. I'll have to reboot my machine to find out for sure as it's been up for 26 days but will **likely be out for the rest of the day**. Since some distro's are shipping it that way by default now, the above statement doesn't always apply on linux-based systems. This is my main concern with the fall back. The inconsistency concern (with not also handling setuid files or fifos etc.) is valid also, but secondary as it's less likely and shouldn't cause a logic issue like above. --- The above wouldn't work on a linux system 3 years ago if the fs they ran that on was on a windows type file system -- an esoteric case, but possible -- It's not a very portable construct to begin with.
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Kees Cook wrote: Regardless of the outcome for cp, it seems like turning off this restriction on the system you're doing this on would be the best short-term solution. It sounds like you're not using a Debian or Ubuntu system which carries defaults in /etc/sysctl.d/ files. That where the user's mods can go. There are 6 other locations w/5 of those being directories to check. Found the culprit in /usr/lib/sysctl.d/50-defaults (comment inside was 'distribution defaults'). So opensuse went with the other lemmings... and made it a default. Have already reviewed and expunged unwanted settings. Fedora's systemd likes to put defaults into /lib/sysctl.d, if that helps you track it down. I think systemd recognizes /etc/sysctl.d for overriding settings in /lib/sysctl.d, so you might be able to set it there instead. You might find they are moving it to /usr/lib as well -- supposedly that's why suse is doing it.
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Some other thoughts on the rest... Kees Cook wrote: Yeah, this is a poor interaction. The trouble with hardlinks is that they retain the original ownership. Seems like doing the fallback would serve most users best. Just keep it documented in the case of really weird stuff like above. Though this case of hardlink-copying a writable unowned tree is pretty unusual already! :) --- It was something that happened recently when I was in a hurry to build a new tree -- permissions were failing so I just made group root had write access (as was already in group root).So how it use to be, was that I didn't have write access to the files, so I couldn't save edits w/o renaming the original and then saving the edits as a new version. I spent more time tracking down the problem because I got the error this time even with the group perms set... that lead me to the hardlink-symlink practice... So the only reason I had write access was to get around the first bits of this problem -- I wonder if they disallowed any non-regular in a more recent update than when the other stuff went in. Either that or a package update from my vendor added the default-on rule. I spent more time to investigate this time (as I duped 3.13 - 3.13.7 and applied patches) , because I found that I didn't need to reapply a local patch to the last kernel build w/new sources and that was because my changes now didn't need to be made to a copy, but flowed right into the source tree for any sub-releases dependent on that Maj-Min combo. Oh well. I.e. I relied on the r/o access to keep myself inline... ;^). Yeah, it'd be nice if the symlink bits had meaning. However, relaxing this check on the kernel side results in bad scenarios, especially when combined with the symlink restrictions. e.g., creating a hardlink to a symlink that matches the directory owner in /tmp. Now an attacker controls the destination of a followable symlink. --- Huh? A symlink doesn't act like an SUID (or has that changed?) -- if the object the symlink pointed to was write protected against the user, they would still be hard pressed to exploit it. How would having a pointer to the file (that still follows tree traversal rules - i.e - only allowing access to paths the user could gotten to anyway) confer new access rights to a user?
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
have a simple test case: as root (w /umask 002): mkdir -p dir/{a,b} touch dir/b/file ln -s ../b/file dir/a/symfile --- So now tree should look like: tree -AFugp dir dir +-- [drwxrwxr-x root root] a/ | +-- [lrwxrwxrwx root root] symfile - ../b/file +-- [drwxrwxr-x root root] b/ +-- [-rw-rw-r-- root root] file Now, w/normal user, who is in group root, try: cp -al dir dir2 cp: cannot create hard link ‘dir2/dir/a/symfile’ to ‘dir/a/symfile’: Operation not permitted - Trying to link to a symlink is the bug -- it used to duplicate the symlink. This is a recent behavior change -- i.e. looking at earlier behavior, the symlinks, like the directories are created as the 'user', and only files are linked to. Core utils version: 8.21 (suse rpm coreutils-8.21-7.7.7.x86_64) Any idea how this managed to be broken?
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Pádraig Brady wrote: On 03/26/2014 06:08 PM, Linda Walsh wrote: have a simple test case: as root (w /umask 002): mkdir -p dir/{a,b} touch dir/b/file ln -s ../b/file dir/a/symfile --- So now tree should look like: tree -AFugp dir dir +-- [drwxrwxr-x root root] a/ | +-- [lrwxrwxrwx root root] symfile - ../b/file +-- [drwxrwxr-x root root] b/ +-- [-rw-rw-r-- root root] file Now, w/normal user, who is in group root, try: cp -al dir dir2 cp: cannot create hard link ‘dir2/dir/a/symfile’ to ‘dir/a/symfile’: Operation not permitted - Trying to link to a symlink is the bug -- it used to duplicate the symlink. This is a recent behavior change -- i.e. looking at earlier behavior, the symlinks, like the directories are created as the 'user', and only files are linked to. Core utils version: 8.21 (suse rpm coreutils-8.21-7.7.7.x86_64) Any idea how this managed to be broken? So I think the change to use hardlinks to symlinks rather than new symlinks happened with: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=594292a1 I.E. The new symlink behaviour only happened between v8.0 and v8.10 inclusive. So why is the hardlink to symlink being disallowed? I wonder is it due to protected_hardlinks: http://danwalsh.livejournal.com/64493.html As far as I know, you could never hardlink to a symlink. only to a file. A symlink is more like a directory in that regard. Looking at the article, it doesn't seem it should apply...it says 1) The solution is to permit symlinks to only be followed when outside a sticky world-writable directory, [it isn't in a sticky world writable directory] or when the uid of the symlink and follower match, or when the directory owner matches the symlink's owner. [in the created dir, the symlink and directory owner match] 2) The solution is to permit hardlinks to only be created when the user is already the existing file's owner, or if they already have read/write access to the existing file. [ Already have r/w access to the file via being in group root and the group having write access] I think I did run into this change, though, and because of it, my system is less secure. ... I.e. in my use case, am copying a source tree into a 2nd tree. I am the directory owner of all dirs in the 2nd tree, but all the source files should allow my linking to it as the permissions on the inode protect the contents of the inode. The access to create a hardlink to that inode has always been controlled by the directory owner, but it never gives them write access to the content. Their exploit case was for a stupid admin who chowned all files in a user's dir for them -- why would that EVER be done? I.e. the root admin should be immediately suspicious as to why they'd need that done. Since you can't hardlink directories, the only way for a foreign owned directory to get into someone's home space, would be if they opened the permissions on the parent then later closed them. The whole premise of their change relies on the user tricking the admin. But if that's so easy, the user already has root access, effectively, and the games up. If we fell back to using symlinks, would that only push the perm issue on to when the symlink was followed? No, the file it points to, if you look at my example is rw for the 'group' members.
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Eric Blake wrote: On 03/26/2014 02:21 PM, Linda Walsh wrote: As far as I know, you could never hardlink to a symlink. only to a file. Wrong. --- How can you presume to say what I wrote about my 'knowledge' (as far as I knew), is wrong? You are implying that I wrote differently than I thought -- an accusation of deliberate lying if one was to look at your response (which I don't think is what you intended -- you were just trying to be abrasive in your answer). You could say what I knew was dated, but simply saying 'wrong' to that sentence shows a lack of understanding of what was said. Nevertheless, it if it can't link to the symlink (like it cannot like to directories), then it should copy the symlink. Whether it is a relative or an absolute symlink, the symlink is readable and can be copied. Furthermore, the man page says -l: hard link files instead of copying'. AFAIK, symlinks are not files any more than directories (or is that wrong too and they are now considered to be 'files'? FWIW, the behavior does seem to be caused by the new buggy security behavior. I.e. it makes no sense that I can hardlink to a file, but not a symlink in the same directory.
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Pádraig Brady wrote: That is true, but I confirmed that this is caused by protected_hardlinks Perhaps there is a blanket ban on symlinks if you're not the owner, since the symlink could be later changed to point somewhere more sensitive? Kees do you know if this is the case? --- If you have 'write' access to the symlink, I would say yes, if not, then no. however, traditionally, the ownership and permissions on symlinks haven't been considered important. Still -- that I can link to a file but not a symlink is an obvious flaw in the implementation. I.e. I have write access to the file -- so I should be able to link to it under their new rules, but I also have write access to the symlink as the mode bits are 777. That's a bit bogus. They are creating a special case where there shouldn't be one. I'm the directory owner -- I should be able to create arbitrary 'entries' in the directory as I own the directory's content -- that's been the tradition interpretation. Though the traditional rules never applied to symlinks -- and now they've come up with an incompatible access method for symlinks... If they really wanted to make them non-linkable, they should start recognizing the mode bits on the symlink (to change the content of the symlink -- which, in this case, is where it points).
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Paul Eggert wrote: I've managed to reproduce the problem on Fedora 20. It indeed is a kernel thing having to do with security. Not only can 'cp' not create hardlinks to symlinks; it also can't create hardlinks to regular files. Actually, we've already verified that creating hardlinks to regular files works because I have write access to the file I am linking to. I also have write access to the symlink if you look at the mode-bits. Pádraig Brady also verified this. There is more than one problem or issue here. The part that is in coreutils is that if for some reason a symlink cannot be linked to, then the symlink should be copied (not what the symlink is pointing to, but the actual contents of the symlink inode -- i.e. the redirection path). Like directories copied with cp -al, symlinks would be owned by the user if this situation comes up -- but they would still be functional. The kernel part of this are the changes associated with /proc/sys/fs/protected_{hard,soft}links -- which seem to be on by default resulting in incompatible behavior. Pádraig referenced a blog where these new behaviors were justifiable on the theory that sys admins might be easily tricked into doing random chowns of all files for users on request without investigating or wondering why... *ahem*...yah...
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Pádraig Brady wrote: I think I see the reason for excluding symlinks here. It's so one is able to remove a sensitive symlink and know there are no more links to it that could be later replaced back. Allowing that could bypass subsequent protected_symlinks checks. --- What would a 'sensitive symlink' be? I.e. AFAIK, any user can create a symlink to any file in a directory they own. That doesn't give any privilege that I am aware of -- it's just a shortcut to the file. hardlinks, I could see a that since the user is incrementing (and changing the link count in the inode), they'll end up changing the CTIME value -- before, nothing controlled that. Now we could fall back to creating separate symlinks, like we do on systems that don't support hard links to symlinks. In my case, that's 1) what I thought it was doing, 2) all that is needed, as I'm using the 2nd area as a build-tree. I wanted the source files r/o (i.e. I'd prefer I not have write access so I don't accidently change the sources in the source tree. But I didn't even think about this problem when I first encountered it -- I just made sure setGID was set on the dirs and made sure they were group writeable. Of course, IMO, I'm less protected than I was before, as before, I'd have to explicity su/sudo to root to change a root-owned file. But now.. I can just write to it because I've ensured group has write. This could be useful given that these less security sensitive symlinks might be just as useful to the user. I've attached a patch for illustration. The security wasn't an issue until the changes went in .. couldn't be that long ago.. but don't know. However I don't like it because it doesn't deal with, 1. fifos, device files, setuid regular files, nor, 2. relative symlinks that traverse outside the copied hierarchy. Well, that's ok . I'm not using it to create a mirror. Just a linked copy that if I want to change a source file, I have had to move the original out of the way to save any edits. Which now I don't... wonderful. But for devices, at least, recreating the same device by Maj,Min type would be fine for the purpose you mention. Not sure how often people use that to mirror.. certainly one wouldn't normally think to link to a device in /dev: ln /dev/zero xxx ln: failed to create hard link ‘xxx’ = ‘/dev/zero’: Invalid cross-device link If it is in my own directory, it works, but not sure if I care... Fifos are a bit different, no? Aren't they uniq /path? But still, if I was doing a copy of something that was uniq, I wouldn't expect to be able to do anything other than make a symlink to it somewhere. suidfiles... someone might expect that.. but its never been a feature I've relied on, since almost any change used to turn off the suid feature and require explicitly resetting it... but it appears that isn't true now either. 3. Also if you were using `cp -al source mirror`, and subsequently wanted to use the link count to see what was added in source, then symlinks having a link count of only 1 would mess that up. Um... if you are trying to use cp -al source mirror and you can't copy the symlink (or create a symlink to it), the issue is moot. I.e. I think the proposal.. and what I thought cp already, effectively did, was link to what was linkable and copy the rest. It doesn't' link to directories either -- which some OS's allow, does that mean take it out for files? If the OS disallows a 'linkto' action, then it would be sensible to at least try to make a copy of it If I really want an exact copy, I'd su to root and probably use a dump restore... So given this is a system security policy that's restricting the operation, and falling back to a less security sensitive operation has at least the above 3 disadvantages, I'm not sure there is anything we should do here. --- What disadvantages? If cp already does this on systems that don't support hardlinking, that's what we have here. linux has changed, and no longer allows except under special circumstances. Given that, as close a copy (-a should be made)... I.e. I want cp to do, *literally* what the manpage says... create a copy, but link to files (which posix can redefine to include directories -- but it would change whether or not linux could link to them). I don't consider the devices/fifos etc. to be files. They are types of FS objects but a file is something that holds data (by common understanding, not posix new speak). goto google and type in what is a file... It's a resource for storing information... a collection of data or information...etc. That doesn't really include sockets, fifos, pipes and devices... Except for the setuid issue -- but even there -- if the user is in the affected group, that should be mutated into a copy as well. Since you mention that making copies is what is already done in situations where the OS doesn't support it, I can't see why you
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Pádraig Brady wrote: On 03/27/2014 02:57 AM, Pádraig Brady wrote: I've attached a patch for illustration. However I don't like it because it doesn't deal with, 1. fifos, device files, setuid regular files, nor, 2. relative symlinks that traverse outside the copied hierarchy. Actually point 2 isn't specific to this issue at all, so forget about that disadvantage. --- And it isn't really a problem. Many times you'll see a link outside the hierarchy to a directory or such -- even a file -- rather than including it in dir -- because hardlinks are often not practical. (cross devs...etc). As for relative links outside the tree they'd still work too if you are making a copy parallel in the same tree. I have these little gems in my linux source/build tree: lrwxrwxrwx 1 15 Jun 26 2010 clean_tree - kbin/clean_tree* lrwxrwxrwx 1 15 Jun 26 2010 doit - kbin/install_it* lrwxrwxrwx 1 15 Jun 26 2010 install_it - kbin/install_it* lrwxrwxrwx 1 11 Feb 28 2013 kbin - ../law/kbin/ lrwxrwxrwx 1 8 Mar 26 20:52 linux - ish-3137/ lrwxrwxrwx 1 16 Jun 26 2010 list_source - kbin/list_source* --- But also for # of kernels, ls -1d asa-* ish-* linux-*|wc -l 64 At over 300 to now over 500MB/tree that would be alot of wasted space if they were not linked, but linked, even diff versions are relatively small: because of tree linking: WithW/o 567M ish-3101 567M ish-3101 12M ish-3102 567M ish-3102 25M ish-3105 567M ish-3105 27M ish-3106 567M ish-3106 27M ish-3107 567M ish-3107 475M ish-311581M ish-311 18M ish-3113 581M ish-3113 25M ish-3116 579M ish-3116 - 1.1G TOTAL 4.5G TOTAL the only reason I can keep so many copies is most of it is duplicate info...
bug#17103: regression: cp -al doesn't copy symlinks, but tries to link to them (fail)
Paul Eggert wrote: Pádraig Brady wrote: I'm not sure there is anything we should do here. I looked at http://lwn.net/Articles/503671/ and as far as I can tell symlinks are vulnerable to none of the attacks they mention, because symlinks are unalterable. However, the non-symlink hardlink attacks are a real problem, and it would seem silly for cp -al to have a workaround for symlinks (which I expect we can do reasonably safely) when cp can't and shouldn't try to have a workaround for anything else. --- No? Why couldn't it create a device or other object under the user account? I.e. if I use a fifo in my build process at the top, -- all that I need is for it to exist -- it doesn't need to be and probably shouldn't be a hardlink. cp has a workaround for directories and it has exactly this workaround on other OS that don't support hardlinking. I don't see why this shouldn't be treated similarly to the 2nd case, as the OS no longer supports hardlinking in as many cases as it used to -- so why shouldn't it fall back? If the user is IN a group that is setGID, then it can be recreated under their UID, if it is another USER... again, that might not be what is needed -- maybe it needs to be the user who created the tree. It is possible to work around most of those cases if not all. But most important -- what % usage are those use cases for cp -al? I.e. copying tree's w/devices FIFOS et al that are owned by someone else? The dirs+files (regular) are the normal case, symlinks can be done because it makes sense, the rest, I think should be there as well, but don't care about as much. So I'm with you; let's leave this one alone. --- core utils are becoming less functional and less core with every new feature. If you aren't flexible you'll eventually have next to nada.
bug#16287: RFE rm -x == --one-file-system
Bernhard Voelker wrote: * These coreutils programs have a --one-file-system option: cp du rm - Guess I got a bit carried away. On the above list with --one-file-system, only 'rm' is missing -x as a shorthand for it. I don't know that either BSD or Solaris have a one-file-system option, so it seems unlikely they would have a -x. secure rm does have -x, which has the same meaning. You can create any arbitrary set of conditions that fulfill your need for acceptance or denial. tar also uses -x and find uses -xdev. Clearly there are 4 other utils that use -x to mean stay on this file-system with -xdev being a weak fifth since find doesn't have many (if any) --long options.
bug#16282: revisit; reasoning for not using ENV vars to provide workarounds for POSIX limitations?
Linda Walsh wrote: And no matter what the name is, if it makes a standard utility behave in odd ways, it'll break scripts that don't expect the odd behavior. That's the essential objection here. Having rm -fr . not follow historical depth-first behavior and, out of sequence, check for a . is odd behavior. That's the essential objection -- and I'm trying to get back the original behavior -- not ask for some new behavior. -- The other alternative to this (which I'm not adverse to) would be reading a system rc (and/or) a per-user rc config file that allows or disables various behaviors. Specifically, rm had both -i and -I to give different levels of prompting that could be put in an alias. It also had -f, --force that were supposed to force never prompting, and do what it could -- that extra switch was supposed to override such a check but was hamstrung -- yet it was specifically designed to circumvent the errors it could and be silent about it. Maybe cp -ffr to doubly force it?... Given the addition of -i -I and -f and over the years, it *seems* like this issue has ping-pong back and forth between those who want to disable such functionality and those who want it. Only site wide or per-user configurability of the command via .rc or ENV vars would seem to offer both sides what they want. To claim that ENV vars always cause trouble seems myopic at best and just ignoring a long standing issue inviting custom versions that will allow no trackability of what is in effect. At least with ENV ops, they can be captured in an ENV snapshot or test (less likely so, config files).
bug#16287: RFE rm -x == --one-file-system
Would it be possible to let rm have a -x flag to be consistent with other utils that use -x to mean --one-file-system? It seems to be a widespread convention.
bug#16287: RFE rm -x == --one-file-system
Bernhard Voelker wrote: On 12/29/2013 06:10 PM, Linda Walsh wrote: Would it be possible to let rm have a -x flag to be consistent with other utils that use -x to mean --one-file-system? It seems to be a widespread convention. Thanks for the suggestion. However, although -x is indeed a common option of several programs, we are reluctant to add new short options. I'd only consider doing so for compatibility reasons I'm looking at compatibility reasons with coreutil programs that recurse directories. More important that other implementations, would be an expectation of similar switch options within one distribution of these programs. Of the core utils that recurse directories, only chgrp does not have an option to stay on the current file system. All of the other *recursive* core utils that have the ability to isolate action to 1 file system have -x. chmod, cp, df, ls, dir, du find uses -xdev tar uses -x secure rm (srm) uses -x mkzftree uses -x (makes a zisofs) primarily was thinking about consistency in the coreutils -- for that matter, chgrp should probably follow suit in providing the ability to stay on 1 fs, and -x as it's the only recursive utils that doesn't provide that ability. As you mention the only other 'rm' util secure rm, also provides -x. Suppose you didn't put it to use to mean what all those other utilities use it for. How could would it be if it took on some completely different (and perhaps cross-purpose) meaning? Wouldn't consistency among those tools that have recursive options be desirable?
bug#16282: revisit; reasoning for not using ENV vars to provide workarounds for POSIX limitations?
I didn't fully understand the reasoning for not wanting ENV vars to override unwanted behaviors. Specifically, I'm thinking about rm -fr ., but there are some others it could apply to as well. ENV vars are used to configure all sorta of GNU utils -- so why the reluctance to do so in order to provide backwards compatibility in overcoming prescribed limitations imposed by POSIX? It's not like it's impossible to create ENV vars that are unlikely to collide with normal ENV var usage, i.e. _rm::EXPERT=allow_dot[,..other features]. Adding colons to the middle of the env var, should both, prevent any accidental setting/usage of such as well as making such overrides easy to find and filter on (if all included '::' after the util name, for example and all started with _ -- they would tend to collate together and the '::' would likely be unique enough to filter on in a grep'ing of the environment. If the issue was accident setting or collision with other usage, something like that would seem to address that problem. If there are other issues, I'm not aware of them... Thanks for any input...
bug#16282: revisit; reasoning for not using ENV vars to provide workarounds for POSIX limitations?
Paul Eggert wrote: Linda Walsh wrote: Adding colons to the middle of the env var That would make the var impossible to use from the shell. That's what I thought you'd say -- meaning it would be well protected against accidental usage. However: env foobar::snore=1 |grep :: foobar::snore=1 And no matter what the name is, if it makes a standard utility behave in odd ways, it'll break scripts that don't expect the odd behavior. That's the essential objection here. Having rm -fr . not follow historical depth-first behavior and, out of sequence, check for a . is odd behavior. That's the essential objection -- and I'm trying to get back the original behavior -- not ask for some new behavior.
bug#16168: uniq mis-handles UTF8 (8bit) characters
Maybe he was hoping for a uniq [-b|--bytes] ? Suggestion to Shlomo (if you use bash): alias uniq='LC_ALL=C \uniq' or, if you want it in your shell scripts too: uniq() { LC_ALL=C; ${type -P uniq} $@ ; }; export -f uniq On 12/16/2013 9:33 AM, Pádraig Brady wrote: tag 16168 notabug close 16168 stop On 12/16/2013 01:50 PM, Shlomo Urbach wrote: Lines with CJK letters are deemed equal by length only, since the characters seem to be ignored. I understand this is due to locale. But, it would be nice if a simple flag would do a locale-free comparison (i.e. equal = all bytes are equal). If you want to compare byte by byte: LC_ALL=C uniq thanks, Pǽdraig.
bug#16094: bug: cp/mv cannot copy/move a file's extended attrs if they start with 'security'
On 12/10/2013 12:52 AM, Pádraig Brady wrote: Note since you're writing to /tmp it might be an issue with tmpfs? df /tmp Filesystem Size Used Avail Use% Mounted on /dev/sdc2 7.8G 3.5G 4.4G 45% /tmp xfs_info /tmp meta-data=/dev/sdc2 isize=256agcount=4, agsize=519101 blks = sectsz=512 attr=2 I don't think so... Have a look at recent TMPFS_SECURITY and TMPFS_XATTR kernel options are enabled. Also there are acl mount options that might impact here too. zgrep TMPFS /proc/config.gz CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y CONFIG_TMPFS_XATTR=y They are enabled, but I don't think they are relevant since /tmp is a normal xfs file system in my case. Actually a it's a dir on /var named /var/rtmp that gets 'rbound' (rbind) to /tmp) so my root can remain relatively static.
bug#16094: bug: cp/mv cannot copy/move a file's extended attrs if they start with 'security'
I saved a file to my home directory on linux via windows. I wanted to move it to /tmp. I got: mv /home/law/tmp/oVars.pm /tmp mv: setting attribute ‘security.NTACL’ for ‘security.NTACL’: Operation not permitted So what's up with this? Shouldn't the NTACL be able to be stored/moved with the file?
bug#16094: bug: cp/mv cannot copy/move a file's extended attrs if they start with 'security'
On 12/9/2013 2:24 PM, Pádraig Brady wrote: So what's up with this? Shouldn't the NTACL be able to be stored/moved with the file? This would be security policy enforced by the system I suspect. I.E. mv is not filtering these explicitly. Ideas as to how? I.e. Is it part of the gnu libraries? I only build the standard linux security model into my kernel, so unless it's a part of a fs driver or something, I'm fairly sure it is not coming from the kernel...
bug#15992: 'ls' ignores term capabilities when generating color.
I logged in on a *dumb* terminal and did an 'ls'. Rather than a file list, I got: \x1b[00;32mwpad.dat\x1b[0m* \x1b[00mwpad_socks.dat\x1b[0m \x1b[00mwuredir.xml\x1b[0m \x1b[00mx.c\x1b[0m \x1b[00mx.c.orig\x1b[0m \x1b[00;32mx1\x1b[0m* \x1b[00mxaml\x1b[0m \x1b[01;35mxmision-web\x1b[0m/ \x1b[01;35mxrdp\x1b[0m/ \x1b[00mxrdp-sesadmin.log\x1b[0m \x1b[00;32mxtree117.zip\x1b[0m* \x1b[00;32mxtree2b-20050606.zip\x1b[0m* \x1b[01;35mxx\x1b[0m/ \x1b[01;35mxxx\x1b[0m/ \x1b[00;32myast2.txt\x1b[0m* \x1b[40;33;01mzero\x1b[0m \x1b[01;35mzips\x1b[0m/ \x1b[01;35mztars\x1b[0m/ \x1b[01;35mztest\x1b[0m/ \x1b[00mzyppinst\x1b[0m -- While I do have an alias that says: alias ls='ls -CF --show-control-chars --color=always' If the terminal HAS NO color capabilities, I would expect it not to display anything where color was selected, as the mapping for switching colors on a dumb terminal is . I tried settings for TERM: of none, dumb, and (empty) All gave the same strings as would be correct for a 16-color terminal. IMO, ls shouldn't print out bogus strings for color that are not in the listed TERMinal's capabilties. Wouldn't that be the wisest course of action? Or is there a requirement, poSomewhereIx to print garbage strings to terminals that don't have specific capabilities? ;-)
bug#15986: df in current coreutils shows the device for loop mounts, not the file
On 11/28/2013 3:16 PM, Bernhard Voelker wrote: In the end, mtab as symlink to /proc/self/mounts is the right thing, as it presents the view of the kernel ... which is always the correct one. I can think of 2 cases where the kernel displays the wrong information concerning mounts. 1) rootfs -- I have no such device. I even tell the kernel on the boot command line root=/dev/sdc1. There is no excuse for displaying garbage in that field. 2) I specify the standard LVM names /dev/VG/LV -- which the kernel then mangles. The kernel gets it wrong again -- it loses the information of what the user mounted there. Device wise, they may be similar or the same, but conceptually, the kernel can't even get the names of my devices as I used them, correctly. The lvm people say they /dev/mapper/ names are not standard form and not to use them. If the kernel actually reflected the information the user provided, I'd agree that it was right. As it can't seem to either hold on to or echo that information, I would say it is defective.
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
On 20/11/2013 22:32, Bernhard Voelker wrote: On 11/21/2013 01:48 AM, Linda Walsh wrote: Isn't it my computer? How do I override such a refusal? $ rm -rv $(pwd -P) removed directory: ‘/tmp/xx’ -- That doesn't give the same behavior and isn't what I want. Compare to cp. Say I want to create a copy of what is in dir a inside of a pre-existing dir b. In dir a are files and sub dirs. On some of those subdirs, other file systems *may* be mounted -- EITHER in the dir immediately under a, OR lower: I would use cp -alx a/. b/. Sometime later, I want to remove the contents of 'b' w/o disturbing 'b'. 'b' may have file systems mounted under it or not. Again, I would use the dot notation. rm -fxr b/. rm -fxr path/b, as you suggest isn't the same thing. Directories are containers. I want to work just with the contents -- not the directory itself. So how do I override the refusal -- and get the same results?
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
On 21/11/2013 09:50, Bob Proulx wrote: Eric Blake wrote: P�draig Brady wrote: as I don't see it as specific to rm. I.E. other tools like chmod etc would have the same requirement, and they might be handled with various shell globbing constructs. Even more generally find(1) could be used to handle arbitrarily many files and commands that don't support recursion internally. Could you explain why rm would get this and say chmod would not? Argh! Feature creep! The reason that rm should have it but chmod should not is that it is to work around the POSIX nanny rule around '.' and '..'. Chmod does not have such a nanny rule and therefore does not need that option. ... This is actually the best argument against it. It is a slippery slope. Let's not implement 'find' all over again. Let's just use '-F' to force rm to adhere to its original depth first path examination. -F disallows applying any path related rules until AFTER depth-first recursive execution has been completed on the path.
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
On 21/11/2013 09:18, Bob Proulx wrote: Eric Blake wrote: But that's not what Linda is asking for. She is not asking to pull . out of under her feet. Actually as I understand it she is expecting the call to succeed if the system kernel allows it. I believe that is the way rm used to work before removing '.' was disallowed. Um... I *expect* .* to be unremoveable. The entries . and .. are required in all directories. These are required for a properly structured directory. The only way to remove . and .. entries in a directory is to remove the directory name itself: rm -fr /tmp/mytmp would remove mytmp + any contents (including the structural entries . and .. inside 'mytmp' . However, if I type rm -fr /tmp/mytmp/. As is implemented in most OS's, it would do a depth first traversal removal. At least on linux, you couldn't remove . as it is your current position. You can remove the directory which . is in as you show below: mkdir /tmp/testdir cd /tmp/testdir rmdir /tmp/testdir echo $? 0 ls -ldog /tmp/testdir ls: cannot access /tmp/testdir: No such file or directory /bin/pwd /bin/pwd: couldn't find directory entry in ‘..’ with matching i-node So I expect anything containing . foo/. to FAIL -- but only AFTER it has already done depth first traversal. Adding the -f flag was to silence the error and have the exit code set to '0' due to any failures. Posix mandates checking . *first*. when doing a recursive removal with -f.. So how about using -F as a gnu extension to ignore that case? That POSIX would have declared rm -fr . illegal on nanny grounds goes against the original spirit of why the -f flag was added in the 1st place. It meant to force the issue, *if* possible (if permissions allowed). I have no issue with error messages due to permission problems -- as they'd indicate the directory wasn't cleaned out -- rm -fr . was to clean out the contents of a dir to ready it for some reused. So I propose adding a -F to force rm to adhere to its original algorithm and override the POSIX restriction (as well as serving the purpose of -f to force any removals. Instead, she wants a command that will recursively remove the children of ., but then leave . itself unremoved (whether by virtue of the fact that rmdir(.) must fail I am missing this part. Why must it fail? And in fact as per my test case above it succeeds. you didn't remove . You removed the directory . is contained in. A direct removal of . or .. should be disallowed because they are a required part of any directory. To removed them you must remove the directory by name, but addressing the structural entries must fail.
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
On 20/11/2013 16:03, Bernhard Voelker wrote: $ src/rm -r src/. src/rm: refusing to remove '.' or '..' directory: skipping 'src/.' That gets back to what Bob mentioned about it being a nanny-restriction. The inevitable comment to be asked by someone is Refuse? Isn't it my computer? How do I override such a refusal? I seem to remember reading that the -f flag was specifically added to override such such a refusal w/no further comment. Answer: well, yeah it was, but they caught MS-itus, and wanted to put in are you really sure? (y/[n]), but weren't allowed to ask more questions, so it just wins because its not your system anymore. **- Is it true that you can override this with -supercalifragilisticexpialidocious flag? 1/2:-) I still think an ENV flag that lists the command and behavior to override would be a specific enough, yet generally enough solution to safely make a case for allowing it. I.e. _EXPERT_=rm(.) command(feature1 feature2) find(.error)
bug#15943: [PATCH] doc: enhance diagnostic when rm skips . or .. arguments
On 20/11/2013 15:47, Bernhard Voelker wrote: - /* If a command line argument resolves to / (and --preserve-root + /* POSIX also says: + If a command line argument resolves to / (and --preserve-root is in effect -- default) diagnose and skip it. */ if (ROOT_DEV_INO_CHECK (x-root_dev_ino, ent-fts_statp)) { - So it is easier to delete everything under '/' than under /tmp/. Hmm... Maybe since '/' doesn't really delete the file system itself, but only files and dirs underneath '/', Then the correct solution is if a user says to remove /tmp/ it will remove everything under /tmp but not /tmp itself? That doesn't seem to be disallowed by POSIX... (its a bit absurd, but as long as it conforms to POSIX it should be fine, right? ;-/)
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
Since there is already an unlink command that corresponds to unlinking a file, but there seems to be no command corresponding to the POSIX remove command, it seems upgrading 'rm' to use the 'remove' POSIX call would be a beneficial move of all the recent POSIX changes. So how about upgrading 'rm' to use the remove function so it would work on empty directories as well.
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
On 19/11/2013 04:15, Pádraig Brady wrote: tag 15926 notabug close 15926 stop On 11/19/2013 11:56 AM, Linda Walsh wrote: Since there is already an unlink command that corresponds to unlinking a file, but there seems to be no command corresponding to the POSIX remove command, it seems upgrading 'rm' to use the 'remove' POSIX call would be a beneficial move of all the recent POSIX changes. So how about upgrading 'rm' to use the remove function so it would work on empty directories as well. Well we have the -d option to rm to explicitly do that. --- Does the posix remove call require a -d? I thought it made more sense to have rm correspond to remove than create another command called remove that called remove. Are you saying that you think it would be better to have a remove command that does this? As for the posix document you pointed at, I'm suggesting doing step 4 after step 1... same thing posix does, just reordering things a bit. This came up, BTW, because a useless error message was added to find such that if you used find dir/. -type d -empty -delete, it now exits with an error status telling you nothing. other than it didn't delete the . entry from inside the dir (which most people know can't be done anyway). But now there is no way to determine if the find command failed, or if it is just making noise over things it never has done. Alternatively, I suggested adding a -f flag that would occur, positionally before any path arguments (like the -H/-L/-P opts now), that would silence such fails and return a 0 status for delete failures that the OS cannot do. No useful purpose is served by the new error message, and creating it breaks compatibility with scripts.
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
On 19/11/2013 10:34, Eric Blake wrote: On 11/19/2013 11:17 AM, Linda Walsh wrote: Well we have the -d option to rm to explicitly do that. --- Does the posix remove call require a -d? Huh? There is no POSIX remove(1), Since when do you think of a call as being a command? Sorry, but to change that, you'd have to go back in time 30 or 40 years to when rm(1) was first written. People have grown to rely on 'rm(1)' being a wrapper around unlink(2), 'rmdir(1)' being a wrapper around rumdir(2), and 'rm(1) -R' being a wrapper around rmdir(2) or unlink(2) as needed. People relied on rm removing files in a depth first search order for 30-40 years. Posix changed that requiring special checks for .. Scripts relied on that behavior for 30-40 years as well... If you want to use that reasoning, rm should go back to doing depth first deletion and reporting an error with deleting . when it is finished. Sorry, but doing things in rm(1) in a different order than POSIX describes would lead to subtle breakage to lots of existing scripts. I claim not. Come up with 1 case where scripts rely on the current behavior -- to die before doing anything useful, vs. the pre-existing behavior which was to issue an error (suppressible with -f) on the final deletion failing. I am calling your bluff -- show me the scripts (other than a posix conformance test script), that would fail -- subtly or otherwise. I assert they don't exist for 2 reasons. The foremost being that working scripts cannot rely on upon the deletion failure stopping any useful work being done by the command. The 2nd being it was a new change in posix that appeared in gnu utils in only the past few years. The previous 25-35 years of scripts would have relied on it working as *documented* (depth first). Checking pathnames before you start depth first traversal is not strictly depth first. By your own standards for not changing something, rm should be fixed to be the way it was for 30-40 years, as. The problem is, is that by implementing that change, functionality was lost and removed in rm. The earlier version had more functionality. So you can't come up with scripts that rely on missing functionality to get things done. It's like relying on missing water to water your plants or missing food to feed yourself. You can't rely on the absence of a feature to do something positive with it.
bug#15926: RFE: unlink command already uses 'unlink' call; make 'rm' use 'remove' call
On 19/11/2013 12:45, Bob Proulx wrote: Since when do you think of a call as being a command? We don't. But from what you wrote (Does the posix remove call require a -d?) makes it appear that you think the posix remove(3) library call is a command. Because library calls do not take options like that while commands do. Well, that was a bit of metaphore, insomuch as the POSIX calls talk about flag passing with a value of AT_REMOVEDIR to do something different... -d in a call would likely be passed as an option Sorry, but to change that, you'd have to go back in time 30 or 40 years to when rm(1) was first written. People have grown to rely on 'rm(1)' being a wrapper around unlink(2), 'rmdir(1)' being a wrapper around rumdir(2), and 'rm(1) -R' being a wrapper around rmdir(2) or unlink(2) as needed. Agreed. Let's not break this. I'm not see what would break. instead of failing to remove a non-empty directory, it would remove it (if it was empty and permissions allowed). People relied on rm removing files in a depth first search order for 30-40 years. When using 'rm -rf' they do. But only with the -r option. Never without it. --- Absolutely... that's what I was referring to, sorry for unclarity. The change you have proposed would for the first time in all of those years have rm remove a directory without the -r option. That would be bad because it would introduce bugs into scripts that are expecting the existing behavior of rm not removing directories. (Unless the -r option is given.) I would ask how it would fail in a script -- I.e. we are only talking empty directories -- and currently an error would be returned if a script tried to do that. If the script was catching errors or running with -e, the script would terminate under the current implementation. So no script could ever work relying on itself to be fail -- it's a contradiction in terms. Posix changed that requiring special checks for .. Scripts relied on that behavior for 30-40 years as well... If you want to use that reasoning, rm should go back to doing depth first deletion and reporting an error with deleting . when it is finished. I actually agree with you on that point. ;-) (Regarding 'rm -rf .') Well, that's what much of my interest in this area and the newly created bug in find hinges on. Sorry, but doing things in rm(1) in a different order than POSIX describes would lead to subtle breakage to lots of existing scripts. I claim not. Come up with 1 case where scripts rely on the current behavior -- to die before doing anything useful, vs. the pre-existing behavior which was to issue an error (suppressible with -f) on the final deletion failing. I have seen many various script writing styles. Some of them are atrocious indeed. I would expect to at some point see someone use rm on file system objects with abandon thinking that they won't remove directories. Suddenly they will be able to remove directories. That would be a change in behavior. Changing established behavior is bad. - *empty directories*.. lets be clear. I am NOT proposing that rm recursively remove anything without -r... I routinely use the opposite side of things myself. I routinely use rmdir on empty directories knowing that if the directory is not empty that it will fail and not remove any actual file content. I could definitely see someone counting on the behavior that rm removes files but never removes directories. (Unless the -r option is given.) I use the fact that rm doesn't remove non-empty directories as well.. I'm not questioning that behavior. If someone relied on using rm to clean out a directory of any fluff or dander that had built up... I strongly believe that if rm also removed *empty* directories, it would be more likely to be appreciated by soemone who was using it to get rid of all the spurious file entries -- as empty directory entries are often just as spurious -- I'm often cleaning out skeletal trees of directories on my systems... I've even thought about the usefulness of rmdir -r -- which would recursively remove empty dirs starting at the top (with a limitation for -xdev... (I know about rmdir -p a/b/c).. but empty, wider skeletons are more common than skinny+tall dir-structures. (skinny=1 dir wide, many dirs deep). I am calling your bluff -- show me the scripts (other than a posix conformance test script), that would fail -- subtly or otherwise. Anything I show would be contrived. But that doesn't mean that changing the behavior should be done. mkdir a b c touch aa bb cc for f in *; do rm $f; done With existing behavior the directories would remain. With the new proposed behavior the directories would be removed. That is a behavior change that should not be made. Why not? if you want 1 command that does both, why not implement a remove command == but wouldn't it
bug#15727: Bug: cp -a|-archive (w/-f|--remove-destination) breaks if one of files is a dir and other not
So why not enhance rsync regarding performance? It's designed for slow, remote copies. It special-cases local copies, but uses much of the of the old code. It also follows it's manpage. It also does id translation and storage of ACL's and Xattrs on file systems that don't support it. It has a complete different set of requirements. cp was designed for locally mounted file systems. Having cp really be able to do updates with -u and really treat destinations as files (-T) and really remove destinations before attempting a copy, seems much more reasonable in line with what cp should do. For that matter, 'cp' could _relearn_ a thing or two from 'dd' when it comes to speed. IMO no: this would add further bloat to the sources which in turn is harder to maintain. Ever looked at copy.c, e.g. at copy_internal()? --- Yes. It's grown considerably in the past few years. But the main thing it could relearn from 'dd' is using larger buffer sizes. It's only half the speed of DD, so it's not like it's *that* slow. (vs. rsync, being ~32x slower for a LAN mounted file to a local copy). Where as cp from ~5 years ago used to be about 80% the speed of 'dd'. So cp's speed has dropped by ~33%... That's what I meant by relearn a thing to two from dd. (likely in regards to buffer size, where on dd I'll specify 16M for over LAN copies. fast as 'dd', but now that often doesn't seem to be the case. hmm, well, could it be that this is because many features have been added over the years? ... possibly -- but if it used larger buffer sizes, that problem would mostly go away. Like I said -- it's not that much worse that 'dd'. I think rm is simply not the right tool for such kind of synchronization. synchronization implies deleting files on target that don't exist on source. That's not updating or treating remote as file for purposes of update (-T) You need to make the docs much more clear about cps limitations. update isn't eally update, and -T is certainly wrong at the very least. If you feel you'd rather document cp's limitations, that's fine... cp is a great tool, don't get me wrong! But when it added update and -T, --remove-destination, it started inferring or promising more than you were willing to deliver. That should be documented. To be honest, I missed the 'file' bit in --remove-destination and focused on how it was different from -f in removing the destination before the copy (manpage says to contrast w/-f)... That still doesn't let it off the hook w/regards to -T where it doesn't treat the destination as a file and remove it first.
bug#15727: Bug: cp -a|-archive (w/-f|--remove-destination) breaks if one of files is a dir and other not
On 10/27/2013 5:38 PM, Pádraig Brady wrote: So overwriting files with dirs and vice versa is prohibited by POSIX. The existing cp options do not adjust this aspect. If you don't care what's in the destination at all, why not just rm it before the copy? Um --- isn't that what the --remove-destination option is supposed to do? Also note, I tried to use it with the update option, since in my real case, in /var/log, some programs had changed from using a single log file to using their own dir under /var/log and multiple log files. It's not like I'm making that up. As for the posix comment. Since when is FSF/Gnu governed by a corporate controlled POSIX? I.e. POSIX is supposed to supply interop standards, not limitations on functionality.
bug#15727: Bug: cp -a|-archive (w/-f|--remove-destination) breaks if one of files is a dir and other not
On 10/28/2013 1:56 PM, Bernhard Voelker wrote: If you don't care what's in the destination at all, why not just rm it before the copy? --remove-destination option is supposed to do? info coreutils 'cp invocation' says: But if I specify -a or -r those imply recursive. Why has it been assumed that recursive doesn't mean to remove files (and then empty directories) on the target? Also note, I tried to use it with the update option, [...] ... and it may be a question to discuss whether --remove-destination should be able to rmdir() emtpy directories, but that GNU extension should never help you out in the case of non-empty directories. --- But if combined with recursive? There's also the -T option which says to treat the destination as a file and not a dir -- which could be construed as a deliberate choice on the part of the user to circumvent restrictions on directory removal -- i.e. specifically since its intent is to ignore the fact that the target is a directory and to treat it as a file. I got the overall impression that you try to sync a source to a destination. Tools like rsync may be better for such a scenario than cp(1) which is made primarily to cp: copy files and directories That was true until various options (and restrictions) were added. -f used to be more forceful about removing obstacles (though I can't say it ever used to override directory restrictions). But adding to it's copy 'mission', --update, and --remove-destination, but gave a basis for expanding the definition of cp. As for 'rsync', it's speed is *comparatively* abysmal for local-to-local copies. For that matter, 'cp' could _relearn_ a thing or two from 'dd' when it comes to speed. Used to be that cp was nearly as fast as 'dd', but now that often doesn't seem to be the case. I'm fine with the idea of requiring a '__GNU_ENHANCEMENTS__=[list]' option in the ENV, or consulting an rc file in something like (~/.rc/Gnu/package|tool|/etc/Gnu/coreutils) that could list areas to override restrictions feature removals like: (cp_dir_restrictions,rm_dot_override[etc] to restore or allow enhancements beyond the growing, posix-restrictions. Even a posix=pre2003 would be useful, since it was about the 2003 updates that started changing posix from descriptive to proscriptive. Please remember the idea behind things like 'ksh' and 'bash' (bourne-again) were to allow features not in the original posix spec --- many of which posix eventually accepted. If there is nothing that provides an outlet for these, posix will only devolve instead of evolve. FSF/Gnu used to offer evolutionary options over posix -- but now... it seems fsf/gnu has decided to submit to the posix restrictions as a ceiling. I still assert that any posix rules that go beyond the original POSIX mission statement to be Descriptive (or to provide for minimum requirements/features) that move into *proscriptive*, or disallowal of features is, by definition outside the bounds of the original POSIX and shouldn't really be abusing the name (I *liked* POSIX up until it started removing features!) Have a nice day, --- Cheers!
bug#15738: Inconsistency bug cp -a A B doesn't reliably create 'B' as a copy of 'A' even w/--remove-destination
with --remove-destination, I think this a bug. W/o it, it is inconsistent with the requirement not to use dirname/. to refer to contents in 'rm'. details: mkdir a touch a/file mkdir a/dir cp -a a b #'ok' touch a/file2 now: cp -au a b (or) cp -a a b Either form creates another complete copy of 'a' in directory 'b'. (Note, this is historical behavior). FWIW, cp -a a/ b/ behaves the same way. However, cp -a a/. b/. fails the 1st time, but not if 'b' is already created (by using a form from above). Technically, though, the above should fail, as it says says to copy the dir-entry for '.' over the dot entry in 'b'. While using '.' as a shorthand for contents of a dir has been normal, it no longer works in utils like 'rm'. ---!!!--- *More troublesome* is that: cp -a --remove-destination a b doesn't work, but duplicates the behaviors of the above. If 'b' is removed *before* the copy, as is documented, then it should be no different than the case where 'a' is copied to a non-existent 'b'.
bug#15727: Bug: cp -a|-archive (w/-f|--remove-destination) breaks if one of files is a dir and other not
I was trying to update a copy of /var (=/root2/var). It issues non-fatal, failure messages when copying from or to a non-dir entry to a dir entry. I tried -a to force the removal of the target dir, or target file. It didn't work. I tried --remove-destination, which is clearly not file specific (i.e. should remove dir), it didn't work either. The problem doesn't happen with pipes or sockets overwriting files or any combo... seems just dirs are a problem. mkdir a touch a/file2dir mkdir a/dir2file cp -a a b mv a/file2dir a/x mv a/dir2file a/file2dir mv a/x a/dir2file --- state= ll a b a: total 0 -rw-rw-r-- 1 0 Oct 26 20:37 dir2file drwxrwxr-x 2 6 Oct 26 20:37 file2dir/ b: total 0 drwxrwxr-x 2 6 Oct 26 20:37 dir2file/ -rw-rw-r-- 1 0 Oct 26 20:37 file2dir now try copying new changes from 'a' to 'b' -- can't do it. cp -a a/. b/. cp: cannot overwrite non-directory ‘b/././file2dir’ with directory ‘a/./file2dir’ cp: cannot overwrite directory ‘b/././dir2file’ with non-directory -a/--archive is suppose to duplicate *types* on the target -- it should be able to make either change without --force or pre-removal, HOWEVER, I'd suggest 1) allowing directories to overwrite files as that seems reasonable...t and 2) with either -f OR --remove-destination, a file would overwrite a directory (dir removed). 2b) One might argue that with -f, where dest is a file the failure comes first, aborting a try for removal; however, certainly, with --remove-destination, the destination (whatever it is) should be removed 1st and no error should be occurring. Version: (on suse linux)... cp (GNU coreutils) 8.21 - Copyright (C) 2013
bug#9987: closed (Re: bug#9987: [PATCH] groups, id: add -0, --null option)
GNU bug Tracking System wrote: Your bug report #9987: RFE: 'groups' command ADD command switches -0, and -1 which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 9...@debbugs.gnu.org. + id accepts a new option: --zero (-z) to delimit the output entries by + a NUL instead of a white space character. Curious: how many commands use -z to emit output with 0 termination vs. something commands that have switches with 0 in them?
bug#9987: closed (Re: bug#9987: [PATCH] groups, id: add -0, --null option)
PC!draig Brady wrote: On 09/22/2013 03:08 AM, Linda Walsh wrote: GNU bug Tracking System wrote: Your bug report #9987: RFE: 'groups' command ADD command switches -0, and -1 which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 9...@debbugs.gnu.org. + id accepts a new option: --zero (-z) to delimit the output entries by + a NUL instead of a white space character. Curious: how many commands use -z to emit output with 0 termination vs. something commands that have switches with 0 in them? === -0 === du (since v5.1.0) (POSIX doesn't mention -0) env (since v8.0) printenv (since v8.0) === -z === basename dirname id join readlink realpath shuf sort uniq sed grep Some of those are fairly new. At least id join aren't even in my manpages yet. The others using -z are fairly new. and the use of -z w/sed and grep is suspect given their long options both refer to null, You left out find xargs -- which ARE core utilities even if they are not part of coreutils ;-). It seems like -0 was the standard in the initial utils -- probably because the word 'zero' in 17 out of hundreds of words for '0' in other languages, but the arabic number system is used in almost all (if not all) countries/languages. It seems like someone, in the past few years, didn't follow the pre-existing standard for such functionality. Since -z is only a recent addition compared to the non-locale-specific, -0, wouldn't it make sense to try to use -0 where it doesn't conflict, since -z can't be used in tools like grep as it indicates that input files are null terminated and uses -Z for output. I thought -0 might conflict somewhere, but it seems it wouldn't conflict in any of the utilities. Wouldn't it be a better choice -- especially considering it is international? Or... was there some reason why -z was chosen, as it makes the switches less consistent and more confusing...(especially given the age of the switches and that -Z has other meanings)...
bug#15328: Bug or dubious feature? (coreutils consistency prob)
Linda Walsh wrote: Whatever the problem is, it's not in 'mv'... But there is a consistency problem in core utils. Even though /usr/share/fonts and /home/share/fonts are the same directory with /home/share being mounted with rbind on /usr/share, mv cannot move files between the two dirs without copying them. That might be ok, except that du -sh /usr/share/fonts /home/share/fonts/ sees them as 1 directory. Why does mv fail to rename (vs. physical copy) when du sees them as the same directory -- (and only lists the first one if trying to du both): Ishtar:/ du -sh /home/share/fonts/. /usr/share/fonts/. 8.8G/home/share/fonts/. Ishtar:/ du -sh /usr/share/fonts/. /home/share/fonts/. 8.8G/usr/share/fonts/. --- Also, is it intentional to leave out args from the listing of du -- I know it doesn't double-count the space (by default), but I would have thought to see a 2nd entry w/0 space: Ishtar:/ du -sh /home/share/fonts/. /usr/share/fonts/. 8.8G/home/share/fonts/. 0 /usr/share/fonts/.
bug#15328: Bug or dubious feature?
I was trying to move a directory with the 'mv' command. Instead of renaming the files it copied and deleted them. The source and destination were the same disk which a stat of the source and destination would confirm. The oddity, is that I was moving between /usr/share/fonts - /home/law/fonts where I use rbind to mount part of home in /usr/share: (fstab) /home/share /usr/share none rbind stat shows: stat -c %D /usr/share /home/share fe03 fe03 So... why didn't it rename if the device numbers are the same? I'm sure it wouldn't try a rename if I moved from /usr-/usr/share as share is a different partition/device number, so it seems 'mv' does check device id's to verify sameness upon move. Then why not in moving what was effectively, /home/share/fonts/dirname = /home/law/fonts/. ? It was only 12G of files, so it was done in ~ 5-10 minutes, but I was expecting a few seconds...?? Coreutils 8.21/linux 3.9.11 / xfs filesystem.
bug#15328: Bug or dubious feature?
Whatever the problem is, it's not in 'mv'... I tried to run my dedup'r, and got this: /usr/share/fonts/sys time ndedup /home/law/fonts/sys2/ . Paths: 34474, uniq nodes: 20192, total_size: 12.8GB (12.9GB allocated). ERROR: Invalid cross-device link doing link(./desktop.ini, /home/law/fonts/sys2//desktop.ini): \ (Invalid cross-device link) at /home/law/bin/ndedup line 1107 10.40sec 10.11usr 0.28sys (99.98% cpu) But this works (same dir's): /usr/share/fonts/sys cd /home/share/fonts/sys /home/share/fonts/sys time ndedup /home/law/fonts/sys2/ . Paths: 34474, uniq nodes: 20191, total_size: 12.8GB (12.9GB allocated). Nodes(done/uniq); Space(read / alloc'd) 35011/1275825.5GB / 7.2GB space ;cursz: 37.8MB; newlnks: 7433(1730 cmps_skip, 11046 cmps_full) 5.7GB in 7434 duplicate files found found. 49.75sec 35.34usr 12.90sys (96.98% cpu) Anyone know if this is a documented kernel feature? Seems odd.
bug#15157: bug#15127: grep not taking last conflicting option as choice?
Eric Blake wrote: On 08/21/2013 10:54 PM, Linda Walsh wrote: Ok, thank you for sharing, but doesn't '-E' mean egrep pattern syntax? That even, '-E' fails, telling the user that they can only use the syntax they are specifying seems abusive. That other options in grep DO take the 'last' option, but the syntax options are disallowed, is inconsistent, unuseful and creates breakages in existing scripts that don't know they should clear GREP_OPTIONS in order for egrep and fgrep to function correctly. Anyone that sets -E via GREP_OPTIONS is already breaking their own system, and we have no sympathy for their dumb action. --- Anyone making broad statements that apply to all of humanity about their own systems is hardly someone who should be commenting on 'dumb'. Why is it that some people use their positions to modify widely used source as a chance to implement power-over and domination over other people? Isn't the point of software to give users the freedom to make their choices -- to help them do their job? It's not to enforce a particular mind-think or dogma. An explicit error is better than silently making a multitude of scripts misbehave. They won't misbehave -- they will fail if the expressions are not compatible. There are few cases where someone deliberately needs | in an expression or + in a regex, to NOT have it's normal meaning. That doesn't mean they don't exist, but it is rare. Furthermore, if someone wants a particular *engine* for matching what I said was the point -- the engine on the command line would take precedence over any in the environment. Also, for egrep/fgrep -- they are reading GREP's options. If they don't understand the option/can't use it (-F/-E/-P), then they should ignore options they don't understand, as they are not reading their own options but those of 'grep' which DOES understand those options. It was also my suggestion that *if* the user explicitly specified an option on the command line -- then it should use the option on the command line no matter if the program is grep/fgrep or egrep. A grep by any other name still knows how to use alternate engines. A deliberate crippling of utilities just to enforce your narrow minded view of how others should use their own systems is the height of arrogance. In my opinion, GREP_OPTIONS is a mistake - it's ONLY useful and safe purpose is to do automatic coloring when used from a terminal, but that can be done with an alias or shell function, and could have been done with an explicitly-named GREP_COLOR instead of GREP_OPTIONS - if only we could go back 20+ years and design it that way to begin with. --- I think all should pay attention to a .greprc/.fgreprc/.egreprc -- would be more easily tailored and not have the env issues you mention -- BUT it is STILL the case that command line options would override previously set options in a config file. Could 15127 also be re-opened as it was closed unilaterally in the presence of obvious bugs. Thanks... These are not obvious bugs. --- Inconsistent treatment of options is still confusing to users and causes errors. On one hand you have grep paying attention to the last option specified, like most other utilities have for 20+ years, and on the other hand, you have some new options, that are inconsistent with previously implemented options. To have them operate on their switches half and half is a design flaw--- a bug. As POSIX permits both behaviors (mutually exclusive options giving an error, vs. last option wins), it is merely a quality of implementation question, which crosses the line into subjective rather than objective. === Implementing things down to the worst behaviors allowed by POSIX is worse than adhering to new posix rules that dumb down behaviors of utilities to protect 5-year old children. That it meets the standard of the lowest-common denominator standard, is hardly worth bragging about, let alone using a justification for inconsistent and flakey design.
bug#15127: grep not taking last conflicting option as choice?
Eric Blake wrote: On 08/21/2013 10:54 PM, Linda Walsh wrote: Ok, thank you for sharing, but doesn't '-E' mean egrep pattern syntax? That even, '-E' fails, telling the user that they can only use the syntax they are specifying seems abusive. That other options in grep DO take the 'last' option, but the syntax options are disallowed, is inconsistent, unuseful and creates breakages in existing scripts that don't know they should clear GREP_OPTIONS in order for egrep and fgrep to function correctly. Anyone that sets -E via GREP_OPTIONS is already breaking their own system, and we have no sympathy for their dumb action. --- (1) Anyone making broad statements that apply to all of humanity about their own systems is hardly someone who should be commenting on 'dumb'. (2) in the above example there was no -E in the user's GREP_OPTIONS. The conflict came in using egrep -E -- a user specifically asking for extended grep syntax from the egrep command. (3) As to your opinion on the proper use of GREP_OPTIONS: anything you say is suspect, since you feel that GREP_OPTIONS shouldn't have been implemented as an ENV var to begin with. It seems odd that you would bother to complain about how GREP_OPTIONS is used, when you feel that it shouldn't be present in the first place. I can't say that your opinion would be representative of someone who isn't pre-biased the ENV option. An explicit error is better than silently making a multitude of scripts misbehave. You are hand waiving using made up statistics. Most scripts won't misbehave and it is a minority of scripts that use features like | or + in RE's that don't want their extended meanings. The desire for expressive syntax in grep lead to the addition of the even-more-complete Perl-regex. If the demand was not there for the more expressive syntax, that wouldn't have happened. If someone wants a particular *RE-engine* for matching -- then either the specialized names (fgrep/egrep) would override ENV options, and those would be over-ridden by explicit specification on the command line (regardless of cmd-name). The current behavior is already problematic in that 'egrep/fgrep' are reading **GREP**'s options. If they don't understand a grep option or cannot use it (any syntax specification causes a problem with egrep/fgrep), then they should ignore those options that are grep-specific. It was also my suggestion that *if* the user explicitly specified an option on the command line -- then it should use the option on the command line no matter if the program is grep/fgrep or egrep. A grep by any other name still knows how to use alternate engines. A deliberate crippling of utilities to enforce the one, right, true way that you believe is the only acceptable use, seems to be undesirable demonstration of power. In my opinion, GREP_OPTIONS is a mistake - it's ONLY useful and safe purpose is to do automatic coloring when used from a terminal, but that can be done with an alias or shell function, and could have been done with an explicitly-named GREP_COLOR instead of GREP_OPTIONS - if only we could go back 20+ years and design it that way to begin with. --- Personally, I think all should pay attention to a .greprc/.fgreprc/.egreprc. They would would be more easily tailored and not have the env issues you mention, but even in that case I would still have command line options override previously set options in a config. Could 15127 also be re-opened as it was closed unilaterally in the presence of obvious bugs. Thanks... These are not obvious bugs. --- Inconsistent treatment of options is confusing to users and causes errors. On one hand you have grep paying attention to the last option specified, like most other utilities have for 20+ years, and on the other hand, you have some new options, that are inconsistent with previously implemented options. To have them operate on their switches half and half is a design flaw--- a bug. As POSIX permits both behaviors (mutually exclusive options giving an error, vs. last option wins), it is merely a quality of implementation question, which crosses the line into subjective rather than objective. === To use POSIX as a justification for bad or user-unfriendly design is hardly a glowing recommendation.
bug#15127: grep not taking last conflicting option as choice?
Eric Blake wrote: On 08/22/2013 04:18 PM, Linda Walsh wrote: Isn't the point of software to give users the freedom to make their choices -- to help them do their job? It's not to enforce a particular mind-think or dogma. And the point of free software is that YOU are free to modify the software to fit your needs, and share your modifications; Oh, so if I submit patches to fix the problems I've raised, they will be incorporated? Or is someone using their position of source-code maintainer/gatekeeper to implement their own vision while excluding others? not to rant that you got something at no price while demanding that someone else fix it to meet your whims. Share patches, rather than rants, and you will gain a lot more friends in the world of free software. I have shared multiple patches -- having them dropped on the floor makes me unwilling to submit patches for things that they gatekeepers are going to unilaterally reject anyway. More than 50% of your mail was ranting about the behavior of grep, which we already established is NOT part of the coreutils package. Sorry, I thought the 15127 was the bug that got assigned when I send the email to the bug-grep reported here: http://lists.gnu.org/archive/html/bug-grep/2013-08/msg00017.html and responded to here: http://lists.gnu.org/archive/html/bug-grep/2013-08/msg00018.html It doesn't seem that the grep bug was assigned a bug number, though it appeared to be rejected and my response was in regards to it.
bug#15127: grep not taking last conflicting option as choice?
Eric Blake wrote: On 08/22/2013 06:18 PM, Linda Walsh wrote: [a repeated message to bug-coreutils] Observe: http://debbugs.gnu.org/15127 In particular, note that your message 27 and 36 are practically identical in content (with slight reformatting). http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15127#27 http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15127#36 Interesting that you take them as identical in content when the 2nd was specifically reworded to remove verbiage that some found offensive. I would point out to people that while some people take offense to the verbiage used in the first one, the difference in wording doesn't even show up as a blip on others' reading of the two notes... FWIW, Eric, the 2nd was rewritten by request. I find it unsurprising that you didn't notice the delta's. (that's an observation, not a veiled insult).
bug#15157: join doesn't follow norms and dies instead of doing something useful w/duplicate options
join is inconsistent with other utils (like cut, for example) in how it handles a specification of a switch value that has already been set. 1) if a switch is set more than once with the same value, it doesn't complain, but if the options differ, unlike utilities like 'cut', the tool dies rather than taking the final specification as what is meant. ex: cut -d'TAB' -d: -f1 /etc/passwd doesn't issue any errors. But the same thing with join: join -t'TAB' -t: -f1 /etc/passwd /etc/group join: incompatible tabs ??? tabs? they are field separators. Historically, options specified on the command line take precedence over options in an init/rc-file or in the ENV. Many utils in a build process build up command lines in pieces -- with the expectation that later values take precedence, allowing for higher level make files to set defaults, while allowing make's in sub directories to override options set in a parent. Defaulting to fail, rather than proceed with latest input data, is rarely useful for humans. It's arguable whether or not it is useful for machines in most cases. In the past, unix utils have tried to do what the user meant rather than deliberately playing stupid and pretending to have no clue about what was likely expected.
bug#15157: bug#15158: sort complains about incompatible options (that aren't) AND when it shouldn't
Paul Eggert wrote: The -g and -h options of 'sort' are indeed incompatible; they result in different outputs. More generally, these bug reports reargue the 'grep' bug reported here: http://lists.gnu.org/archive/html/bug-grep/2013-08/msg00017.html --- Not really. There was nothing about the similarity and overlap of sort options and how integer is still in the class of general numbers, which includes 'e' as an abbreviation for a power prefix, just like KMG are power prefixes as well. You can't claim that 3.2e+3, where e indicates some power of 10 follows, and is used as a scaling factor for 3.2 is that different from 3.2k, where k is a scaling factor, and, indicates that 3.2 is scaled by a factor of 10**3. They are both general numeric cases... I don't think I had anything to say, at all, about the overlap of such in grep, which is inconsistent with itself: grep -d read -d skip --color=auto --color=always foo (no error)... GREP_OPTIONS=-d skip --color=auto -P grep -E this|that grep: conflicting matchers specified ??? I didn't specify them on the command line -- OR now: egrep this|that egrep: egrep can only use the egrep pattern syntax But I didn't specify any other syntax... oh.. it reads the cmdline options of 'grep', but it can't function like grep? That sounds a bit ill-designed. --- Please don't try to take over and confused bugs on other utils, just to put forth your view that utilities should default to broken unless the user invokes them perfectly. That's a bad User Interface design -- computers are supposed to be there to help us -- not to enable those who enjoy making others wrong to do so on a wide scale. and replied to here: http://lists.gnu.org/archive/html/bug-grep/2013-08/msg00018.html Generally speaking, the GNU utilities follow the POSIX utility syntax guidelines, in particular Guideline 11: The order of different options relative to one another should not matter, unless the options are documented as mutually-exclusive and such an option is documented to override any incompatible options preceding it. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02 It sounds like you're disputing the main part of this guideline and are advocating always taking the second (exceptional) part. --- The 2nd part has been the norm. While you are claiming that the exceptional design should be the norm. Let me know how specifying your options out of order on gnu tar works out... There are many contexts where one option enables or disables others. The idea that options should be order independent is as absurd as the idea that the lines in a C program should also be order independent. This is a perfect example of the ivory tower thinking that is dominating POSIX now. While they used to be more practical and describe what was, now they've jump to the forefront to dictate bad design and dumbed-down interfaces. Have you ever thought about the fact that they are funded by Corporations who might have an interest in seeing Linux's open nature be killed off -- and/or it's usage reduced? It's not clear to me that this makes sense, and there are good arguments for sticking with the more-cautious approach. So you have said, but name some that would be true for most people and make more sense than the previous, deterministic approach... As it is, I could point to sources by them and talk about what they said and everyone knows... to make my point, but I'll rely on the common sense that most people have in knowing that having programs that are resilient in the face of odd user input and have it do something useful and predictable is far better than having fragile programs that break on all but perfect input.
bug#15157: join doesn't follow norms and dies instead of doing something useful w/duplicate options
Pádraig Brady wrote: On 08/21/2013 10:44 PM, Linda Walsh wrote: Historically, options specified on the command line take precedence over options in an init/rc-file or in the ENV. Many utils in a build process build up command lines in pieces -- with the expectation that later values take precedence, allowing for higher level make files to set defaults, while allowing make's in sub directories to override options set in a parent. Defaulting to fail, rather than proceed with latest input data, is rarely useful for humans. It's arguable whether or not it is useful for machines in most cases. In the past, unix utils have tried to do what the user meant rather than deliberately playing stupid and pretending to have no clue about what was likely expected. Right, to support subsequent specification of scripts etc. it's useful to allow options to be overridden. In addition this is how other systems behave wrt to input field separator options for example. Now on the other hand, the ambiguity being diagnosed here in such an obtuse manner, is that one might think that _any_ of the specified separators are supported in the input, rather than the last taking precedence. There are other utilities not all officially under the official 'coreutils' project, but definitely under the core unix utilities definition. One of those which started me looking at the inconsistencies was/is grep(+flavors). There, you have the ENV var GREP_OPTIONS, which I would argue should take the least precedence when compared with the 'command name' and 'options on the command line' The -[FEPG] options are mutually exclusive and can easily override each other w/o harm. To add spice, egrep, uses the 'GREP_OPTIONS', but isn't really compatible with 'grep' (as it is now) w/regards to -- for example, the search-type switches. I'm not sure why, but egrep, right now, refuses all pattern options --this is a real kicker: egrep -E foo egrep: egrep can only use the egrep pattern syntax Ok, thank you for sharing, but doesn't '-E' mean egrep pattern syntax? That even, '-E' fails, telling the user that they can only use the syntax they are specifying seems abusive. That other options in grep DO take the 'last' option, but the syntax options are disallowed, is inconsistent, unuseful and creates breakages in existing scripts that don't know they should clear GREP_OPTIONS in order for egrep and fgrep to function correctly. There is no reason why last specified shouldn't apply there as well (with the ENV being specified before the command was entered, thus having lowest priority), the command name being the 2nd thing typed, and having next priority, and options specified to the command being the last thing typed, left to right. It so happens that 'join' was used as a justification for this behavior in 'grep', which was one of the reasons why I looked at join (along with sort, and a few others) to note where there might be inconsistencies and whether or not the trend of fail taking precedence over deterministic and working behaviors that have been defined as normal for as long as I can remember on *nix. Do you see a reason why grep(+e/f) should fail -- or, especially, why e/fgrep should fail due to conflicting options in a GREP env var... or reject specification of a correct format? New users of these tools may be caught out though. --- They wouldn't have any previous history to be caught by. When I came to *nix, I read the man page and noted that nearly all of the utilities showed the same behavior (with the exception of sort that might have it's options confused as applying to different fields, not sure how likely that is). I have come to rely on option-override working in a number of utils -- with config files taking the lowest priority (they are present before the user logs in), followed by ENV vars (set each session), command name and switches...(usually command name isn't part of that list, but to make things consistent...) We could display a warning but that would negate most of the benefit of allowing overriding the option. I suppose we could support --debug on all utils to diagnose ambiguities like this, rather than disallowing them. I'll look at doing both of the above. --lint? debug has other connotations... or --anal^h^h^h^hstrict ? ;^) thanks, Pádraig. -- ditto.. and I need to know how to phrase the problem for the kernel folks as they have quite a few places calling grep where they don't check for status (let alone, now being affected by conflicting options)... Could 15127 also be re-opened as it was closed unilaterally in the presence of obvious bugs. Thanks...
bug#15127: grep not taking last conflicting option as choice?
Eric Blake wrote: Grep is its own project (bug-g...@gnu.org, per 'grep --help' output). As such, I'm closing this as not a coreutils bug. --- I see... Sorry about that.. It's hard to think of some things that are their own separate projects as not being coreutils as they were a core part of any unix system... (that isn't meant to say anything about where they should be located, just that the name of coreutils tends to be a bit all encompassing of a few side projects... Sigh.
bug#15127: grep not taking last conflicting option as choice?
Isn't it usually the case with conflicting options that the last one gets taken as the choice, with choices on the command line overriding choices in the environment? Grep doesn't seem to follow this convention. Is there a reason why grep doesn't or did it used to and now chooses to do nothing in the case of conflicting options? (eg. -P v. -E) I think the earlier behavior, especially in respect to cmdline value overriding the environment is more useful otherwise lines built up both by successive passes in make files and with those who specify defaults in GREP_OPTIONS, but expect cmd-line usage to override ENV... (coreutils 8.21) (or is grep not part of coreutils these days...hmmm...)