bug#17601: printf not interpreting hex escapes

2014-05-26 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

printf '\x41' prints x41 instead of A.

Also it has no --help or --version switch, but I seem to be running
version 8.21-1ubuntu5 ( obviously on ubuntu ).

-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJTg27TAAoJEI5FoCIzSKrwcTAH/1zFzoplzkWtkhNFilJr+2aX
pwHqEhtTI1V1zolTE72R8fKn52XoXaR2LXTBUZdf170r4vmlSKCbb/dzZSbaLRLr
VHsDGn4lhr/kgQfQKx1hX5/K90w/VwnKTxmJiQ02by6kF/c78wT1x0J7BGP1i+34
jJWOkqRwckFoGyYXJP4s0W+Mr7leNMU9KWXG4NdjKBUQepBJxdK06c9DRxrUy/+M
ZkqUwcfQpdx7S6UASjD4c15Eg/fb3hEXLOYsmIzz1G+iS4Zac6u9PRRnhgJj/dlW
NH6YZcdaSrqSQ6P13fDODE9nU8X8Fwy5nQx3MpBT36+ptb2ewL35DaKDIr1eSw8=
=Cr+I
-END PGP SIGNATURE-





bug#17601: printf not interpreting hex escapes

2014-05-26 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 05/26/2014 12:50 PM, Pádraig Brady wrote:
 $ type printf printf is a shell builtin

Of course, darn builtins!  Sorry for the noise.


-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJTg3sMAAoJEI5FoCIzSKrwP80H+QH46DNMtO7DJalEa1UQrpXL
xadU4ahdEtD4VUvqEcOX6LFDmmKgqAHAOnvQyT9rojAysv+EzTvxZ9Ni4ebI7pqE
BFiE0j3k55qsTxN92ZyhKE5WZt+CYEp2vaHSyrRsWbztOss2Mh0xCI0eGJfIEGSa
fmJFIBxz92ZU0xR6NPC+obA/8GJb6St6pJLQGomRskbNrpB5GaJOn5Jdrg3dMCQI
XVpSJp42acZ0Ivb+aplQXoZrujfzlGzhcUOoRtx8FQdCmshWMYNeV8+HV0zqNoVd
TYpBnqbdJXg2eS7caL7Wd0HxwGXAheYDjr9qebb8nDLnhzG+KXpZoYLlfinKcXs=
=dH7q
-END PGP SIGNATURE-





bug#15955: df hides original mount point instead of bind mount

2013-11-26 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 11/26/2013 06:37 PM, Bernhard Voelker wrote:
 As already mentioned, the current implementation is not ideal. It
 is a compromise between the requirements which hit 'df' at that
 time:
 
 * showing the real root file system instead of early-boot rootfs:
 
 $ src/df -a | grep ' /$' rootfs12095032   798
 3491744  70% / /dev/sda1 12095032   798   3491744  70%
 /

That doesn't seem to be at all related since the path on both is
exactly the same, not one longer than the other, and is filtered by
the fstype being rootfs.  For that matter, this has always seemed
like a bug in the kernel to me: rootfs isn't mounted in /, it is
mounted above / and therefore is not visible to this process, so it
shouldn't be shown in /proc/mounts.

 * suppressing massive bind-mounts with hundreds or thousands of
 bind mounts of the same file system:
 
 $ for f in $(seq 1000) ; do mkdir dir$f \  mount -o bind /etc
 dir$f ; \ done and then look at 'df' vs. 'df -a'.

Yes, the whole topic is hiding bind mounts; the question is how to
choose which one to hide.  Why use the path length instead of which
was mounted first?

 * IIRC there was a another issue re. shortening mount information
 like: 
 /dev/disk/by-id/scsi-SATA_Hitachi_HDS7210_JP2911N03AR0WV-part1

Again, that seems totally unrelated.  The by-id is a symlink so it is
followed to the underlying devnode and that's what is reported in
/proc/mounts, and it doesn't have anything to do with the length of
the mount point.


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJSlU+JAAoJEI5FoCIzSKrwJnEH/jYK117dsdMOQWJsSBqYIRUD
fI4ilZjsTPb5n49dlsN4oXI/phzdFvGTKdw3LRNYv2MJhG2KcQScIRqIgV4OT+Rr
9RsRAHtvGF7j89AtdKbu4HVQlFm450WrqAsReg9vnjCrj6q57Ms/CXp4GNHDu1HD
JaNZ/8XlpLAsJR9rz62+R1GCqQF4yZbwRDgudy8gxG1OcXmO24Wk6SE03Q0Ss8Ho
QFSreq9Bpzs4l/BAkqLuAXJAMBHX5gQ1R/URbHv+m3mW6RnCVryjSQmOGd7+wWRn
Y6ttY+OGU9334ckwfsJ4lD448fY2/81ty89NhrkKPuXwZbUj1ocJiZK8SWyfcAE=
=pAtS
-END PGP SIGNATURE-





bug#15955: df hides original mount point instead of bind mount

2013-11-24 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 11/24/2013 05:24 AM, Bernhard Voelker wrote:
 Thanks for the suggestion, but that is not possible.  For the kernel,
 all bind mounts are actually equal among each other, and there's no
 information about bind flags in /proc/self/mounts (which 'df' uses).

I'm aware of that, but the order they are reported in /proc/mounts at least 
seems to be the order they were mounted in, which seems like it would be a 
better thing to decide which to show and hide than the length of the path.



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJSkovxAAoJEJrBOlT6nu75zPkH/jKoyhDSgjSihw1lIIueSkL0
jHv/Y3ZhUx+mykzxMO4ssPqWaknzgB2HNoFz5U7NcSmUvNyTX9Zt39ZBYJnxqOdb
HOLNxq1vpy2yDk0frHdEH/VEfdnN9D1bniR93uRxW/j8KoFv+i16P34MX+3cQF8M
F/clayut55pSL+NnSVrGLB6eGAyJYEDsoiuq+SBxYO5Tt6zmS1jb7e/ORWdZuwE5
td9snMTLkolB/YqvawsQFHwYCzWbOukC7g7DUKIK85tQnFopqdXb8kSh0O42tdig
xFzkFP8+5np6O1GAPmDVPxX7cpjy8Pw7S0bYg9rr6YF3WeNTwpI9nBWM/AjTpCY=
=QYXB
-END PGP SIGNATURE-





bug#15633: dd and host protected area

2013-10-16 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/16/2013 3:19 AM, Peter D. wrote:
 Hi,
 
 Is it deliberate that dd can not read from, or write to the host
 protected area?  Or is it a bug?

The HPA is a feature of the drive, not the OS or software, so dd has
no idea whether or not there is one and can not get around it.  To
unlock the full capacity of the drive you have to send commands to the
drive.  You can have the libata driver do this by setting
libata.ignore_hpa=1.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSXvYTAAoJEJrBOlT6nu75nbIH/3fbuqzWpJWKFr9ngYvHcMcB
FkMNDF7Up0JrmaT/7/uXwfGVjUYiGiPdGI9EzmizpKhunfdxc5qkmiM8bD1M2+NW
Nk8dJp/Pc2gw4N++w7IsqQ4U5vPPlqLYRDmYGecC7HOSpRLecHRKXd8V3vMJYvDd
/Tw463Jk/svlZOBGUmtQZnlZnIsjRgcxWqDfMx+4yskjDTVYy7dPjxEa8OEqkBmB
X75N2RN+k7KS/CHtXLMdZ/RwqovSvpbQPuc8WoTgq35GA+IDhFNT/iLExgSooLp+
ijE4JmWsUg867ZxVgYFqKzjARBxYwxxt2sTBk9tFG9AxbZEGldvSGW7PeRgwIjw=
=d4XP
-END PGP SIGNATURE-





bug#8533: Mailing list has shifted domains?

2011-04-25 Thread Phillip Susi
On 4/21/2011 4:10 PM, Phillip Susi wrote:
 I noticed that some posts to the list today were not being filtered into
 the correct mailbox because the List-Id and List-Post fields were
 changed from gnu.org to nongnu.org.  Was this intentional, and why?

I didn't file a bug report.  This message says I sent an email to
8...@debbugs.gnu.org, but I did not; I sent it to the usual mailing list
address: bug-cureut...@gnu.org.





bug#5817: false core util

2010-04-01 Thread Phillip Susi
On 4/1/2010 7:49 AM, phoenix wrote:
 Hey guys,
 I've found a serious error in the program false: If I pipe some text to it
 (e.g. echo i lost the game | false), it does not fail but rather returns a
 success. Any suggestions? I need this program very much for completing my
 thesis on Pseudo-Riemannian manifolds.
 It would very kind of you if you'd inform me when the error has been fixed.
 (Remark: I am using Ubuntu 9.10.)
 Best regards,
 Hajo Pei

You must be checking it wrong because on Ubuntu 9.10 here it correctly
returns an exit status of 1 for failure.






Re: Suggestion for rm(1)

2010-03-11 Thread Phillip Susi
On 3/11/2010 7:37 AM, Andreas Schwab wrote:
 Incidentally, due to the increasing use of SSD and their tendency not to
 reuse recently used blocks it may become again easier in future to
 recover data.

Actually once TRIM support becomes common recovering deleted files on
SSD will be impossible since the flash blocks will be erased rather than
just being left unallocated.

I think the man page should changed to state that the file *MAY* be
recoverable using forensics.  That should be sufficiently clear that it
is not a secure erase, but recovery is not at all easy if it is possible
at all.




Re: [bug #25538] excluded files are still stat()ed

2009-02-11 Thread Phillip Susi

Kevin Pulo wrote:
r...@bebique:~# du -axk /home/kev/mnt/sf 
du: cannot access `/home/kev/mnt/sf/home': Permission denied

4   /home/kev/mnt/sf
r...@bebique:~# du -axk --exclude=/home/kev/mnt/sf/home /home/kev/mnt/sf
du: cannot access `/home/kev/mnt/sf/home': Permission denied
4   /home/kev/mnt/sf
r...@bebique:~# echo $? 
1
r...@bebique:~# 


The fuse mount point is /home/kev/mnt/sf/home right?  I believe this is 
a bug in the fuse kernel filesystem driver.  While it can refuse access 
to the contents of the filesystem, a stat on the mount point itself 
should work and probably return a mode indicating you do not have read 
or execute access to the directory so tools know not to try traversing it.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Threaded versions of cp, mv, ls for high latency / parallel filesystems?

2008-11-12 Thread Phillip Susi

James Youngman wrote:

This version should be race-free:

find -type f -print0 |
 xargs -0 -n 8 --max-procs=16 md5sum  ~/md5sums 21

I think that writing into a pipe should be OK, since pipes are
non-seekable.  However, with pipes in this situation you still have a
problem if processes try to write more than PIPE_BUF bytes.


You aren't using a pipe there.  What you are doing is having the shell 
open the file, then the md5sum processes all inherit that fd so they all 
share the same offset.  As long as they write() the entire line at once, 
the file pointer will be updated atomically for all processes and the 
lines from each process won't clobber each other.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-16 Thread Phillip Susi

Micah Cowan wrote:

He means that there _is_ no optimization. When you're applying ls -i
directly to files (ls -i non-directory, the scenario he mentioned as
not being affected), there is no readdir, there are no directory
entries, and so there is no optimization to be made. A call to stat is
required. There is no effect to reduce.

I may be completely off-base here, but that's how I read it, at least
(how do you get inode info from dir entries you don't have?!).


facepalm/

Of course... I wasn't keying in on the non-directory argument for some 
reason I was taking that as indicating that you were still doing an 
ls -i in a directory that *contained* non directory children.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-15 Thread Phillip Susi

Jim Meyering wrote:

When I say not affected I mean it.
Turning off the readdir optimization affects ls -i only
when it reads directory entries.


You mean you are only disabling the optimization and calling stat() 
anyway for directory entries, and not normal files?  Then the effect is 
only reduced, not eliminated, but probably reduced enough to not be a 
big deal since most directories do not contain many subdirectories and 
little to no files.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-11 Thread Phillip Susi

Jim Meyering wrote:

EVERY application that invokes ls -i is effected.


Please name one.


I'm not sure why this isn't getting through to you.  ANY and EVERY 
invoker of ls -i that does or possibly could exist is effected by a 
degradation of its performance.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-11 Thread Phillip Susi

Jim Meyering wrote:

Here are two reasons:

  - lack of convincing arguments: any program that runs
ls -i non-directory ... is not affected at all.


Of course it is effected -- it takes much longer to run.


  - lack of evidence that users would be adversely affected:
the only program alleged to be impacted is one that (so far) I've
found no reference to, so I suspect very few people use it.


Every single time that ls -i is run, by anyone, or anything, anywhere, 
EVER, it will be slower with this change.  That's adversely effecting 
every user, which is a lot more than one.



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-10 Thread Phillip Susi

Jim Meyering wrote:

From what I've read, POSIX does not specify this.
If you know of wording that is more precise, please post a quote.


That was my point the standard does not specify that this behavior 
is an error, and since every unix system since the dawn of time has 
behaved this way, it is NOT an error as you claim.



As far as I've heard, only one application would be affected
adversely by this change (extra stat calls would hurt performance),
and that application, magicmirror is not widely used -- since I
found no reference to it in the first few pages of search results.


EVERY application that invokes ls -i is effected.


If you know of other applications that run ls -i and depend on
the post-coreutils-6.0 behavior of not stat'ing some files,
please let me know.


EVERY user of ls wishes it to be as quick as possible.  Since it has 
been shown that simply returning the inode number from the d_ent is not 
really an error, and that fetching the different number from stat 
provides no actual benefit, slowing down ls for that purpose is a 
regression, not a bug fix.



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-10 Thread Phillip Susi

Wayne Pollock wrote:

How can either ls or readdir be considered correct when
the output is so inconsistent?  What behavior do you expect from
backup scripts (and similar tools) that use find (or readdir)?
It seems clear to me that returning the underlying inode numbers
must result in having the wrong files backed up in this case.


As has already been discussed, the backup tools are already in error if 
they are relying on the inode number as reported by readdir. 
Essentially the number is meaningless and should be ignored in the 
context of a mounted directory.



If some folk feel the underlying numbers reported by the
(buggy; sorry but I don't know what other adjective to use) readdir,
then for the sake of consistency stat must be considered buggy
for reporting the root inode number of mount points.  Consistency
first, and performance be damned!


It is a logical fallacy ( false choice ) to assume that if one is right, 
the other MUST be wrong.  The kill command is both a shell builtin and 
an external utility.  The builtin version understands job specs, but the 
external only accepts pids.  So, depending on whether you end up using 
the builtin or the external, the behavior is inconsistent.  That does 
not mean that the external utility is broken.



 From my (admittedly inexpert) point of view the inode of the underlying
filesystem directory is useless except to find hidden files.  So
the stat behavior, even if useless without a device number too, is
the better value to return in all cases.


The number is useless for finding ANY file, hidden or otherwise.


It might be argued which value is better, but inconsistent values
(and perhaps unpredictable values without reading the source code)
are clearly wrong.


Inconsistency and unpredictability by themselves are not necessarily 
wrong.  If you have an uninitialized local variable in C for instance, 
its value is unpredictable, but this is quite acceptable because the 
standard clearly indicates that its value is undefined and should not be 
relied on.


What I'm saying is, add a note to the man page to document the this 
somewhat confusing fact, and let it be.


You will find a similar inconsistency if you run a du on an entire disk, 
then compare the size reported with that from df.  Often times they will 
be ( sometimes vastly ) different and this will confuse people that do 
not understand why.  The reason is that files removed from the directory 
tree do not immediately have their space freed - that is only done once 
the file has no names pointing to it, AND no process has it open.  This 
does not mean that the number reported by df is wrong, and that it 
should be changed to do the work that du does in order to get the same 
result.



The POSIX standard is correct in my opinion; glibc is currently buggy.
I would claim (and I guess I did) that this behavior is buggy even
if every Unix system in the universe always worked this way.
A long standing bug is still a bug.


It isn't glibc, it is the kernel.  Indeed, just because it is 
longstanding does not mean it is not a bug, but the fact that it is 
confusing or inconsistent likewise does not make it a bug either.  You 
consider it a bug because you got a result that is not what you expect. 
 It is your _expectation_ that needs changed, not the result.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-08 Thread Phillip Susi

Jim Meyering wrote:

Ultimately, neither POSIX nor any other official standard defines what
is right for coreutils.  POSIX usually serves as a fine reference, but
I don't follow it blindly.  In rare cases I've had a well-considered
disagreement with some aspect of a standard, and I have implemented
default behavior that technically makes an application non-conforming.
The standard evolves, too.


Going against the standard behavior is not strictly anathema, true, but 
you had better have a good reason for it.  This 'fix' gains NOTHING 
since any application ( whether it exists now or conceivably could in 
the future ) that depends on your preferred behavior is already 
inherently broken.  With nothing to gain, and both conformance and 
performance to loose, this fix seems to be bad form.



Furthermore it _is_ right even in absolute terms.


Then we'll have to agree to disagree.


So far your well-considered disagreement with the standard seems to 
consist solely of it is confusing that the two inum's don't match. 
That seems a fairly weak argument for technical correctness.  While I 
agree that your way makes more sense, that alone does not outweigh both 
performance and conformance.  There are plenty of things that don't seem 
to make sense at first glance, but we still do them with good reason.



When ls -i prints an inode number not equal to the one lstat would
produce, I consider it inaccurate.
ls -ldi DIR prints an accurate inode number for DIR.
ls -i DIR/.. |grep DIR fails to do so when DIR is a mount point
(but works on Cygwin)


I would argue that it is not inaccurate, since as far as correct 
operation of the machine is concerned, it has no effect.  The perceived 
inaccuracy is only in your own mind while you incorrectly attempt to 
assign meaning to the value which has none.  To justify the cost of 
conformance and performance, something a bit more substantial than 
making a human feel better needs to be achieved.



There are *no existing programs* and *no plausible correct programs*
which depend on your new behaviour.


Easy to say.  Even if it were known to be true,
imho, it's irrelevant, since your definition of correct
presupposes the broken behavior.


You seem to have this backwards.  Ian's definition of correct 
presupposes that the inums are useless in the presence of mount points 
in the first place, which they are without a dnum.  It is your 
definition of correct that is founded solely on what you feel the 
behavior should be, which by all logical measures, is broken.


You have no logical reason for arguing that your way is correct other 
than that you find the other way unsettling.



Not yet.  But I haven't done a complete survey, either.
Maybe this exposure will prod the affected systems to provide
a readdir-like interface with always-lstat-consistent d_ino values.


I'd bet money that Linux won't since there zero reason to do so, and 
several reasons not to.





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: making GNU ls -i (--inode) work around the linux readdir bug

2008-07-08 Thread Phillip Susi

Jim Meyering wrote:

The change I expect to implement does not going against POSIX.
On the contrary, the standard actually says the current readdir
behavior is buggy.  See my previous reference to a quote from
readdir's rationale.


Going against the standard behavior means differing in behavior that 
all other implementations have.  The new POSIX standard verbiage you 
pointed out only _hints_ that that it is incorrect behavior, with no 
justification for that position.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: tee logs no output if stdout is closed

2008-07-03 Thread Phillip Susi

Andreas Schwab wrote:

It would match the behaviour as defined by ASYNCHRONOUS EVENTS in 1.11
Utility Description Defaults.


Could you quote that section or give me a url to somewhere I can see it 
myself?  I have no idea what it says nor where to look it up.


Also what about the issue where tee will try to write() to the now 
broken fd and fail?



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: tee logs no output if stdout is closed

2008-07-01 Thread Phillip Susi

Andreas Schwab wrote:

It seems to me that tee should have a SIGPIPE handler which closes the
broken fd and stops trying to write to it, and if ALL outputs have been
closed, exit.


That would not be compatible with POSIX.


In what way?

Also, won't ignoring SIGPIPE cause problems later when tee tries to 
write() to the broken fd and gets back an error?




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: tee logs no output if stdout is closed

2008-06-30 Thread Phillip Susi

Andreas Schwab wrote:

Bruno Haible [EMAIL PROTECTED] writes:


How about adding an option '-p' to 'tee', that causes it to ignore SIGPIPE
while writing to stdout?


Just add a trap '' SIGPIPE before starting tee.


Wouldn't that only trap SIGPIPE sent to the shell, not tee?  Aren't all 
signal handlers reset on exec()?


It seems to me that tee should have a SIGPIPE handler which closes the 
broken fd and stops trying to write to it, and if ALL outputs have been 
closed, exit.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: cp: performance improvement with small files

2008-05-23 Thread Phillip Susi

Hauke Laging wrote:

Hello,

I just read an interesting hint in the German shell Usenet group 
([EMAIL PROTECTED]). As I could not find anything about 
that point in your mailing list archive I would like to mention it here.


The author claims that he achieved a huge performance increase (more than 
factor 10) when copying a big amount of small files (1-10 KiB) by sorting 
by inode numbers first. This probably reduces the disk access time which 
becomes the dominating factor for small files.


It depends on what filesystem you are using.  In ext3 this would help, 
but not on reiserfs, where there is no relationship between inode number 
and disk position.


In any case, this would significantly increase the complexity of cp for 
at best, dubious gains, so it isn't likely to happen.  Rather than sort 
by inode, it would be better if filesystems that would benefit from that 
would keep the directory list sorted that way so that the list would 
already be sorted when passed to cp.  IIRC, the defrag package sorts 
directories this way on ext2/3.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: bug in sha1sum

2008-05-13 Thread Phillip Susi

Philip Rowlands wrote:
Coreutils manpages tend to be short reference sheets listing the 
available options. Further documentation is provided in the info 
command, as should be mentioned as the end of each manpage.


 From the docs:
`-b'
`--binary'
 Treat each input file as binary, by reading it in binary mode and
 outputting a `*' flag.  This is the inverse of `--text'.  On
 systems like GNU that do not distinguish between binary and text
 files, this option merely flags each input file as binary: the MD5
 checksum is unaffected.  This option is the default on systems
 like MS-DOS that distinguish between binary and text files, except
 for reading standard input when standard input is a terminal.

`-t'
`--text'
 Treat each input file as text, by reading it in text mode and
 outputting a ` ' flag.  This is the inverse of `--binary'.  This
 option is the default on systems like GNU that do not distinguish
 between binary and text files.  On other systems, it is the
 default for reading standard input when standard input is a
 terminal.


I have to agree with Dave on this then.  It is a severe bug that text 
mode is the default since this means that you will get different results 
for the checksum on MS-DOS/Windows than on a GNU/Linux system.





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Qustions about CPU usage of dd process

2007-12-04 Thread Phillip Susi

Pádraig Brady wrote:

The CPU percentage of dd process sometimes is 30% to 50%,
which is higher than we expect (= 20%), and there is no other big
program running at the same time.
If the disc in SATA ODD is CD-R instead of DVD-R, the percentage is
much smaller(=20).


That just means that dd is waiting on the CD-R more than
on the DVD-R as the DVD-R is probably faster.


iowait != busy.  It's using cpu time to actively copy data around in ram 
or to/from IO ports if not using DMA, not waiting on the hardware.



So my questions are:
(1) Is there an official normal range(or criteria) to the dd CPU
percentage?
(2) Can we say that it's abnormal if it is higher than 30% or even 50%?
(3) And what kinds of factors lead to the high CPU percentage of dd
and
how to decrease it?


1) No, it entirely depends on your hardware and kernel
2) I certainly don't like to see it even as high as 20% on typical 
desktop PC type hardware.  I prefer to see it less than 5%.
3) Ideas include the small block size, hardware not using DMA, hardware 
generating a lot of interrupts/only transferring a single sector at a time.





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: df not for subfs?

2007-12-03 Thread Phillip Susi

Bob Proulx wrote:

It appears that something failed getting the file system values.  Try
debugging this using strace.  The following produces useful debug
output on my GNU/Linux system for usb storage devices.


Most likely that is the case as this subfs does not appear to have been 
actively maintained since 2004, before the 2.6 kernel series new world 
plug and play order came about.  I'd suggest updating your distribution 
to a new one that does not use subfs.






___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: du: fts and vfat

2007-11-26 Thread Phillip Susi

Jim Meyering wrote:

Yes, it's expected, whenever you use a file system with
imperfect-by-definition inode emulation.


AFAIR, the fat driver uses the starting cluster of the file as the inode 
number, so unless you defrag or something, it shouldn't change.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: strange DF (disk full) output

2007-09-19 Thread Phillip Susi

Roberto Spadim wrote:
i think that DF is understanding /mnt/smb (smbfs mount point) as / disk 
usage

but if i umount it and get again df and du -s /*, df still with 88%


No, df asks the filesystem itself for the information with statfs(), so 
the only way it is wrong is if the fs is damaged.  You might want to 
fsck it.


what could i do? i think that DU is right, but DF is the main problem 
since, if i do:

dd if=/dev/zero of=/test bs=1024 count=very big count
i get:

[EMAIL PROTECTED] /]# dd if=/dev/zero of=/test bs=1024 count=50
rm /testdd: escrevendo `/test': Não há espaço disponível no dispositivo
1194693+0 records in
1194692+0 records out
1223364608 bytes (1,2 GB) copied, 37,6201 s, 32,5 MB/s



so, df is my problem since i have only 1.2 gb,


df said you had 1.2 gb free, so it appears you have confirmed that it is 
correct.





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: who(1) shows more than one user on the same terminal

2007-08-16 Thread Phillip Susi

[EMAIL PROTECTED] wrote:

Is it normal to see two users on the same tty?
$ who
jidanni  pts/0 ...
ralphpts/0 ...
jim  pts/1 ...
$ ls -l /dev/pts/0
crw--w 1 jidanni tty 136, 0 2007-08-17 00:58 /dev/pts/0

The administrator (Dreamhost) says

You could potentially see many more than that at any given time.
There are other users with whom you share the hosting, as well as
the admins. This is normal.


Yes, many users, but not on the same pts/0?!
ps -u anybody_other_than_me won't show anything, so I can't
investigate further. Perhaps stale utmp entries?


Yes, stale utmp entry.



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: mv: cannot move `dir' to a subdirectory of itself, `../dir'

2007-08-14 Thread Phillip Susi

Andreas Schwab wrote:

Chris Moore [EMAIL PROTECTED] writes:


$ mv dir ..
mv: cannot move `dir' to a subdirectory of itself, `../dir'


With coreutils 6.9 you'll get Directory not empty.


That also seems incorrect.  Shouldn't the error be A file ( directory ) 
with that name already exists?




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: nohup feature request / Please fwd to Jim Meyering

2007-06-14 Thread Phillip Susi

Bob Proulx wrote:

NOT this:

  $* nohup.out.$name-$pid

But this:

  $* | sed s/^/$name-$pid: /  nohup.out
  $* | sed s/^/$timestamp: /  nohup.out


OH!  I see now... yea, that would require active participation.


  trap  1 15
  if test -t 21  ; then
echo Sending output to 'nohup.out'
exec nice -5 $* nohup.out 21
  else
exec nice -5 $* 21
  fi

All that nohup does is to ignore SIGHUP and SIGTERM and redirect the
output if it is not already redirected.  Before job control this was
all that was needed to avoid a controlling terminal disconnection from
killing the process.  Unfortunately with job control a little more is
needed.


Trapping the signals in the shell does not trap them in the exec'd child 
program, so I don't see how this would work.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: nohup feature request / Please fwd to Jim Meyering

2007-06-14 Thread Phillip Susi

Micah Cowan wrote:

Untrue, actually: _handling_ the signals would not handle them in the
exec'd child (for obvious reasons), but POSIX requires blocked signals
to remain blocked after an exec.

Try the following:


#!/bin/sh
trap  TSTP
exec yes  /dev/null


and then try suspending the program with ^Z, with and without the trap
line commented out. :)


Interesting... I thought that exec reset all signal handlers to their 
default behavior...





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: nohup feature request / Please fwd to Jim Meyering

2007-06-13 Thread Phillip Susi

Bob Proulx wrote:

Uhm...  I think we drifted from the feature discussion:


How so?


Jack van de Vossenberg wrote:

My request is: could the output be preceded by
1) the name/PID of the process that produces the output.
2) the time that the output was produced.


I don't think that is possible without an active participation by a
process.


What requires any kind of active participation?  The only thing 
requested is that nohup open a file named with the current time and pid 
instead of /dev/null prior to exec()ing the given command.





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: nohup feature request / Please fwd to Jim Meyering

2007-06-12 Thread Phillip Susi

Bob Proulx wrote:

Well, perhaps in a sense *anything* is possible with enough code to
implement it.  However as originally designed and currently written it
is not possible for nohup to do this.  It is only possible for nohup
if it were rewritten to be a completely different program.  It would
need to be an active participant in the I/O.  And I would hate to
invalidate all of the Unix documentation written to date on nohup.


It has no need to be an active participant in the IO; it just needs to 
change the name it passes to open() when it sets up the file descriptors 
prior to exec().





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: nohup feature request / Please fwd to Jim Meyering

2007-06-07 Thread Phillip Susi

Pádraig Brady wrote:

My request is: could the output be preceded by
1) the name/PID of the process that produces the output.


That's not possible unfortunately, as nohup just
sets things up, and replaces itself with the command.
It might suffice to have separate files for each command,
which you can specify by using redirection:


Of course it is possible; nohup knows its pid as well as the command it 
is asked to run.  When it opens the output file it just needs to use 
that information to name it.



nohup nice command1 command1.out
nohup nice command2 command2.out


This works too.


Personally I much prefer using screen to nohup:
http://www.pixelbeat.org/docs/screen/


Screen is nice for interactive commands that you want to come back to 
later.  For things that you just want to run in the background and 
forget about, I find the batch command to be very useful.  Works similar 
to nohup but instead of having to check an output file for the results 
later and check ps to see if the command is still running, you get the 
results in an email when it is finished.





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: kilo is k and not K

2007-02-27 Thread Phillip Susi

Alfred M. Szmidt wrote:

Standards should never be followed blindly, and standards should be
broken when one thinks one has good reasons.

SI also conflicts with POSIX in this case.  Not to mention that SI
does not define prefixes for all possible units, only SI units, and a
byte is not a SI unit.  So SI-wise, there is nothing wrong about using
k or K as a prefix symbol for `kilo'.


From (coreutils)Block size:


`k'
`K'
`KiB'
 kibibyte: 2^10 = 1024.  `K' is special: the SI prefix is `k' and
 the IEC 60027-2 prefix is `Ki', but tradition and POSIX use `k' to
 mean `KiB'.


Well put.  Personally I can't stand the fact that someone decided to 
make up such a silly sounding word as kibi because they don't like the 
fact that the contextualized definition of Kilo-Byte was slightly at 
odds with the SI use of the Kilo prefix.  It was well established that a 
Kilo-Byte was 1024 bytes and was abbreviated as KB well before this 
silly 'kibi' nonsense started.


SI does not define what a kilo-byte is, computer scientists do, and they 
defined it as 1024.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: stat() order performance issues

2007-01-26 Thread Phillip Susi

Jim Meyering wrote:

Which ls option(s) are you using?


I used ls -Ui to list the inode number and do not sort.  I expected this 
to simply return the contents from getdents, but I see stat64 calls on 
each file, I believe in the order they are returned by getdents in, 
which causes a massive seek storm.



Which file system?  As you probably know, it really matters.


In my case, reiserfs, but this should apply equally as well to ext2/3.


If it's just ls -U, then ls may not have to perform a single stat call.
If it's ls -l, then the stat per file is inevitable.
But if it's ls --inode or ls --file-type, with the right file system,
ls gets all it needs via readdir, and can skip all stat calls.  But with
some other file system types, it still has to stat every file.



It seems that ls -U does not stat, but ls -Ui does.  It seems it 
shouldn't because the name and inode number are returned by readdir 
aren't they?



For example, when I run ls --file-type on three maildirs containing
over 160K entries, it's nearly instantaneous.  There are only 3 stat calls:

$ strace -c ls -1 a b c|wc -l


Are a, b and c files or directories?  If they are files, then of course 
it would only stat 3 times, because you have only asked ls to look up 3 
files.  Try just ls -Ui without the a b c parameters.



du in a Maildir with many thousands of small files takes ages to
complete.  I have investigated and believe this is due to the order in


Yep.  du has to perform the stat calls.

ages?  Give us numbers.  Is NFS involved?  A slow disk?
I've just run du -s on a directory containing almost 70,000 entries,
and it didn't take *too* long with a cold cache: 21 seconds.


Modest disk, no NFS, 114k entries, and it takes 10-15 minutes with cold 
cache.  When I sorted the directory listing by inode number and ran stat 
on each in that order with cold caches, it only took something like 1 
minute.



Post your patch, so others can try easily.
If sorting entries (when possible, i.e., for du, and some invocations
of ls) before stating them really does result in a 10x speed-up on
important systems, then there's a good chance we'll do it.


I have no patch, I merely did some instrumentation with shell scripts, 
ls, and stat.





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: stat() order performance issues

2007-01-26 Thread Phillip Susi

Jim Meyering wrote:

That's good, but libc version matters too.
And the kernel version.  Here, I have linux-2.6.18 and
Debian/unstable's libc-2.3.6.


How does the kernel or libc version matter at all?  What matters is the 
on disk filesystem layout and how it is not optimized for fetching stat 
information on files in what is essentially a random order, instead of 
inode order.  In the case of ext2/3, the inodes are stored on disk in 
numerical order, and for reiserfs, they tend to be stored in order, but 
don't have to be.  On ext2/3 I believe file names are stored in the 
order they were created in, and on reiserfs, they are stored in order of 
their hash.  In both cases the ordering of inodes and the ordering of 
names returned from readdir are essentially randomly related.


Anyhow, I am running kernel 2.6.15 and libc 2.3.6.


10-15 minutes is very bad.
Something needs an upgrade.


Or a bugfix/enhancement, unless there already is a newer version of 
coreutils that stats in inode order.  My version of coreutils is 5.93.



I presume you used xargs -- you wouldn't run stat 114K times...


Yes

ls -Ui  files
cat files | sort -g | cut -c 9-  files-sorted
cat files | cut -c 9-  files-unsorted
time cat files-unsorted | xargs stat  /dev/null
 clear cache 
time cat files-sorted | xargs stat  /dev/null


Sorting by inode number made the stats at least 10 times faster.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


stat() order performance issues

2007-01-25 Thread Phillip Susi
I have noticed that performing commands such as ls ( even with -U ) and 
du in a Maildir with many thousands of small files takes ages to 
complete.  I have investigated and believe this is due to the order in 
which the files are stat()ed.  I believe that these utilities are simply 
stat()ing the files in the order that they are returned by readdir(), 
and this causes a lot of random disk reads to fetch the inodes from disk 
out of order.


My initial testing indicates that sorting the files into inode order and 
calling stat() on them in order is around an order of magnitude faster, 
so I would suggest that utilities be modified to behave this way.


Questions/comments?



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [bug #17903] cp/mv only read/write 512 byte blocks when filesystem blksize 4MB

2006-10-03 Thread Phillip Susi
Why not simply cap the size at 4 MB?  If it is greater than 4 MB just go 
with 4 MB instead of 512 bytes.  In fact, you might even want to cap it 
at less than 4 MB, say 1 MB or 512 KB.  I think you will find that any 
size larger than the 32-128 kb range yields no further performance 
increase and can even be detrimental due to the increased memory pressure.


Tony Ernst wrote:

Hi Paul,

Unfortunately, there isn't really a largest legitimate st_blksize 
for XFS file systems, or I should say the maximum is whatever fits 
into a size_t (64 bits).  It's dependent on the stripe width.  I 
talked with an XFS developer who told me that 2GB is not out of the 
question today.


Now there is also the question of whether or not we really want cp/mv
allocating a 2GB buffer, but that's a different issue (and a site with 
that kind of disk set-up probably also has plenty of memory).


Since the 4MB check was to fix a specific problem on hpux, it seems 
like that check should occur inside the # if defined hpux ... section.
At the very least, since the bogus value returned by hpux is such an 
strange number, maybe we could just change:

 (statbuf).st_blksize = (1  22)) /* 4MB */ \
to:
 (statbuf).st_blksize != 2147421096) \

I would be very surprised if 2147421096 was ever a valid st_blksize on 
any system/filesystem.  It's not a power of 2, or even a multiple 
of 128, 512, etc.


% factor 2147421096
2147421096: 2 2 2 3 3 17 37 47417

Thanks,
Tony




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [bug #17903] cp/mv only read/write 512 byte blocks when filesystem blksize 4MB

2006-10-03 Thread Phillip Susi

Tony Ernst wrote:
I believe the larger block sizes are especially beneficial with RAID. 
I'm adding Geoffrey Wehrman to the CC list, as he understands disk I/O 
much better than I do.


I believe most kernels always performs the actual IO in the same size 
chunks due to the block layer and cache, even if user space passes down 
a large buffer.  The exception to this on Linux would be when you use 
O_DIRECT IO, then using buffer sizes at least as large as the stripe 
width is definitely good for keeping all the disks spinning.


In any case, the raid case falls under the pervue of the kernel buffer 
cache read-ahead mechanism, and is beyond the scope of coreutils.  From 
the perspective of coreutils, using a larger buffer size has the benefit 
of reducing the number of system calls needed, but causes more memory to 
be locked down.  As such you need to have some kind of limit on the size 
of the buffer so you don't try to exhaust system memory.


On the other hand, in a way it is up to the kernel to provide a 
reasonable value for the block size knowing full well that applications 
that use that value use it as a guide to the IO buffer size to use.  In 
that case I suspect that XFS really should not be setting that value to 
many megabytes.




I don't think a cap is really necessary.  This test and arbitrary 
limitation were put in to work around a specific problem on hpux.  But 
it has side-effects that reach beyond hpux and the original problem. 
So the test should either be limited to hpux-only, or fixed to get rid 
of the side-effects by making it more specific to the original problem.







___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: dd new iflag= oflag= flags directory, nolinks

2006-03-21 Thread Phillip Susi
What is atomic about having dd do this?  open() with O_DIRECTORY to test 
for existence of a directory is exactly what test does isn't it?  If 
your goal is to test for the existence of a directory then test is what 
you want to use, not dd.  The purpose of dd is to copy/convert data 
between devices, not test if files or directories exist, so I don't 
think this patch is very appropriate. 


Paul Eggert wrote:

You can open a directory without writing to it.  Something like this:

   $ mkdir foo
   $ dd if=foo iflag=directory count=0
   0+0 records in
   0+0 records out
   0 bytes (0 B) copied, 2e-05 seconds, 0 B/s

You can think of this as a way to do a test -d foo  test -r foo
atomically, without race conditions.  Admittedly this is somewhat
twisted, but the new documentation does say that iflag=directory is of
limited utility

  


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: dd new iflag= oflag= flags directory, nolinks

2006-03-21 Thread Phillip Susi

Paul Eggert wrote:

No, because test -d foo  test -r foo is _two_ invocations of
test, not one.  A race condition is therefore possible.  The race
condition is not possible with dd if=foo iflag=directory count=0.

  


Ok, so this allows you to atomically test if the named object is both a 
directory and is readable, but why bother?  It's a bit silly to worry 
about a race between the two tests when you still have a race between 
the dd-test and whatever it is the script actually does with the 
directory, such as try to list the files in it.  Practically, 
eliminating the one pedantic race condition doesn't solve any problems 
because you've still got a dozen others. 




Admittedly this usage is unusual.  The main arguments for it are
(1) dd supports all the other O_* flags, so why not O_DIRECTORY? and
(2) I ran into a situation where I could use O_DIRECTORY.
(1) is far more important than (2).


The purpose of dd is to transfer data.  Since O_DIRECTORY excludes 
transfering data, its use goes against the grain of dd, and since it 
doesn't actually accomplish anything useful either, I don't think it 
should be done.  It just adds complexity for no good reason.



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: df enhancment for removable media

2006-03-21 Thread Phillip Susi
This sounds like an autofs problem.  I'm running ubuntu and hal auto 
mounts removable media when it is inserted.  When it is not mounted, df 
will not show a line for it at all, since df only shows mounted points. 
 I think what you are seeing is an autofs mount point being mounted 
there which is why df shows a line for the mount point, but autofs has 
decided to unmount the real fs and return bogus stat values.


I'd suggest not using autofs.  In any case, this isn't a bug with df.

Kevin R. Bulgrien wrote:

Did anything final every come out of this thread.  I've written a
plug-in script for amaroK that a Suse user is complaining about.
I never heard of a system unmounting a disk automagically behind
the user's back when a mount was explicitly requested.

df is reporting USB media to be have 0 bytes free.  The simple
ls /media/USB_DISK /dev/null; df /media/USB_DISK example is not
sufficient to get df to report something real.

Does anyone feel like giving some idea how this might best be
handled.  I hate to kludge things by creating a file on the USB
device, and it seems silly to over-access the media just to get
a df to output what I want.  I could do a find, because some
other function in my script does that and df reports properly,
but it seems pretty heavy-handed for a simple space check to
run through a USB drive that might be a few GB in size with
hundreds of files.

Is this something that was fixed so that a later version of
Suse might have a df that works?  If so, I might just allow the
user to override the space checking in the event he has an
affected system.

Thanks,

Kevin R. Bulgrien

Juergen Weigert wrote:


Hi coreutils people!

On a recent SUSE Linux df became unreliable for e.g. USB-drives.
This is because hald automatically mounts and unmounts such drives
as they are accessed.

Usually I get something like:

$ df /media/USB_DISK
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sda10 0 0   -  /media/USB_DISK

only if the USB_DISK is being accessed, I get the expected output.

$ ls /media/USB_DISK  /dev/null; df /media/USB_DISK
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sda1   252522238718 13804  95% /media/USB_DISK

A simple enhancement for df is to actively access the USB_DISK while
running statfs(). I've added an opendir() call in the attached patch. This
can be suppressed with a new commandline option -n.

Please keep me in CC, I am not subscribed.

thanks,
Jw.


Paul Eggert wrote:


Juergen Weigert [EMAIL PROTECTED] writes:


Unless I'm missing something I'd rather not change the default behavor
of df, as that would be a compatibility hassle.  That is, df shouldn't
attempt to mount file systems by default; it should do so only if the
user asks, with a new option.

These hald mounts are different. For almost every aspect such a device
appears to be mounted. So I figured, df should also pretend the
device is mounted.

But lots of programs other than df invoke statfs.  We shouldn't have
to change them all.  Instead, it would be much better to fix statfs to
do the right thing with hald mounts.  statfs should return values that
are consistent with every other system call: it should not return
incorrect values simply for the convenience of some low-level hardware
abstraction layer.

Please also see the message from Ivan Guyrdiev of Cornell archived at
http://www.nsa.gov/selinux/list-archive/0507/thread_body36.cfm dated
2005-07-20 in which he says something similar: the statfs
implementation needs to get fixed.



Looks ugly in df.c, right. But in fsusage.c we'd have to place the
new code in multiple implementations. Ugly too.

It would only need to be placed in sections corresponding to
implementations that have the bug.  Currently, that's just one
implementation: GNU/Linux, and only a small subset of these hosts as
well.  Since the workaround issues more system calls, it would be nice
to detect the broken implementations at compile-time somehow, or at
least filter out the obviously non-broken implementations.







___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: dd new iflag= oflag= flags directory, nolinks

2006-03-07 Thread Phillip Susi
I'm confused.  You can't open() and write() to a directory, so how does 
it make any sense to ask dd to set O_DIRECTORY?


Paul Eggert wrote:

I wanted to use dd iflag=directory (to test whether a file is a
directory, atomically), and noticed that dd didn't have it.  The use
was a fairly obscure one (just testing the O_DIRECTORY flag) but I
figured dd should support all the flags.  I noticed also that dd
doesn't support the Solaris 10 O_NOLINKS option.  So I installed this:

2006-03-05  Paul Eggert  [EMAIL PROTECTED]

* doc/coreutils.texi (dd invocation): New flags directory, nolinks.
Alphabetize nofollow.
* src/dd.c (flags, usage): New flags directory, nolinks.
* src/system.h (O_NOLINKS): Define to 0 if not already defined.

Index: doc/coreutils.texi
===
RCS file: /fetish/cu/doc/coreutils.texi,v
retrieving revision 1.315
diff -p -u -r1.315 coreutils.texi
--- doc/coreutils.texi  27 Feb 2006 10:47:23 -  1.315
+++ doc/coreutils.texi  6 Mar 2006 07:15:27 -
@@ -7003,7 +7003,8 @@ argument(s).  (No spaces around any comm
 Access the output file using the flags specified by the @var{flag}
 argument(s).  (No spaces around any comma(s).)
 
-Flags:

+Here are the flags.  Not every flag is supported on every operating
+system.
 
 @table @samp
 
@@ -7019,6 +7020,13 @@ contents of the file.  This flag makes s

 @cindex direct I/O
 Use direct I/O for data, avoiding the buffer cache.
 
[EMAIL PROTECTED] directory

[EMAIL PROTECTED] directory
[EMAIL PROTECTED] directory I/O
+
+Fail unless the file is a directory.  Most operating systems do not
+allow I/O to a directory, so this flag has limited utility.
+
 @item dsync
 @opindex dsync
 @cindex synchronized data reads
@@ -7043,11 +7051,6 @@ Use non-blocking I/O.
 @cindex access time
 Do not update the file's access time.
 
[EMAIL PROTECTED] nofollow

[EMAIL PROTECTED] nofollow
[EMAIL PROTECTED] symbolic links, following
-Do not follow symbolic links.
-
 @item noctty
 @opindex noctty
 @cindex controlling terminal
@@ -7056,6 +7059,16 @@ This has no effect when the file is not 
 On many hosts (e.g., @acronym{GNU}/Linux hosts), this option has no effect

 at all.
 
[EMAIL PROTECTED] nofollow

[EMAIL PROTECTED] nofollow
[EMAIL PROTECTED] symbolic links, following
+Do not follow symbolic links.
+
[EMAIL PROTECTED] nolinks
[EMAIL PROTECTED] nolinks
[EMAIL PROTECTED] hard links
+Fail if the file has multiple hard links.
+
 @item binary
 @opindex binary
 @cindex binary I/O
Index: src/dd.c
===
RCS file: /fetish/cu/src/dd.c,v
retrieving revision 1.190
diff -p -u -r1.190 dd.c
--- src/dd.c7 Dec 2005 21:12:12 -   1.190
+++ src/dd.c6 Mar 2006 07:15:27 -
@@ -263,10 +263,12 @@ static struct symbol_value const flags[]
   {append, O_APPEND},
   {binary, O_BINARY},
   {direct, O_DIRECT},
+  {directory,  O_DIRECTORY},
   {dsync,  O_DSYNC},
   {noatime,O_NOATIME},
   {noctty, O_NOCTTY},
   {nofollow,   O_NOFOLLOW},
+  {nolinks,O_NOLINKS},
   {nonblock,   O_NONBLOCK},
   {sync,   O_SYNC},
   {text,   O_TEXT},
@@ -460,6 +462,8 @@ Each FLAG symbol may be:\n\
 ), stdout);
   if (O_DIRECT)
fputs (_(  directuse direct I/O for data\n), stdout);
+  if (O_DIRECTORY)
+   fputs (_(  directory fail unless a directory\n), stdout);
   if (O_DSYNC)
fputs (_(  dsync use synchronized I/O for data\n), stdout);
   if (O_SYNC)
@@ -468,11 +472,13 @@ Each FLAG symbol may be:\n\
fputs (_(  nonblock  use non-blocking I/O\n), stdout);
   if (O_NOATIME)
fputs (_(  noatime   do not update access time\n), stdout);
-  if (O_NOFOLLOW)
-   fputs (_(  nofollow  do not follow symlinks\n), stdout);
   if (O_NOCTTY)
fputs (_(  nocttydo not assign controlling terminal from file\n),
   stdout);
+  if (O_NOFOLLOW)
+   fputs (_(  nofollow  do not follow symlinks\n), stdout);
+  if (O_NOLINKS)
+   fputs (_(  nolinks   fail if multiply-linked\n), stdout);
   if (O_BINARY)
fputs (_(  binaryuse binary I/O for data\n), stdout);
   if (O_TEXT)
Index: src/system.h
===
RCS file: /fetish/cu/src/system.h,v
retrieving revision 1.143
diff -p -u -r1.143 system.h
--- src/system.h26 Feb 2006 10:03:17 -  1.143
+++ src/system.h6 Mar 2006 07:15:27 -
@@ -193,6 +193,10 @@ initialize_exit_failure (int status)
 # define O_NOFOLLOW 0
 #endif
 
+#if !defined O_NOLINKS

+# define O_NOLINKS 0
+#endif
+
 #if !defined O_RSYNC
 # define O_RSYNC 0
 #endif
  


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: comparing string with regular expression using test command in unix

2006-02-22 Thread Phillip Susi

Or grep.

Paul Eggert wrote:

N Gandhi Raja [EMAIL PROTECTED] writes:


Can we use test command in UNIX to compare a *string *with the
*regular expression*?


No.  You might look at 'expr' or 'awk' instead.






___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: coreutils-5.94 imminent

2006-02-13 Thread Phillip Susi
Shouldn't it be made consistent?  IMHO, the command mv a b/ means move 
the file or directory named a into the directory named b, so if b does 
not exist or is not a directory, it should fail.  If you want to make mv 
deviate from this behavior, then at least shouldn't it behave the same 
on all platforms, without depending on the implementation of rename()?


Jim Meyering wrote:

Eric Blake [EMAIL PROTECTED] wrote:

According to Jim Meyering on 2/9/2006 1:56 AM:

I'm hoping to release coreutils-5.94 soon.  If anyone sees
something on the trunk that they want but that's not yet
in 5.94, please speak up now.  So far, my policy has been
to apply only bug fixes.

Whatever happened to this thread:
http://lists.gnu.org/archive/html/bug-coreutils/2005-11/msg00285.html


Thanks for bringing that up.
In spite of saying I wanted to fix it for 5.94, I decided to let it wait.
The probable fix and tests seem complicated/risky enough that I'd prefer
to release 5.94 now.  For example, the tests will have to be dependent
on how rename works, and will have to ensure that e.g.,

  rm -rf a b; touch a; mv a b/

still fails on systems with a rename syscall that honors trailing slashes.

I'd rather see such changes made to the trunk first,
and get some exposure through a test release.

Besides, no one volunteered to do the work ;-)
The code changes are trivial.  Writing tests and changing
documentation will take more time.
The fact that changing this part of mv induces no failure (on Linux)
in the test suite is a sure indication we need more tests.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: RFC: How du counts size of hardlinked files

2006-01-13 Thread Phillip Susi
Maybe I misunderstood you but you seem to think that each hard link to 
the same file can have different ownerships.  This is not the case. 
Hard links are just additional names for the same inode, and permissions 
and ownership is associated with the inode, not the name(s).


Also I just tested it and du doesn't report the size used by duplicate 
hard links in the tree twice.  I did a cp -al foo bar, then a du -sh, du 
-sh foo, and they were both the same size.




Johannes Niess wrote:

Hi list,

du (with default options) seems to count files with multiple hard links in the 
first directory it traverses.


The -l option changes that.

But there are other valid viewpoints.

Somehow the byte count of multiple hardlinks partially belongs to all of them, 
even when not part of traversed directories. In this mode a file with 10 
bytes and 3 hardlinks would be counted as 3 files with 3 bytes (an only one 
hardlink) each. The rounding error of integers is acceptable in this 
'approximate' mode. Programmatically this is should be very similar to the -l 
mode. Use case: Different physical owners of the hardlinks and doing fair 
accounting for them. (Of course the inode has only one common logical owner 
for all directory entries).


Not counting multiple AND out-of-tree hardlinks is also usefull. It tells us 
how much space we really gain when deleting that tree. 'rm-size' could be a 
name for this mode. Programmatically this is similar to default mode: In Perl 
I'd use hash keys for the test in default mode. In 'rm-size' mode I'd 
increase the hash values of visited inodes.  Finally compare # of visited 
directory entries to the # of links.


du seems to be the natural home for this functionality. Or is it feature 
bloat?


Background: Backups via 'cp -l' need (almost) no space for files unchanged in 
several cycles. But these shadow forests of hardlinks are difficult to 
account for. Especially when combined with finding and linking identical  
files across several physical owners.


Johannes Niess

P.S: I'm not volunteering to implement this. I did not even feel enough need 
to do the perl scripts.






___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Using DD to write to a specific LBA (converting LBA to offset, blocksize, count)

2006-01-05 Thread Phillip Susi
Hard drive sectors are 512 bytes so use a bs of 512 and skip FF68 
blocks.  I'm not sure if dd will accept hex numbers, try prefixing it 
with a 0x ( the C convention for hex numbers ).  Otherwise, convert the 
hex number to decimal.


Mark Perino wrote:

How does one convert from LBA to skip, blocksize, and count?

IE I want to write zero's over LBA 00FF68 through 00FF6F on a raw device
/dev/sda /dev/rhdisk4, /drv/C0T0D0S1, etc..

As an example under AIX I would like to use:

dd if=/dev/zero of=/dev/rhdisk4 skip= blocksize= count=

Where can I obtain the info to calculate ,  and ?  Whatever
platform you can show me how to do the math on I can probably figure out
how to do this on AIX, Linux, Solaris, etc..

___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: sleep command generates hardisk activity

2005-12-20 Thread Phillip Susi
Most likely this is the access timestamps being updated on the files 
being read, try adding the noatime option to your mount options to 
prevent this.


Jochen Baier wrote:

hi,

i run in a weird problem: a script which use the sleep command, 
generates hardisk access every x seconds. this is really annoyning cause 
i can hear the harddisk sound.

an other user reported the same behavier.

the script:

while true; do
sleep 3
done


output from strace  sleep 3:




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Manual page for mount

2005-12-09 Thread Phillip Susi
I have always thought that the very name sync is completely 
misleading.  The option really has nothing at all to do with IO being 
synchronous or asynchronous, you can still perform IO either way ( think 
non blocking and linux async IO ).  What this option really does is 
simply cause the cache to switch to write-through mode instead of 
write-back mode.


I would have it say:

All I/O to the file system should be synced to the disk immediately. 
This effectively changes kernel disk caching for the device from 
write-back to write-through mode, causing more writes to occur.  Media 
with limited write cycles, flash for example, will age prematurely. 
Many operations may be slowed down significantly by use of this option, 
but the filesystem will be more up to date in the event of a system crash.


Jonathan Andrews wrote:

The sync and async options have an impact on disk caching, yet the
manual pages avoid the term cache - I assume being careful to be
general about capabilities of the underlying kernel.

This does make it difficult for people trying to find options relating
to disk cache behaviour, could the sync and async options be changed
to refer to disk caches explicitly. 



For example, sync currently reads.

All I/O to the file system should be done synchronously. In case of
media with limited number of write cycles (e.g. some flash drives)
sync may cause life-cycle shortening.


Something like this may be more meaningful to a lot of users, it may
oversimplify things with the use of the term cache, but in most cases
its the reference users are looking (searching) for.

All I/O to the file system should be done synchronously. This
effectively removes kernel disk caching for the device causing more
writes to occur. Media with limited write cycles, flash for example,
will age prematurely.


Thanks,
Jon





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-22 Thread Phillip Susi

Robert Latham wrote:

I mean no offense cutting out most of your points.  You describe great
ways to achieve high I/O rates for anyone writing a custom file mover.
I shouldn't have mentioned network file systems.  It's a distraction
from the real point of my patch: cp(1) should consider both the source
and the destination st_blksize.



No problem... I kind of went off on a tangent there.


All I expect from st_blksize is what the stat(2)
manpage suggests:

   The value st_blocks gives the size of  the  file  in  512-byte
   blocks.  (This  may  be  smaller than st_size/512 e.g. when the
   file has holes.) The value st_blksize gives the preferred
   blocksize for efficient file system  I/O.  (Writing to a file
   in smaller chunks may cause an inefficient
   read-modify-rewrite.)

All I really want is for cp(1) to do the right thing no matter what
the soruce or destination st_blksize value might be. 



Ok, I see what you are talking about now.  Using a copy block size 
smaller than the filesystem block size can result in a lot of extra IO, 
thus reducing throughput.  Of course, this doesn't really apply in a 
typical use case because the kernel will cache the writes and combine 
them when it flushes the IO to disk, however, yes... it is a good idea 
to use an IO block size that is at least as large as the larger of the 
source and destination filesystem block sizes.



In copying from a 4k blocksize file sytem to a 64k blocksize
filesystem, cp(1) will perform well, as it is using a 64k buffer.  


In copying *from* that 64k blocksize filesystem *to* a 4k blocksize
filesytem, cp(1) will not perform as well: it's using a 4k buffer and
so reading from the source filesystem in less-than-ideal chunks.



Again, this probably won't happen in real practice due to the influence 
of the filesystem cache, but I do see your point.  In practice though, I 
don't know of any filesystem with a 64k block size.  By default ext2/3 
use 1k, and reiserfs uses 4k.  These are going to be typical values for 
st_blksize, yet if you use a copy block size of say, 64k, I think you 
will find the performance to be significantly better than either 1k or 
4k.  I think that a good case in point is copying to/from a typical 
ext2/3 filesystem using a 1k block size.  Using a buffer smaller than a 
single 4k page is going to significantly degrade performance.  You 
certainly do not want to go smaller than the block size, but really, you 
should be going larger.



Thanks again for taking the time to respond.  I hope I have made the
intent of my patch more clear. 


==rob



You did... and I thank you as well and hope that I have made myself more 
clear.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: can we remove a directory from within that directory??

2005-11-21 Thread Phillip Susi
It is a general design philosophy of linux, and unix in general, that 
the kernel will not enforce locking of files.  This is why you can 
upgrade software without rebooting: the old file can be deleted and 
replaced with the new file, even though it is still in use.  Of course, 
it isn't actually deleted until everyone using it closes it, but it's 
name is removed from the directory tree immediately.


If you really want to mess up a system, you can rm -fr / ( as root of 
course ) and it will happily delete all the files on the system. 
Whatever is running at the time will keep running, but new opens will 
fail.  This behavior is pretty much by design.


kuldeep vyas wrote:
   [input][input][input][input] 
Hi,


I'm using Redhat 9 (kernel 2.4.20-8 on i686)
I logged in as k(username), then I started terminal, 
then I gave following commands:-

kpwd
/home/k

kmkdir my_dir
  // i created a directory: my_dir

kcd my_dir
  // let's go in my_dir

  
  // now let's try to remove my_dir

krmdir /home/k/my_dir
  // no error;

kls /home/k/
  // my_dir gone 


kpwd
/home/k/my_dir
  // oops!!

  // let's create my_file here!!
kcat my_file
bash: my_file: no such file or directory
  // I'm not allowed to a create file here.


pwd says I'm in my_dir, but my_dir doesn't exist.
I think: user should not be allowed to remove a directory,
until  unless he is placed in a directory which is 
hierarchically above the one he has chosen to remove.


If my approach is not right, I'd like to know the 
philosophy behind this.


Happy contributing to LINUX!!

kuldeep vyas
  







___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-20 Thread Phillip Susi
What would such network filesystems report as their blocksize?  I have a 
feeling it isn't going to be on the order of a MB.  At least for local 
filesystems, the ideal transfer block size is going to be quite a bit 
larger than the filesystem block size ( if the filesystem is even block 
oriented... think reiser4, or cramfs ).  In the case of network 
filesystems, they should be performing readahead in the background 
between small block copies to keep the pipeline full.  As long as the 
copy program isn't blocked elsewhere for long periods, say in the write 
to the destination, then the readahead mechanism should keep the 
pipeline full.  Up to a point, using larger block sizes saves some cpu 
by lowering the number of system calls.  After a certain point, the copy 
program can start to waste enough time in the write that the readahead 
stops and stalls the pipeline. 

If you want really fast copies of large files, then you want to send 
down multiple overlapped aio ( real aio, not the glibc threaded 
implementation ) O_DIRECT reads and writes, but that gets quite 
complicated.  Simply using blocking O_DIRECT reads into a memory mapped 
destination file buffer performs nearly as well, provided you use a 
decent block size.  On my system I have found that 128 KB+ buffers are 
needed to keep the pipeline full because I'm using a 2 disk raid0 with a 
64k stripe factor.  As a result, blocks smaller than 128 KB only keep 
one disk going at a time.  That's probably getting a bit too complicated 
though for this conversation. 

If we are talking about the conventional blocking cached read, followed 
by a blocking cached write, then I think you will find that using a 
buffer size of several pages ( say 32 or 64 KB ) will be MUCH more 
efficient than 1024 bytes ( the typical local filesystem block size ), 
so using st_blksize for the size of the read/write buffer is not good.  
I think you may be ascribing meaning to st_blksize that is not there. 



Robert Latham wrote:


In local file systems, i'm sure you are correct.  If you are working
with a remote file system, however, the optimal size is on the order
of megabytes, not kilobytes.  For a specific example, consider the
PVFS2 file system, where the plateau in blocksize vs. bandwitdh is
two orders of magnitude larger than 64 KB.  PVFS2 is a parallel file
system for linux clusters.  I am not nearly as familiar with Lustre,
GPFS, or GFS, but I suspect those filesystems too would benefit from
block sizes larger than 64 KB.  


Are you taking umbrage at the idea of using st_blksize to direct how
large the transfer size should be for I/O?  I don't know what other
purpose st_blksize should have, nor are there any other fields which
are remotely valid for that purpose.  

Thanks for your feedback. 
==rob


 





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-19 Thread Phillip Susi
I don't see why the filesystem's cluster size should have a thing to do 
with the buffer size used to copy files.  For optimal performance, the 
larger the buffer, the better.  Diminishing returns applies of course, 
so at some point the increase in buffer size results in little to no 
further increase in performance, so that's the size you should use.  I 
believe that the optimal size is about 64 KB. 




Robert Latham wrote:


(README says to ping if there's not been an ack of a patch after two
weeks.  here i go)

This patch to today's (18 Nov 2005) coreutils CVS makes copy.c
consider both the source and destination blocksize when computing
buf_size.  With this patch, src/copy.c will use the LCM of the soruce
and destination block sizes.  As Paul suggested, I used the buffer_lcm
routine from diffutils. 


For what it's worth, this patch does not introduce any regressions
into the coreutils testsuite.

When copying from a remote filesystem with a block size of 4MB to a
filesystem with a 4k blocksize, the copy is *very* slow.  Going from a
filesystem with 4k blocks to a filesystem with 4MB blocks is much
faster.  With this patch, both operations are equally performant.

I went ahead and added a ChangeLog entry as well.  


Thanks.  I'll be more than happy to incorporate any suggestions or
comments.

==rob


 





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: O_DIRECT support in dd

2005-11-18 Thread Phillip Susi

The man pages at:

http://www.gnu.org/software/coreutils/manual/html_chapter/coreutils_11.html#SEC65

Do not document an iflag parameter.  Is this simply an error in the 
documentation?  It looks like I'm still using coreutils 5.2, so I guess 
I'll have to upgrade.  I'm wondering now why there was over a year 
between the release of 5.2.1 and 5.92, with nothing in between.


Paul Eggert wrote:

Phillip Susi [EMAIL PROTECTED] writes:


I searched the archives and found a thread from over a year ago
talking about adding support to dd for O_DIRECT, but it is not
documented in the man pages.


It's in coreutils 5.93 dd.  Try, e.g., dd iflag=direct.  It's
not documented with the phrase O_DIRECT, though, which is possibly
why you missed it.


does dd use aio so as to keep the IO pipelined, since a blocking
read/write would only be able to access one file/disk at a time?


Nope.  Might be nice to add, I suppose.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


O_DIRECT support in dd

2005-11-17 Thread Phillip Susi
I searched the archives and found a thread from over a year ago talking 
about adding support to dd for O_DIRECT, but it is not documented in the 
man pages.  Did the man pages not get updated, or did this patch not 
make it in?


If O_DIRECT is supported, but not documented, then I wonder: does dd use 
aio so as to keep the IO pipelined, since a blocking read/write would 
only be able to access one file/disk at a time?






___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils