Re: df command should suppress duplicates

2012-12-04 Thread Bernhard Voelker
Hi Ondrej,

On 12/03/2012 07:09 PM, Ondrej Oprala wrote:
 thanks for the rebase :) .

no worries, you're welcome. ;-)

 I've modified the patch a bit, so I dont interfere with output if df -a 
 is specified.

Thanks.

I just had a quick look on the patch and I like the idea
of combining the filter for rootfs with that of the
duplicate entries.

I don't have time for a detailed review right now.
A few things certainly need to be straightened, e.g. I think
we can't ignore a stat failure:

 +static bool
 +dev_examined (struct devlist *devlist, char *mount_dir, char *devname)
 +{
 [...]
 +  stat (mount_dir, buf);

and it probably deserves a little more work on the texi
documentation and on the test (which doesn't cover rootfs
hiding yet).

Have a nice day,
Berny



Re: Make mv work better with SELinux.

2012-12-04 Thread Pádraig Brady

On 10/08/2012 09:24 PM, Daniel J Walsh wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

One of if not the most common problem people hit with SELinux is the mv
command, which maintains the file context of the source destination.

mv /home/dwalsh/index.html /var/www/html/

This blows up on everybody and then the users have no idea why.

I was thinking about adding -Z (--restorecon) to mv and having it basically do a
internal restorecon on the destination.

Then we could suggest people who get burnt by this to:

alias mv=mv -Z

In Fedora 18 we have greatly enhanced matchpathcon, by pre-compiling the
regex, so there should be very little slow down in doing this.


A question on performance.
So there was a large matchpathcon() performance issue in Fedora 11 time,
where we had a 20x slow down if matchpathcon_init_prefix() wasn't called
https://bugzilla.redhat.com/show_bug.cgi?id=479502#c24

Does calling matchpathcon_init_prefix() still provide benefit on Fedora 18?
More importantly, since the new selinux::restorecon_private() doesn't
call matchpathcon_init_prefix(), will it have the large performance
issues on Fedora = 17 and other SELinux supporting platforms?

Not a huge issue since install(1) enables setdefaultfilecon() by default,
whereas the new proposal would only enable when -Z is specified.
That's an inconsistency in the patch in this thread actually.
install -Z runs the new restorecon(), while also running the old
setdefaultfilecon(). Seems like we may need to drop the new install -Z
code for now, and possible in future merge restorecon() and setdefaultfilecon()

cheers,
Pádraig.



Re: fifo unlimited buffer size?

2012-12-04 Thread Peng Yu
On Tue, Dec 4, 2012 at 6:24 AM, Pádraig Brady p...@draigbrady.com wrote:
 tag 13075 + notabug
 close 13075
 thanks

 On 12/04/2012 03:19 AM, Peng Yu wrote:

 Hi,

 I have the following script. When the number to the right of 'seq' is
 large (as 10 in the example), the script will hang. But when the
 number is small (say 1000), the script can be finished correctly. I
 suspect that the problem is that there is a limit on the buffer size
 for fifo. Is it so? Is there a way to make the following script work
 no matter how large the number is? Thanks!

 ~/linux/test/gnu/gnu/coreutils/mkfifo/tee$ cat main2.sh
 #!/usr/bin/env bash

 rm -rf a b c
 mkfifo a b c
 seq 10 | tee a  b 
 sort -k 1,1n a  c 
 join -j 1 (awk 'BEGIN{OFS=\t; FS=\t} {print $1, $1+10}'  c)
 (awk 'BEGIN{OFS=\t; FS=\t}{print $1, $1+20}'  b)


 So this is problematic due to `sort`.
 That's special as it needs to consume all its input before
 producing any output. Therefore unless the buffers connecting
 the other commands in || can consume the data, there will be a deadlock.

I can't parse Therefore unless the buffers connecting the other
commands in || can consume the data...  What is '||'?


 This version doesn't block for example as
 the input is being generated asynchronously for the sort command.

 #!/usr/bin/env bash
 rm -rf a b c
 mkfifo a b c
 join -j 1 (awk 'BEGIN{OFS=\t; FS=\t} {print $1, $1+10}'  c) \
 (awk 'BEGIN{OFS=\t; FS=\t}{print $1, $1+20}'  b) 
 seq 10 | sort -k 1,1n  c 
 seq 10  b
 wait

 Obviously, if your input is expensive to generate,
 then you'd be best copying to another file
 and sorting that.

I should send the message to the regular mailing list. I have two
implicit requirements.

1. The input 'seq 10' can not be run twice, it has to be called once.
2. There can not be intermediate files generated.

Given the above requirements, there is no solution?

The generate question is, there is one input stream fans out to
multiple streams, which then are under some processing. Then these
processed streams converge to one program, which outputs one output
stream. This seems to be general use pattern. This pattern can be
nested arbitrarily, in which case, the above two requirements are
better held. Does this make sense?

-- 
Regards,
Peng



Re: Make mv work better with SELinux.

2012-12-04 Thread Pádraig Brady

On 12/04/2012 03:38 PM, Pádraig Brady wrote:

On 10/08/2012 09:24 PM, Daniel J Walsh wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

One of if not the most common problem people hit with SELinux is the mv
command, which maintains the file context of the source destination.

mv /home/dwalsh/index.html /var/www/html/

This blows up on everybody and then the users have no idea why.

I was thinking about adding -Z (--restorecon) to mv and having it basically do a
internal restorecon on the destination.

Then we could suggest people who get burnt by this to:

alias mv=mv -Z

In Fedora 18 we have greatly enhanced matchpathcon, by pre-compiling the
regex, so there should be very little slow down in doing this.


A question on performance.
So there was a large matchpathcon() performance issue in Fedora 11 time,
where we had a 20x slow down if matchpathcon_init_prefix() wasn't called
https://bugzilla.redhat.com/show_bug.cgi?id=479502#c24

Does calling matchpathcon_init_prefix() still provide benefit on Fedora 18?
More importantly, since the new selinux::restorecon_private() doesn't
call matchpathcon_init_prefix(), will it have the large performance
issues on Fedora = 17 and other SELinux supporting platforms?

Not a huge issue since install(1) enables setdefaultfilecon() by default,
whereas the new proposal would only enable when -Z is specified.
That's an inconsistency in the patch in this thread actually.
install -Z runs the new restorecon(), while also running the old
setdefaultfilecon(). Seems like we may need to drop the new install -Z
code for now, and possible in future merge restorecon() and setdefaultfilecon()


Also could you comment on the different schemes used by
restorecon() and setdefaultfilecon().
The old setdefaultfilecon() sets the context of the dest files
to that returned by matchpathcon directly, whereas the new
restorecon() only uses the type portion of the context
from matchpathcon() and inserts that into the exisiting
context for the dest file.

thanks,
Pádraig.



Command-line program to convert 'human' sizes?

2012-12-04 Thread Assaf Gordon
Hello,

Is there a command-line program (or a recommended way) to expose the coreutil's 
common functionality of converting raw sizes to 'human' sizes and vice versa ?

I'm referring to the -h parameter that du/df/sort are accepting, and 
reporting human sizes, but also the reverse (where sort's -G accepts 40M 
as valid input).

I found two relevant threads, but no resolution:
http://lists.gnu.org/archive/html/coreutils/2011-08/msg00035.html
http://lists.gnu.org/archive/html/coreutils/2012-02/msg00088.html

Thanks,
 -gordon



Re: fifo unlimited buffer size? (possibly tee related)

2012-12-04 Thread Peng Yu
 I understand the structure, but the concurrent pipelines
 need separate data sources (process or file copy), or otherwise
 deadlock may happen as data overflows various buffers.
 I suppose this could be encapsulated in tee(1) with non-blocking
 writes and internal buffering, but that would just end up
 being a data copy anyway, so I'm not sure it's warranted.

So the point to improve shall be tee? If so, I'm glad that we at least
figure this out.

Let me explain why file copying can be bad. In my pathological example
based on sort, sort needs to take all the input before generating any
output. However, there are many applications which just need to see a
portion (let's call it lookahead) of input before generating any
output, in which case copying the whole input is a waste, especially
when the input is very large (say hundres of GB) and lookahead is much
smaller (say a few MB).

Could the maintainer of tee see if there is anything can be improved?

-- 
Regards,
Peng



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Pádraig Brady

On 12/04/2012 04:25 PM, Assaf Gordon wrote:

Hello,

Is there a command-line program (or a recommended way) to expose the coreutil's 
common functionality of converting raw sizes to 'human' sizes and vice versa ?

I'm referring to the -h parameter that du/df/sort are accepting, and reporting human sizes, but 
also the reverse (where sort's -G accepts 40M as valid input).

I found two relevant threads, but no resolution:
http://lists.gnu.org/archive/html/coreutils/2011-08/msg00035.html
http://lists.gnu.org/archive/html/coreutils/2012-02/msg00088.html


Nothing yet. The plan is to make a numfmt command available with this interface:
http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html

cheers,
Pádraig.



[PATCH 2/6] cp: -Z: simplify return code handling in selinux routines

2012-12-04 Thread Pádraig Brady
* src/selinux.c: Since we don't have to distinguish
return codes other than -1, simplify the handling of
rc in these routines.
---
 src/selinux.c |   36 +++-
 1 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/src/selinux.c b/src/selinux.c
index afb3959..b1186e9 100644
--- a/src/selinux.c
+++ b/src/selinux.c
@@ -109,18 +109,13 @@ defaultcon (char const *path, mode_t mode)
   security_context_t scon = NULL, tcon = NULL;
   context_t scontext = NULL, tcontext = NULL;
 
-  rc = matchpathcon (path, mode, scon);
-  if (rc  0)
+  if (matchpathcon (path, mode, scon)  0)
 goto quit;
-  rc = computecon (path, mode, tcon);
-  if (rc  0)
+  if (computecon (path, mode, tcon)  0)
 goto quit;
-  scontext = context_new (scon);
-  rc = -1;
-  if (!scontext)
+  if (!(scontext = context_new (scon)))
 goto quit;
-  tcontext = context_new (tcon);
-  if (!tcontext)
+  if (!(tcontext = context_new (tcon)))
 goto quit;
 
   context_type_set (tcontext, context_type_get (scontext));
@@ -171,41 +166,32 @@ restorecon_private (char const *path, bool preserve)
 
   if (fd)
 {
-  rc = fstat (fd, sb);
-  if (rc  0)
+  if (fstat (fd, sb)  0)
 goto quit;
 }
   else
 {
-  rc = lstat (path, sb);
-  if (rc  0)
+  if (lstat (path, sb)  0)
 goto quit;
 }
 
-  rc = matchpathcon (path, sb.st_mode, scon);
-  if (rc  0)
+  if (matchpathcon (path, sb.st_mode, scon)  0)
 goto quit;
-  scontext = context_new (scon);
-  rc = -1;
-  if (!scontext)
+  if (!(scontext = context_new (scon)))
 goto quit;
 
   if (fd)
 {
-  rc = fgetfilecon (fd, tcon);
-  if (rc  0)
+  if (fgetfilecon (fd, tcon)  0)
 goto quit;
 }
   else
 {
-  rc = lgetfilecon (path, tcon);
-  if (rc  0)
+  if (lgetfilecon (path, tcon)  0)
 goto quit;
 }
 
-  rc = -1;
-  tcontext = context_new (tcon);
-  if (!tcontext)
+  if (!(tcontext = context_new (tcon)))
 goto quit;
 
   context_type_set (tcontext, context_type_get (scontext));
-- 
1.7.6.4




[PATCH 1/6] cp: -Z: restorecon(): fix detection and indication of errors

2012-12-04 Thread Pádraig Brady
* src/selinux.c (restorecon_private): Check for correct error code
from [lf]getfilecon().  Note gnulib ensures these functions
always return -1 on error.  Also indicate return with an error if
context_new() fails.
---
 src/selinux.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/selinux.c b/src/selinux.c
index e708b55..afb3959 100644
--- a/src/selinux.c
+++ b/src/selinux.c
@@ -193,15 +193,17 @@ restorecon_private (char const *path, bool preserve)
   if (fd)
 {
   rc = fgetfilecon (fd, tcon);
-  if (!rc)
+  if (rc  0)
 goto quit;
 }
   else
 {
   rc = lgetfilecon (path, tcon);
-  if (!rc)
+  if (rc  0)
 goto quit;
 }
+
+  rc = -1;
   tcontext = context_new (tcon);
   if (!tcontext)
 goto quit;
-- 
1.7.6.4




[PATCH 3/6] cp: -Z: check for more errors in selinux routines

2012-12-04 Thread Pádraig Brady
* src/selinux.c (defaultconf): Handle error returns from
context_type_get(), context_type_set() and context_str().
(retorecon_private): Likewise.
---
 src/selinux.c |   25 -
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/src/selinux.c b/src/selinux.c
index b1186e9..3235309 100644
--- a/src/selinux.c
+++ b/src/selinux.c
@@ -108,6 +108,8 @@ defaultcon (char const *path, mode_t mode)
   int rc = -1;
   security_context_t scon = NULL, tcon = NULL;
   context_t scontext = NULL, tcontext = NULL;
+  const char *contype;
+  char *constr;
 
   if (matchpathcon (path, mode, scon)  0)
 goto quit;
@@ -118,8 +120,14 @@ defaultcon (char const *path, mode_t mode)
   if (!(tcontext = context_new (tcon)))
 goto quit;
 
-  context_type_set (tcontext, context_type_get (scontext));
-  rc = setfscreatecon (context_str (tcontext));
+  if (!(contype = context_type_get (scontext)))
+goto quit;
+  if (context_type_set (tcontext, contype))
+goto quit;
+  if (!(constr = context_str (tcontext)))
+goto quit;
+
+  rc = setfscreatecon (constr);
 
 //  printf(defaultcon %s %s\n, path, context_str(tcontext));
 quit:
@@ -149,6 +157,8 @@ restorecon_private (char const *path, bool preserve)
   struct stat sb;
   security_context_t scon = NULL, tcon = NULL;
   context_t scontext = NULL, tcontext = NULL;
+  const char *contype;
+  char *constr;
   int fd;
 
   if (preserve)
@@ -194,12 +204,17 @@ restorecon_private (char const *path, bool preserve)
   if (!(tcontext = context_new (tcon)))
 goto quit;
 
-  context_type_set (tcontext, context_type_get (scontext));
+  if (!(contype = context_type_get (scontext)))
+goto quit;
+  if (context_type_set (tcontext, contype))
+goto quit;
+  if (!(constr = context_str (tcontext)))
+goto quit;
 
   if (fd)
-rc = fsetfilecon (fd, context_str (tcontext));
+rc = fsetfilecon (fd, constr);
   else
-rc = lsetfilecon (path, context_str (tcontext));
+rc = lsetfilecon (path, constr);
 
 //  printf(restorcon %s %s\n, path, context_str(tcontext));
 quit:
-- 
1.7.6.4




[PATCH 6/6] cp: -Z: fix handling of open errors in restorecon()

2012-12-04 Thread Pádraig Brady
* src/selinux.c (restorecon_private): open() returns -1 on error.
---
 src/selinux.c |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/selinux.c b/src/selinux.c
index e1c122b..0bb6bcc 100644
--- a/src/selinux.c
+++ b/src/selinux.c
@@ -170,10 +170,10 @@ restorecon_private (char const *path, bool local)
 }
 
   fd = open (path, O_RDONLY | O_NOFOLLOW);
-  if (!fd  (errno != ELOOP))
+  if (fd == -1  (errno != ELOOP))
 goto quit;
 
-  if (fd)
+  if (fd != -1)
 {
   if (fstat (fd, sb)  0)
 goto quit;
@@ -189,7 +189,7 @@ restorecon_private (char const *path, bool local)
   if (!(scontext = context_new (scon)))
 goto quit;
 
-  if (fd)
+  if (fd != -1)
 {
   if (fgetfilecon (fd, tcon)  0)
 goto quit;
@@ -210,13 +210,14 @@ restorecon_private (char const *path, bool local)
   if (!(constr = context_str (tcontext)))
 goto quit;
 
-  if (fd)
+  if (fd != -1)
 rc = fsetfilecon (fd, constr);
   else
 rc = lsetfilecon (path, constr);
 
 quit:
-  close (fd);
+  if (fd != -1)
+close (fd);
   context_free (scontext);
   context_free (tcontext);
   freecon (scon);
-- 
1.7.6.4




[PATCH 5/6] cp: -Z: rename PRESERVE bool param to LOCAL

2012-12-04 Thread Pádraig Brady
* src/selinux.c (restorcon): PRESERVE is badly named,
since there is no distinction as to what context is being set.
Also clarify the function comments as to what the boolean
controls exactly.
---
 src/selinux.c |   31 ---
 1 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/src/selinux.c b/src/selinux.c
index 565022d..e1c122b 100644
--- a/src/selinux.c
+++ b/src/selinux.c
@@ -138,10 +138,10 @@ quit:
 }
 
 /*
-  This function takes a PATH of an existing file system object, and a boolean
-  that indicates whether the function should preserve the object's label or
-  generate a new label using matchpathcon.  If the function
-  is called with preserve, it will ask the SELinux Kernel what the default 
label
+  This function takes a PATH of an existing file system object, and a LOCAL
+  boolean that indicates whether the function should set the object's label
+  to the default for the local process, or one using system wide settings.
+  If LOCAL == true, it will ask the SELinux Kernel what the default label
   for all objects created should be and then sets the label on the object.
   Otherwise it calls matchpathcon on the object to ask the system what the
   default label should be, extracts the type field and then modifies the file
@@ -150,7 +150,7 @@ quit:
   Returns -1 on failure.  errno will be set appropriately.
 */
 static int
-restorecon_private (char const *path, bool preserve)
+restorecon_private (char const *path, bool local)
 {
   int rc = -1;
   struct stat sb;
@@ -160,7 +160,7 @@ restorecon_private (char const *path, bool preserve)
   char *constr;
   int fd;
 
-  if (preserve)
+  if (local)
 {
   if (getfscreatecon (tcon)  0)
 return rc;
@@ -226,25 +226,26 @@ quit:
 
 /*
   This function takes three parameters:
-  Path of an existing file system object.
-  A boolean indicating whether it should call restorecon_private recursively.
-  A boolean that indicates whether the function should preserve the object's
-  label or generate a new label using matchpathcon.
 
-  If Recurse is selected and the file system object is a directory, restorecon
-  calls restorecon_private on every file system object in the directory.
+  PATH of an existing file system object.
+
+  A RECURSE boolean which if the file system object is a directory, will
+  call restorecon_private on every file system object in the directory.
+
+  A LOCAL boolean that indicates whether the function should set object labels
+  to the default for the local process, or use system wide settings.
 
   Returns false on failure.  errno will be set appropriately.
 */
 bool
-restorecon (char const *path, bool recurse, bool preserve)
+restorecon (char const *path, bool recurse, bool local)
 {
   const char *mypath[2] = { path, NULL };
   FTS *fts;
   bool ok = true;
 
   if (!recurse)
-return restorecon_private (path, preserve);
+return restorecon_private (path, local);
 
   fts = fts_open ((char *const *) mypath, FTS_PHYSICAL, NULL);
   while (1)
@@ -263,7 +264,7 @@ restorecon (char const *path, bool recurse, bool preserve)
   break;
 }
 
-  ok = restorecon_private (fts-fts_path, preserve);
+  ok = restorecon_private (fts-fts_path, local);
 }
 
   if (fts_close (fts) != 0)
-- 
1.7.6.4




[PATCH 4/6] cp: -Z: tweak comments for selinux routines

2012-12-04 Thread Pádraig Brady
* src/selinux.c: Remove debugging comments and
standardise existing comments a bit.
---
 src/selinux.c |   21 +
 1 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/src/selinux.c b/src/selinux.c
index 3235309..565022d 100644
--- a/src/selinux.c
+++ b/src/selinux.c
@@ -59,11 +59,11 @@ mode_to_security_class (mode_t m)
 }
 
 /*
-  This function takes a path and a mode and then asks SELinux what the label
+  This function takes a PATH and a MODE and then asks SELinux what the label
   of the path object would be if the current process label created it.
-  it then returns the label.
+  It then returns the label.
 
-  Returns -1 on failure. errno will be set appropriately.
+  Returns -1 on failure.  errno will be set appropriately.
 */
 
 static int
@@ -100,7 +100,7 @@ quit:
   default type into label.  It tells the SELinux Kernel to label all new file
   system objects created by the current process with this label.
 
-  Returns -1 on failure. errno will be set appropriately.
+  Returns -1 on failure.  errno will be set appropriately.
 */
 int
 defaultcon (char const *path, mode_t mode)
@@ -129,7 +129,6 @@ defaultcon (char const *path, mode_t mode)
 
   rc = setfscreatecon (constr);
 
-//  printf(defaultcon %s %s\n, path, context_str(tcontext));
 quit:
   context_free (scontext);
   context_free (tcontext);
@@ -139,8 +138,8 @@ quit:
 }
 
 /*
-  This function takes a path of an existing file system object, and a boolean
-  that indicates whether the function should preserve the objects label or
+  This function takes a PATH of an existing file system object, and a boolean
+  that indicates whether the function should preserve the object's label or
   generate a new label using matchpathcon.  If the function
   is called with preserve, it will ask the SELinux Kernel what the default 
label
   for all objects created should be and then sets the label on the object.
@@ -148,7 +147,7 @@ quit:
   default label should be, extracts the type field and then modifies the file
   system object.
 
-  Returns -1 on failure. errno will be set appropriately.
+  Returns -1 on failure.  errno will be set appropriately.
 */
 static int
 restorecon_private (char const *path, bool preserve)
@@ -216,7 +215,6 @@ restorecon_private (char const *path, bool preserve)
   else
 rc = lsetfilecon (path, constr);
 
-//  printf(restorcon %s %s\n, path, context_str(tcontext));
 quit:
   close (fd);
   context_free (scontext);
@@ -229,15 +227,14 @@ quit:
 /*
   This function takes three parameters:
   Path of an existing file system object.
-  A boolean indicating whether it should call restorecon_private recursively
-  or not.
+  A boolean indicating whether it should call restorecon_private recursively.
   A boolean that indicates whether the function should preserve the object's
   label or generate a new label using matchpathcon.
 
   If Recurse is selected and the file system object is a directory, restorecon
   calls restorecon_private on every file system object in the directory.
 
-  Returns false on failure. errno will be set appropriately.
+  Returns false on failure.  errno will be set appropriately.
 */
 bool
 restorecon (char const *path, bool recurse, bool preserve)
-- 
1.7.6.4




Re: df command should suppress duplicates

2012-12-04 Thread Jim Meyering
Bernhard Voelker wrote:
 Hi Ondrej,

 On 12/03/2012 07:09 PM, Ondrej Oprala wrote:
 thanks for the rebase :) .

 no worries, you're welcome. ;-)

 I've modified the patch a bit, so I dont interfere with output if df -a
 is specified.

 Thanks.

 I just had a quick look on the patch and I like the idea
 of combining the filter for rootfs with that of the
 duplicate entries.

 I don't have time for a detailed review right now.
 A few things certainly need to be straightened, e.g. I think
 we can't ignore a stat failure:

 +static bool
 +dev_examined (struct devlist *devlist, char *mount_dir, char *devname)
 +{
 [...]
 +  stat (mount_dir, buf);

 and it probably deserves a little more work on the texi
 documentation and on the test (which doesn't cover rootfs
 hiding yet).

Thanks to both of you for keeping this thread moving.
I've had very little spare time recently.



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Assaf Gordon
Hello,

Pádraig Brady wrote, On 12/04/2012 11:30 AM:
 On 12/04/2012 04:25 PM, Assaf Gordon wrote:
 
 Nothing yet. The plan is to make a numfmt command available with this 
 interface:
 http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html
 

Attached is a stub for such a program (mostly command-line processing, no 
actual conversion yet).

Please let me know if you're willing to eventually include this program (and 
I'll more functionality, tests, docs, etc.).

I tried to follow the existing code conventions in other programs, but all 
comments and suggestions are welcomed.

-gordon

From bb5162a7521aee6b95c902acc65c1d3800ba4f30 Mon Sep 17 00:00:00 2001
From: Assaf Gordon assafgor...@gmail.com
Date: Tue, 4 Dec 2012 15:32:05 -0500
Subject: [PATCH] numfmt: stub code for new program

---
 build-aux/gen-lists-of-programs.sh |1 +
 src/.gitignore |1 +
 src/numfmt.c   |  298 
 3 files changed, 300 insertions(+), 0 deletions(-)
 create mode 100644 src/numfmt.c

diff --git a/build-aux/gen-lists-of-programs.sh b/build-aux/gen-lists-of-programs.sh
index 212ce02..bf63ee3 100755
--- a/build-aux/gen-lists-of-programs.sh
+++ b/build-aux/gen-lists-of-programs.sh
@@ -85,6 +85,7 @@ normal_progs='
 nl
 nproc
 nohup
+numfmt
 od
 paste
 pathchk
diff --git a/src/.gitignore b/src/.gitignore
index 181..25573df 100644
--- a/src/.gitignore
+++ b/src/.gitignore
@@ -59,6 +59,7 @@ nice
 nl
 nohup
 nproc
+numfmt
 od
 paste
 pathchk
diff --git a/src/numfmt.c b/src/numfmt.c
new file mode 100644
index 000..e513194
--- /dev/null
+++ b/src/numfmt.c
@@ -0,0 +1,298 @@
+/* Reformat numbers like 11505426432 to the more human-readable 11G
+   Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see http://www.gnu.org/licenses/.  */
+
+#include config.h
+#include getopt.h
+#include stdio.h
+#include sys/types.h
+
+#include argmatch.h
+#include error.h
+#include system.h
+#include xstrtol.h
+
+/* The official name of this program (e.g., no 'g' prefix).  */
+#define PROGRAM_NAME numfmt
+
+#define AUTHORS proper_name ()
+
+#define BUFFER_SIZE (16 * 1024)
+
+enum
+{
+  FROM_OPTION = CHAR_MAX + 1,
+  FROM_UNIT_OPTION,
+  TO_OPTION,
+  TO_UNIT_OPTION,
+  ROUND_OPTION,
+  SUFFIX_OPTION
+};
+
+enum scale_type
+{
+scale_none, /* the default: no scaling */
+scale_auto, /* --from only */
+scale_SI,
+scale_IEC,
+scale_custom  /* --to only, custom scale */
+};
+
+static char const *const scale_from_args[] =
+{
+auto, SI, IEC, NULL
+};
+static enum scale_type const scale_from_types[] =
+{
+scale_auto, scale_SI, scale_IEC
+};
+
+static char const *const scale_to_args[] =
+{
+SI, IEC, NULL
+};
+static enum scale_type const scale_to_types[] =
+{
+scale_SI, scale_IEC
+};
+
+
+enum round_type
+{
+round_ceiling,
+round_floor,
+round_nearest
+};
+
+static char const *const round_args[] =
+{
+ceiling,floor,nearest, NULL
+};
+
+static enum round_type const round_types[] =
+{
+round_ceiling,round_floor,round_nearest
+};
+
+static struct option const longopts[] =
+{
+  {from, required_argument, NULL, FROM_OPTION},
+  {from-unit, required_argument, NULL, FROM_UNIT_OPTION},
+  {to, required_argument, NULL, TO_OPTION},
+  {to-unit, required_argument, NULL, TO_UNIT_OPTION},
+  {round, required_argument, NULL, ROUND_OPTION},
+  {format, required_argument, NULL, 'f'},
+  {suffix, required_argument, NULL, SUFFIX_OPTION},
+  {GETOPT_HELP_OPTION_DECL},
+  {GETOPT_VERSION_OPTION_DECL},
+  {NULL, 0, NULL, 0}
+};
+
+
+enum scale_type scale_from=scale_none;
+enum scale_type scale_to=scale_none;
+enum round_type _round=round_ceiling;
+char const *format_str = NULL;
+const char *suffix = NULL;
+uintmax_t from_unit_size=1;
+uintmax_t to_unit_size=1;
+
+/* Convert a string of decimal digits, N_STRING, with an optional suffinx
+   to an integral value.  Upon successful conversion,
+   return that value.  If it cannot be converted, give a diagnostic and exit.
+*/
+static uintmax_t
+string_to_integer (const char *n_string)
+{
+  strtol_error s_err;
+  uintmax_t n;
+
+  s_err = xstrtoumax (n_string, NULL, 10, n, bkKmMGTPEZY0);
+
+  if (s_err == LONGINT_OVERFLOW)
+{
+  error (EXIT_FAILURE, 0,
+ _(%s: unit size is so large that it is not representable),
+n_string);
+}
+
+  if (s_err != LONGINT_OK)
+{
+  

New feature in mv

2012-12-04 Thread Paweł Lampe
Hi there !

Few minutes ago, my mate asked me 'is there any way to swap two
files ?'. I have realized, that the mv has option -S but it's all about
suffix. I think, there should be also -s for swapping. It should work
like:
mv -s a b
Effect should be like:
a - tmp
b - a
tmp - b

Think about it 




Re: modechange.c (Feature added) - Binary mode support.

2012-12-04 Thread Eric Blake
On 11/23/2012 07:30 PM, Raphael S Carvalho wrote:
 Hi,
 I found some questions on the internet wondering whether chmod tool does
 support binary numbers as input.
 Even though it doesn't support I downloaded core-utils source and got
 started reading chmode code.
 
 Let's to the point: I added binary input support by making changes
 into the*/lib/modechanges.c
 * file.
 PS: (b) prefix is not case sensitive.
 
 *Tests:*
 ./chmod* b10111* t**

Thanks for the suggestion.  However, I don't think it adds any value, as
bash can already generically process binary numbers at the command line
for ANY application, not just chmod, with just a bit more syntax:

chmod $(printf %o $((2#10111))) t

You can further wrap things in a shell function to reduce your
day-to-day typing:

b2d() {
  case $1 in
*[!01]*) echo 'invalid input' 21 ;;
*) eval printf %d \$((2#$1)) ;;
  esac
}
b2o() {
  case $1 in
*[!01]*) echo 'invalid input' 21 ;;
*) eval printf %o \$((2#$1)) ;;
  esac
}

chmod $(b2o 10111) t

That is, teaching chmod how to parse binary is just code bloat, when you
already have a generic binary parser in your shell.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Pádraig Brady

On 12/04/2012 10:55 PM, Assaf Gordon wrote:

Hello,


Pádraig Brady wrote, On 12/04/2012 11:30 AM:

Nothing yet. The plan is to make a numfmt command available with this interface:
http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html



Attached is a patch, with a proof-of-concept working 'numfmt'.

Works: from=SI/IEC/AUTO, to=SI/IEC, from-units, to-units, suffix, round.
Doesn't work: format, to=NUMBER,field=N .

The code isn't clean and can be improved.
Currently, either  every (non-option) command-line parameter is expected to be 
a number, or every line on stdin is expected to start with a number.


Thanks a lot for working on this.
All I'll say at this stage is to take it
as far as you can as per the interface specified
at the above URL with a mind to reusing stuff from
lib/human.c if possible.

We'll review it then with a view to including it ASAP.

thanks,
Pádraig.



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Assaf Gordon
Pádraig Brady wrote, On 12/04/2012 06:11 PM:
 On 12/04/2012 10:55 PM, Assaf Gordon wrote:
 Hello,

 Pádraig Brady wrote, On 12/04/2012 11:30 AM:
 Nothing yet. The plan is to make a numfmt command available with this 
 interface:
 http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html


 Attached is a patch, with a proof-of-concept working 'numfmt'.

 
 Thanks a lot for working on this.
 All I'll say at this stage is to take it
 as far as you can as per the interface specified
 at the above URL with a mind to reusing stuff from
 lib/human.c if possible.
 
 We'll review it then with a view to including it ASAP.

Thanks!

Input-wise, I had to copy and modify the xstrtol implementation, because the 
original function doesn't allow the caller to force SI or IEC or AUTO (it has 
internal logic to deduce it, based on parameters and user input).

Output-wise, human_readable() from lib/human.c is called as-is (no code 
modification).

Regarding the advanced options:
1. I'm wondering what is the reason/need for --to=NUMBER ? It base 
different than 1024/1000 would result in values like 4K that are very 
unintuitive (since they don't mean 4096/4000).

2. FORMAT: is the only use-case adding spaces before/after the number, and 
grouping?
human_readable() already has support for grouping, and padding might be added 
with different parameters?

I'm asking about #1 and #2, because if we forgo them, human_readable() could 
be used as-is. Otherwise, it will require copypasting and some modifications.

3. SUFFIX - is the purpose of this simply to print a string following the 
number? or are there some more complications?

4. Should nun-suffix characters following a parsed number cause errors, or 
ignored? e.g. 4KQO 








Re: fifo unlimited buffer size?

2012-12-04 Thread Pádraig Brady

On 12/04/2012 03:46 PM, Peng Yu wrote:

On Tue, Dec 4, 2012 at 6:24 AM, Pádraig Brady p...@draigbrady.com wrote:

tag 13075 + notabug
close 13075
thanks

On 12/04/2012 03:19 AM, Peng Yu wrote:


Hi,

I have the following script. When the number to the right of 'seq' is
large (as 10 in the example), the script will hang. But when the
number is small (say 1000), the script can be finished correctly. I
suspect that the problem is that there is a limit on the buffer size
for fifo. Is it so? Is there a way to make the following script work
no matter how large the number is? Thanks!

~/linux/test/gnu/gnu/coreutils/mkfifo/tee$ cat main2.sh
#!/usr/bin/env bash

rm -rf a b c
mkfifo a b c
seq 10 | tee a  b 
sort -k 1,1n a  c 
join -j 1 (awk 'BEGIN{OFS=\t; FS=\t} {print $1, $1+10}'  c)
(awk 'BEGIN{OFS=\t; FS=\t}{print $1, $1+20}'  b)



So this is problematic due to `sort`.
That's special as it needs to consume all its input before
producing any output. Therefore unless the buffers connecting
the other commands in || can consume the data, there will be a deadlock.


I can't parse Therefore unless the buffers connecting the other
commands in || can consume the data...  What is '||'?


Sorry, parallel.


This version doesn't block for example as
the input is being generated asynchronously for the sort command.

#!/usr/bin/env bash
rm -rf a b c
mkfifo a b c
join -j 1 (awk 'BEGIN{OFS=\t; FS=\t} {print $1, $1+10}'  c) \
(awk 'BEGIN{OFS=\t; FS=\t}{print $1, $1+20}'  b) 
seq 10 | sort -k 1,1n  c 
seq 10  b
wait

Obviously, if your input is expensive to generate,
then you'd be best copying to another file
and sorting that.


I should send the message to the regular mailing list. I have two
implicit requirements.

1. The input 'seq 10' can not be run twice, it has to be called once.


If the input is expensive to generate,
then it would need to be copied.


2. There can not be intermediate files generated.


Ah :(


Given the above requirements, there is no solution?


Not one I can think of.


The generate question is, there is one input stream fans out to
multiple streams, which then are under some processing. Then these
processed streams converge to one program, which outputs one output
stream. This seems to be general use pattern. This pattern can be
nested arbitrarily, in which case, the above two requirements are
better held. Does this make sense?


I understand the structure, but the concurrent pipelines
need separate data sources (process or file copy), or otherwise
deadlock may happen as data overflows various buffers.
I suppose this could be encapsulated in tee(1) with non-blocking
writes and internal buffering, but that would just end up
being a data copy anyway, so I'm not sure it's warranted.

thanks,
Pádraig.



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Pádraig Brady

On 12/04/2012 11:35 PM, Assaf Gordon wrote:

Pádraig Brady wrote, On 12/04/2012 06:11 PM:

On 12/04/2012 10:55 PM, Assaf Gordon wrote:

Hello,


Pádraig Brady wrote, On 12/04/2012 11:30 AM:

Nothing yet. The plan is to make a numfmt command available with this interface:
http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html



Attached is a patch, with a proof-of-concept working 'numfmt'.



Thanks a lot for working on this.
All I'll say at this stage is to take it
as far as you can as per the interface specified
at the above URL with a mind to reusing stuff from
lib/human.c if possible.

We'll review it then with a view to including it ASAP.


Thanks!

Input-wise, I had to copy and modify the xstrtol implementation, because the 
original function doesn't allow the caller to force SI or IEC or AUTO (it has 
internal logic to deduce it, based on parameters and user input).


We can tweak xstrtol() in gnulib if needed
I suppose we can refactor after.


Output-wise, human_readable() from lib/human.c is called as-is (no code 
modification).

Regarding the advanced options:
1. I'm wondering what is the reason/need for --to=NUMBER ? It base different than 
1024/1000 would result in values like 4K that are very unintuitive (since they don't mean 
4096/4000).


Drats I can't remember now.


2. FORMAT: is the only use-case adding spaces before/after the number, and 
grouping?
human_readable() already has support for grouping, and padding might be added 
with different parameters?


we can do padding outside of human_readable() using mbsalign() I think,
and that would be auto enabled with the --field option.

I was thinking that --format would be a central place for tweaking
grouping and spacing etc. of numbers using standard printf format modifiers.



I'm asking about #1 and #2, because if we forgo them, human_readable() could be 
used as-is. Otherwise, it will require copypasting and some modifications.


We can tweak human_readable() in gnulib if needed
I suppose we can refactor after.



3. SUFFIX - is the purpose of this simply to print a string following the 
number? or are there some more complications?


That's it basically.
It could be done with --format, but it's such a common requirement
that I thought it warranted a separate option.
Note human_readable() may output a 'B' suffix in certain cases,
which it shouldn't, and we should suppress it if it does as
noted in the spec.



4. Should nun-suffix characters following a parsed number cause errors, or ignored? e.g. 
4KQO


Ignored I would say.

thanks,
Pádraig.



Re: fifo unlimited buffer size? (possibly tee related)

2012-12-04 Thread Pádraig Brady

On 12/04/2012 04:26 PM, Peng Yu wrote:

I understand the structure, but the concurrent pipelines
need separate data sources (process or file copy), or otherwise
deadlock may happen as data overflows various buffers.
I suppose this could be encapsulated in tee(1) with non-blocking
writes and internal buffering, but that would just end up
being a data copy anyway, so I'm not sure it's warranted.


So the point to improve shall be tee? If so, I'm glad that we at least
figure this out.

Let me explain why file copying can be bad. In my pathological example
based on sort, sort needs to take all the input before generating any
output. However, there are many applications which just need to see a
portion (let's call it lookahead) of input before generating any
output, in which case copying the whole input is a waste, especially
when the input is very large (say hundres of GB) and lookahead is much
smaller (say a few MB).

Could the maintainer of tee see if there is anything can be improved?


I notice something similar with:
http://code.dogmap.org/fdtools/multitee/

but it notes issues with non blocking IO,
and it doesn't seem to handle the buffering I mentioned.

So I would say it's a possible enhancement,
though given the awkwardness of implementation,
and more importantly the esoteric nature of
the use case, it would be way down the priority list.

thanks,
Pádraig.



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Jim Meyering
Pádraig Brady wrote:

 On 12/04/2012 11:35 PM, Assaf Gordon wrote:
 Pádraig Brady wrote, On 12/04/2012 06:11 PM:
 On 12/04/2012 10:55 PM, Assaf Gordon wrote:
 Hello,

 Pádraig Brady wrote, On 12/04/2012 11:30 AM:
 Nothing yet. The plan is to make a numfmt command available with
 this interface:
 http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html


 Attached is a patch, with a proof-of-concept working 'numfmt'.


 Thanks a lot for working on this.
 All I'll say at this stage is to take it
 as far as you can as per the interface specified
 at the above URL with a mind to reusing stuff from
 lib/human.c if possible.

 We'll review it then with a view to including it ASAP.

 Thanks!

 Input-wise, I had to copy and modify the xstrtol implementation,
 because the original function doesn't allow the caller to force SI
 or IEC or AUTO (it has internal logic to deduce it, based on
 parameters and user input).

 We can tweak xstrtol() in gnulib if needed
 I suppose we can refactor after.

 Output-wise, human_readable() from lib/human.c is called as-is
 (no code modification).

 Regarding the advanced options:
 1. I'm wondering what is the reason/need for --to=NUMBER ? It
 base different than 1024/1000 would result in values like 4K that
 are very unintuitive (since they don't mean 4096/4000).

 Drats I can't remember now.

 2. FORMAT: is the only use-case adding spaces before/after the
 number, and grouping?
 human_readable() already has support for grouping, and padding
 might be added with different parameters?

 we can do padding outside of human_readable() using mbsalign() I think,
 and that would be auto enabled with the --field option.

 I was thinking that --format would be a central place for tweaking
 grouping and spacing etc. of numbers using standard printf format modifiers.


 I'm asking about #1 and #2, because if we forgo them,
 human_readable() could be used as-is. Otherwise, it will require
 copypasting and some modifications.

 We can tweak human_readable() in gnulib if needed
 I suppose we can refactor after.


 3. SUFFIX - is the purpose of this simply to print a string
 following the number? or are there some more complications?

 That's it basically.
 It could be done with --format, but it's such a common requirement
 that I thought it warranted a separate option.
 Note human_readable() may output a 'B' suffix in certain cases,
 which it shouldn't, and we should suppress it if it does as
 noted in the spec.


 4. Should nun-suffix characters following a parsed number cause
 errors, or ignored? e.g. 4KQO

 Ignored I would say.

Ignoring trailing bytes for a string like 4O could be misleading, no?
I'd rather get a diagnostic that there's this O (capital O) at the
end of my number.   Same thing for 1's vs l's: 9111l looks a lot like
91,111 in some fonts.



Re: New feature in mv

2012-12-04 Thread Pádraig Brady

On 12/04/2012 05:04 PM, Paweł Lampe wrote:

Hi there !

Few minutes ago, my mate asked me 'is there any way to swap two
files ?'. I have realized, that the mv has option -S but it's all about
suffix. I think, there should be also -s for swapping. It should work
like:
mv -s a b
Effect should be like:
a - tmp
b - a
tmp - b

Think about it


A fairly useful feature, but something as noted here
that might be more suited to a separate script,
that could be maintained within coreutils under the
recently mentioned contrib/ directory

http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00014.html

thanks,
Pádraig.



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Pádraig Brady

On 12/05/2012 12:19 AM, Jim Meyering wrote:

Pádraig Brady wrote:


On 12/04/2012 11:35 PM, Assaf Gordon wrote:

Pádraig Brady wrote, On 12/04/2012 06:11 PM:

On 12/04/2012 10:55 PM, Assaf Gordon wrote:

Hello,


Pádraig Brady wrote, On 12/04/2012 11:30 AM:

Nothing yet. The plan is to make a numfmt command available with
this interface:
http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html



Attached is a patch, with a proof-of-concept working 'numfmt'.



Thanks a lot for working on this.
All I'll say at this stage is to take it
as far as you can as per the interface specified
at the above URL with a mind to reusing stuff from
lib/human.c if possible.

We'll review it then with a view to including it ASAP.


Thanks!

Input-wise, I had to copy and modify the xstrtol implementation,
because the original function doesn't allow the caller to force SI
or IEC or AUTO (it has internal logic to deduce it, based on
parameters and user input).


We can tweak xstrtol() in gnulib if needed
I suppose we can refactor after.


Output-wise, human_readable() from lib/human.c is called as-is
(no code modification).

Regarding the advanced options:
1. I'm wondering what is the reason/need for --to=NUMBER ? It
base different than 1024/1000 would result in values like 4K that
are very unintuitive (since they don't mean 4096/4000).


Drats I can't remember now.


2. FORMAT: is the only use-case adding spaces before/after the
number, and grouping?
human_readable() already has support for grouping, and padding
might be added with different parameters?


we can do padding outside of human_readable() using mbsalign() I think,
and that would be auto enabled with the --field option.

I was thinking that --format would be a central place for tweaking
grouping and spacing etc. of numbers using standard printf format modifiers.



I'm asking about #1 and #2, because if we forgo them,
human_readable() could be used as-is. Otherwise, it will require
copypasting and some modifications.


We can tweak human_readable() in gnulib if needed
I suppose we can refactor after.



3. SUFFIX - is the purpose of this simply to print a string
following the number? or are there some more complications?


That's it basically.
It could be done with --format, but it's such a common requirement
that I thought it warranted a separate option.
Note human_readable() may output a 'B' suffix in certain cases,
which it shouldn't, and we should suppress it if it does as
noted in the spec.



4. Should nun-suffix characters following a parsed number cause
errors, or ignored? e.g. 4KQO


Ignored I would say.


Ignoring trailing bytes for a string like 4O could be misleading, no?
I'd rather get a diagnostic that there's this O (capital O) at the
end of my number.   Same thing for 1's vs l's: 9111l looks a lot like
91,111 in some fonts.


Fair point.
Though if someone wanted to use O for octets,
or to count 4KOwls, 2Owls, ..
Hmm, maybe that's another reason I had the separate --suffix option,
so that only that suffix was allowed, and others were rejected.
Let's go with that for now.

thanks,
Pádraig.



Re: New feature in mv

2012-12-04 Thread Raphael S Carvalho
On Tue, Dec 4, 2012 at 10:22 PM, Pádraig Brady p...@draigbrady.com wrote:
 On 12/04/2012 05:04 PM, Paweł Lampe wrote:

 Hi there !

 Few minutes ago, my mate asked me 'is there any way to swap two
 files ?'. I have realized, that the mv has option -S but it's all about
 suffix. I think, there should be also -s for swapping. It should work
 like:
 mv -s a b
 Effect should be like:
 a - tmp
 b - a
 tmp - b

 Think about it


 A fairly useful feature, but something as noted here
 that might be more suited to a separate script,
 that could be maintained within coreutils under the
 recently mentioned contrib/ directory

 http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00014.html

 thanks,
 Pádraig.


I read this page:
http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00014.html
So I would really like to create that program. Is it still even needed?

Att, Raphael SC raphael.sc...@gmail.com



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Assaf Gordon
Pádraig Brady wrote, On 12/04/2012 07:31 PM:
 On 12/05/2012 12:19 AM, Jim Meyering wrote:
 Pádraig Brady wrote:
 On 12/04/2012 11:35 PM, Assaf Gordon wrote:
 Pádraig Brady wrote, On 12/04/2012 06:11 PM:
 On 12/04/2012 10:55 PM, Assaf Gordon wrote:
 Pádraig Brady wrote, On 12/04/2012 11:30 AM:

 snip long discussion 

Would the following be acceptable:
1. remove --to=NUMBER option
2. surplus characters following immediately after converted number trigger a 
warning (error?), 
  except if the following characters match exactly the suffix parameter.


Regarding --format:
The implementation doesn't really use printf, so %d isn't directly usable.
One option is to tell the user to use %s (instead of %d), and we'll simply 
put the result of human_readable() as the string parameter in vasnprintf - 
this will be flexible in terms of alignment.
Another option is the remove --format option, and replace it with --padding 
or similar.

Regarding grouping (thousands separator):
This only has an effect when no using --to=SI or --to=IEC, right?
Perhaps we can add a separate option --grouping, and simply turn on the 
human_grouping flag? (easy to implement).







Re: New feature in mv

2012-12-04 Thread Pádraig Brady

On 12/05/2012 12:45 AM, Raphael S Carvalho wrote:

On Tue, Dec 4, 2012 at 10:22 PM, Pádraig Brady p...@draigbrady.com wrote:

On 12/04/2012 05:04 PM, Paweł Lampe wrote:


Hi there !

Few minutes ago, my mate asked me 'is there any way to swap two
files ?'. I have realized, that the mv has option -S but it's all about
suffix. I think, there should be also -s for swapping. It should work
like:
mv -s a b
Effect should be like:
a - tmp
b - a
tmp - b

Think about it



A fairly useful feature, but something as noted here
that might be more suited to a separate script,
that could be maintained within coreutils under the
recently mentioned contrib/ directory

http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00014.html

thanks,
Pádraig.



I read this page:
http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00014.html
So I would really like to create that program. Is it still even needed?


Cool, thanks!

I would prototype a script in shell,
with possibly some of the same considerations as in this rewrite script,
http://lists.gnu.org/archive/html/bug-coreutils/2010-03/txtNTX6owUFov.txt

thanks,
Pádraig.



Re: New feature in mv

2012-12-04 Thread Raphael S Carvalho
On Tue, Dec 4, 2012 at 11:05 PM, Pádraig Brady p...@draigbrady.com wrote:
 On 12/05/2012 12:45 AM, Raphael S Carvalho wrote:

 On Tue, Dec 4, 2012 at 10:22 PM, Pádraig Brady p...@draigbrady.com wrote:

 On 12/04/2012 05:04 PM, Paweł Lampe wrote:


 Hi there !

 Few minutes ago, my mate asked me 'is there any way to swap two
 files ?'. I have realized, that the mv has option -S but it's all about
 suffix. I think, there should be also -s for swapping. It should work
 like:
 mv -s a b
 Effect should be like:
 a - tmp
 b - a
 tmp - b

 Think about it



 A fairly useful feature, but something as noted here
 that might be more suited to a separate script,
 that could be maintained within coreutils under the
 recently mentioned contrib/ directory

 http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00014.html

 thanks,
 Pádraig.


 I read this page:
 http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00014.html
 So I would really like to create that program. Is it still even needed?


 Cool, thanks!

 I would prototype a script in shell,
 with possibly some of the same considerations as in this rewrite script,
 http://lists.gnu.org/archive/html/bug-coreutils/2010-03/txtNTX6owUFov.txt

 thanks,
 Pádraig.

I was thinking about writing such a program using C language, though
I'm not sure if it would be the best choice. I could search for the
best data exchange algorithm, so making the program fast and useful.
I'm not sure if they(maintainers) would accept a new program in the
core-utils. As I'm seeing, the program would provide a way to exchange
data between two files, besides changing filenames. Should another
one(feature) be implemented?
I also thought about using Python.

Regards, Raphael S.Carvalho raphael.sc...@gmail.com



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Pádraig Brady

On 12/05/2012 12:56 AM, Assaf Gordon wrote:

Pádraig Brady wrote, On 12/04/2012 07:31 PM:

On 12/05/2012 12:19 AM, Jim Meyering wrote:

Pádraig Brady wrote:

On 12/04/2012 11:35 PM, Assaf Gordon wrote:

Pádraig Brady wrote, On 12/04/2012 06:11 PM:

On 12/04/2012 10:55 PM, Assaf Gordon wrote:

Pádraig Brady wrote, On 12/04/2012 11:30 AM:


 snip long discussion 

Would the following be acceptable:
1. remove --to=NUMBER option


I wish I could find my notes on this.
Yes ignore for now. I might remember what
I was thinking here later.


2. surplus characters following immediately after converted number trigger a 
warning (error?),
   except if the following characters match exactly the suffix parameter.


Right. Might as well error as give a warning?


Regarding --format:
The implementation doesn't really use printf, so %d isn't directly usable.


So we lose zero padding, base conversion, etc. but...


One option is to tell the user to use %s (instead of %d), and we'll simply put the 
result of human_readable() as the string parameter in vasnprintf - this will be flexible in terms 
of alignment.
Another option is the remove --format option, and replace it with --padding 
or similar.




Regarding grouping (thousands separator):
This only has an effect when no using --to=SI or --to=IEC, right?
Perhaps we can add a separate option --grouping, and simply turn on the 
human_grouping flag? (easy to implement).


... such features may be better with explicit options
for --padding, --grouping, --base.
Yes grouping wouldn't make much sense with --to=SI

cheers,
Pádraig.



Re: df command should suppress duplicates

2012-12-04 Thread Bernhard Voelker


On 12/04/2012 07:18 PM, Jim Meyering wrote:

 Thanks to both of you for keeping this thread moving.

Thanks.

I have moved the change around a bit to make the overall
change in df.c smaller. I thought that it would be better
to keep such filtering stuff in get_dev() as all the other
filters, e.g. --local.

A also added some other test cases and another test that
examines rootfs hiding.
Finally, I added some texi documentation.

In the end, that change wasn't that small actually, so I
added a Co-authored line to the patch. ;-)

Ondrej (and others): WDYT?

Have a nice day,
Berny

From b8e311fde1803e900b2f894364922eb9cdd9841c Mon Sep 17 00:00:00 2001
From: Ondrej Oprala oopr...@redhat.com
Date: Wed, 5 Dec 2012 01:39:44 +0100
Subject: [PATCH] df: do not print duplicate entries and rootfs by default

* src/df.c (struct devlist): Add new struct for storing already-
examined device numbers.
(devlist_head): Add new store of the above type.
(show_rootfs): Add new global boolean to not skip rootfs.
(dev_examined): Add new function to check if the device has
already been traversed.
(get_dev): Filter out rootfs unless -t rootfs or the -a
option is specified. Filter out duplicate entries by calling
the above new dev_examined unless the -a option is specified.
(main): Set the show_rootfs variable appropriately when the -t
option is specified for rootfs. Free device list (guarded by
IF_LINT).
* tests/df/skip-duplicates.sh: Add test to exercise the skipping
of duplicate entries.
* tests/df/skip-rootfs.sh: Add test to exercise the skipping
of the rootfs pseudo file system.
* tests/local.mk: Add the above new tests.
* NEWS (Changes in behavior): Mention the changes.
* doc/coreutils.texi (df invocation): Document df's behavior about
skipping rootfs and duplicate entries.

Co-authored-by: Bernhard Voelker.
---
 NEWS|6 +++
 doc/coreutils.texi  |8 
 src/df.c|   58 
 tests/df/skip-duplicates.sh |   77 +++
 tests/df/skip-rootfs.sh |   46 +
 tests/local.mk  |2 +
 6 files changed, 197 insertions(+), 0 deletions(-)
 create mode 100755 tests/df/skip-duplicates.sh
 create mode 100755 tests/df/skip-rootfs.sh

diff --git a/NEWS b/NEWS
index d4aebeb..0694ec5 100644
--- a/NEWS
+++ b/NEWS
@@ -40,6 +40,12 @@ GNU coreutils NEWS-*- outline -*-
   field can be in any column.  If there is no source column, then df
   prints 'total' into the target column.
 
+  df now properly outputs file system information with bind mounts present on
+  the system by skipping duplicate entries (identified by the device number).
+
+  df now skips the early-boot pseudo file system type rootfs unless either the
+  -a option or -t rootfs is specified.
+
   nl no longer supports the --page-increment option which was deprecated
   since coreutils-7.5.  Use --line-increment instead.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 46d3680..21400ad 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -10600,6 +10600,14 @@ Normally the disk space is printed in units of
 1024 bytes, but this can be overridden (@pxref{Block size}).
 Non-integer quantities are rounded up to the next higher unit.
 
+For bind mounts and without arguments, @command{df} only outputs the statistics
+for the first occurence of that device in the list of file systems (@var{mtab}),
+i.e., it hides duplicate entries, unless the @option{-a} option is specified.
+
+By default, @command{df} omits the early-boot pseudo file system type
+@samp{rootfs}, unless the @option{-a} option is specified or that file system
+type is explicitly to be included by using the @option{-t} option.
+
 @cindex disk device file
 @cindex device file, disk
 If an argument @var{file} is a disk device file containing a mounted
diff --git a/src/df.c b/src/df.c
index cac26b7..63c8b31 100644
--- a/src/df.c
+++ b/src/df.c
@@ -43,6 +43,17 @@
   proper_name (David MacKenzie), \
   proper_name (Paul Eggert)
 
+/* Filled with device numbers of examined file systems to avoid
+   duplicities in output.  */
+struct devlist
+{
+  dev_t dev_num;
+  struct devlist *next;
+};
+
+/* Store of already-processed device numbers.  */
+static struct devlist *devlist_head;
+
 /* If true, show even file systems with zero size or
uninteresting types.  */
 static bool show_all_fs;
@@ -54,6 +65,12 @@ static bool show_local_fs;
command line argument -- even if it's a dummy (automounter) entry.  */
 static bool show_listed_fs;
 
+/* If true, include rootfs in the output.  */
+static bool show_rootfs;
+
+/* The literal name of the initial root file system.  */
+static char const *ROOTFS = rootfs;
+
 /* Human-readable options for output.  */
 static int human_output_opts;
 
@@ -589,6 +606,29 @@ excluded_fstype (const char *fstype)
   return false;
 }
 
+/* Check if the device was already examined.  */
+
+static 

Re: New feature in mv

2012-12-04 Thread Bernhard Voelker
On 12/05/2012 02:34 AM, Pádraig Brady wrote:
 On 12/05/2012 01:25 AM, Raphael S Carvalho wrote:
 I was thinking about writing such a program using C language,
 [...]


 The idea here is that swapping files would be just a thin
 wrapper around the mv or cp or ln utilities which
 already do the heavy lifting in C for copying data
 around (often using complicated techniques), or
 in fact just renaming as appropriate.

Well, maybe a C program execing mv/cp/ln would be easier
to maintain than a shell script regarding portability ...
;-)

Have a nice day,
Berny



Re: New feature in mv

2012-12-04 Thread Pádraig Brady

On 12/05/2012 01:41 AM, Bernhard Voelker wrote:

On 12/05/2012 02:34 AM, Pádraig Brady wrote:

On 12/05/2012 01:25 AM, Raphael S Carvalho wrote:

I was thinking about writing such a program using C language,
[...]




The idea here is that swapping files would be just a thin
wrapper around the mv or cp or ln utilities which
already do the heavy lifting in C for copying data
around (often using complicated techniques), or
in fact just renaming as appropriate.


Well, maybe a C program execing mv/cp/ln would be easier
to maintain than a shell script regarding portability ...
;-)


Sure.

But shell would be a good prototype for this at least.
Also having scripts like this portable to the vast
majority of shells would be a good source of robust
shell examples that use coreutils to the fullest.
They would also ensure to some extent that we were
providing appropriate interfaces from the C utilities,
to be robustly and efficiently used by shell scripts.

cheers,
Pádraig.



Re: Command-line program to convert 'human' sizes?

2012-12-04 Thread Assaf Gordon
Hello,

 Pádraig Brady wrote, On 12/04/2012 11:30 AM:
 Nothing yet. The plan is to make a numfmt command available with this 
 interface:
 http://lists.gnu.org/archive/html/coreutils/2012-02/msg00085.html


Attached is a patch, with a proof-of-concept working 'numfmt'.

Works: from=SI/IEC/AUTO, to=SI/IEC, from-units, to-units, suffix, round.
Doesn't work: format, to=NUMBER,field=N .

The code isn't clean and can be improved.
Currently, either  every (non-option) command-line parameter is expected to be 
a number, or every line on stdin is expected to start with a number.
 
Comments are welcomed,
 -gordon


Examples;

$ ./src/numfmt --from=auto 2K
2000
$ ./src/numfmt --from=auto 2Ki
2048
$ ./src/numfmt --from=SI 2K
2000
$ ./src/numfmt --from=SI 2Ki
2000
$ ./src/numfmt --from=IEC  2Ki
2048
$ ./src/numfmt --from=SI --to=IEC 2Ki
2.0K
$ ./src/numfmt --from=IEC --to=SI 2K 
2.1k
$ ./src/numfmt --from=IEC 1M
1048576
$ ./src/numfmt --from=IEC --to=SI 1M
1.1M
$ ./src/numfmt --from=IEC --to-unit=20 1M
52429
./src/numfmt --from-unit=512 --to=IEC 4
2.0K
$ ./src/numfmt --round=ceiling --to=IEC 2000
2.0K
$ ./src/numfmt --round=floor --to=IEC 2000
1.9K


Help screen
===
$ ./src/numfmt --help 
Usage: ./src/numfmt [OPTIONS] [NUMBER]
Reformats NUMBER(s) to/from human-readable values.
Numbers can be processed either from stdin or command arguments.

  --from=UNIT Auto-scale input numbers (auto, SI, IEC)
  If not specified, input suffixed are ignored.
  --from-unit=N   Specifiy the input unit size (instead of the default 1).
  --to=UNIT   Auto-scale output numbres (SI,IEC,N).
  If not specified, 
  --to-unit=N Specifiy the output unit size (instead of the default 1).
  --rount=METHOD  Round input numbers. METHOD can be:
  ceiling (the default), floor, nearest
  -f, --format=FORMAT   use printf style output FORMAT.
Default output format is %d .
  --suffix=SUFFIX   
  
  --help display this help and exit
  --version  output version information and exit

UNIT options:
 auto ('--from' only):
  1K  = 1000
  1Ki = 1024
  1G  = 100
  1Gi = 1048576
 SI:
  1K* = 1000
  (additional suffixes after K/G/T do not alter the scale)
 IEC:
  1K* = 1024
  (additional suffixes after K/G/T do not alter the scale)
 N ('--to' only):
  Use number N as the scale.


Examples:
  ./src/numfmt --to=SI 1000   - 1K
  echo 1K | ./src/numfmt --from=SI- 1000
  echo 1K | ./src/numfmt --from=IEC   - 1024

Report numfmt bugs to bug-coreut...@gnu.org
GNU coreutils home page: http://www.gnu.org/software/coreutils/
General help using GNU software: http://www.gnu.org/gethelp/
Report numfmt translation bugs to http://translationproject.org/team/
For complete documentation, run: info coreutils 'numfmt invocation'
===

 build-aux/gen-lists-of-programs.sh |1 +
 src/.gitignore |1 +
 src/numfmt.c   |  549 
 3 files changed, 551 insertions(+), 0 deletions(-)

diff --git a/build-aux/gen-lists-of-programs.sh b/build-aux/gen-lists-of-programs.sh
index 212ce02..bf63ee3 100755
--- a/build-aux/gen-lists-of-programs.sh
+++ b/build-aux/gen-lists-of-programs.sh
@@ -85,6 +85,7 @@ normal_progs='
 nl
 nproc
 nohup
+numfmt
 od
 paste
 pathchk
diff --git a/src/.gitignore b/src/.gitignore
index 181..25573df 100644
--- a/src/.gitignore
+++ b/src/.gitignore
@@ -59,6 +59,7 @@ nice
 nl
 nohup
 nproc
+numfmt
 od
 paste
 pathchk
diff --git a/src/numfmt.c b/src/numfmt.c
new file mode 100644
index 000..99b1450
--- /dev/null
+++ b/src/numfmt.c
@@ -0,0 +1,549 @@
+/* Reformat numbers like 11505426432 to the more human-readable 11G
+   Copyright (C) 2012 Free Software Foundation, Inc.
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see http://www.gnu.org/licenses/.  */
+
+#include config.h
+#include getopt.h
+#include stdio.h
+#include sys/types.h
+
+#include argmatch.h
+#include error.h
+#include system.h
+#include xstrtol.h
+#include human.h
+
+/* The official name of this program (e.g., no 'g' prefix).  */
+#define PROGRAM_NAME numfmt
+
+#define AUTHORS proper_name ()
+
+#define BUFFER_SIZE (16 * 1024)
+
+enum
+{
+  FROM_OPTION = CHAR_MAX + 1,
+  FROM_UNIT_OPTION,
+  TO_OPTION,
+  TO_UNIT_OPTION,
+  ROUND_OPTION,
+  SUFFIX_OPTION
+};
+

bug#13075: fifo unlimited buffer size?

2012-12-04 Thread Pádraig Brady

tag 13075 + notabug
close 13075
thanks

On 12/04/2012 03:19 AM, Peng Yu wrote:

Hi,

I have the following script. When the number to the right of 'seq' is
large (as 10 in the example), the script will hang. But when the
number is small (say 1000), the script can be finished correctly. I
suspect that the problem is that there is a limit on the buffer size
for fifo. Is it so? Is there a way to make the following script work
no matter how large the number is? Thanks!

~/linux/test/gnu/gnu/coreutils/mkfifo/tee$ cat main2.sh
#!/usr/bin/env bash

rm -rf a b c
mkfifo a b c
seq 10 | tee a  b 
sort -k 1,1n a  c 
join -j 1 (awk 'BEGIN{OFS=\t; FS=\t} {print $1, $1+10}'  c)
(awk 'BEGIN{OFS=\t; FS=\t}{print $1, $1+20}'  b)


So this is problematic due to `sort`.
That's special as it needs to consume all its input before
producing any output. Therefore unless the buffers connecting
the other commands in || can consume the data, there will be a deadlock.

This version doesn't block for example as
the input is being generated asynchronously for the sort command.

#!/usr/bin/env bash
rm -rf a b c
mkfifo a b c
join -j 1 (awk 'BEGIN{OFS=\t; FS=\t} {print $1, $1+10}'  c) \
(awk 'BEGIN{OFS=\t; FS=\t}{print $1, $1+20}'  b) 
seq 10 | sort -k 1,1n  c 
seq 10  b
wait

Obviously, if your input is expensive to generate,
then you'd be best copying to another file
and sorting that.

thanks,
Pádraig.





bug#13080: [PATHC] improve error reporting

2012-12-04 Thread Pádraig Brady

On 12/04/2012 03:32 PM, Alexandru Cojocaru wrote:

From 8ecd92b3c11abc5cde184f0c511cd469190511af Mon Sep 17 00:00:00 2001

From: Cojocaru Alexandru xo...@gmx.com
Date: Tue, 4 Dec 2012 16:08:42 +0100
Subject: [PATCH] cut: improve error reporting

* src/cut.c (main): Report error on `-d '' -b1'

* src/cut.c (set_fields): Change the error message when
the given list is invalid.
---
  src/cut.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/cut.c b/src/cut.c
index 4219d24..d55da51 100644
--- a/src/cut.c
+++ b/src/cut.c
@@ -365,7 +365,7 @@ set_fields (const char *fieldstr)
in_digits = false;
/* Starting a range. */
if (dash_found)
-FATAL_ERROR (_(invalid byte or field list));
+FATAL_ERROR (_(invalid byte, character or field list));
dash_found = true;
fieldstr++;

@@ -491,7 +491,7 @@ set_fields (const char *fieldstr)
fieldstr++;
  }
else
-FATAL_ERROR (_(invalid byte or field list));
+FATAL_ERROR (_(invalid byte, character or field list));
  }

max_range_endpoint = 0;


The above is fine.


@@ -781,7 +781,7 @@ main (int argc, char **argv)
/* By default, all non-delimited lines are printed.  */
suppress_non_delimited = false;

-  delim = '\0';
+  delim = 0x7F + 1;
have_read_stdin = false;

while ((optc = getopt_long (argc, argv, b:c:d:f:ns, longopts, NULL)) != 
-1)
@@ -846,7 +846,7 @@ main (int argc, char **argv)
if (operating_mode == undefined_mode)
  FATAL_ERROR (_(you must specify a list of bytes, characters, or 
fields));

-  if (delim != '\0'  operating_mode != field_mode)
+  if (delim != 0x7F + 1  operating_mode != field_mode)
  FATAL_ERROR (_(an input delimiter may be specified only\
   when operating on fields));


Keying on 0x80 doesn't seem right.
I'll apply this hunk instead in your name.
Please confirm.

diff --git a/src/cut.c b/src/cut.c
index 4219d24..f2e63dc 100644
--- a/src/cut.c
+++ b/src/cut.c
@@ -846,7 +846,7 @@ main (int argc, char **argv)
   if (operating_mode == undefined_mode)
 FATAL_ERROR (_(you must specify a list of bytes, characters, or fields));

-  if (delim != '\0'  operating_mode != field_mode)
+  if (delim_specified  operating_mode != field_mode)
 FATAL_ERROR (_(an input delimiter may be specified only\
  when operating on fields));

thanks,
Pádraig.