Re: Command line parsing of ls with genparse: inline code

2008-07-18 Thread Michael Geng
On Mon, Jun 16, 2008 at 09:43:37PM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
 ...
  I think emitting into a .h file would be a good solution. But wouldn't
  it then be better to directly add a genparse file (e.g. ls.gp) instead of
  extracting it from ls.c? ls.c would no longer have to be modified then.
 
 The disadvantage is that there would then be bits of code in ls.gp
 that depends on declarations in ls.c.  I.e., static, file-scoped
 variables in ls.c would now also appear in ls.gp.  That is too distant
 for my taste.  That would also add ~100 more version-controlled *.gp
 source files.

I uploaded a new patch to 
http://genparse.sourceforge.net/coreutils/genparse-07-18-2008.patch.
With that patch the distribution tarball will include ls.in.c which 
contains the genparse input file at the bottom. Make will automatically
extract it and move the generated parser to ls-gp.h which will also be
part of the distribution tarball. ls.c will not be distributed but
generating it from ls.in.c only requires sed, not genparse. So there is
no code duplication in the distribution. Conceptionally genparse is not a 
build time requirement but actually it still is because the configure-script
doesn't yet check if genparse is available.

About translations: I suggest that gettext is applied on the generated parser
(ls-gp.h), not on the genparse input at the bottom of ls.in.c. On the rest, 
shall gettext be applied on ls.c or ls.in.c? The difference between ls.in.c 
and ls.c is only genparse input file which is attached at the end of ls.c as a 
comment. ls.c is a generated file now. For this reason make distcheck fails
with this patch (because it doesn't find ls.c during the translation checks) 
although make check succeeds.

You will need genparse v0.7.7 in order to process the parser in this patch.
It is available from http://genparse.sourceforge.net/ as usual.

How about this concept?

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: truncate.c fails to compile on make distcheck

2008-06-26 Thread Michael Geng
On Wed, Jun 25, 2008 at 11:39:50PM +0100, Pádraig Brady wrote:
 There may be nothing wrong with your system, I'm just speculating.
 For example on 64 bit systems which I don't have access to,
 there may be a mismatch between intmax_t and off_t which is not
 obvious to me? Anyway it would be good to confirm that you
 can infact create a large file using: `truncate -s3G large_file`

I could successfully create a 3GB file with `truncate -s3G large_file`
using truncate from the sources I compiled yesterday.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


truncate.c fails to compile on make distcheck

2008-06-25 Thread Michael Geng
Hi,

When I'm running make distcheck on the Coreutils I get an error message

...
gcc -std=gnu99  -I. -I../lib  -I../lib   -Werror -ansi -Wno-long-long -MT 
truncate.o -MD -MP -MF .deps/truncate.Tpo -c -o truncate.o truncate.c
cc1: warnings being treated as errors
truncate.c: In function 'parse_len':
truncate.c:78: error: passing argument 4 of 'xstrtoimax' from incompatible 
pointer type
make[6]: *** [truncate.o] Error 1
...

I'm running Linux on a PC and I compiled the Coreutils from git
from this morning bootstrapped with Gnulib also from git from
this morning.

Is this a bug in the Coreutils or is there something wrong on my system?

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: truncate.c fails to compile on make distcheck

2008-06-25 Thread Michael Geng
On Wed, Jun 25, 2008 at 05:29:57PM +0100, Pádraig Brady wrote:
 Michael Geng wrote:
  Hi,
  
  When I'm running make distcheck on the Coreutils I get an error message
  
  ...
  gcc -std=gnu99  -I. -I../lib  -I../lib   -Werror -ansi -Wno-long-long -MT 
  truncate.o -MD -MP -MF .deps/truncate.Tpo -c -o truncate.o truncate.c
  cc1: warnings being treated as errors
  truncate.c: In function 'parse_len':
  truncate.c:78: error: passing argument 4 of 'xstrtoimax' from incompatible 
  pointer type
  make[6]: *** [truncate.o] Error 1
  ...
  
  I'm running Linux on a PC and I compiled the Coreutils from git
  from this morning bootstrapped with Gnulib also from git from
  this morning.
  
  Is this a bug in the Coreutils or is there something wrong on my system?
 
 What version of glibc are you using.
 I make the assumption that OFF_T_MAX = INTMAX_MAX,
 but this is only true when _FILE_OFFSET_BITS=64
 I'll look into fixing that up this evening.
 I wonder why that is not set in your lib/config.h ?
 
 cheers,
 Pádraig.

I'm using glibc version 2.7 (Debian packages libc6 and libc6-dev 
version 2.7-12).

In my lib/config.h in my sources of the Coreutils I have 
_FILE_OFFSET_BITS set to 64.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: truncate.c fails to compile on make distcheck

2008-06-25 Thread Michael Geng
On Wed, Jun 25, 2008 at 07:21:51PM +0200, Jim Meyering wrote:
 Pádraig Brady [EMAIL PROTECTED] wrote:
  Michael Geng wrote:
  Hi,
 
  When I'm running make distcheck on the Coreutils I get an error message
 
  ...
  gcc -std=gnu99  -I. -I../lib  -I../lib   -Werror -ansi -Wno-long-long -MT 
  truncate.o -MD -MP -MF .deps/truncate.Tpo -c -o truncate.o truncate.c
  cc1: warnings being treated as errors
  truncate.c: In function 'parse_len':
  truncate.c:78: error: passing argument 4 of 'xstrtoimax' from incompatible 
  pointer type
  make[6]: *** [truncate.o] Error 1
  ...
 
  I'm running Linux on a PC and I compiled the Coreutils from git
  from this morning bootstrapped with Gnulib also from git from
  this morning.
 
  Is this a bug in the Coreutils or is there something wrong on my system?
 
  What version of glibc are you using.
  I make the assumption that OFF_T_MAX = INTMAX_MAX,
 
 Humph.  I should have caught that in review.
 Here's a tentative patch:
 
 diff --git a/src/truncate.c b/src/truncate.c
 index 8febd58..52500d2 100644
 --- a/src/truncate.c
 +++ b/src/truncate.c
 @@ -74,10 +74,21 @@ static int
  parse_len (char const *str, off_t *size)
  {
enum strtol_error e;
 -  /* OFF_T_MAX = INTMAX_MAX */
 -  e = xstrtoimax (str, NULL, 10, size, EgGkKmMPtTYZ0);
 -  errno = (e == LONGINT_OVERFLOW) ? EOVERFLOW : 0;
 -  return (e == LONGINT_OK) ? 0 : -1;
 +  intmax_t tmp_size;
 +  e = xstrtoimax (str, NULL, 10, tmp_size, EgGkKmMPtTYZ0);
 +  if (e == LONGINT_OK
 +   !(OFF_T_MIN = tmp_size  tmp_size = OFF_T_MAX))
 +e = LONGINT_OVERFLOW;
 +
 +  if (e == LONGINT_OK)
 +{
 +  errno = 0;
 +  *size = tmp_size;
 +  return 0;
 +}
 +
 +  errno = (e == LONGINT_OVERFLOW ? EOVERFLOW : 0);
 +  return -1;
  }
 
  static void

With your patch I could successfully compile the Coreutils
and it also passed make distcheck this time.

Thank you very much for providing the patch so quickly!
Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: truncate.c fails to compile on make distcheck

2008-06-25 Thread Michael Geng
On Wed, Jun 25, 2008 at 08:19:56PM +0100, Pádraig Brady wrote:
 Micheal there still is something wrong with your system I think.
 Did you perhaps compile glibc from scratch and didn't pass
 a required configure option for Large File Support?

I didn't compile glibc myself. So I don't know what's wrong on
my system. But never mind, thanks to Jim's quick patch it works 
now :-)

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: check-AUTHORS fails because of non ansi characters

2008-06-23 Thread Michael Geng
On Sat, Jun 21, 2008 at 06:10:28PM +0200, Bruno Haible wrote:
 Michael must be using a locale in ASCII encoding; if it were a Latin1 
 encoding,
 the output would have contained a cedilla, not a question mark.

I did not have the en_US.UTF8 locale installed. After installing it the 
problem disappeared.

Thanks,
Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


check-AUTHORS fails because of non ansi characters

2008-06-20 Thread Michael Geng
Hi,

when I'm doing make distcheck on the present git version it fails
with the following message:

...
diff authors-actual authors-dotdot  rm -f authors-actual authors-dotdot
...
58c58
 ptx: Fran?ois Pinard
---
 ptx: François Pinard
...
make[3]: *** [check-AUTHORS] Error 1
...

It clearly has to do with non ansi characters in names. But what exactly 
is wrong?

Does AUTHORS now have to be updated? Or do the source files (ptx.c in this 
case) have to be updated? Or is there anything wrong with my Linux 
installation?

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Command line parsing of ls with genparse: inline code

2008-06-16 Thread Michael Geng
On Sat, Jun 14, 2008 at 06:17:30PM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
  when I posted a patch around Christmas which showed how genparse
  could generate the parser code for the ping command of the inetutils
  Alfred Szmidt replied that in that example there are 2 loops (see
  http://lists.gnu.org/archive/html/bug-inetutils/2007-12/msg7.html):
 
  1. The switch statement which sets the arg_t struct with the
 results of getopt_long in ping-clp.c
 
  2. A list of if-clauses which sets the already existing variables
 in ping.c from the arg_t struct
 
  I now released a new version of genparse (0.7.6) which allows to put
  the generated parser code inline of existing code, replacing almost
  exactly the existing C code. This version has 2 extensions:
 
  1. A __CODE__(statements) directive which allows to let the user define
 the code which is executed in the switch statement which evaluates
 the results of getopt_long.
 
  2. A global directive #no_struct. If specified in the genparse file
 then no arg_t struct struct is generated.
 
  I uploaded a patch which shows how this works for the ls command of
  the coreutils. It transforms ls.c.in which has the genparse file inline
  into ls.c which has almost exactly the same command line parser code
  inline as the present version of ls.c. The patch is targeted to today's
  git version of the coreutils. I verified that it passes make distcheck.
 
  Please use the following links:
 
  patch:   
  http://genparse.sourceforge.net/coreutils/genparse-ls-inline-git-01-19-2008.patch
  ls.c.in: http://genparse.sourceforge.net/coreutils/ls.c.in (included in the 
  patch)
  ls.c http://genparse.sourceforge.net/coreutils/ls.c(included in the 
  patch)
 
 Hi Michael,
 
 If you're still interested (and willing to work! ;-)
 I think we're getting close.  I hope you understand the delay.
 We're contemplating a massive change, and I want to be sure
 I don't end up regretting it.
 
 Now that coreutils-6.12 is out, once two remaining points
 are addressed, then I think we'll finally be ready to switch
 a few trial programs.
 
 Please add #line directives in the emitted C code, so that
 debuggers know the source file/line-numbers.
 
 I'd really like to avoid having to move the primary source
 for 100+ C files from .c to .c.in files (or .in.c).
 I'm not sure how best to address that.  Here are some thoughts:
 
 We might arrange for automake to emit an augmented .c-to-.o rule
 that invokes genparse along the way (this is where #line directives
 become essential -- the postprocessed intermediate .c file may not even
 exist after it's compiled.  Note however that that would add genparse
 as a *build-time* requirement, so isn't really an option.  The usual
 approach to solving this problem is to distribute both primary sources
 and generated sources (e.g., yacc and lex sources ship with generated
 .c files), but that would bloat the distribution tarballs.
 
 One possibility would be to emit into a .h file the static
 definitions for functions like Cmdline.  Then there would be
 far less duplication.
 
 Jim

Hi Jim,

I think emitting into a .h file would be a good solution. But wouldn't 
it then be better to directly add a genparse file (e.g. ls.gp) instead of 
extracting it from ls.c? ls.c would no longer have to be modified then. 
How about simply renaming the genparse generated ls-clp.c into a header
file, modifying it the same way the Makefile with my last patch already 
does and including it in ls.c? Here's a drawing of this idea:

   genparse sed
ls.gp -- ls-clp.c - ls-genparse.h

This would mean that ls.c becomes more tiny and readable (because the
command line part is removed) but ls.gp will be added and in order not to
have genparse as a build-time requirement ls-genparse.h will also have to 
be added to the distribution. ls-clp.c becomes an intermediate file which 
need not be distributed, I hope that ls-clp.h will not be needed at all. 
There will be no code duplication except between the genparse file and the 
generated command line parser.

I haven't yet had time to test my idea, I hope I will find time to do that
in the next days.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Command line parsing of ls with genparse: inline code

2008-01-19 Thread Michael Geng
Hi,

when I posted a patch around Christmas which showed how genparse 
could generate the parser code for the ping command of the inetutils
Alfred Szmidt replied that in that example there are 2 loops (see
http://lists.gnu.org/archive/html/bug-inetutils/2007-12/msg7.html):

1. The switch statement which sets the arg_t struct with the 
   results of getopt_long in ping-clp.c

2. A list of if-clauses which sets the already existing variables
   in ping.c from the arg_t struct

I now released a new version of genparse (0.7.6) which allows to put
the generated parser code inline of existing code, replacing almost
exactly the existing C code. This version has 2 extensions:

1. A __CODE__(statements) directive which allows to let the user define 
   the code which is executed in the switch statement which evaluates
   the results of getopt_long.

2. A global directive #no_struct. If specified in the genparse file 
   then no arg_t struct struct is generated. 

I uploaded a patch which shows how this works for the ls command of 
the coreutils. It transforms ls.c.in which has the genparse file inline
into ls.c which has almost exactly the same command line parser code 
inline as the present version of ls.c. The patch is targeted to today's 
git version of the coreutils. I verified that it passes make distcheck.

Please use the following links:

patch:   
http://genparse.sourceforge.net/coreutils/genparse-ls-inline-git-01-19-2008.patch
ls.c.in: http://genparse.sourceforge.net/coreutils/ls.c.in (included in the 
patch)
ls.c http://genparse.sourceforge.net/coreutils/ls.c(included in the 
patch)

How do you think about it?

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Translator comments and other extensions to genparse

2007-10-13 Thread Michael Geng
Hi Bruno, Jim, Andreas, hello mailing list,

I released genparse version 0.7.2 which addresses some issues which we 
recently discussed in the mailing list.

Splitting usage text in smaller pieces
==

Regarding http://lists.gnu.org/archive/html/bug-coreutils/2007-10/msg00048.html:

genparse now has a new command line option -P / --manyprints which causes the
description for every command line parameter to be printed in a seperate print 
command. See example below.

Note that genparse also has a global directive #break_lines which allows to 
break lines in the usage text automatically to a specified width.

Translator comments
===

Release 0.7.2 also has a new directive __COMMENT__ which allows to add comments
to the source code which prints the help text. The example below shows how
this can be used especially for translator comments.

puts instead of printf
==

Regarding http://lists.gnu.org/archive/html/bug-coreutils/2007-10/msg00087.html:

genparse automatically uses the puts command now if possible. 

An updated version of genparse, updated docu and examples can be accessed from
http://genparse.sourceforge.net/ as usual.



Below is a modified version of the genparse file for the wc command:

#include coreutils.h

c / bytes   flagprint the byte counts
m / chars   flagprint the character counts
l / lines   flagprint the newline counts
NONE / files0-from=Fstring  read input from the files specified by
  NUL-terminated names in file F
L / max-line-length flag__COMMENT__(TRANSLATORS: the length here is 
actually the screen width)
print the length of the longest line
w / words   flagprint the word counts
#gp_include help_version.gp

#usage_begin
__COMMENT__(TRANSLATORS: Keep the colons aligned.)
Usage: __PROGRAM_NAME__ [OPTION]... [FILE]...
  or:  __PROGRAM_NAME__ [OPTION]...__NEW_PRINT__
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified.  With no FILE, or when FILE is -,
read standard input.__NEW_PRINT__
__GLOSSARY_GNU__(25)
__COMMAND__(emit_bug_reporting_address ())
#usage_end



And below is the usage () function of the parser genparse 0.7.2 generates 
when invoked with
genparse --longmembers --internationalize --static-headers --gnulib 
--manyprints -o wc-clp wc.gp

/*
**
** usage ()
**
** Print out usage information, then exit
**
**--*/

void usage (int status, char *program_name)
{
  if (status != EXIT_SUCCESS)
fprintf (stderr, _(Try `%s --help' for more information.\n),
program_name);
  else
{
  /* TRANSLATORS: Keep the colons aligned. */
  printf (_(\
Usage: %s [OPTION]... [FILE]...\n\
  or:  %s [OPTION]...\n), program_name, program_name);
  puts (_(\
Print newline, word, and byte counts for each FILE, and a total line if\n\
more than one FILE is specified.  With no FILE, or when FILE is -,\n\
read standard input.));
  puts (_(\
  -c, --bytesprint the byte counts));
  puts (_(\
  -m, --charsprint the character counts));
  puts (_(\
  -l, --linesprint the newline counts));
  puts (_(\
  --files0-from=Fread input from the files specified by\n\
   NUL-terminated names in file F));
  /* TRANSLATORS: the length here is actually the screen width */
  puts (_(\
  -L, --max-line-length  print the length of the longest line));
  puts (_(\
  -w, --wordsprint the word counts));
  puts (_(\
  --help display this help and exit));
  puts (_(\
  --version  output version information and exit));
  emit_bug_reporting_address ();
}
  exit (status);
}

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Translator comments and other extensions to genparse

2007-10-13 Thread Michael Geng
On Sat, Oct 13, 2007 at 04:01:09PM +0200, Bruno Haible wrote:
 Michael Geng wrote:
  I released genparse version 0.7.2 which addresses some issues which we 
  recently discussed in the mailing list.
 
 Thank you very much!
 
 The only remaining question is: Why is --manyprints not implied by
 --internationalize? Or why does the doc of --internationalize not say:
 When you use this option, the option --manyprints is also recommended.?
 
 Bruno

Is there really consensus that in future every command line parameter shall 
be output in a seperate print command?

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls, tail and wc with genparse

2007-10-09 Thread Michael Geng
On Sat, Oct 06, 2007 at 05:35:43PM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
 ...
  file you extracted above is from the wc command. You can watch the genparse
  generated parser for it from 
  http://genparse.sourceforge.net/examples/wc_clp.c.
 
 It's nice to see the continuing improvements.
 
 I noticed that you transformed the uses of fputs in wc's usage
 function into uses of fprintf in your generated wc_clp.c.
 That is probably ok in most cases, but in some (with a string
 containing %), it's not -- unless you escape them.
 Besides, I switched from fprintf to fputs for a reason: so that
 I (and translators) don't have to worry about such escaping.
 In addition, I don't mind that fputs is lighter-weight than fprintf.

Is there a special reason why you use fprintf (... , stdout) and 
fputs (stdout, ...) instead of printf and puts? 

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls, tail and wc with genparse

2007-10-07 Thread Michael Geng
On Sat, Oct 06, 2007 at 05:09:15PM +0200, Bruno Haible wrote:
 Michael Geng wrote:
  My intention is in fact to invoke xgettext on the parser files which 
  genparse generates. ... You can watch the genparse
  generated parser for it from 
  http://genparse.sourceforge.net/examples/wc_clp.c.
 
 OK, this kind of generated code is not bad; it's acceptable to have
 translators look at it.
 
 But there are still two problems:
 
 1) The gettext documentation [1] recommends to not make the _() arguments
 unnecessarily large.
 
   printf (_(\
   --files0-from=Fread input from the files specified by\n\
NUL-terminated names in file F\n\
   -L, --max-line-length  print the length of the longest line\n\
   -w, --wordsprint the word counts\n));
 
 This is too large. When the maintainer adds another option, the translator
 should not have to re-proofread the 3 existing options. So you should better
 generate code like this:

The text is partitioned exactly as it is in the existing code of tail.c (I'm 
looking at a cvs archive copy from sept 9). This is from tail.c:

  fputs (_(\
  --files0-from=Fread input from the files specified by\n\
   NUL-terminated names in file F\n\
  -L, --max-line-length  print the length of the longest line\n\
  -w, --wordsprint the word counts\n\
), stdout);

-

   printf (_(\
   --files0-from=Fread input from the files specified by\n\
NUL-terminated names in file F\n));
   printf (_(\
   -L, --max-line-length  print the length of the longest line\n));
   printf (_(\
   -w, --wordsprint the word counts\n));

Genparse could easily generate the above code if you added more __NEW_PRINT__
directives to the genparse file like this:

NONE / files0-from=Fstring  read input from the files specified by
  NUL-terminated names in file F__NEW_PRINT__
L / max-line-length flagprint the length of the longest 
line__NEW_PRINT__
w / words   flagprint the word counts__NEW_PRINT__

 2) What about translator comments? How can a maintainer provide
 additional information for the translators? Does this work?
 
   /* TRANSLATORS: the length here is actually the screen width */
   L / max-line-length  flagprint the length of the longest line
 
 
 /* TRANSLATORS: Keep the colons aligned. */
 Usage: __PROGRAM_NAME__ [OPTION]... [FILE]...
   or:  __PROGRAM_NAME__ [OPTION]... --files0-from=F__NEW_PRINT__
 
 
 Bruno
 
 
 [1] 
 http://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html

That could be a reason for a new directive in the genparse file. When I grep for
TRANSLATORS in the coreutils src/ directory I can find only 3 hits however and
none of them is directly in the command line parsing code. system.h contains a
TRANSLATORS comment in the emit_bug_reporting_address () function but that can
stay as it is because genparse only instanstiates the call to 
emit_bug_reporting_address () (see e.g. 
http://genparse.sourceforge.net/examples/wc_clp.c).
Do you see more examples of translator comments in the coreutils code for which 
you think genparse should handle them?

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls, tail and wc with genparse

2007-10-07 Thread Michael Geng
On Sat, Oct 06, 2007 at 05:35:43PM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
 ...
  file you extracted above is from the wc command. You can watch the genparse
  generated parser for it from 
  http://genparse.sourceforge.net/examples/wc_clp.c.
 
 It's nice to see the continuing improvements.
 
 I noticed that you transformed the uses of fputs in wc's usage
 function into uses of fprintf in your generated wc_clp.c.
 That is probably ok in most cases, but in some (with a string
 containing %), it's not -- unless you escape them.
 Besides, I switched from fprintf to fputs for a reason: so that
 I (and translators) don't have to worry about such escaping.
 In addition, I don't mind that fputs is lighter-weight than fprintf.

Genparse could issue fputs () calls whenever the printed string doesn't contain
arguments. I will probably change this in one of the next genparse releases.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls, tail and wc with genparse

2007-10-07 Thread Michael Geng
On Sun, Oct 07, 2007 at 02:48:43PM +0200, Bruno Haible wrote:
 Michael Geng wrote:
   1) The gettext documentation [1] recommends to not make the _() arguments
   unnecessarily large.
  
  The text is partitioned exactly as it is in the existing code of tail.c 
  (I'm 
  looking at a cvs archive copy from sept 9). This is from tail.c:
  
fputs (_(\
--files0-from=Fread input from the files specified by\n\
 NUL-terminated names in file F\n\
-L, --max-line-length  print the length of the longest line\n\
-w, --wordsprint the word counts\n\
  ), stdout);
 
 This is too large already now in tail.c. I haven't told Jim about it in
 the past because there are dozens of program maintainers who would have
 to change hundreds of usage messages manually.
 
 The difference is that you are writing a tool, and your tool already knows
 where the description of each option starts and ends.
 
  Genparse could easily generate the above code if you added more 
  __NEW_PRINT__
  directives to the genparse file like this:
  
  NONE / files0-from=Fstring  read input from the files specified by
NUL-terminated names in file F__NEW_PRINT__
  L / max-line-length flagprint the length of the longest 
  line__NEW_PRINT__
  w / words   flagprint the word counts__NEW_PRINT__
 
 Why should the maintainer have to write __NEW_PRINT__ everywhere here?
 It's clear from the input where the description of each option ends. Your
 tool can add __NEW_PRINT__ implicitly there.

I didn't know that it is desired to have the description for every command line
argument in a seperate string. I could add a new command line switch to 
genparse 
which automatically splits text this way.

   2) What about translator comments? How can a maintainer provide
   additional information for the translators? Does this work?
 
  When I grep for TRANSLATORS in the coreutils src/ directory I can find
  only 3 hits however ...
  Do you see more examples of translator comments in the coreutils code for
  which you think genparse should handle them?
 
 This is not a strong argument.
 
 Bruno

I understand that you don't want to loose the possibility to add such translator
comments. Maybe I will add a new directive to the genparse file for it in one 
of 
the next versions.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls, tail and wc with genparse

2007-10-07 Thread Michael Geng
On Sun, Oct 07, 2007 at 09:47:31AM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
 ...
  The text is partitioned exactly as it is in the existing code of tail.c (I'm
  looking at a cvs archive copy from sept 9). This is from tail.c:
 
 I think tail.c is still in sync, but...
 
 Please don't use cvs anymore (use git instead).
 Something went wrong with a git-to-cvs commit-export,
 so that at least the cvs ChangeLog is out of sync with
 the upstream git repository.
 
 I'll fix it eventually, but the point is you shouldn't
 rely on cvs being up to date.

I retrieved the git version today and bootstrapped it with an updated cvs 
version of Gnulib. Unfortunately I still have problems to compile it. It
stops with the following messages:

...
gcc -std=gnu99  -I. -I../lib  -I../lib   -g -O2 -MT cp.o -MD -MP -MF 
.deps/cp.Tpo -c -o cp.o cp.c
mv -f .deps/cp.Tpo .deps/cp.Po
gcc -std=gnu99  -I. -I../lib  -I../lib   -g -O2 -MT copy.o -MD -MP -MF 
.deps/copy.Tpo -c -o copy.o copy.c
copy.c:42:22: error: file-set.h: No such file or directory
copy.c:48:25: error: hash-triple.h: No such file or directory
copy.c: In function 'dest_info_init':
copy.c:908: error: 'triple_hash' undeclared (first use in this function)
copy.c:908: error: (Each undeclared identifier is reported only once
copy.c:908: error: for each function it appears in.)
copy.c:909: error: 'triple_compare' undeclared (first use in this function)
copy.c:910: error: 'triple_free' undeclared (first use in this function)
copy.c: In function 'src_info_init':
copy.c:930: error: 'triple_hash_no_name' undeclared (first use in this function)
copy.c:931: error: 'triple_compare' undeclared (first use in this function)
copy.c:932: error: 'triple_free' undeclared (first use in this function)
copy.c: In function 'copy_internal':
copy.c:1040: warning: implicit declaration of function 'seen_file'
copy.c:1047: warning: implicit declaration of function 'record_file'
make[2]: *** [copy.o] Error 1
make[2]: Leaving directory `/home/linux/tmp/coreutils-git-10-07-2007/src'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/linux/tmp/coreutils-git-10-07-2007/src'
make: *** [all-recursive] Fehler 1


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


tail.c: variable pid not initialized

2007-09-28 Thread Michael Geng
Hi,

I don't understand where the static variable pid in tail.c gets 
initialized when tail is invoked without --pid option. 

I'm looking at the latest version in the cvs archive.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls with genparse

2007-08-30 Thread Michael Geng
On Wed, Aug 29, 2007 at 11:39:46PM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
 ...
  of text. In the present version of ls the usage() function calls
  fputs() several times. The genparse version prints everything in
  1 single call to printf(). So the usage() text in the present ls.c
  is split into multiple _() macros, whereas ls-clp.c uses 1 single _()
  macro for the whole help screen.
 
 Consider separating it into strings no longer than 509 bytes each
 and printing them separately.  That's a portability limitation imposed
 by some c89 compilers.  (see gcc's -Woverlength-strings)
 
 If you were to run make distcheck, this and some other problems
 would be exposed.  For example, you added at least one function
 that was not declared static.
 
 With your changes does make check still pass?

Yes, I verified that that it passes make check. make distcheck
is new to me however, I haven't yet tried that.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls with genparse

2007-08-29 Thread Michael Geng
On Tue, Aug 28, 2007 at 09:08:51PM -0600, Eric Blake wrote:
 According to Michael Geng on 8/28/2007 12:33 PM:
  
  In the present version of genparse new strings are always printed
  in new lines. For example (also from the ls commmand):
  
  d / directory   flaglist directory entries instead 
  of contents,
and do not dereference symbolic 
  links
 
 Why not make genparse a bit smarter, and let the user supply free-form
 text as the option description.  Genparse should then wrap it to fit an
 80-column screen before generating the resulting usage() in the .c file.
 Then the above example would simply be:
 
 d / directory flag \
 list directory entries instead of contents, and do not dereference
 symbolic links
 
 with the __GNU_GLOSSARY__(29) being the formatting hint of where the
 auto-wrapping should occur in the output English text.

I think that's a good idea. How about adding a --linebreak[=width] command 
line switch to genparse which enables breaking lines on the help screen 
automatically to the specified width or 80 columns if --linebreak is given
without argument?

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls with genparse

2007-08-29 Thread Michael Geng
On Tue, Aug 28, 2007 at 01:21:46PM -0700, Eric Blake-1 wrote:
 
  2. ls.c depends ls-clp.h (the generated parser)
 ls-clp.h depends on ls.gp (the genparse file)
 ls.gp depends on ls.c because ls.gp is embedded as a comment in ls.c
 - There is a circular dependency!
 
 That seems wrong to me.  Isn't it really:
 
 ls$(EXEEXT) directly depends on ls.o and ls-clp.o
 ls.o directly depends on ls.c and ls-clp.h
 ls-clp.o directly depends on ls-clp.c and ls-clp.h
 ls-clp.c directly depends on ls.gp
 ls-clp.h directly depends on ls.gp
 ls.gp directly depends on ls.c
 
 No cycle there, even though ls.c is an indirect
 dependency of ls$(EXEEXT) through more than one
 leg of the transitive closure.

You are right. I verified this and it builds properly with the 
following modifications on src/Makefile.am:

--- coreutils-6.9.orig/src/Makefile.am  2007-03-20 08:24:27.0 +0100
+++ coreutils-6.9/src/Makefile.am   2007-08-29 21:14:29.0 +0200
@@ -48,7 +48,7 @@
 EXTRA_DIST = dcgen dircolors.hin tac-pipe.c \
   groups.sh wheel-gen.pl extract-magic c99-to-c89.diff
 BUILT_SOURCES =
-CLEANFILES = $(SCRIPTS) su
+CLEANFILES = $(SCRIPTS) su *.gp *-clp.c *-clp.h
 
 AM_CPPFLAGS = -I$(top_srcdir)/lib
 
@@ -185,14 +185,16 @@
 __SOURCES = lbracket.c
 
 cp_SOURCES = cp.c copy.c cp-hash.c
-dir_SOURCES = ls.c ls-dir.c
-vdir_SOURCES = ls.c ls-vdir.c
-ls_SOURCES = ls.c ls-ls.c
+dir_SOURCES = ls.c ls-dir.c ls-clp.c ls-clp.h
+vdir_SOURCES = ls.c ls-vdir.c ls-clp.c ls-clp.h
+ls_SOURCES = ls.c ls-ls.c ls-clp.c ls-clp.h
 chown_SOURCES = chown.c chown-core.c
 chgrp_SOURCES = chgrp.c chown-core.c
 
 mv_SOURCES = mv.c copy.c cp-hash.c remove.c
 rm_SOURCES = rm.c remove.c
+tail_SOURCES = tail.c tail-clp.c
+wc_SOURCES = wc.c wc-clp.c
 
 md5sum_SOURCES = md5sum.c
 md5sum_CPPFLAGS = -DHASH_ALGO_MD5=1 $(AM_CPPFLAGS)
@@ -363,3 +365,14 @@
| grep -Ev -f $$t \
  { echo 'the above variables should have static scope' 12;   \
exit 1; } || :
+
+ls.$(OBJEXT): ls-clp.c ls-clp.h
+tail.$(OBJEXT): ls-clp.c tail-clp.h
+wc.$(OBJEXT): ls-clp.c wc-clp.h
+
+%-clp.c %-clp.h: %.gp
+   genparse --longmembers --internationalize -o $(*F)-clp $
+
+%.gp: %.c
+   sed -n -e '/genparse file starts here/,/genparse file ends here/p'  
$(*F).c | \
+   sed -e '/genparse file ends here/d' -n -e '2,$$p'  $@

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls with genparse

2007-08-29 Thread Michael Geng
On Tue, Aug 28, 2007 at 01:21:46PM -0700, Eric Blake-1 wrote:
  +++ coreutils-6.9/src/ls.c  2007-08-26 19:58:20.0 +0200
  @@ -76,7 +76,6 @@
   # define SA_RESTART 0
   #endif
   
  -#include system.h
   #include fnmatch.h
 
 Why are you deleting this include?  Without it, how do you ensure
 that config.h is pulled in before anything else?  If you intend for
 ls-clp.h to fill this role, then it must be included before any
 system files.  Also, are you sure you are not falling foul of
 any 'make distcheck' rules in Makefile.maint?

I need the following definitions in ls-clp.c:

1. the i18n macro _()
2. the definition of PACKAGE_BUGREPORT
3. the definition of true and false

I got everything by including system.h in ls-clp.c. Unfortunately
I had to exclude it from ls.c then because there were duplicate 
definitons. I entered this when I wrote the patch for the wc command
and at that time I was happy to get it compiled, that's all.

I'm sure there is a better solution. Maybe I have to include other
files. I agree that this has to be fixed.

  +  Cmdline(cmdline, argc, argv);
 
 GNU coding standards want a space between the function
 name and open (.

Right, thanks.

  +/* Extract the following section an process it with genparse
  +   (see http://genparse.sourceforge.net) in order to generate a parser
  +   for the command line arguments and a usage function for printing a
  help 
  +   screen. */
  +
  +/* genparse file starts here
  +#include config.h
  +#include system.h
  +#include ls.h
  +
  +#exit_value LS_FAILURE
 
 I know the C standard requires this, but in practice, are all
 C preprocessors tolerant of comments that contain lines
 that look like preprocessing directives but which are not?

That's potentially another drawback of embedding the genparse files in the
C sources.

  +NONE / helpflagdisplay this help and exit
  +NONE / version flagoutput version information and 
  exit
 
 It looks like one drawback of using genparse is that you lose
 the system.h magic that ensures consistency between all
 the apps with --help and --version, since you can't really
 use the preprocessor macros *_HELP_OPTION_* here.

I could imagine that this can be solved by adding the capability 
to include parameter definitions in a genparse file, i.e. include 
genparse files in other genparse files. There could be a shared 
genparse file with the parameter definitions for help and version 
which could be included by all other genparse files.

  +Report bugs to __STRING__(PACKAGE_BUGREPORT).
 
 What happened to the TRANSLATOR comment that reminds
 them to add a second line, including the address to report
 translation bugs to?  Also, it isn't very obvious how this
 will affect xgettext extraction of strings that need
 translation.  Are you sure you haven't broken things
 for other locales?  Would the generated ls-clp.c need
 to be added to POTFILES.in, or is your intent still to
 have all translatable strings reside in ls.c?

If I understood the i18n mechanism right then the C preprocesor
is needed for the _() macros to take effect. So the genparse files
can't be translated directly, even if they are embedded in C files
because they are still inside of a comment. So I think ls-clp.c
would have to be added to POTFILES.in. 

I haven't investigated how a genparse based solution affects
i18n and I generally have very view experiance with i18n. I 
would expect that problems are caused by different partitioning 
of text. In the present version of ls the usage() function calls 
fputs() several times. The genparse version prints everything in 
1 single call to printf(). So the usage() text in the present ls.c
is split into multiple _() macros, whereas ls-clp.c uses 1 single _() 
macro for the whole help screen. Do you agree that this is the main
source of trouble? Do you see other problems? I haven't fully 
thought this through but I think I could change genparse such that 
the user can control when a new print command should start thus 
giving control of partitioning translatable text to the user.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: [PATCH] Command line parsing of ls with genparse

2007-08-28 Thread Michael Geng
On Tue, Aug 28, 2007 at 10:48:55AM -0700, Eric Blake-1 wrote:
 
  1. The usage function for the ls command has the following 
 explanation for the -F, --classify command liune switch:
 append indicator (one of */=@|) to entries.
 
 Can you rewrite it as append indicator (one of * /=@|) to entries,
 and rely on string concatenation to produce the same output?
 
 -- Eric Blake

In the present version of genparse new strings are always printed
in new lines. For example (also from the ls commmand):

d / directory   flaglist directory entries instead of 
contents,
  and do not dereference symbolic 
links

in the genparse (.gp) file is translated into

  -d, --directorylist directory entries instead of contents,\n\
   and do not dereference symbolic links\n\

in the generated usage() function. Or

NONE / group-directories-first  flag group directories before files

could be used to generate

  --group-directories-first\n\
 group directories before files\n\

in the usage() function.

So presently it doesn't work, but your idea is not bad. I will think 
about it again and maybe change genparse such that multiple comment
strings in the same line will simply be concatenated.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


[PATCH] Command line parsing of tail with genparse

2007-08-19 Thread Michael Geng
Hi Jim, hello everybody,

genparse version 0.6.8 has support for command line parameters
with optional arguments now. This was the most important missing
feature for generating the command line parsing code for the tail 
command. I focused on the tail command because in a previous email 
you suggested tail as a candidate for genparse.

The attached patch contains the genparse extensions for both the
wc command (as previously posted) and the tail command. I verified
that it successfully passes make check. I also added tail to
the examples on the genparse homepage 
(http://genparse.sourceforge.net/examples.html) so that you can 
have a look at the generated code without downloading and building
the latest version of genparse.

As usual: Any comments welcome!

Michael
diff -u -N -r coreutils-6.9.orig/doc/Makefile.am coreutils-6.9/doc/Makefile.am
--- coreutils-6.9.orig/doc/Makefile.am  2007-03-18 22:36:43.0 +0100
+++ coreutils-6.9/doc/Makefile.am   2007-08-19 13:45:25.0 +0200
@@ -32,10 +32,10 @@
 # old systems.
 AM_MAKEINFOFLAGS = --no-split
 
-constants.texi: $(top_srcdir)/src/tail.c
+constants.texi: $(top_srcdir)/src/tail.h
LC_ALL=C \
sed -n -e 's/^#define \(DEFAULT_MAX[_A-Z]*\) \(.*\)/@set \1 \2/p' \
- $(top_srcdir)/src/tail.c  t-$@
+ $(top_srcdir)/src/tail.h  t-$@
mv t-$@ $@
 
 MAINTAINERCLEANFILES = constants.texi
diff -u -N -r coreutils-6.9.orig/doc/Makefile.in coreutils-6.9/doc/Makefile.in
--- coreutils-6.9.orig/doc/Makefile.in  2007-03-22 22:20:21.0 +0100
+++ coreutils-6.9/doc/Makefile.in   2007-08-19 13:46:04.0 +0200
@@ -937,10 +937,10 @@
uninstall-ps-am
 
 
-constants.texi: $(top_srcdir)/src/tail.c
+constants.texi: $(top_srcdir)/src/tail.h
LC_ALL=C \
sed -n -e 's/^#define \(DEFAULT_MAX[_A-Z]*\) \(.*\)/@set \1 \2/p' \
- $(top_srcdir)/src/tail.c  t-$@
+ $(top_srcdir)/src/tail.h  t-$@
mv t-$@ $@
 
 $(DVIS): $(EXTRA_DIST)
diff -u -N -r coreutils-6.9.orig/src/Makefile.am coreutils-6.9/src/Makefile.am
--- coreutils-6.9.orig/src/Makefile.am  2007-03-20 08:24:27.0 +0100
+++ coreutils-6.9/src/Makefile.am   2007-08-19 13:46:56.0 +0200
@@ -363,3 +363,12 @@
| grep -Ev -f $$t \
  { echo 'the above variables should have static scope' 12;   \
exit 1; } || :
+
+wc_SOURCES = wc_clp.c wc.c
+wc_clp.c: wc.gp
+   genparse --longmembers --internationalize -o wc_clp wc.gp
+
+
+tail_SOURCES = tail_clp.c tail.c
+tail_clp.c: tail.gp
+   genparse --longmembers --internationalize -o tail_clp tail.gp
diff -u -N -r coreutils-6.9.orig/src/Makefile.in coreutils-6.9/src/Makefile.in
--- coreutils-6.9.orig/src/Makefile.in  2007-03-22 22:23:20.0 +0100
+++ coreutils-6.9/src/Makefile.in   2007-08-19 13:47:14.0 +0200
@@ -569,8 +569,8 @@
 tac_LDADD = $(LDADD)
 tac_DEPENDENCIES = ../lib/libcoreutils.a $(am__DEPENDENCIES_1) \
../lib/libcoreutils.a
-tail_SOURCES = tail.c
-tail_OBJECTS = tail.$(OBJEXT)
+am_tail_OBJECTS = tail_clp.$(OBJEXT) tail.$(OBJEXT)
+tail_OBJECTS = $(am_tail_OBJECTS)
 tail_DEPENDENCIES = $(am__DEPENDENCIES_3)
 tee_SOURCES = tee.c
 tee_OBJECTS = tee.$(OBJEXT)
@@ -636,8 +636,8 @@
 vdir_OBJECTS = $(am_vdir_OBJECTS)
 vdir_DEPENDENCIES = $(am__DEPENDENCIES_2) $(am__DEPENDENCIES_1) \
$(am__DEPENDENCIES_1) $(am__DEPENDENCIES_1)
-wc_SOURCES = wc.c
-wc_OBJECTS = wc.$(OBJEXT)
+am_wc_OBJECTS = wc_clp.$(OBJEXT) wc.$(OBJEXT)
+wc_OBJECTS = $(am_wc_OBJECTS)
 wc_LDADD = $(LDADD)
 wc_DEPENDENCIES = ../lib/libcoreutils.a $(am__DEPENDENCIES_1) \
../lib/libcoreutils.a
@@ -677,10 +677,10 @@
rmdir.c seq.c setuidgid.c $(sha1sum_SOURCES) \
$(sha224sum_SOURCES) $(sha256sum_SOURCES) $(sha384sum_SOURCES) \
$(sha512sum_SOURCES) shred.c shuf.c sleep.c sort.c split.c \
-   stat.c stty.c su.c sum.c sync.c tac.c tail.c tee.c test.c \
-   touch.c tr.c true.c tsort.c tty.c uname.c unexpand.c uniq.c \
-   unlink.c uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c \
-   yes.c
+   stat.c stty.c su.c sum.c sync.c tac.c $(tail_SOURCES) tee.c \
+   test.c touch.c tr.c true.c tsort.c tty.c uname.c unexpand.c \
+   uniq.c unlink.c uptime.c users.c $(vdir_SOURCES) $(wc_SOURCES) \
+   who.c whoami.c yes.c
 DIST_SOURCES = $(__SOURCES) base64.c basename.c cat.c $(chgrp_SOURCES) \
chmod.c $(chown_SOURCES) chroot.c cksum.c comm.c $(cp_SOURCES) \
csplit.c cut.c date.c dd.c df.c $(dir_SOURCES) dircolors.c \
@@ -693,10 +693,10 @@
rmdir.c seq.c setuidgid.c $(sha1sum_SOURCES) \
$(sha224sum_SOURCES) $(sha256sum_SOURCES) $(sha384sum_SOURCES) \
$(sha512sum_SOURCES) shred.c shuf.c sleep.c sort.c split.c \
-   stat.c stty.c su.c sum.c sync.c tac.c tail.c tee.c test.c \
-   touch.c tr.c true.c tsort.c tty.c uname.c unexpand.c uniq.c \
-   unlink.c uptime.c users.c $(vdir_SOURCES) wc.c who.c whoami.c \
- 

Re: [PATCH] Command line parsing of wc with genparse

2007-07-22 Thread Michael Geng
On Mon, Jul 16, 2007 at 09:55:47PM +0200, Michael Geng wrote:
 On Mon, Jul 16, 2007 at 10:02:13AM +0200, Jim Meyering wrote:
  [EMAIL PROTECTED] (Michael Geng) wrote:
   I released version 0.6.6 of genparse which supports
   internationalization now. A link to the download page
   and to the updated documentation can be accessed from
   the genparse project page http://genparse.sourceforge.net/.
  
   I also prepared a patch which demonstrates how genparse
   could be used for parsing the command line arguments of
   the wc command. The patch is relative to the coreutils version
   6.9. In order to compile the coreutils after you applied
   the patch you will need genparse version 0.6.6 in your path.
  
   Does this approach appear reasonable to you? What else should
   be improved on genparse?
  
  I took a quick look only at the patch (not at the code genparse generates)
  and spotted some minor problems:
  
   +  if (cmdline.version)
   +{
   +  version_etc (stdout, program_name, GNU_PACKAGE, VERSION, 
  AUTHORS,
  
  Here, it should be PROGRAM_NAME, not program_name.

I changed it to PROGRAM_NAME. Thanks.

   diff -u -N -r coreutils-6.9.orig/src/wc.gp coreutils-6.9/src/wc.gp
   --- coreutils-6.9.orig/src/wc.gp  1970-01-01 01:00:00.0 
  +0100
   +++ coreutils-6.9/src/wc.gp   2007-07-10 18:17:33.0 +0200
   @@ -0,0 +1,24 @@
  ...
   +Report bugs to [EMAIL PROTECTED].
   +#usage_end
  
  s/fileutils/coreutils/
  or, perhaps better: PACKAGE_BUGREPORT

Again I released a new version of genparse (0.6.7) which now allows to include
string macros into the usage() function. So genparse is now able to use the 
PACKAGE_BUGREPORT macro.

  Did you ensure that wc still passes make check?
 
 I did this now - and I found a bug: wc -l doesn't work because I didn't assign
 the print_lines variable. That's not a bug in genparse but in my hand written
 code in wc.c. But after fixing this make check still fails because the usage()
 function doesn't print a proper PACKAGE_BUGREPORT as you also found.

I 'm attaching an updated patch which passes make check. You need genpare 
version
0.6.7 in your path in order to compile coreutils after you applied the patch.

  Make sure valgrind wc ... shows no leaks, too.
  In particular, test that with the --files0-from=F option.

valgrind wc --files0-from=F tells me something about 3 not-freed blocks.
But it gives exactly the same message if I'm executing valgrind wc 
--files0-from=F
from a coreutils build from the original sources of version 6.9. Is there a 
known memory leak with the --files0-from=F option of wc?

I tried vagrind wc ... with some other options but it didn't report any errors
except for the --files0-from option.

  Is there any significant difference in performance (I doubt it) or binary 
  size?

I inspected the wc which was generated using genparse with the original one and
found that the .text segment increased by 16 bytes. BTW, how can I build the
coreutils without debug information? configure CFLAGS=-O2 -g0 doesn't
remove debug information completely. Would you build coreutils differently for 
comparing binary file sizes? How exactly would you compare binary file sizes? 
I think comparing the .text segment size is not enough.

  I suggest you also adapt a program that takes integer arguments.
  Can genparse handle arguments of all different types, from int
  through uintmax_t?

Presently genparse is able to handle int, float, char, string or flag types.
No unsigned int for example. That's another extension which I will have to
add.

  
  tail looks like a good candidate.  It also had an optional argument:
  --follow[={name|descriptor}]
  
  There are quite a few programs in the coreutils that have unusual
  argument-parsing constraints, so it's hard to point to a small subset
  and say if you adapt just these few, then that should be proof enough.
  I'm afraid that if you're interested in having coreutils convert, you'll
  have to do most of the work, and then demonstrate that there's no
  resulting regression (and presumably cleaner code).
  
  For now, let's start with tail, and maybe ls if you're ambitious :)

That's probably a good path to move on, but now that I started with the wc 
command I wanted to get this one working first. 

  Another detail:
  IMHO, the contents of your wc.gp file should reside in wc.c.
  Might as well leave it in a delimited comment in place of the usage
  function you're eliminating, then have a makefile rule use sed or awk
  to extract it into wc.gp.

Could be done this way.

Michael
diff -u -N -r coreutils-6.9.orig/src/Makefile.am coreutils-6.9/src/Makefile.am
--- coreutils-6.9.orig/src/Makefile.am  2007-03-20 08:24:27.0 +0100
+++ coreutils-6.9/src/Makefile.am   2007-07-10 18:17:23.0 +0200
@@ -363,3 +363,7 @@
| grep -Ev -f $$t \
  { echo 'the above variables should have

Re: [PATCH] Command line parsing of wc with genparse

2007-07-16 Thread Michael Geng
On Mon, Jul 16, 2007 at 10:02:13AM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
  I released version 0.6.6 of genparse which supports
  internationalization now. A link to the download page
  and to the updated documentation can be accessed from
  the genparse project page http://genparse.sourceforge.net/.
 
  I also prepared a patch which demonstrates how genparse
  could be used for parsing the command line arguments of
  the wc command. The patch is relative to the coreutils version
  6.9. In order to compile the coreutils after you applied
  the patch you will need genparse version 0.6.6 in your path.
 
  Does this approach appear reasonable to you? What else should
  be improved on genparse?
 
 I took a quick look only at the patch (not at the code genparse generates)
 and spotted some minor problems:
 
  +  if (cmdline.version)
  +{
  +  version_etc (stdout, program_name, GNU_PACKAGE, VERSION, AUTHORS,
 
 Here, it should be PROGRAM_NAME, not program_name.
 ...
  diff -u -N -r coreutils-6.9.orig/src/wc.gp coreutils-6.9/src/wc.gp
  --- coreutils-6.9.orig/src/wc.gp1970-01-01 01:00:00.0 
 +0100
  +++ coreutils-6.9/src/wc.gp 2007-07-10 18:17:33.0 +0200
  @@ -0,0 +1,24 @@
 ...
  +Report bugs to [EMAIL PROTECTED].
  +#usage_end
 
 s/fileutils/coreutils/
 or, perhaps better: PACKAGE_BUGREPORT
 
 Did you ensure that wc still passes make check?

I did this now - and I found a bug: wc -l doesn't work because I didn't assign
the print_lines variable. That's not a bug in genparse but in my hand written
code in wc.c. But after fixing this make check still fails because the usage()
function doesn't print a proper PACKAGE_BUGREPORT as you also found.

 Make sure valgrind wc ... shows no leaks, too.
 In particular, test that with the --files0-from=F option.
 Is there any significant difference in performance (I doubt it) or binary 
 size?
 
 I suggest you also adapt a program that takes integer arguments.
 Can genparse handle arguments of all different types, from int
 through uintmax_t?
 
 tail looks like a good candidate.  It also had an optional argument:
 --follow[={name|descriptor}]
 
 There are quite a few programs in the coreutils that have unusual
 argument-parsing constraints, so it's hard to point to a small subset
 and say if you adapt just these few, then that should be proof enough.
 I'm afraid that if you're interested in having coreutils convert, you'll
 have to do most of the work, and then demonstrate that there's no
 resulting regression (and presumably cleaner code).
 
 For now, let's start with tail, and maybe ls if you're ambitious :)
 
 Another detail:
 IMHO, the contents of your wc.gp file should reside in wc.c.
 Might as well leave it in a delimited comment in place of the usage
 function you're eliminating, then have a makefile rule use sed or awk
 to extract it into wc.gp.

I will need some time to think about all the issues you mentioned and
find solutions. I appreciate your constructive commments! 

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


[PATCH] Command line parsing of wc with genparse

2007-07-15 Thread Michael Geng
Hi,

I released version 0.6.6 of genparse which supports 
internationalization now. A link to the download page
and to the updated documentation can be accessed from
the genparse project page http://genparse.sourceforge.net/.

I also prepared a patch which demonstrates how genparse 
could be used for parsing the command line arguments of 
the wc command. The patch is relative to the coreutils version 
6.9. In order to compile the coreutils after you applied 
the patch you will need genparse version 0.6.6 in your path.

Does this approach appear reasonable to you? What else should 
be improved on genparse? 

Michael
diff -u -N -r coreutils-6.9.orig/src/Makefile.am coreutils-6.9/src/Makefile.am
--- coreutils-6.9.orig/src/Makefile.am  2007-03-20 08:24:27.0 +0100
+++ coreutils-6.9/src/Makefile.am   2007-07-10 18:17:23.0 +0200
@@ -363,3 +363,7 @@
| grep -Ev -f $$t \
  { echo 'the above variables should have static scope' 12;   \
exit 1; } || :
+
+wc_SOURCES = wc_clp.c wc.c
+wc_clp.c: wc.gp
+   genparse --longmembers --internationalize -o wc_clp wc.gp
diff -u -N -r coreutils-6.9.orig/src/wc.c coreutils-6.9/src/wc.c
--- coreutils-6.9.orig/src/wc.c 2007-03-18 22:36:43.0 +0100
+++ coreutils-6.9/src/wc.c  2007-07-10 18:49:27.0 +0200
@@ -24,13 +24,13 @@
 #include getopt.h
 #include sys/types.h
 
-#include system.h
 #include error.h
 #include inttostr.h
 #include quote.h
 #include readtokens0.h
 #include safe-read.h
 #include wcwidth.h
+#include wc_clp.h
 
 #if !defined iswspace  !HAVE_ISWSPACE
 # define iswspace(wc) \
@@ -77,60 +77,6 @@
   struct stat st;
 };
 
-/* For long options that have no equivalent short option, use a
-   non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
-enum
-{
-  FILES0_FROM_OPTION = CHAR_MAX + 1
-};
-
-static struct option const longopts[] =
-{
-  {bytes, no_argument, NULL, 'c'},
-  {chars, no_argument, NULL, 'm'},
-  {lines, no_argument, NULL, 'l'},
-  {words, no_argument, NULL, 'w'},
-  {files0-from, required_argument, NULL, FILES0_FROM_OPTION},
-  {max-line-length, no_argument, NULL, 'L'},
-  {GETOPT_HELP_OPTION_DECL},
-  {GETOPT_VERSION_OPTION_DECL},
-  {NULL, 0, NULL, 0}
-};
-
-void
-usage (int status)
-{
-  if (status != EXIT_SUCCESS)
-fprintf (stderr, _(Try `%s --help' for more information.\n),
-program_name);
-  else
-{
-  printf (_(\
-Usage: %s [OPTION]... [FILE]...\n\
-  or:  %s [OPTION]... --files0-from=F\n\
-),
- program_name, program_name);
-  fputs (_(\
-Print newline, word, and byte counts for each FILE, and a total line if\n\
-more than one FILE is specified.  With no FILE, or when FILE is -,\n\
-read standard input.\n\
-  -c, --bytesprint the byte counts\n\
-  -m, --charsprint the character counts\n\
-  -l, --linesprint the newline counts\n\
-), stdout);
-  fputs (_(\
-  --files0-from=Fread input from the files specified by\n\
-   NUL-terminated names in file F\n\
-  -L, --max-line-length  print the length of the longest line\n\
-  -w, --wordsprint the word counts\n\
-), stdout);
-  fputs (HELP_OPTION_DESCRIPTION, stdout);
-  fputs (VERSION_OPTION_DESCRIPTION, stdout);
-  printf (_(\nReport bugs to %s.\n), PACKAGE_BUGREPORT);
-}
-  exit (status);
-}
-
 /* FILE is the name of the file (or NULL for standard input)
associated with the specified counters.  */
 static void
@@ -580,6 +526,7 @@
   char *files_from = NULL;
   struct fstatus *fstatus;
   struct Tokens tok;
+  struct arg_t cmdline;
 
   initialize_main (argc, argv);
   program_name = argv[0];
@@ -589,44 +536,27 @@
 
   atexit (close_stdout);
 
-  print_lines = print_words = print_chars = print_bytes = false;
-  print_linelength = false;
   total_lines = total_words = total_chars = total_bytes = max_line_length = 0;
 
-  while ((optc = getopt_long (argc, argv, clLmw, longopts, NULL)) != -1)
-switch (optc)
-  {
-  case 'c':
-   print_bytes = true;
-   break;
-
-  case 'm':
-   print_chars = true;
-   break;
-
-  case 'l':
-   print_lines = true;
-   break;
-
-  case 'w':
-   print_words = true;
-   break;
-
-  case 'L':
-   print_linelength = true;
-   break;
-
-  case FILES0_FROM_OPTION:
-   files_from = optarg;
-   break;
-
-  case_GETOPT_HELP_CHAR;
-
-  case_GETOPT_VERSION_CHAR (PROGRAM_NAME, AUTHORS);
-
-  default:
-   usage (EXIT_FAILURE);
-  }
+  Cmdline(cmdline, argc, argv);
+  print_bytes  = cmdline.bytes;
+  print_chars  = cmdline.lines;
+  print_words  = cmdline.words;
+  print_linelength = cmdline.max_line_length;
+  files_from   = cmdline.files0_from;
+  print_bytes  = cmdline.bytes;
+  print_bytes  = cmdline.bytes;
+  print_bytes  = cmdline.bytes;
+
+  if (cmdline.help)
+usage (EXIT_SUCCESS, 

Re: Simplifying command line parsing with Genparse

2007-06-23 Thread Michael Geng
On Sat, Jun 16, 2007 at 07:23:09PM +0200, Michael Geng wrote:
 On Thu, Jun 07, 2007 at 02:44:38PM +0200, Jim Meyering wrote:
  However, before I even consider it seriously, it'll need
  some improvements:
  
- it must detect any and all write failures[*]
 
 I just added a new release of genparse 
 (http://sourceforge.net/project/showfiles.php?group_id=4341package_id=4354release_id=516584)
 which now also checks file write and close operations, not only
 file open operations as it was before.
 
- packaging so that I can run ./configure  make  make check, and
if I don't happen to have cppunit infrastructure, gcj or something
similar, it should tell me about it directly, rather than causing
harder-to-diagnose build failures.
 
 I went through all these tools and updated configure.ac (and related files)
 such that you get a warning if something is missing but you can still
 compile genparse. For example if you don't have gcj installed then it 
 automatically removes the java example from the list of build targets.
 It now checks for cunit, cppunit, junit, gcj, doxygen, texi2html, texi2pdf 
 and man2html.
 
- one of the generated .c files I looked at calls strdup but doesn't
check for a NULL return value or (less important) attempt to avoid
the leak when the corresponding --backup=S option appears twice or 
  more.
 
 I'm working on that and I'll let you know when this is finished.

I again released a new version of genparse (0.6.5) which doesn't allocate
any memory on the heap any more because I think this is a better concept.
So it doesn't generate any calls to strdup() or malloc() any more. As a
consequence the parser function changed from 

struct arg_t *Cmdline(int argc, char *argv[]) 

to

void Cmdline(struct arg_t *my_args, int argc, char *argv[]). 

See the Genparse examples page (http://genparse.sourceforge.net/examples.html) 
which is also updated.

 
- not exactly essential, but highly recommended: it should work
with the latest autoconf and automake
 
 I'm using autoconf version 2.59 and automake 1.9.5. Aditionally I renamed 
 configure.in to configure.ac which appears to be the standard today and 
 removed acconfig.h because it appears to be deprecated. This might have
 produced some warnings before. 
 
  [*] with genparse-0.6.3, I get this:
  
$ strace -o log -e write ./genparse -o /full/tmp/foo ../examples/ls.gp \
   echo exit status 0
creating /full/tmp/foo.h...done
creating /full/tmp/foo.c...done
exit status 0
$ tail -2 log
write(3, /***..., 80) = -1 ENOSPC (No space 
  left on device)
write(1, creating /full/tmp/foo.c...done\n, 32) = 32
  
  The two files it claims to have created are both empty,
  and genparse exited successfully.
  This means genparse is not detecting write or close failures.
  
  Note that in the example above, /full/tmp is a full file system that still
  has free inodes, so open can create new files, but writes always fail.
 
 I could verify this behavior and I hope it's fixed now. Now it tells you
 that it can't write to the file and returns with a non zero exit status.
 
 Thanks again for your input,
 Michael
 
 
 ___
 Bug-coreutils mailing list
 Bug-coreutils@gnu.org
 http://lists.gnu.org/mailman/listinfo/bug-coreutils


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Simplifying command line parsing with Genparse

2007-06-17 Thread Michael Geng
On Thu, Jun 07, 2007 at 02:44:38PM +0200, Jim Meyering wrote:
 However, before I even consider it seriously, it'll need
 some improvements:
 
   - it must detect any and all write failures[*]

I just added a new release of genparse 
(http://sourceforge.net/project/showfiles.php?group_id=4341package_id=4354release_id=516584)
which now also checks file write and close operations, not only
file open operations as it was before.

   - packaging so that I can run ./configure  make  make check, and
   if I don't happen to have cppunit infrastructure, gcj or something
   similar, it should tell me about it directly, rather than causing
   harder-to-diagnose build failures.

I went through all these tools and updated configure.ac (and related files)
such that you get a warning if something is missing but you can still
compile genparse. For example if you don't have gcj installed then it 
automatically removes the java example from the list of build targets.
It now checks for cunit, cppunit, junit, gcj, doxygen, texi2html, texi2pdf 
and man2html.

   - one of the generated .c files I looked at calls strdup but doesn't
   check for a NULL return value or (less important) attempt to avoid
   the leak when the corresponding --backup=S option appears twice or more.

I'm working on that and I'll let you know when this is finished.

   - not exactly essential, but highly recommended: it should work
   with the latest autoconf and automake

I'm using autoconf version 2.59 and automake 1.9.5. Aditionally I renamed 
configure.in to configure.ac which appears to be the standard today and 
removed acconfig.h because it appears to be deprecated. This might have
produced some warnings before. 

 [*] with genparse-0.6.3, I get this:
 
   $ strace -o log -e write ./genparse -o /full/tmp/foo ../examples/ls.gp \
  echo exit status 0
   creating /full/tmp/foo.h...done
   creating /full/tmp/foo.c...done
   exit status 0
   $ tail -2 log
   write(3, /***..., 80) = -1 ENOSPC (No space 
 left on device)
   write(1, creating /full/tmp/foo.c...done\n, 32) = 32
 
 The two files it claims to have created are both empty,
 and genparse exited successfully.
 This means genparse is not detecting write or close failures.
 
 Note that in the example above, /full/tmp is a full file system that still
 has free inodes, so open can create new files, but writes always fail.

I could verify this behavior and I hope it's fixed now. Now it tells you
that it can't write to the file and returns with a non zero exit status.

Thanks again for your input,
Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Simplifying command line parsing with Genparse

2007-06-08 Thread Michael Geng
On Thu, Jun 07, 2007 at 03:49:53PM -0600, Bob Proulx wrote:
 Michael Geng wrote:
  I would expect Genparse to generate faster code than argp because it does 
  part of the work at compile time while argp does everything at run time
  since it's a library function.
 
 Is the performance of parsing program arguments really a concern?  I
 would think that the difference would not be significant.  I would
 value clean code over minor performance improvements.  For coreutils I
 would value object code size over performance.
 
 Bob

I could imagine that especially for the Coreutils speed might be more a 
concern than for other tools because some of the Coreutils tools are
called very often. And I would also expect that with Genparse you also get 
smaller code than with argp again because it does part of the work while
it generates the parser code while argp is a library function and so it 
does everything at run time.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Simplifying command line parsing with Genparse

2007-06-07 Thread Michael Geng
Hi,

would it be an option to use Genparse (http://genparse.sourceforge.net/)
for command line parsing in the GNU Coreutils? 

I'm one of the developers of Genparse and I recently used some of the 
well known Coreutils as an exercise for testing Genparse (see
http://genparse.sourceforge.net/examples.html). Using Genparse for 
generating the command line parsing code could reduce the amount of
coreutils source code because the input to Genparse is a short config 
file only. The overhead of writing the parser code would be delegated 
to the tool then.

The Genparse generated parsers call getopt() (or getopt_long()) exactly
the same way the Coreutils's command line parsers do it today. This
wouldn't be changed. So the code Genparse generates will be very similar
to the existing hand written parsers of the individual Coreutils tools. 
But calling getopt() is only part of the work, preparing and evaluating 
the results of getopt() also causes coding work which could be delegated 
to Genparse. Genparse also automatically generates a help screen which 
would no longer have to be done manually.

Michael


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Simplifying command line parsing with Genparse

2007-06-07 Thread Michael Geng
On Thu, Jun 07, 2007 at 02:44:38PM +0200, Jim Meyering wrote:
 [EMAIL PROTECTED] (Michael Geng) wrote:
  would it be an option to use Genparse (http://genparse.sourceforge.net/)
  for command line parsing in the GNU Coreutils?
 
  I'm one of the developers of Genparse and I recently used some of the
  well known Coreutils as an exercise for testing Genparse (see
  http://genparse.sourceforge.net/examples.html). Using Genparse for
  generating the command line parsing code could reduce the amount of
  coreutils source code because the input to Genparse is a short config
  file only. The overhead of writing the parser code would be delegated
  to the tool then.
 
  The Genparse generated parsers call getopt() (or getopt_long()) exactly
  the same way the Coreutils's command line parsers do it today. This
  wouldn't be changed. So the code Genparse generates will be very similar
  to the existing hand written parsers of the individual Coreutils tools.
  But calling getopt() is only part of the work, preparing and evaluating
  the results of getopt() also causes coding work which could be delegated
  to Genparse. Genparse also automatically generates a help screen which
  would no longer have to be done manually.
 
 Hi Michael,
 
 Genparse looks promising.
 I like the examples.  But there are almost 100 programs in the
 coreutils.  If genparse can really handle all of those use cases
 without causing any significant degradation in the tools, then
 it will be hard to object.
 
 How does it deal with internationalization?

There is no internationalization. I looked at the Coreutils 
internationalization but I neither fully understood how it works 
nor do I have an idea right now how to internationalize Genparse's
output. internationalization appears to be a difficult thing.

 However, before I even consider it seriously, it'll need
 some improvements:

I appreciate your concerns very much. All of them are valid and valuable
input for Genparse development. At least some of them will need more
discussion however which I think is not interesting to everybody listening
to the Coreutils mailing list. For this reason I started a new thread
on the genparse-users mailing list
(http://sourceforge.net/mailarchive/forum.php?thread_name=20070607150633.GA26405%40t-online.deforum_name=genparse-users).
Any posts to the genparse-users mailing list from anybody is welcome!
I will let you (and the coreutils-bug mailing list) know if there is any
progress on ony of the problems you listed.

   - it must detect any and all write failures[*]
   - packaging so that I can run ./configure  make  make check, and
   if I don't happen to have cppunit infrastructure, gcj or something
   similar, it should tell me about it directly, rather than causing
   harder-to-diagnose build failures.
   - one of the generated .c files I looked at calls strdup but doesn't
   check for a NULL return value or (less important) attempt to avoid
   the leak when the corresponding --backup=S option appears twice or more.
   - not exactly essential, but highly recommended: it should work
   with the latest autoconf and automake
 
 
 [*] with genparse-0.6.3, I get this:
 
   $ strace -o log -e write ./genparse -o /full/tmp/foo ../examples/ls.gp \
  echo exit status 0
   creating /full/tmp/foo.h...done
   creating /full/tmp/foo.c...done
   exit status 0
   $ tail -2 log
   write(3, /***..., 80) = -1 ENOSPC (No space 
 left on device)
   write(1, creating /full/tmp/foo.c...done\n, 32) = 32
 
 The two files it claims to have created are both empty,
 and genparse exited successfully.
 This means genparse is not detecting write or close failures.
 
 Note that in the example above, /full/tmp is a full file system that still
 has free inodes, so open can create new files, but writes always fail.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils