bug#20553: 'echo -e' does not escape backslash correctly

2015-05-12 Thread Erik Auerswald
Hi,

On Mon, May 11, 2015 at 11:17:34PM +0100, Stephane Chazelas wrote:
 2015-05-11 23:50:25 +0200, Jo Drexl (FFGR-IT):
  Hi guys,
  I had to write a Windows bat file for twentysomething users and - as
  Linux geek - wrote a small Bash script for it. The code in question is
  as follows:
  
  echo -e net use z: srv\\aqs /persistent:no /user:%USERNAME%
  $BG_PASSWD\r
 [...]
 
 If that's a bash script, then that has nothing to do with GNU
 coreutils as bash has its own builtin version of echo.
 
 In any case, there's no bug here. and GNU coreutils echo or the
 bash one behave the same.
 
 \ is used as an escape character both for the bash language
 within double quotes, and for echo -e.
 
 echo -e 
 
 Passes 3 arguments to echo: echo, -e and \\

So you need to add another handful of \ characters:

$ echo -e 
\\
$ /bin/echo -e 
\\

Erik





bug#20511: split : does not account for --numeric-suffixes=FROM in calculation of suffix length?

2015-05-12 Thread Pádraig Brady
On 06/05/15 11:53, Pádraig Brady wrote:
 On 06/05/15 05:29, Ben Rusholme wrote:
 As you say, this can always be fixed by the --suffix-length argument, but 
 it’s only required for certain combinations of FROM and CHUNK, (and “split” 
 already has all the information it needs).

 Now you could bump the suffix length based on the start number,
 though I don't think we should as that would impact on future
 processing (ordering) of the resultant files.  I.E. specifying
 a FROM value to --numeric-suffixes should only impact the
 start value, rather than the width.

 Could you clarify this for me? Doesn’t the zero-padding ensure correct 
 processing order?
 
 There are two use cases supported by specifying FROM.
 1. Setting the start for a single run (FROM is usually 1 in this case)
 2. Setting the offset for multiple independent split runs.
 In the second case we can't infer the size of the total set
 in any particular run, and thus require that --suffix-length is specified 
 appropriately.
 I.E. for multiple independent runs, the suffix length needs to be
 fixed width across the entire set for the total ordering to be correct.
 
 
 Things we could change are...
 
 1. Special case FROM=1 to assume a single run and thus
 enable auto suffix expansion or appropriately sized suffix with CHUNK.
 This would be a backwards incompat change and also not
 guaranteed a single run, so I'm reluctant to do that.
 
 2. Give an early error with specified FROM and CHUNK
 that would overflow the suffix size for CHUNK.
 This would save some processing, though doesn't add
 any protections against latent issues. I.E. you still get
 the error which is dependent on the parameters rather than the input data 
 size.
 Therefore it's probably not worth the complication.
 
 3. Leave suffix length at 2 when both FROM and CHUNK are specified.
 In retrospect, this would probably have been the best option
 to avoid ambiguities like this. However now we'd be breaking
 compat with scripts with FROM=1 and CHUNK=200 etc.
 While CHUNK values  100 would be unusual
 
 4. Auto set the suffix len based on FROM + CHUNK.
 That would support use case 1 (single run),
 but _silently_ break subsequent processing order
 of outputs from multiple split runs
 (as FROM is increased in multiples of CHUNK size).
 We could mitigate the _silent_ breakage though
 by limiting this change to when FROM  CHUNK.
 
 5. Document in man page and with more detail in info docs
 that -a is recommended when specifying FROM
 
 So I'll do 4 and 5 I think.

Attached.

cheers,
Pádraig

From 4d5e6c4f4a2ba8407420e56282c0d4e37b2691ee Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Wed, 6 May 2015 01:48:40 +0100
Subject: [PATCH] split: auto set suffix len for --numeric-suffixes=N
 --number=N

Supporting `split --numeric-suffixes=1 -n100` for example.

* doc/coreutils.texi (split invocation): Mention the two
use cases for the FROM parameter, and the consequences on
the suffix length determination.
* src/split.c (set_suffix_length): Use the --numeric-suffixes
FROM parameter in the suffix width calculation, when it's
less than the number of files specified in --number.
* tests/split/suffix-auto-length.sh: Add test cases.
Fixes http://bugs.gnu.org/20511
---
 doc/coreutils.texi| 11 ---
 src/split.c   | 22 --
 tests/split/suffix-auto-length.sh | 21 -
 3 files changed, 44 insertions(+), 10 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 51d96b4..f887e04 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -3181,9 +3181,14 @@ specified, will auto increase the length by 2 as required.
 @opindex --numeric-suffixes
 Use digits in suffixes rather than lower-case letters.  The numerical
 suffix counts from @var{from} if specified, 0 otherwise.
-Note specifying a @var{from} value also disables the default
-auto suffix length expansion described above, and so you may also
-want to specify @option{-a} to allow suffixes beyond @samp{99}.
+
+@var{from} is used to either set the initial suffix for a single run,
+or to set the suffix offset for independently split inputs, and consequently
+the auto suffix length expansion described above is disabled.  Therefore
+you may also want to use option @option{-a} to allow suffixes beyond @samp{99}.
+Note if option @option{--number} is specified and the number of files is less
+than @var{from}, a single run is assumed and the minimum suffix length
+required is automatically determined.
 
 @item --additional-suffix=@var{suffix}
 @opindex --additional-suffix
diff --git a/src/split.c b/src/split.c
index 5d6043f..b6fe2dd 100644
--- a/src/split.c
+++ b/src/split.c
@@ -39,6 +39,7 @@
 #include sig2str.h
 #include xfreopen.h
 #include xdectoint.h
+#include xstrtol.h
 
 /* The official name of this program (e.g., no 'g' prefix).  */
 #define PROGRAM_NAME split
@@ -173,9 +174,26 @@ set_suffix_length (uintmax_t 

bug#20553: 'echo -e' does not escape backslash correctly

2015-05-12 Thread Stephane Chazelas
2015-05-11 17:36:50 -0600, Eric Blake:
 On 05/11/2015 04:14 PM, Pádraig Brady wrote:
 
  echo -e net use z: srv\\aqs /persistent:no /user:%USERNAME% 
  $BG_PASSWD\r
 
 'echo -e' is non-portable.  POSIX recommends that you use printf
 instead, as the POSIX version of echo is supposed to behave as follows:
 
 $ echo -e 'a\nb'
 -e a\nb
[...]

Strictly speaking without XSI (Unix conformance), the behaviour
of that command is unspecified because the arguments contain
backslash.

With XSI, the behaviour is specified but the expected output is:

-e anewlinebnewline

 You are relying on non-POSIX behavior for backslash interpolation.
 
  Note echo is not portable to other systems, and if that's required,
 
 In fact, it's not even portable to bash:
 
 $ shopt -s xpg_echo
 
 tells bash to turn on POSIX rules for echo, invalidating any bash script
 that relies on 'echo -e'.

xpg_echo alone doesn't make bash's echo POSIX compliant. It just means
-e is implicit. It wouln't break scripts that use echo -e,
just script that use echo without -e and expect escape
sequences not to be expanded.

To have a Unix (POSIX+XSI) conformant echo, you need both the
posix (set -o posix) and xpg_echo (shopt -s xpg_echo) options.

Those can also be enabled via the environment (BASHOPTS,
SHELLOPTS, POSIXLY_CORRECT) or at compile time. bash will also
enable posix when called as sh.

See
https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo
for more info.

 
  printf(1) is a better option, though that will have different
  quoting again due to the % chars etc.
 
 % doesn't need quoting in shell.  But yes, printf is more portable.

% needs escaped (with %) in the format argument though, which is
probaby what Pádraig was refering to.

-- 
Stephane