Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-11-07 Thread Alexandre Ferrieux
On Tue, 7 Nov 2023 16:14:58 + patrick.a...@orange.com wrote:
>
> So it means that the kind of code below in anyone's script will have an
opposite result depending on whether we use dash of the Bookworm version or
dash of the Bullseye version and that this same command interpreted by Bash
or dash will always have an opposite result ? Not at all sure it's serious
>
> - Dash Bullseye and bash
># case "A" in [^A]) echo "character not accepted" ;;esac
>
> - Dash bookworm
># case "A" in [^A]) echo "character not accepted" ;;esac
>character not accepted
>
> thanks for reconsideration,
> Patrick

+1000!

The fact that one year has elapsed with that terrible regression in sid
without any complaint does NOT mean that it is "okay"; it only means that a
huge population of Debian sysadmins only ever stick to stable.
In this huge population, how many might be using #! /bin/sh as a shebang ?
And among them, how many use caret-negation in "case..esac" ?
And within this (still hefty IMO) subset, how many are operating stuff like
nuclear facilities, planes or brain surgery tools ?

Does this perspective make it sound reasonable to break decade-old
semantics in the most central piece of modern software after the Linux
kernel ?

TL;DR: please please please NO, don't freakin' break the Shell !

-Alex


Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-11-07 Thread patrick . anat
Hi,

On Thu, 13 Apr 2023 11:48:10 +0200 Paul Gevers  wrote: 
> Control: clone -1 -2 
> Control: reassign -2 release-notes 
> 
> On 12-04-2023 16:57, Santiago Ruano Rincón wrote: 
> > If the current behaviour 
> > would be part of bookworm, a NEWS entry would be great. 
> 
> And a release note would be worth it too I guess. 
> 
> Paul 

So it means that the kind of code below in anyone's script will have an 
opposite result depending on whether we use dash of the Bookworm version or 
dash of the Bullseye version and that this same command interpreted by Bash or 
dash will always have an opposite result ? Not at all sure it's serious

- Dash Bullseye and bash
   # case "A" in [^A]) echo "character not accepted" ;;esac

- Dash bookworm
   # case "A" in [^A]) echo "character not accepted" ;;esac
   character not accepted

thanks for reconsideration,
Patrick

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-05-29 Thread Max Nikulin

On 29/05/2023 17:59, Paul Gevers wrote:


On 29-05-2023 12:51, Max Nikulin wrote:
I am unaware of another dash implementation. Do you mean ash from 
which dash was forked?


No, I understood from Andrej that dash *internally* has two ways to do 
the matching. One embedded implementation, and one using system library 
calls.


Thank you for clarification, I did not realized that you were writing 
about glob/fnmatch implementation that supports [^c] negation in glibc 
while the internal alternative treats its as a literal. Other libc 
variants are out of the scope of the debian package.




Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-05-29 Thread Paul Gevers

Hi,

On 29-05-2023 12:51, Max Nikulin wrote:
I am unaware of another dash implementation. Do you mean ash from which 
dash was forked?


No, I understood from Andrej that dash *internally* has two ways to do 
the matching. One embedded implementation, and one using system library 
calls. Which one is used depends on the configure options during the 
build. Both code paths are now made consistent (with the way dash 
maintainers always ment it to be).


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-05-29 Thread Max Nikulin

On 29/05/2023 17:30, Paul Gevers wrote:

On 29-05-2023 12:02, Max Nikulin wrote:

Strictly speaking, behavior of circumflex is *unspecified* in POSIX:


... A bracket expression
    starting with an unquoted  character produces 
unspecified

    results.


Right. Maybe better to say it now matches the other implementation (dash 
has two implementations and they were behaving differently).


I am unaware of another dash implementation. Do you mean ash from which 
dash was forked? I have checked 
https://en.wikipedia.org/wiki/Debian_Almquist_shell and noticed that 
busybox ash implementation was derived from dash, but the similar issue 
is still open in their tracker.


I would recommend users to check scripts by the "shellcheck" static 
analyzer, but I am unsure if such suggestion is suitable for release 
notes or for Debian news in the dash package.

https://www.shellcheck.net/wiki/SC3026



Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-05-29 Thread Paul Gevers

Hi,

On 29-05-2023 12:02, Max Nikulin wrote:

Strictly speaking, behavior of circumflex is *unspecified* in POSIX:


... A bracket expression
    starting with an unquoted  character produces unspecified
    results.


Right. Maybe better to say it now matches the other implementation (dash 
has two implementations and they were behaving differently).


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-05-29 Thread Max Nikulin

On 29/05/2023 02:53, Paul Gevers wrote:

Our (crafted with Andrej) proposal is here:
https://salsa.debian.org/ddp-team/release-notes/-/merge_requests/181


from the diff:

... as a literal
character, as was always the intended POSIX-compliant
behavior.


Strictly speaking, behavior of circumflex is *unspecified* in POSIX:


... A bracket expression
starting with an unquoted  character produces unspecified
results.


Moreover, it is intentionally left unspecified:
https://www.austingroupbugs.net/view.php?id=1558



Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-05-28 Thread Paul Gevers

Control: tags -1 pending patch

Hi,

On Thu, 13 Apr 2023 11:48:10 +0200 Paul Gevers  wrote:

On 12-04-2023 16:57, Santiago Ruano Rincón wrote:
> If the current behaviour
> would be part of bookworm, a NEWS entry would be great.

And a release note would be worth it too I guess.


Our (crafted with Andrej) proposal is here:
https://salsa.debian.org/ddp-team/release-notes/-/merge_requests/181

Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-05-21 Thread Richard Lewis
On Fri, 06 Jan 2023 10:52:31 +0100 "Andrej Shadura"  wrote:
> On Thu, 5 Jan 2023, at 21:32, наб wrote:

> > Please for the love of god add this to the NEWS.
> > I /guarantee/ people are using '[^0-9]' to mean "not 0-9",
> > and similar constructs, even if they are well-versed in the shell language.

> I’m actually considering reverting that patch, as it seems a bit too late in 
> the release cycle to introduce such a breaking change.

Hi - what is the status of these bugs about globbing in dash:  is
there a change in dash and a need to add to release-notes or not?

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1028002 against dash
asking for NEWS is still open,
https://salsa.debian.org/debian/dash/-/blob/debian/unstable/debian/dash.NEWS
is not updated since 2009
And the message above says the change might be reverted

So should https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1034344
against asking to document in release-notes be closed?



Bug#1034344: Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-04-19 Thread Maxim Nikulin

On Thu, 13 Apr 2023 11:48:10 +0200 Paul Gevers wrote:

On 12-04-2023 16:57, Santiago Ruano Rincón wrote:
> If the current behaviour
> would be part of bookworm, a NEWS entry would be great.

And a release note would be worth it too I guess.


Shellcheck static analyzer detects the issue with [^c] for pattern 
matching. I think, it may be recommended for installation 
https://packages.debian.org/bookworm/shellcheck or as an online tool 
https://www.shellcheck.net/


The warning concerning globs recommends to visit the following page:

https://www.shellcheck.net/wiki/SC3026
SC3026 In POSIX sh, ^ in place of ! in glob bracket expressions is 
undefined.




Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-04-13 Thread Paul Gevers

Control: clone -1 -2
Control: reassign -2 release-notes

On 12-04-2023 16:57, Santiago Ruano Rincón wrote:

If the current behaviour
would be part of bookworm, a NEWS entry would be great.


And a release note would be worth it too I guess.

Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-04-12 Thread Santiago Ruano Rincón
Control: severity -1 important

Hi!

On Fri, 6 Jan 2023 12:31:47 +0100 =?utf-8?B?0L3QsNCx?= 
 wrote:
> Hi!
> 
> On Fri, Jan 06, 2023 at 10:52:31AM +0100, Andrej Shadura wrote:
> > On Thu, 5 Jan 2023, at 21:32, наб wrote:
> > > Bisecting over the upstream git, I got
> > >   commit 8f9cca055bc661c4c690a5f5e1ca71370d129bc3 (HEAD, refs/bisect/bad)
> > >   Author: Herbert Xu 
> > >   Date:   Wed Jan 19 16:37:54 2022 +1100
> > >  
> > >   expand: Always quote caret when using fnmatch
> > 
> > > as the first bad commit with default configuration (HAVE_FNMATCH=1).
> > >
> > > I /cannot/ find a set-up where configuring like Debian
> > > (--disable-fnmatch --disable-lineno --disable-glob)
> > > isn't broken.
> > 
> > I’m not sure why this also affects configurations with --disable-fnmatch — 
> > from the description of it, it shouldn’t?
> 
> Well, dash's built-in globs Just Don't Support ^. Never have.
> (Defined as "current code doesn't and it blames to start-of-git".)
> They're strictly POSIX, and ^ is a regular character for them.
> 
> 8f9cca0 fixes the fact that glibc fnmatch() has a special meaning for ^
> by unconditionally escaping it (if configured for libc fnmatch) ‒
> it normalises [^0-9] to always mean [0-9^],
> regardless of --with-fnmatch/--disable-fnmatch.
> 
> > > Y'know what, I bisected the Salsa git, too, but then I consulted POSIX.
> > > Apparently, this is fine.
> > 
> > > Please for the love of god add this to the NEWS.
> > > I /guarantee/ people are using '[^0-9]' to mean "not 0-9",
> > > and similar constructs, even if they are well-versed in the shell 
> > > language.
> > >
> > > This is a breaking change going from bullseye, and quite an insidious one.
> > > I assume my reaction is gonna mirror others' quite well.
> > >
> > > /Please/ add this to the NEWS.
> > 
> > I’m actually considering reverting that patch, as it seems a bit too late 
> > in the release cycle to introduce such a breaking change.
> 
> I've bisected across snapshot.d.o, and the first Debian version
> that exhibits this behaviour is 0.5.11+git20210903+057cd650a4ed-4:
>   
> http://snapshot.debian.org/package/dash/0.5.11%2Bgit20210903%2B057cd650a4ed-4/
> 
> Which, if I understand it right, has landed in sid on 2022-03-04.
> Since march of last year, sid and testing have been using this;
> quoth tracker.d.o:
>   [2022-03-07] dash 0.5.11+git20210903+057cd650a4ed-7 MIGRATED to testing 
> (Debian testing watch) 
> 
> So it's been a good part of a year and no-one's complained
> (maybe I'm the idiot what doesn't know globs are negated with !s),
> from the point of view of "system compatibility",
> I think this has passed the test.
> 
> From the point of user code, a NEWS entry I'd consider sufficient,
> as usual for breaking-for-compat user-observable changes.
> 
> Reverting this now would probably have the opposite effect

I am taking the liberty to increase the severity of this bug. I'd say it
is serious, but I'd let the maintainer or the release team to decide on
that.

I am aware of at least one user hit by this. If the current behaviour
would be part of bookworm, a NEWS entry would be great.

Thanks,

 -- Santiago


signature.asc
Description: PGP signature


Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-01-06 Thread наб
Hi!

On Fri, Jan 06, 2023 at 10:52:31AM +0100, Andrej Shadura wrote:
> On Thu, 5 Jan 2023, at 21:32, наб wrote:
> > Bisecting over the upstream git, I got
> >   commit 8f9cca055bc661c4c690a5f5e1ca71370d129bc3 (HEAD, refs/bisect/bad)
> >   Author: Herbert Xu 
> >   Date:   Wed Jan 19 16:37:54 2022 +1100
> >  
> >   expand: Always quote caret when using fnmatch
> 
> > as the first bad commit with default configuration (HAVE_FNMATCH=1).
> >
> > I /cannot/ find a set-up where configuring like Debian
> > (--disable-fnmatch --disable-lineno --disable-glob)
> > isn't broken.
> 
> I’m not sure why this also affects configurations with --disable-fnmatch — 
> from the description of it, it shouldn’t?

Well, dash's built-in globs Just Don't Support ^. Never have.
(Defined as "current code doesn't and it blames to start-of-git".)
They're strictly POSIX, and ^ is a regular character for them.

8f9cca0 fixes the fact that glibc fnmatch() has a special meaning for ^
by unconditionally escaping it (if configured for libc fnmatch) ‒
it normalises [^0-9] to always mean [0-9^],
regardless of --with-fnmatch/--disable-fnmatch.

> > Y'know what, I bisected the Salsa git, too, but then I consulted POSIX.
> > Apparently, this is fine.
> 
> > Please for the love of god add this to the NEWS.
> > I /guarantee/ people are using '[^0-9]' to mean "not 0-9",
> > and similar constructs, even if they are well-versed in the shell language.
> >
> > This is a breaking change going from bullseye, and quite an insidious one.
> > I assume my reaction is gonna mirror others' quite well.
> >
> > /Please/ add this to the NEWS.
> 
> I’m actually considering reverting that patch, as it seems a bit too late in 
> the release cycle to introduce such a breaking change.

I've bisected across snapshot.d.o, and the first Debian version
that exhibits this behaviour is 0.5.11+git20210903+057cd650a4ed-4:
  http://snapshot.debian.org/package/dash/0.5.11%2Bgit20210903%2B057cd650a4ed-4/

Which, if I understand it right, has landed in sid on 2022-03-04.
Since march of last year, sid and testing have been using this;
quoth tracker.d.o:
  [2022-03-07] dash 0.5.11+git20210903+057cd650a4ed-7 MIGRATED to testing 
(Debian testing watch) 

So it's been a good part of a year and no-one's complained
(maybe I'm the idiot what doesn't know globs are negated with !s),
from the point of view of "system compatibility",
I think this has passed the test.

From the point of user code, a NEWS entry I'd consider sufficient,
as usual for breaking-for-compat user-observable changes.

Reverting this now would probably have the opposite effect
(breaking
 (and in this case this /is/ breaking, since the new behaviour is correct)
 people's globs late in the release cycle).

But what do I know,
наб


signature.asc
Description: PGP signature


Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-01-06 Thread Andrej Shadura
Hi,

On Thu, 5 Jan 2023, at 21:32, наб wrote:
> (I built 0.5.12-2 from the .dsc,
>  the binary packages don't appear to have propagated yet.
>  I also originally wrote this without knowing that glob
>  classes are negated by !, not ^.
>  s/correct/compatible/ and s/broken/incompatible/, i guess)

Thanks for the report.

> Ruh-roh!!! That's /horrific/.
> In my original reproducer that test is for checking the input 
> is an integer, this is a common pattern.
>
> This smells an awful lot like it'd affect all globs, right?
> Yeah.

<...>

> Bisecting over the upstream git, I got
>   commit 8f9cca055bc661c4c690a5f5e1ca71370d129bc3 (HEAD, refs/bisect/bad)
>   Author: Herbert Xu 
>   Date:   Wed Jan 19 16:37:54 2022 +1100
>  
>   expand: Always quote caret when using fnmatch

> as the first bad commit with default configuration (HAVE_FNMATCH=1).
>
> I /cannot/ find a set-up where configuring like Debian
> (--disable-fnmatch --disable-lineno --disable-glob)
> isn't broken.

I’m not sure why this also affects configurations with --disable-fnmatch — from 
the description of it, it shouldn’t?

> Y'know what, I bisected the Salsa git, too, but then I consulted POSIX.
> Apparently, this is fine.

> Please for the love of god add this to the NEWS.
> I /guarantee/ people are using '[^0-9]' to mean "not 0-9",
> and similar constructs, even if they are well-versed in the shell language.
>
> This is a breaking change going from bullseye, and quite an insidious one.
> I assume my reaction is gonna mirror others' quite well.
>
> /Please/ add this to the NEWS.

I’m actually considering reverting that patch, as it seems a bit too late in 
the release cycle to introduce such a breaking change.

-- 
Cheers,
  Andrej



Bug#1028002: dash: sid dash globs no longer allow [^...] to negate a class; upcoming breaking change from bullseye

2023-01-05 Thread наб
Package: dash
Version: 0.5.12-2
Version: 0.5.11+git20210903+057cd650a4ed-9
Severity: wishlist

Dear Maintainer,

(I built 0.5.12-2 from the .dsc,
 the binary packages don't appear to have propagated yet.
 I also originally wrote this without knowing that glob
 classes are negated by !, not ^.
 s/correct/compatible/ and s/broken/incompatible/, i guess)

Original reproducer:
  sh -xc 'rerat_secs=7200; [ "${rerat_secs%[^0-9]*}" != "$rerat_secs" ]; echo 
$?'
reduced for testing:
  sh -c 'i=10; echo "${i%[^0-9]*}"'

The /correct/ output, given by 0.5.11+git20200708+dd9ef66-5 (bullseye)
(and bash, and any other shell), is, naturally "10":
we're removing, from the end, a nondigit, then anything.
There are no nondigits, so nothing is removed.

Let's observe:
  bullseye$ sh -c 'i=10; echo "${i%[^0-9]*}"'
  10
  sid$ sh -c 'i=10; echo "${i%[^0-9]*}"'
  1
  0.5.12-2$ sh -c 'i=10; echo "${i%[^0-9]*}"'
  1
  trunk$ sh -c 'i=10; echo "${i%[^0-9]*}"'
  1

Ruh-roh!!! That's /horrific/.
In my original reproducer that test is for checking the input 
is an integer, this is a common pattern.

This smells an awful lot like it'd affect all globs, right?
Yeah.
  $ ls
  1  10  2  3  4  5  6  7  8  9  bin  DEBIAN  usr
  $ echo [^0-9]*  # bash, bullseye dash
  bin DEBIAN usr
  $ sh -c 'echo [^0-9]*'  # sid dash, dash 0.5.12+ trunk
  1 10 2 3 4 5 6 7 8 9

Terrifying.

Bisecting over the upstream git, I got
  commit 8f9cca055bc661c4c690a5f5e1ca71370d129bc3 (HEAD, refs/bisect/bad)
  Author: Herbert Xu 
  Date:   Wed Jan 19 16:37:54 2022 +1100
  
  expand: Always quote caret when using fnmatch
  
  This patch forces ^ to be a literal when we use fnmatch.
  
  In order to allow for the extra space to quote the caret, the
  function _rmescapes will allocate up to twice the memory if the
  flag RMESCAPE_GLOB is set.
  
  Fixes: 7638476c18f2 ("shell: Enable fnmatch/glob by default")
  Reported-by: Christoph Anton Mitterer 
  Suggested-by: Harald van Dijk 
  Signed-off-by: Herbert Xu 
as the first bad commit with default configuration (HAVE_FNMATCH=1).

I /cannot/ find a set-up where configuring like Debian
(--disable-fnmatch --disable-lineno --disable-glob)
isn't broken.




Y'know what, I bisected the Salsa git, too, but then I consulted POSIX.
Apparently, this is fine. Apparently, XCU, 2.13.1 Patterns Matching a Single 
Character: 
  When unquoted and outside a bracket expression, the following three
  characters shall have special meaning in the specification of patterns:
  [
If an open bracket introduces a bracket expression as in XBD RE
Bracket Expression, except that the  character
( '!' ) shall replace the  character ( '^' ) in its
role in a non-matching list in the regular expression notation, it
shall introduce a pattern bracket expression. A bracket expression
starting with an unquoted  character produces unspecified
results. Otherwise, '[' shall match the character itself.


Please for the love of god add this to the NEWS.
I /guarantee/ people are using '[^0-9]' to mean "not 0-9",
and similar constructs, even if they are well-versed in the shell language.

This is a breaking change going from bullseye, and quite an insidious one.
I assume my reaction is gonna mirror others' quite well.

/Please/ add this to the NEWS.

Thanks,
наб

-- System Information:
Debian Release: bookworm/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: x32 (x86_64)
Foreign Architectures: amd64, i386

Kernel: Linux 6.0.0-6-amd64 (SMP w/2 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, 
TAINT_UNSIGNED_MODULE
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages dash depends on:
ii  debianutils  5.7-0.4
ii  dpkg 1.21.15
ii  libc62.36-7

dash recommends no packages.

dash suggests no packages.

-- debconf information excluded


signature.asc
Description: PGP signature