Bug#972672: bash SIGSEGV related to locale

2020-11-04 Thread Chet Ramey
On 11/4/20 7:16 AM, Thomas Schwinge wrote:

>> I suspect that the bug is the "len" argument on the previous line
>>   n = wcsrtombs(pathname, (const wchar_t **), len, );
>>
>> Here "len" is byte length obtained for the original string from
>> strlen(). But the call seems to expect the length of the wide character
>> version in wpathname which was obtained above with xdupmbstowcs(), and
>> so the code should use the return value of that function (in variable
>> n) instead of len. Using too long a length makes wcsrtombs() set the
>> pointer to NULL when it continues to a zero character.
> 
> So this is different from the change that Chet (CCed) applied to upstream
> bash-5.1-rc2, 'lib/glob/glob.c':
> 
> (via ).
> 
> I cannot comment on the details, as I'm not at all familiar with these
> string APIs.  Chet?

The prototype for wcsrtombs is (simplified):

size_t wcsrtombs(char *dst, wchar_t **src, size_t len, mbstate_t *ps)

In the prototype, LEN is the maximum number of bytes to store in DST.

In bash, since LEN is set from the number of bytes in the original
pathname, and the possibly-modified multibyte character pathname cannot
contain more than that number of bytes, LEN is appropriate.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Bug#972672: bash SIGSEGV related to locale

2020-11-04 Thread Thomas Schwinge
Hi!

I saw your message only when now looking at
 -- apparently, the Debian BTS doesn't
put the submitter (me) onto CC by default?  (Strange.)


Anyway:

On 2020-10-25T04:36:58+0200, Uoti Urpala  wrote:
> On Thu, 22 Oct 2020 11:47:44 +0200 Thomas Schwinge  
> wrote:
>> ..., that is, SIGSEGV, supposedly when bash tries to expand the last path
>> component glob ('*'):
>
> I also encountered this bug. After installing bash-dbgsym, gdb says it
> crashes at glob.c line 487 in wdequote_pathname(). The immediate cause
> of the crash there seems to be that wpathname is NULL.
>
> I suspect that the bug is the "len" argument on the previous line
>   n = wcsrtombs(pathname, (const wchar_t **), len, );
>
> Here "len" is byte length obtained for the original string from
> strlen(). But the call seems to expect the length of the wide character
> version in wpathname which was obtained above with xdupmbstowcs(), and
> so the code should use the return value of that function (in variable
> n) instead of len. Using too long a length makes wcsrtombs() set the
> pointer to NULL when it continues to a zero character.

So this is different from the change that Chet (CCed) applied to upstream
bash-5.1-rc2, 'lib/glob/glob.c':

(via ).

I cannot comment on the details, as I'm not at all familiar with these
string APIs.  Chet?


Grüße
 Thomas



Bug#972672: bash SIGSEGV related to locale

2020-11-04 Thread Thomas Schwinge
Control: forcemerge 972286 -1


Hi!

On 2020-10-22T11:47:44+0200, I wrote:
> [...], SIGSEGV, supposedly when bash tries to expand the last path
> component glob ('*'):

> 2020-10-22 06:51:53 upgrade bash:amd64 5.0-7 5.1~rc1-2
> [...]
>
> So that's probably it.  I'll later retry with downgraded bash.

The issue has already been fixed in upstream bash-5.1-rc2, specifically
its 'lib/glob/glob.c' change,
.


Grüße
 Thomas



Bug#972672: bash SIGSEGV related to locale

2020-10-24 Thread Uoti Urpala
On Thu, 22 Oct 2020 11:47:44 +0200 Thomas Schwinge  wrote:
> ..., that is, SIGSEGV, supposedly when bash tries to expand the last path
> component glob ('*'):

I also encountered this bug. After installing bash-dbgsym, gdb says it
crashes at glob.c line 487 in wdequote_pathname(). The immediate cause
of the crash there seems to be that wpathname is NULL.

I suspect that the bug is the "len" argument on the previous line
  n = wcsrtombs(pathname, (const wchar_t **), len, );

Here "len" is byte length obtained for the original string from
strlen(). But the call seems to expect the length of the wide character
version in wpathname which was obtained above with xdupmbstowcs(), and
so the code should use the return value of that function (in variable
n) instead of len. Using too long a length makes wcsrtombs() set the
pointer to NULL when it continues to a zero character.



Bug#972672: bash SIGSEGV related to locale

2020-10-22 Thread Matthias Klose
On 10/22/20 11:47 AM, Thomas Schwinge wrote:
> So that's probably it.  I'll later retry with downgraded bash.

please use the bashbug script to forward that issue, if that's new behavior in 
5.1.



Bug#972672: bash SIGSEGV related to locale

2020-10-22 Thread Thomas Schwinge
Package: bash
Version: 5.1~rc1-2

Hi!

Up-to-date Debian testing GNU/Linux x86_64 system.

Given a directory/path
'/home/thomas/Mail/thomas\@schwinge.name/list/käufer\*innen-forum.forum.fairmondo.de':
special characters '@', 'ä' ('\303\244'), '*', and directory content as
follows:

$ find 
/home/thomas/Mail/thomas\@schwinge.name/list/käufer\*innen-forum.forum.fairmondo.de/
 -ls
  4563522  4 drwxrwx---   4 thomas   thomas   4096 Oct  4 10:14 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/
 11485361  4 drwxrwx---   5 thomas   thomas   4096 Oct  4 10:14 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-10
 11485362  4 drwxrwx---   2 thomas   thomas   4096 Oct 10 15:01 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-10/tmp
 11485363  4 drwxrwx---   2 thomas   thomas   4096 Oct 11 12:39 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-10/new
 11485364  4 drwxrwx---   2 thomas   thomas   4096 Oct  4 10:14 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-10/cur
  4564560  4 drwxrwx---   5 thomas   thomas   4096 Sep  1 10:34 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-09
  4564561  4 drwxrwx---   2 thomas   thomas   4096 Sep 30 22:43 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-09/tmp
  4564562  4 drwxrwx---   2 thomas   thomas   4096 Oct  3 17:20 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-09/new
  4564563  4 drwxrwx---   2 thomas   thomas   4096 Sep  1 10:34 
/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/2020-09/cur

..., that is, two empty maildirs: '2020-09', '2020-10', and:

$ locale
LANG=C.UTF-8
LANGUAGE=C
LC_CTYPE=de_DE.utf8
LC_NUMERIC=de_DE.utf8
LC_TIME=de_DE.utf8
LC_COLLATE=C
LC_MONETARY=de_DE.utf8
LC_MESSAGES=de_DE.utf8
LC_PAPER=de_DE.utf8
LC_NAME=de_DE.utf8
LC_ADDRESS=de_DE.utf8
LC_TELEPHONE=de_DE.utf8
LC_MEASUREMENT=de_DE.utf8
LC_IDENTIFICATION=de_DE.utf8
LC_ALL=

..., we get:

$ strace -ff -o s bash -c ': 
/home/thomas/Mail/thomas\@schwinge.name/list/käufer\*innen-forum.forum.fairmondo.de/*'
Segmentation fault

..., that is, SIGSEGV, supposedly when bash tries to expand the last path
component glob ('*'):

[...]
openat(AT_FDCWD, "/usr/lib/locale/C.UTF-8/LC_CTYPE", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=201272, ...}) = 0
mmap(NULL, 201272, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f829f2b5000
close(3)= 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
getpeername(0, 0x7fff90f70c20, [16])= -1 ENOTSOCK (Socket operation on 
non-socket)
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV +++

There is no such problem under 'LC_ALL=C':

$ LC_ALL=C strace -ff -o s_C bash -c ': 
/home/thomas/Mail/thomas\@schwinge.name/list/käufer\*innen-forum.forum.fairmondo.de/*'

[...]
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
getpeername(0, 0x7ffe452fba20, [16])= -1 ENOTSOCK (Socket operation on 
non-socket)
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
openat(AT_FDCWD, 
"/home/thomas/Mail/tho...@schwinge.name/list/k\303\244ufer*innen-forum.forum.fairmondo.de/",
 O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|0770, st_size=4096, ...}) = 0
getdents64(3, /* 4 entries */, 32768)   = 112
getdents64(3, /* 0 entries */, 32768)   = 0
close(3)= 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
exit_group(0)   = ?
+++ exited with 0 +++

(That means, I do have a workaround.)

I cannot tell when exactly this started failing, but it must've been very
recently, last few days.

Recent (automated) package updates that are supposedly relevant, per
'/var/log/dpkg.log':

[...]
2020-10-19 08:28:50 upgrade libc6:amd64 2.31-3 2.31-4
[...]
2020-10-19 08:29:59 upgrade libc-l10n:all 2.31-3 2.31-4
[...]
2020-10-19 08:32:22 upgrade locales-all:amd64 2.31-3 2.31-4
[...]
2020-10-19 08:37:57 upgrade locales:all 2.31-3 2.31-4
[...]

I can't confirm for sure, but I'm reasonably confident that the problem
did not occur at this point.  Then, this night:

[...]
2020-10-22 06:51:53 upgrade bash:amd64 5.0-7 5.1~rc1-2
[...]

So that's probably it.  I'll later retry with downgraded bash.


Grüße
 Thomas