Processed: Re: Bug#965091: glibc: setgroups: Bad address [2.31/x32, regression from 2.30]

2020-07-19 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> severity 965091 important
Bug #965091 [libc6] glibc: setgroups: Bad address [2.31/x32, regression from 
2.30]
Severity set to 'important' from 'grave'
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
965091: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=965091
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#965091: glibc: setgroups: Bad address [2.31/x32, regression from 2.30]

2020-07-19 Thread Aurelien Jarno
severity 965091 important
thanks

Hi,

On 2020-07-15 23:21, Thorsten Glaser wrote:
> Package: libc6
> Version: 2.31-1
> Severity: grave
> Justification: renders package unusable
> 
> This is related to #965086 and #965087 (and, in fact, possibly
> causing them). After a glibc upgrade half the system services
> (postfix, sshd, apt-get(!)) don’t work any more.

The set of packages involved in the glibc transition is ready and this
bug is the only thing blocking the migration to testing. The patch
fixing that issue is being developed upstream, but not fully ready yet.

I am therefore downgraded the severity of this bug given that:
- The x32 port is not an official port and doesn't have testing. Having
  this glibc package in testing won't change anything for that port.
- The libc6-x32 is affected by the issue, but x32 syscalls are disabled
  by default in the debian kernels.

This doesn't mean that the bug won't be fixed. Once the package has
migrated to testing and we have a patch upstream, I'll do another upload
fixing the issue.

Regards,
Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Bug#965091: glibc: setgroups: Bad address [2.31/x32, regression from 2.30]

2020-07-15 Thread H.J. Lu
On Wed, Jul 15, 2020 at 5:22 PM Jessica Clarke  wrote:
>
> [H.J. Lu Cc'ed as author of what looks like the problematic commit]
>
> On Wed, 15 Jul 2020 23:47:27 +0200 (CEST) Thorsten Glaser 
>  wrote:
> > Some more analysis:
> >
> > We enter libc from openssh code with the correct values in rsi and rdi:
> >
> >
> > (gdb) u
> > => 0x56560e45 : call   0x5655d0b0 
> > (gdb) info r
> > rax0xfffe  65534
> > rbx0x5663a000  1449369600
> > rcx0x0 0
> > rdx0x0 0
> > rsi0xd2e0  4294955744
> > rdi0x1 1
> > rbp0x56641b50  1449401168
> > rsp0xd260  4294955616
> > r8 0x0 0
> > r9 0x0 0
> > r100xf7a8  4155017352
> > r110x246   582
> > r120x565d71d1  1448964561
> > r130x3 3
> > r140xe2cc  58060
> > r150x5663c730  1449379632
> > rip0x56560e45  1448480325
> > eflags 0x282   [ SF IF ]
> > cs 0x3351
> > ss 0x2b43
> > ds 0x2b43
> > es 0x2b43
> > fs 0x0 0
> > gs 0x0 0
> >
> >
> > Inside glibc, there’s an indirect call:
> >
> >
> > => 0xf79949f4 <__GI_setgroups+100>: call   rax
> > rax0xf7833500  4152571136
> > => 0xf7833500 <__nptl_setxid>:  push   r15
> >
> >
> > Some time in __nptl_setxid later, here’s the bug:
> >
> >
> > 1162in allocatestack.c
> > rax0xf77ca840  4152141888
> > rbx0xd230  4294955568
> > rcx0x0 0
> > rdx0x1 1
> > rsi0xd2e0  4294955744
> > rdi0xf77ca840  4152141888
> > rbp0xf77ca840  4152141888
> > rsp0xd1d0  4294955472
> > r8 0x0 0
> > r9 0x0 0
> > r100xf77caac0  4152142528
> > r110x246   582
> > r120xf784a360  4152664928
> > r130xf784a360  4152664928
>
> Looks like df76ff3a446a787a95cf74cb15c285464d73a93d broke this upstream
> (x32: Properly pass long to syscall [BZ #25810]).
>
> setgroups uses INLINE_SETXID_SYSCALL, which in turn passes its
> arguments through the various id fields in xid_command. Unfortunately,
> those are `long int`, and so, when the `const gid_t *list` argument
> gets later passed through INTERNAL_SYSCALL_NCS, it's treated as an
> integer argument and thus sign-extended, despite actually being a
> pointer. I think xid_command's id fields need to become __kernel_ulong
> or equivalent, with ARGIFY happening inside INLINE_SETXID_SYSCALL when
> it still knows what the types are.
>

Please open a glibc bug with a small testcase.

-- 
H.J.



Bug#965091: glibc: setgroups: Bad address [2.31/x32, regression from 2.30]

2020-07-15 Thread Jessica Clarke
[H.J. Lu Cc'ed as author of what looks like the problematic commit]

On Wed, 15 Jul 2020 23:47:27 +0200 (CEST) Thorsten Glaser  
wrote:
> Some more analysis:
> 
> We enter libc from openssh code with the correct values in rsi and rdi:
> 
> 
> (gdb) u
> => 0x56560e45 : call   0x5655d0b0 
> (gdb) info r
> rax0xfffe  65534
> rbx0x5663a000  1449369600
> rcx0x0 0
> rdx0x0 0
> rsi0xd2e0  4294955744
> rdi0x1 1
> rbp0x56641b50  1449401168
> rsp0xd260  4294955616
> r8 0x0 0
> r9 0x0 0
> r100xf7a8  4155017352
> r110x246   582
> r120x565d71d1  1448964561
> r130x3 3
> r140xe2cc  58060
> r150x5663c730  1449379632
> rip0x56560e45  1448480325
> eflags 0x282   [ SF IF ]
> cs 0x3351
> ss 0x2b43
> ds 0x2b43
> es 0x2b43
> fs 0x0 0
> gs 0x0 0
> 
> 
> Inside glibc, there’s an indirect call:
> 
> 
> => 0xf79949f4 <__GI_setgroups+100>: call   rax
> rax0xf7833500  4152571136
> => 0xf7833500 <__nptl_setxid>:  push   r15
> 
> 
> Some time in __nptl_setxid later, here’s the bug:
> 
> 
> 1162in allocatestack.c
> rax0xf77ca840  4152141888
> rbx0xd230  4294955568
> rcx0x0 0
> rdx0x1 1
> rsi0xd2e0  4294955744
> rdi0xf77ca840  4152141888
> rbp0xf77ca840  4152141888
> rsp0xd1d0  4294955472
> r8 0x0 0
> r9 0x0 0
> r100xf77caac0  4152142528
> r110x246   582
> r120xf784a360  4152664928
> r130xf784a360  4152664928

Looks like df76ff3a446a787a95cf74cb15c285464d73a93d broke this upstream
(x32: Properly pass long to syscall [BZ #25810]).

setgroups uses INLINE_SETXID_SYSCALL, which in turn passes its
arguments through the various id fields in xid_command. Unfortunately,
those are `long int`, and so, when the `const gid_t *list` argument
gets later passed through INTERNAL_SYSCALL_NCS, it's treated as an
integer argument and thus sign-extended, despite actually being a
pointer. I think xid_command's id fields need to become __kernel_ulong
or equivalent, with ARGIFY happening inside INLINE_SETXID_SYSCALL when
it still knows what the types are.

Jess



Bug#965091: glibc: setgroups: Bad address [2.31/x32, regression from 2.30]

2020-07-15 Thread Thorsten Glaser
> So something clearly changed…

Compiler output, most probably. I cannot reproduce it. I tried:

struct xid_command
{
  int syscall_no;
  long int id[3];
  volatile int cntr;
  volatile int error; /* -1: no call yet, 0: success seen, >0: error 
seen.  */
};

extern void a_barrier(int *);

# define REGISTERS_CLOBBERED_BY_SYSCALL "cc", "r11", "cx"

/* NB: This also works when X is an array.  For an array X,  type of
   (X) - (X) is ptrdiff_t, which is signed, since size of ptrdiff_t
   == size of pointer, cast is a NOP.   */
#define TYPEFY1(X) __typeof__ ((X) - (X))
/* Explicit cast the argument.  */
#define ARGIFY(X) ((TYPEFY1 (X)) (X))
/* Create a variable 'name' based on type of variable 'X' to avoid
   explicit types.  */
#define TYPEFY(X, name) __typeof__ (ARGIFY (X)) name


#undef INTERNAL_SYSCALL_NCS
#define INTERNAL_SYSCALL_NCS(number, err, nr, args...)  
\
internal_syscall##nr (number, err, args)

#undef internal_syscall3
#define internal_syscall3(number, err, arg1, arg2, arg3)
\
({  
\
unsigned long int resultvar;
\
TYPEFY (arg3, __arg3) = ARGIFY (arg3);  
\
TYPEFY (arg2, __arg2) = ARGIFY (arg2);  
\
TYPEFY (arg1, __arg1) = ARGIFY (arg1);  
\
register TYPEFY (arg3, _a3) asm ("rdx") = __arg3;   
\
register TYPEFY (arg2, _a2) asm ("rsi") = __arg2;   
\
register TYPEFY (arg1, _a1) asm ("rdi") = __arg1;   
\
asm volatile (  
\
"syscall\n\t"   
\
: "=a" (resultvar)  
\
: "0" (number), "r" (_a1), "r" (_a2), "r" (_a3) 
\
: "memory", REGISTERS_CLOBBERED_BY_SYSCALL);
\
(long int) resultvar;   
\
})

int
foo(struct xid_command *cmdp)
{
  int result;
  asm volatile ("xor rsi,rsi\n\txor rdi,rdi" : : : "rsi", "rdi");
  result = INTERNAL_SYSCALL_NCS (cmdp->syscall_no, err, 3,
 cmdp->id[0], cmdp->id[1], cmdp->id[2]);
  a_barrier();
  return result;
}


Save as x.c then:

x86_64-linux-gnux32-gcc-10 -c -std=gnu11 -fgnu89-inline  -pipe -O2 -g -Wall 
-Wwrite-strings -Wundef -Werror -fmerge-all-constants -frounding-math 
-fstack-protector-strong -Wstrict-prototypes -Wold-style-definition 
-fmath-errno   -fpie   -ftls-model=initial-exec -D_LIBC_REENTRANT -DPIC -S 
-masm=intel x.c -Wno-error

This doesn’t yield any “movsxd” in the output, like in glibc, though:

 b32:   67 48 63 73 08  movsxd rsi,DWORD PTR [ebx+0x8]
 b37:   67 48 63 7b 04  movsxd rdi,DWORD PTR [ebx+0x4]
 b3c:   67 48 63 53 0c  movsxd rdx,DWORD PTR [ebx+0xc]
 b41:   67 8b 03moveax,DWORD PTR [ebx]
 b44:   0f 05   syscall

(disassembly of pthread_create.o from libpthread.a 2.31)

I’m unsure whether this is a glibc or gcc issue… without a reproducer
I’m stuck.

I’ll have to downgrade to 2.30 for now, to keep the system ssh-in-able…

bye,
//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg



Bug#965091: glibc: setgroups: Bad address [2.31/x32, regression from 2.30]

2020-07-15 Thread Thorsten Glaser
Some more analysis:

We enter libc from openssh code with the correct values in rsi and rdi:


(gdb) u
=> 0x56560e45 : call   0x5655d0b0 
(gdb) info r
rax0xfffe  65534
rbx0x5663a000  1449369600
rcx0x0 0
rdx0x0 0
rsi0xd2e0  4294955744
rdi0x1 1
rbp0x56641b50  1449401168
rsp0xd260  4294955616
r8 0x0 0
r9 0x0 0
r100xf7a8  4155017352
r110x246   582
r120x565d71d1  1448964561
r130x3 3
r140xe2cc  58060
r150x5663c730  1449379632
rip0x56560e45  1448480325
eflags 0x282   [ SF IF ]
cs 0x3351
ss 0x2b43
ds 0x2b43
es 0x2b43
fs 0x0 0
gs 0x0 0


Inside glibc, there’s an indirect call:


=> 0xf79949f4 <__GI_setgroups+100>: call   rax
rax0xf7833500  4152571136
=> 0xf7833500 <__nptl_setxid>:  push   r15


Some time in __nptl_setxid later, here’s the bug:


1162in allocatestack.c
rax0xf77ca840  4152141888
rbx0xd230  4294955568
rcx0x0 0
rdx0x1 1
rsi0xd2e0  4294955744
rdi0xf77ca840  4152141888
rbp0xf77ca840  4152141888
rsp0xd1d0  4294955472
r8 0x0 0
r9 0x0 0
r100xf77caac0  4152142528
r110x246   582
r120xf784a360  4152664928
r130xf784a360  4152664928
r140xf78482c8  4152656584
r150x40ca  1073742026
rip0xf7833752  4152571730
eflags 0x246   [ PF ZF IF ]
cs 0x3351
ss 0x2b43
ds 0x2b43
es 0x2b43
fs 0x0 0
gs 0x0 0

=> 0xf7833752 <__nptl_setxid+594>:  movsxd rsi,DWORD PTR [ebx+0x8]
   0xf7833757 <__nptl_setxid+599>:  movsxd rdi,DWORD PTR [ebx+0x4]
   0xf783375c <__nptl_setxid+604>:  movsxd rdx,DWORD PTR [ebx+0xc]
(gdb) t
=> 0xf7833752 <__nptl_setxid+594>:  movsxd rsi,DWORD PTR [ebx+0x8]

1162in allocatestack.c
rax0xf77ca840  4152141888
rbx0xd230  4294955568
rcx0x0 0
rdx0x1 1
rsi0xd2e0  -11552
rdi0xf77ca840  4152141888
rbp0xf77ca840  4152141888
rsp0xd1d0  4294955472
r8 0x0 0
r9 0x0 0
r100xf77caac0  4152142528
r110x246   582
r120xf784a360  4152664928
r130xf784a360  4152664928
r140xf78482c8  4152656584
r150x40ca  1073742026
rip0xf7833757  4152571735
eflags 0x246   [ PF ZF IF ]
cs 0x3351
ss 0x2b43
ds 0x2b43
es 0x2b43
fs 0x0 0
gs 0x0 0


Looking at the next instructions…


=> 0xf7833757 <__nptl_setxid+599>:  movsxd rdi,DWORD PTR [ebx+0x4]
   0xf783375c <__nptl_setxid+604>:  movsxd rdx,DWORD PTR [ebx+0xc]
   0xf7833761 <__nptl_setxid+609>:  moveax,DWORD PTR [ebx]
   0xf7833764 <__nptl_setxid+612>:  syscall 


… this most likely corresponds to this C source:


 1162   result = INTERNAL_SYSCALL_NCS (cmdp->syscall_no, err, 3,
 1163  cmdp->id[0], cmdp->id[1], cmdp->id[2]);


Diffing glibc-2.30..glibc-2.31 shows no noticeable delta
in nptl/allocatestack.c so going on.

struct xid_command (nptl/descr.h) also did not change.

Looking at pthread_create.o (whyever this is the file __nptl_setxid
ends up being in) from 2.30-8, the code in question looks like this:

 c3d:   67 8b 75 08 movesi,DWORD PTR [ebp+0x8]
 c41:   67 8b 7d 04 movedi,DWORD PTR [ebp+0x4]
 c45:   67 8b 55 0c movedx,DWORD PTR [ebp+0xc]
 c49:   67 8b 45 00 moveax,DWORD PTR [ebp+0x0]
 c4d:   0f 05   syscall 

So something clearly changed…

//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 

Bug#965091: glibc: setgroups: Bad address [2.31/x32, regression from 2.30]

2020-07-15 Thread Thorsten Glaser
Package: libc6
Version: 2.31-1
Severity: grave
Justification: renders package unusable

This is related to #965086 and #965087 (and, in fact, possibly
causing them). After a glibc upgrade half the system services
(postfix, sshd, apt-get(!)) don’t work any more.

Downgrading with dpkg -i the following set of packages fixes it:

libc-bin_2.30-8_x32.deb
libc-dev-bin_2.30-8_x32.deb
libc-l10n_2.30-8_all.deb
libc6-dbg_2.30-8_x32.deb
libc6-dev_2.30-8_x32.deb
libc6_2.30-8_amd64.deb
libc6_2.30-8_i386.deb
libc6_2.30-8_x32.deb
locales-all_2.30-8_x32.deb
locales_2.30-8_all.deb
unscd_0.53-1+b3_x32.deb

Snippet from strace:

[…]
9839  getpid()  = 9839
9839  chroot("/run/sshd")   = 0
9839  chdir("/")= 0
9839  write(7, "\0\0\0$\0\0\0\7\0\0\0\34privsep user:group 1"..., 40) = 40
9839  setgroups(1, 0xff866750 
9794  <... poll resumed>)   = 1 ([{fd=6, revents=POLLIN}])
9839  <... setgroups resumed>)  = -1 EFAULT (Bad address)
9794  read(6,  
9839  write(7, "\0\0\0\36\0\0\0\1\0\0\0\26setgroups: Bad addre"..., 34 

[…]

Noticeable: the sign-extended address.

I haven’t yet managed to reproduce this in a stand-alone program.

-- System Information:
Debian Release: bullseye/sid
  APT prefers unreleased
  APT policy: (500, 'unreleased'), (500, 'buildd-unstable'), (500, 'unstable'), 
(100, 'experimental')
Architecture: x32 (x86_64)
Foreign Architectures: i386, amd64

Kernel: Linux 5.7.0-1-amd64 (SMP w/4 CPU threads)
Kernel taint flags: TAINT_FIRMWARE_WORKAROUND
Locale: LANG=C, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/lksh
Init: sysvinit (via /sbin/init)

Versions of packages libc6 depends on:
ii  libcrypt1  1:4.4.16-1
ii  libgcc-s1  10.1.0-6

Versions of packages libc6 recommends:
ii  libidn2-0  2.3.0-1

Versions of packages libc6 suggests:
ii  debconf [debconf-2.0]  1.5.74
ii  glibc-doc  2.31-1
ii  libc-l10n  2.31-1
ii  locales2.31-1

-- debconf information:
  glibc/disable-screensaver:
* libraries/restart-without-asking: true
  glibc/restart-failed:
  glibc/kernel-too-old:
* glibc/upgrade: true
* glibc/restart-services: postfix openbsd-inetd cups cron
  glibc/kernel-not-supported: