Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-17 Thread Drew Parsons
On Sat, 9 Apr 2016 13:25:35 +0200 Aurelien Jarno 
wrote:
> 
> The chrpath issue has been fixed, I have scheduled binNMUs to get a
> fixed openmpi on mipsel and mips64el
> 
> I am keeping this bug open with severity minor to not forget to
remove
> the workaround that has been added (thanks for that). There is no
> urgency for it, it can wait other changes.
> 


petsc now builds on all official architectures except for mipsel.  It
does build on mips and mips64el, but not mipsel.  I'm not certain if
the build failure is related to the final work needed in this bug for
mipsel, or if it's something else.  Further details are in Bug#816101.

Drew



Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-09 Thread Aurelien Jarno
control: retitle -1 openmpi: please remove mips chrpath workaround
control: severity -1 minor

On 2016-04-09 09:38, Gilles Filippini wrote:
> Control: reopen -1
> 
> On Fri, 8 Apr 2016 13:34:04 +0200 Emilio Pozuelo Monfort
>  wrote:
> > On Fri, 8 Apr 2016 12:16:47 +0200 Alastair McKinstry
> >  wrote:
> > > 
> > > >> OpenMPI maintainers (and anyone else affected):
> > > >> One possible workaround is to use chrpath -r ""  on mips*
> > > >> architectures until this is fixed since that command does not cause any
> > > >> tags to be moved. It has a tiny performance penalty but should
> > > >> otherwise work properly.
> > > > Thanks for the workaround.
> > > >
> > > > Aurelien
> > > >
> > > Thanks.
> > > I've tested this fix within openmpi on mips (works) and have uploaded a
> > > new version with
> > > the workaround.
> > 
> > Thanks! Unfortunately you forgot to apply this same workaround to mipsel and
> > mips64el. Could you apply it in those architectures as well?
> 
> Reopening, until the problem is fixed for mipsel and mip64el.

The chrpath issue has been fixed, I have scheduled binNMUs to get a
fixed openmpi on mipsel and mips64el

I am keeping this bug open with severity minor to not forget to remove
the workaround that has been added (thanks for that). There is no
urgency for it, it can wait other changes.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


signature.asc
Description: PGP signature


Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-09 Thread Gilles Filippini
Control: reopen -1

On Fri, 8 Apr 2016 13:34:04 +0200 Emilio Pozuelo Monfort
 wrote:
> On Fri, 8 Apr 2016 12:16:47 +0200 Alastair McKinstry
>  wrote:
> > 
> > >> OpenMPI maintainers (and anyone else affected):
> > >> One possible workaround is to use chrpath -r ""  on mips*
> > >> architectures until this is fixed since that command does not cause any
> > >> tags to be moved. It has a tiny performance penalty but should
> > >> otherwise work properly.
> > > Thanks for the workaround.
> > >
> > > Aurelien
> > >
> > Thanks.
> > I've tested this fix within openmpi on mips (works) and have uploaded a
> > new version with
> > the workaround.
> 
> Thanks! Unfortunately you forgot to apply this same workaround to mipsel and
> mips64el. Could you apply it in those architectures as well?

Reopening, until the problem is fixed for mipsel and mip64el.

Thanks,

_g.



signature.asc
Description: OpenPGP digital signature


Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-08 Thread Aurelien Jarno
On 2016-04-08 10:45, Aurelien Jarno wrote:
> Hi,
> 
> On 2016-04-07 11:51, James Cowgill wrote:
> > 
> > Based only on chrpath and cmake reverse dependencies, there is an upper
> > bound of about 1500 binNMUs (after the tools after fixed). Hopefully
> > that can be reduced!
> >
> > I really don't have any time to fix all this. Please can someone else
> > have a look!
> 
> I'll try to do an archive scan asap to really get an idea on how many
> packages are affected. After I'll look at how to fix chrpath, but help
> would be welcome as I also don't have a lot of time.

Please find below a small quick and dirty python script to check that.
There is probably a way to do it better and cleaner, but it does its
job.

I will start an archive scan and will start working on a chrpath fix.

Aurelien



#!/usr/bin/python3

from sys import argv, exit
from elftools.elf.elffile import ELFFile, DynamicSection

# MIPS specific constants
DT_MIPS_BASE_ADDRESS=0x7006
DT_MIPS_RLD_MAP=0x7016
DT_MIPS_RLD_MAP_REL=0x7035

good = True
filename = argv[1]

with open(filename, 'rb') as f:
for section in ELFFile(f).iter_sections():
if not isinstance(section, DynamicSection):
continue

base = None
rld_map = None
rld_map_rel = None
rld_map_rel_offset = None
for index, tag in enumerate(section.iter_tags()):
if tag.entry.d_tag == DT_MIPS_BASE_ADDRESS:
base = tag.entry.d_val
elif tag.entry.d_tag == DT_MIPS_RLD_MAP:
rld_map = tag.entry.d_val
elif tag.entry.d_tag == DT_MIPS_RLD_MAP_REL:
rld_map_rel = tag.entry.d_val
rld_map_rel_offset = section.header.sh_offset + index * 
section.header.sh_entsize

if base and rld_map and rld_map_rel:
if rld_map != base + rld_map_rel + rld_map_rel_offset:
good = False

print('%s: %s' % (filename, 'ok' if good else 'bad'))
exit(0 if good else 1)

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


signature.asc
Description: PGP signature


Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-08 Thread Emilio Pozuelo Monfort
On Fri, 8 Apr 2016 12:16:47 +0200 Alastair McKinstry
 wrote:
> 
> >> OpenMPI maintainers (and anyone else affected):
> >> One possible workaround is to use chrpath -r ""  on mips*
> >> architectures until this is fixed since that command does not cause any
> >> tags to be moved. It has a tiny performance penalty but should
> >> otherwise work properly.
> > Thanks for the workaround.
> >
> > Aurelien
> >
> Thanks.
> I've tested this fix within openmpi on mips (works) and have uploaded a
> new version with
> the workaround.

Thanks! Unfortunately you forgot to apply this same workaround to mipsel and
mips64el. Could you apply it in those architectures as well?

Thanks,
Emilio



Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-08 Thread Alastair McKinstry

>> OpenMPI maintainers (and anyone else affected):
>> One possible workaround is to use chrpath -r ""  on mips*
>> architectures until this is fixed since that command does not cause any
>> tags to be moved. It has a tiny performance penalty but should
>> otherwise work properly.
> Thanks for the workaround.
>
> Aurelien
>
Thanks.
I've tested this fix within openmpi on mips (works) and have uploaded a
new version with
the workaround.

regards
Alastair

-- 
Alastair McKinstry, , , 
https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered. 



Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-08 Thread Aurelien Jarno
Hi,

On 2016-04-07 11:51, James Cowgill wrote:
> Hi,
> 
> I've managed to find the cause of the openmpi segfault (#818909). It
> might affect a number of different packages.

Thanks for working on that.

> The segfault is caused by the interaction of the
> new DT_MIPS_RLD_MAP_REL dynamic tag (from binutils 2.26) and chrpath.
> Unlike all other tags, this tag is relative to the offset of the tag
> within the executable. chrpath is used to remove rpaths from ELF files.
> It does this by moving all of the other dynamic tags up one entry, but
> since the DT_MIPS_RLD_MAP_REL is not updated, it now points to an
> incorrect offset. The dynamic linker will then overwrite some other
> memory when processing the DT_MIPS_RLD_MAP_REL tag.
> 
> The openmpi segfault was caused by a global variable being initialized
> incorrectly (overwritten by the dynamic linker). I expect other
> executables using chrpath will also be affected - possibly in strange
> ways (not nessesarily a segfault).
> 
> It also seems that at least cmake uses the same technique for removing
> the RPATH so any cmake reverse dependencies could be affected. The
> DT_MIPS_RLD_MAP_REL is only created for executables which limits the
> effect of this slightly. Only packages built using binutils
> >= 2.25.51.20151014-1 will be affected.

It seems the other condition is to use glibc 2.22, which contains the
following corresponding commits:

| commit a2057c984e4314c3740f04cf54e36c824e4c8f32
| Author: Matthew Fortune 
| Date:   Thu Jun 11 10:43:48 2015 +0100
| 
| Add support for DT_MIPS_RLD_MAP_REL.
| 
| This tag allows debugging of MIPS position independent executables
| and provides access to shared library information.
| 
| * elf/elf.h (DT_MIPS_RLD_MAP_REL): New macro.
| (DT_MIPS_NUM): Update.
| * sysdeps/mips/dl-machine.h (ELF_MACHINE_DEBUG_SETUP): Handle
| DT_MIPS_RLD_MAP_REL.

Maybe we can temporarily revert this commit until the problem is fixed
in chrpath and the packages are rebuilt. 

> There is a convinient way to test if a package is broken using the
> presence of the old DT_MIPS_RLD_MAP tag. When correct
> (DT_MIPS_RLD_MAP_REL + tag offset + executable base address) equals
> DT_MIPS_RLD_MAP, so someone could analyze the archive to find which
> packages are affected (any if any tools other than chrpath and cmake
> are broken).
> 
> Based only on chrpath and cmake reverse dependencies, there is an upper
> bound of about 1500 binNMUs (after the tools after fixed). Hopefully
> that can be reduced!
>
> I really don't have any time to fix all this. Please can someone else
> have a look!

I'll try to do an archive scan asap to really get an idea on how many
packages are affected. After I'll look at how to fix chrpath, but help
would be welcome as I also don't have a lot of time.

> OpenMPI maintainers (and anyone else affected):
> One possible workaround is to use chrpath -r ""  on mips*
> architectures until this is fixed since that command does not cause any
> tags to be moved. It has a tiny performance penalty but should
> otherwise work properly.

Thanks for the workaround.

Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net


signature.asc
Description: PGP signature


Bug#818909: Segfaults caused by new DT_MIPS_RLD_MAP_REL tag and RPATH removers

2016-04-07 Thread James Cowgill
Hi,

I've managed to find the cause of the openmpi segfault (#818909). It
might affect a number of different packages.

The segfault is caused by the interaction of the
new DT_MIPS_RLD_MAP_REL dynamic tag (from binutils 2.26) and chrpath.
Unlike all other tags, this tag is relative to the offset of the tag
within the executable. chrpath is used to remove rpaths from ELF files.
It does this by moving all of the other dynamic tags up one entry, but
since the DT_MIPS_RLD_MAP_REL is not updated, it now points to an
incorrect offset. The dynamic linker will then overwrite some other
memory when processing the DT_MIPS_RLD_MAP_REL tag.

The openmpi segfault was caused by a global variable being initialized
incorrectly (overwritten by the dynamic linker). I expect other
executables using chrpath will also be affected - possibly in strange
ways (not nessesarily a segfault).

It also seems that at least cmake uses the same technique for removing
the RPATH so any cmake reverse dependencies could be affected. The
DT_MIPS_RLD_MAP_REL is only created for executables which limits the
effect of this slightly. Only packages built using binutils
>= 2.25.51.20151014-1 will be affected.

There is a convinient way to test if a package is broken using the
presence of the old DT_MIPS_RLD_MAP tag. When correct
(DT_MIPS_RLD_MAP_REL + tag offset + executable base address) equals
DT_MIPS_RLD_MAP, so someone could analyze the archive to find which
packages are affected (any if any tools other than chrpath and cmake
are broken).

Based only on chrpath and cmake reverse dependencies, there is an upper
bound of about 1500 binNMUs (after the tools after fixed). Hopefully
that can be reduced!

I really don't have any time to fix all this. Please can someone else
have a look!

OpenMPI maintainers (and anyone else affected):
One possible workaround is to use chrpath -r ""  on mips*
architectures until this is fixed since that command does not cause any
tags to be moved. It has a tiny performance penalty but should
otherwise work properly.

James

signature.asc
Description: This is a digitally signed message part