Re: [Rpm-maint] [rpm-software-management/rpm] debugedit.c patches (#171)

2017-03-06 Thread Jeff Johnson
Another issue that might make debugedit.c more portable is similar to inlining 
DW_FORM* instead of
`#include `

Instead of using EU version dependent #defines for strtab_init() etc (which 
needs -lebl), it may be more "portable" to just inline the routines and 
maintain a copy in debugedit.c.

That would permit non-elfutils -lelf to be used "portably" even if there is 
code duplication.

I dunno (yet) how much code would need to be duplicated, but will take a look 
soonishly.

-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/171#issuecomment-284468205___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] debugedit.c patches (#171)

2017-03-06 Thread Mark Hatle
Re: https://patchwork.openembedded.org/patch/46887

The bug and related information are available at:

https://bugzilla.yoctoproject.org/show_bug.cgi?id=4089

I am currently working through this to see if the current rpm (4) version 
suffers from the same failure.  I will update this if it does or does not.

-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/171#issuecomment-284464186___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] debugedit.c patches (#171)

2017-03-06 Thread Jeff Johnson
Yes all the patches were reversed.

Meanwhile I just sorted the first patch with Mark Hatle (aka "fray"). The 
segfault is not reproducible with EU-0.168.

The hex printing was an attempt to avoid pgpHexStr() which differs between 
rpm.org and rpm5.org.

You explained away the valid_file() check

The other changes to debugedit --help are largely cosmetic. I maintain POPT and 
so am slightly OCD ;-)


-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/171#issuecomment-284463573___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Allow SOURCE_DATE_EPOCH to override file timestamps (#144)

2017-03-06 Thread Bernhard M. Wiedemann
on the semi-offtopic python .pyc timestamp issue see: 
https://github.com/python/cpython/pull/296

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/144#issuecomment-284450175___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [PATCH] debugedit: Support String/Line table rewriting for larger/smaller paths.

2017-03-06 Thread Panu Matilainen

On 03/06/2017 01:15 PM, Mark Wielaard wrote:

On Sat, 2017-03-04 at 11:28 +0200, Panu Matilainen wrote:

I dont feel qualified to really review this, and perhaps others are
feeling the same way since it's been out there for a week now with no
comments at all.

Because this seems quite awesome (even if also a bit scary), to avoid
stalling forever with no-one to review it: if there are no objections
raised by Monday I'm going to just apply it (and the couple of other
recent debuginfo patches). So anybody having doubts, speak up now.


Thanks. Obviously I think the patch is awesome and should be integrated
sooner rather than later :) But being slightly worried this means "you
touched it last, you own it now"

[...]

That's the price of course :)

Applied (all four patches), thanks for all the work and comprehensive 
explanation and self-review!


- Panu -

___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] debug edit build generation is endian dependent when cross-compiling? (#171)

2017-03-06 Thread Mark Wielaard
On Sun, 2017-03-05 at 09:58 -0800, Jeff Johnson wrote:
> Portability of debugedit.c on systems that do not support #include
>  (solaris, *BSD) can be achieved by duplicating certain DWARF
> standard constants:

I think you reversed the patch you are proposing again.
I'll leave it to others to say whether it makes sense or if rpm should
just require up to date elf.h and dwarf.h headers be present.

Cheers,

Mark
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] debug edit build generation is endian dependent when cross-compiling? (#171)

2017-03-06 Thread Mark Wielaard
On Sun, 2017-03-05 at 09:48 -0800, Jeff Johnson wrote:
> For rpm portability with older versions of elfutils, this harmless sanity 
> check might be useful:
> 
> (excuse the different hash functions in use in the patch snippet below)
> @@ -1533,40 +1493,12 @@
>/* Now format the build ID bits in hex to print out.  */
>{   
>  const uint8_t * id = (uint8_t *)build_id->d_buf + build_id_offset;
> -char hex[build_id_size * 2 + 1];
> -int n = snprintf (hex, 3, "%02" PRIx8, id[0]);
> -assert (n == 2);
> -for (i = 1; i < (int)build_id_size; ++i)
> -  {
> -   n = snprintf ([i * 2], 3, "%02" PRIx8, id[i]);
> -   assert (n == 2);
> -  }
> +char *hex = pgpHexStr(id, build_id_size);
>  puts (hex);
> +free(hex);
>}   
>  }

I don't understand this hunk. What is it attempting to do?

Also I assume you posted the reverse patch?

> -/* It avoided the segment fault while file's bss offset have a large number.
> -   See https://bugzilla.redhat.com/show_bug.cgi?id=1019707
> -   https://bugzilla.redhat.com/show_bug.cgi?id=1020842 for detail. */

This is indeed a bug in elfutils, fixed in elfutils 0.162.
But also only occurs for bogus/bad ELF files in the first place.

> -void valid_file(int fd)
> -{
> -  Elf *elf = elf_begin (fd, ELF_C_RDWR, NULL);
> -  if (elf == NULL)
> -  {
> -error (1, 0, "elf_begin: %s", elf_errmsg (-1));
> -return;
> -  }
> -
> -  elf_flagelf (elf, ELF_C_SET, ELF_F_LAYOUT);
> -
> -  if (elf_update (elf, ELF_C_WRITE) < 0)
> -error (1, 0, "elf_update: %s", elf_errmsg (-1));
> -
> -  elf_end (elf);
> -
> -  return;
> -}

So this opens and closes the ELF file and tries to write it out to disk,
but since nothing changed all this does is make libelf do a layout check
of all headers. Where it would notice nothing changed, so ELF_C_WRITE
would not actually write something out. Except if it does see a
discrepancy and might "correct" a header field. It sounds slightly safer
to do an ELF_C_UPDATE so that nothing might accidentally be written out.

Although it might indeed catch the issue with the bogus ELF file earlier
(and so prevent a crash with a buggy elfutils) I am not sure it is
really the task of rpm/debugedit to make sure the user doesn't have
bogus/bad ELF files and/or an old/buggy libelf implementation. But if
you don't mind opening/closing/reading the ELF file twice this might
indeed not be that expensive (compared to all the other work debugedit
does). It is somewhat bogus and obscure though.

Cheers,

Mark
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] debug edit build generation is endian dependent when cross-compiling? (#171)

2017-03-06 Thread Mark Wielaard
On Sun, 2017-03-05 at 09:09 -0800, Jeff Johnson wrote:
> rpm5.org carries this patch from Yocto:
>   https://patchwork.openembedded.org/patch/46887
> which likely should be added when updating debug edit.c soon.

It is slightly hard to review a patch to a patch.
Could you extract the precise patch you suggest would be helpful and
post it to rpm-maint for review?

Also I suspect that the latest patch that moved away from using
ELF_C_RDWR_MMAP makes the original problem also go away.

Thanks,

Mark
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [PATCH] debugedit: Support String/Line table rewriting for larger/smaller paths.

2017-03-06 Thread Mark Wielaard
On Sat, 2017-03-04 at 11:28 +0200, Panu Matilainen wrote:
> I dont feel qualified to really review this, and perhaps others are 
> feeling the same way since it's been out there for a week now with no 
> comments at all.
> 
> Because this seems quite awesome (even if also a bit scary), to avoid 
> stalling forever with no-one to review it: if there are no objections 
> raised by Monday I'm going to just apply it (and the couple of other 
> recent debuginfo patches). So anybody having doubts, speak up now.

Thanks. Obviously I think the patch is awesome and should be integrated
sooner rather than later :) But being slightly worried this means "you
touched it last, you own it now", let me do at least my own review.

Questions I would ask about this patch if someone else would have
submitted it:

- Why depend on elfutils libebl/libdw for the .debug_str string table
  reconstruction? Isn't a string table just a simple concatenation of
  strings?

It is indeed a simple concatenation of strings, but to pack them
efficiently substrings can be reused/shared. e.g. if the table contains
the string "easy_and_slow" at index 16, then it also contains the string
"slow" at index 25. The elfutils strtab code already handles this
efficiently, so I wanted to depend on that instead of open coding it.

- Why introduce those string storage allocation blocks? Can't you just
  malloc whenever you need a new string?

To make sure we don't leak too much memory we need to track our mallocs
already. strings come either from the existing .debug_str table or from
newly created (replacement) strings. I felt it was a good idea to pack
those strings as closely together because they might be compared often.
So keep them close together helps data cache locality. But this was
probably premature optimisation since we end up at the strings
indirectly through the original indexes and we compare those first (if
the original indexes are equal then the (replacement) strings would be
the same too (and so we end up not having to compare that many strings
anyway). But we needed a structure to keep track of our allocated
strings anyway. So I kept it as is.

- To compare/find existing string table indexes the code uses a binary
  search tree (tsearch), but for line table indexes the code uses a
  simple (sorted) array (bsearch). Why different mechanisms for the same
  kind of lookup?

There are much, much more string table indexes than line table offsets
(e.g. libstdc++.so has 125 line tables, and 10331 full strings [ignoring
shared "tails"]). So for line tables we don't mind if we sometimes have
to do a quick linear search (in phase 0). But for the string indexes it
is important to always be able to search/insert them efficiently. So for
the string table indexes we always use the binary search tree balanced.
For line tables we simply sort the array when we collected all offsets.
The binary search tree does have a bit more memory overhead, but it was
worth it for the fast lookups [*].

- Does the switch from mmap and rawdata to simple read/write and
  elf_data cause much more memory usage?

Yes and no. When mmapping the file the memory is still allocated, but
might not really be used. We do in most cases end up using most of it
because for recalculating the build-id we need to mmap/read in all
section data. The reading in of the ELF data through elf_getdata is
slightly less efficient that elf_rawdata because it might have to do
endian conversion if the ELF file is not native. But in practice it
always is (we just build it). We can probably do better by being smarter
about when/how we update the build-ids. I didn't try to do that yet.

- The relocation rewriting seems scary.

Yes it is :{ Luckily this is only needed for object files (ET_REL) and
normally they aren't packaged. The big exceptions are kernel modules
(why those are ET_REL files and not simply ET_DYN is beyond me) and
static ar archives. If we would have to handle all possible
(architecture specific) relocations this would be really impossible.
Luckily for the indexes into .debug sections only simple relocation
types are used (basically it provides an extra offset into the section).
The line table relocations pointing into code sections are also simple
because we don't care about the actual relocation type, just where in
the section they are. So we can simply move the relocation offset.
We don't handle static ar archives at all. Which is something we might
want to look at at some later point.

- The code ends up touching allocated sections! Isn't that a no-no?

Yes. Anything allocated really shouldn't be touched. Only the
non-allocated (debug) sections should be touched/written out. Sadly a
bug in elfutils means we do end up having to mark the allocated sections
dirty to make sure the whole ELF file ends up correct on disk. This is
really unfortunate and might make things a bit slower. But no
(allocated) data is really changed at all. And once the bug is fixed (in
elfutils 0.169) this 

Re: [Rpm-maint] [PATCH] debugedit: Support String/Line table rewriting for larger/smaller paths.

2017-03-06 Thread Thierry Vignaud
On 4 March 2017 at 10:28, Panu Matilainen  wrote:
>> From: Mark Wielaard 
>>
>> debugedit --base to --dest rewriting of debug source file paths only
>> supported dest paths that were smaller or equal than the base path
>> (and the size should differ more than 1 character for correct debug
>> lines).
>> All paths were changed "in place". Which could in theory mess up debug str
>> sharing.
>>
>> This rewrite supports base and dest strings of any size (some limitations,
>> see below). This is done by reconstructing the debug_str and debug_line
>> tables and updating the references in the debug_info attributes pointing
>> to these tables. Plus, if necessary (only for ET_REL kernel modules),
>> updating any relocations for the debug_info and debug_line sections.
>>
>> This has the nice benefit of merging any duplicate strings in the
>> debug_str table which might resulting on slightly smaller files.
>> kernel modules are ET_REL files that often contain a lot of duplicate
>> strings.
>>
>> The rewrite uses elfutils (either libebl or libdw) to reconstruct the
>> debug_str table. Since we are changing some section sizes now we cannot
>> just use mmap and rawdata to poke the values, but need to read in and
>> write out the changed sections. This does take a bit more memory because
>> we now also need to keep track of all string/line references.
>
>
> I dont feel qualified to really review this, and perhaps others are feeling
> the same way since it's been out there for a week now with no comments at
> all.
>
> Because this seems quite awesome (even if also a bit scary), to avoid
> stalling forever with no-one to review it: if there are no objections raised
> by Monday I'm going to just apply it (and the couple of other recent
> debuginfo patches). So anybody having doubts, speak up now.

If it does fix the infamous "Dest dir longer than base dir is not
supported", the merrier.
Though that didn't impact that much users in practice (but we did get
a report in mga a long time ago from one guy who set _topdir to
/RPM...).
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint