On Mon, 2019-07-29 at 16:24 +0200, Mark Wielaard wrote: > On Mon, Jul 29, 2019 at 10:43:56AM +0200, Florian Weimer wrote: > > * Mark Wielaard: > > > > > + if (elf == NULL) > > > + { > > > + /* This likely means it just isn't an ELF file, probably not a > > > + real issue, but warn if verbose reporting. */ > > > + if (verbose > 0) > > > + fprintf (stderr, "warning: %s: %s\n", current_path, elf_errmsg (-1)); > > > + return false; > > > + } > > > > Is it possible to distinguish the error from a memory allocation error? > > It would be wrong to mis-classify a file just because the system is low > > on memory. > > You are right this is not the proper way to report the issue. > Normally, when just using elf_begin, a NULL return should be reported > through elf_issue (which will set the issues flag). > > But, because I added -z, we are using either elf_begin or > dwelf_elf_begin. dwelf_elf_begin will return NULL (instead of a an > empty (ELF_K_NONE) Elf descriptor when there is an issue, or the > (decompressed) file wasn't an ELF file. > > So we should split the error reporting. If we called elf_begin and get > NULL we should call elf_issue to report the proper issue. > > If we called dwefl_elf_begin and we get NULL, I am not sure yet what > the proper way is to detect whether it is a real issue, or "just" not > a (decompressed) ELF file. I am afraid the current handling is the > best we can do. > > Maybe we can fix dwelf_elf_begin to return an empty (ELF_K_NONE) Elf > descriptor if there was no issue, but the (decompressed) file wasn't > an ELF file.
Sorry this took so long. And this is indeed the last issue holding up the release. But this is a tricky problem. We made a mistake when we wrote the contract for dwelf_elf_begin by saying it would never return ELF_K_NONE. That made it different from elf_begin and made it impossible to distinguish between a real (file or decompression) error and whether the file simply wasn't an ELF file and also wasn't a compressed ELF file. I think we should fix the contract. Technically it would be an API break, but I think no user is really relying on the fact that the Elf handle returned is never ELF_K_NONE. Users still need to distinguish between ELF_K_ELF and ELF_K_AR (and theoretically any other ELF_K_type, like COFF, which we currently don't support, but we do define it). So that is what the attached patch does. I also audited all decompression code to make sure it returns error codes consistently. The decompression will either decompress successfully and return DWFL_E_NOERROR, or if the file wasn't compressed (or an embedded image) it will return DWFL_E_BADELF. In all other cases (file or decompression error) it will set a a different DWFL_E error. This "only" leaves the problem that we don't have a good way to translate those errors into "real" libelf error codes. So for now we just fake one if it wasn't an elf_errno value. I don't intent to try to solve this error translation issue before the release (I don't know how to do it yet). What do you think about this change to dwelf_elf_begin? The change would make it possible to detect real errors in the elfclassify code, whether elf_begin or dwelf_elf_begin was used. So we would not misclassify files (but return an error status of 2). Thanks, Mark
From 648837a9f1be7628e9ceee6818bf56c80b9d3fa1 Mon Sep 17 00:00:00 2001 From: Mark Wielaard <m...@klomp.org> Date: Mon, 12 Aug 2019 00:43:22 +0200 Subject: [PATCH] libdwelf: Make dwelf_elf_begin return NULL only when there is an error. dwelf_elf_begin was slightly different from elf_begin in case the file turned out to not be an ELF file. elf_begin would return an Elf handle with ELF_K_NONE. But dwelf_elf_begin would return NULL. This made it impossible to tell the difference between a file or decompression error and a (decompressed) file not being an ELF file. Since dwelf_elf_begin could still return different kinds of ELF files (ELF_K_ELF or ELF_K_AR - and theoretically ELF_K_COFF) this was not really useful anyway. So make it so that dwelf_elf_begin always returns an Elf handle unless there was a real error reading or decompressing the file. Otherwise return NULL to make clear there was a real error. Make sure that the decompression function returns DWFL_E_BADELF only when the file isn't compressed. In which case the Elf handle won't be replaced and can be returned (as ELF_K_NONE). Add a new version to dwelf_elf_begin so programs can rely on it returning NULL only for real errors. Signed-off-by: Mark Wielaard <m...@klomp.org> --- libdw/ChangeLog | 4 ++++ libdw/libdw.map | 4 ++++ libdwelf/ChangeLog | 6 ++++++ libdwelf/dwelf_elf_begin.c | 12 +++++++----- libdwelf/libdwelf.h | 9 ++++++--- libdwfl/ChangeLog | 8 ++++++++ libdwfl/gzip.c | 5 +++-- libdwfl/open.c | 10 +++++++--- 8 files changed, 45 insertions(+), 13 deletions(-) diff --git a/libdw/ChangeLog b/libdw/ChangeLog index 6b779e77..bf1f4857 100644 --- a/libdw/ChangeLog +++ b/libdw/ChangeLog @@ -1,3 +1,7 @@ +2019-08-12 Mark Wielaard <m...@klomp.org> + + * libdw.map (ELFUTILS_0.177): Add new version of dwelf_elf_begin. + 2019-06-28 Mark Wielaard <m...@klomp.org> * libdw.map (ELFUTILS_0.177): New section. Add diff --git a/libdw/libdw.map b/libdw/libdw.map index 2e1c0e9e..decac05c 100644 --- a/libdw/libdw.map +++ b/libdw/libdw.map @@ -365,4 +365,8 @@ ELFUTILS_0.175 { ELFUTILS_0.177 { global: dwelf_elf_e_machine_string; + # Replaced ELFUTILS_0.175 versions. Both versions point to the + # same implementation, but users of the new symbol version can + # presume that NULL is only returned on error (otherwise ELF_K_NONE). + dwelf_elf_begin; } ELFUTILS_0.175; diff --git a/libdwelf/ChangeLog b/libdwelf/ChangeLog index 29f9a509..5b48ed8f 100644 --- a/libdwelf/ChangeLog +++ b/libdwelf/ChangeLog @@ -1,3 +1,9 @@ +2019-08-12 Mark Wielaard <m...@klomp.org> + + * libdwelf.h (dwelf_elf_begin): Update documentation. + * dwelf_elf_begin.c (dwelf_elf_begin): Don't suppress ELF_K_NONE. + Mark old and new version. + 2019-06-28 Mark Wielaard <m...@klomp.org> * Makefile.am (libdwelf_a_SOURCES): Add dwelf_elf_e_machine_string.c. diff --git a/libdwelf/dwelf_elf_begin.c b/libdwelf/dwelf_elf_begin.c index 79825338..c7d63a1c 100644 --- a/libdwelf/dwelf_elf_begin.c +++ b/libdwelf/dwelf_elf_begin.c @@ -41,13 +41,13 @@ dwelf_elf_begin (int fd) { Elf *elf = NULL; Dwfl_Error e = __libdw_open_elf (fd, &elf); - if (elf != NULL && elf_kind (elf) != ELF_K_NONE) + if (e == DWFL_E_NOERROR) return elf; - /* Elf wasn't usable. Make sure there is a proper elf error message. */ - - if (elf != NULL) - elf_end (elf); + /* Elf wasn't usable. Make sure there is a proper elf error + message. This is probably not the real error, because there is + no good way to propagate errnos or decompression errors, but + better than nothing. */ if (e != DWFL_E_LIBELF) { @@ -60,3 +60,5 @@ dwelf_elf_begin (int fd) return NULL; } +OLD_VERSION (dwelf_elf_begin, ELFUTILS_0.175) +NEW_VERSION (dwelf_elf_begin, ELFUTILS_0.177) diff --git a/libdwelf/libdwelf.h b/libdwelf/libdwelf.h index cb7ea091..dbb8f08c 100644 --- a/libdwelf/libdwelf.h +++ b/libdwelf/libdwelf.h @@ -128,9 +128,12 @@ extern void dwelf_strtab_free (Dwelf_Strtab *st) /* Creates a read-only Elf handle from the given file handle. The file may be compressed and/or contain a linux kernel image header, in which case it is eagerly decompressed in full and the Elf handle - is created as if created with elf_memory (). On error NULL is - returned. The Elf handle should be closed with elf_end (). The - file handle will not be closed. Does not return ELF_K_NONE handles. */ + is created as if created with elf_memory (). On decompression or + file errors NULL is returned (and elf_errno will be set). If there + was no error, but the file is not an ELF file, then an ELF_K_NONE + Elf handle is returned (just like with elf_begin). The Elf handle + should be closed with elf_end (). The file handle will not be + closed. */ extern Elf *dwelf_elf_begin (int fd); /* Returns a human readable string for the given ELF header e_machine diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog index 8cbe90c9..04a39637 100644 --- a/libdwfl/ChangeLog +++ b/libdwfl/ChangeLog @@ -1,3 +1,11 @@ +2019-08-12 Mark Wielaard <m...@klomp.org> + + * gzip.c (open_stream): Return DWFL_E_ERRNO on bad file operation. + * open.c (libdw_open_elf): New argument bad_elf_ok. Check it and + return DWFL_E_NOERROR in case it is set and error was DWFL_E_BADELF. + (__libdw_open_file): Call libdw_open_elf with bad_elf_ok false. + (__libdw_open_elf): Call libdw_open_elf with bad_elf_ok true. + 2019-08-05 Omar Sandoval <osan...@fb.com> * dwfl_segment_report_module.c (dwfl_segment_report_module): Assign diff --git a/libdwfl/gzip.c b/libdwfl/gzip.c index c2c13baf..043d0b6e 100644 --- a/libdwfl/gzip.c +++ b/libdwfl/gzip.c @@ -139,14 +139,14 @@ open_stream (int fd, off_t start_offset, struct unzip_state *state) { int d = dup (fd); if (unlikely (d < 0)) - return DWFL_E_BADELF; + return DWFL_E_ERRNO; if (start_offset != 0) { off_t off = lseek (d, start_offset, SEEK_SET); if (off != start_offset) { close (d); - return DWFL_E_BADELF; + return DWFL_E_ERRNO; } } state->zf = gzdopen (d, "r"); @@ -288,6 +288,7 @@ unzip (int fd, off_t start_offset, if (result == DWFL_E_NOERROR && gzdirect (state.zf)) { gzclose (state.zf); + /* Not a compressed stream after all. */ return fail (&state, DWFL_E_BADELF); } diff --git a/libdwfl/open.c b/libdwfl/open.c index 74367359..35fc5283 100644 --- a/libdwfl/open.c +++ b/libdwfl/open.c @@ -118,7 +118,7 @@ what_kind (int fd, Elf **elfp, Elf_Kind *kind, bool *may_close_fd) static Dwfl_Error libdw_open_elf (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok, - bool never_close_fd) + bool never_close_fd, bool bad_elf_ok) { bool may_close_fd = false; @@ -164,6 +164,10 @@ libdw_open_elf (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok, && !(archive_ok && kind == ELF_K_AR)) error = DWFL_E_BADELF; + /* This basically means, we keep a ELF_K_NONE Elf handle and return it. */ + if (bad_elf_ok && error == DWFL_E_BADELF) + error = DWFL_E_NOERROR; + if (error != DWFL_E_NOERROR) { elf_end (elf); @@ -184,11 +188,11 @@ libdw_open_elf (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok, Dwfl_Error internal_function __libdw_open_file (int *fdp, Elf **elfp, bool close_on_fail, bool archive_ok) { - return libdw_open_elf (fdp, elfp, close_on_fail, archive_ok, false); + return libdw_open_elf (fdp, elfp, close_on_fail, archive_ok, false, false); } Dwfl_Error internal_function __libdw_open_elf (int fd, Elf **elfp) { - return libdw_open_elf (&fd, elfp, false, true, true); + return libdw_open_elf (&fd, elfp, false, true, true, true); } -- 2.18.1