Re: [PATCH] PR30879: debuginfod intermittent terminate()

2023-10-01 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> The new terminate_handler seems really useful, I wonder why something
> like this isn't the default behaviour.

Yeah.

> I think I understand the idea behind wrapping scan () to catch the
> reportable_exception.

Yup.

> The sqlite_checkpoint_pb looks simpler now. But I must admit I don't
> fully understand if the lifetime of the object. The sqlite_ps had a
> destructor that called finalize, which I assume isn't necessary?

In the new case, we're calling "sqlite3_execute" directly, which does
not need finalization.

> The code looks good to me.

Thanks!

> Did you already catch (haha, pun intended) where/what caused that odd
> sparc uncaught exception?

Nope. :( In the cases I saw, the error occurred right during the
object ctor, but was not able to trigger it by hand or get a core dump
to analyze further.

- FChE



Re: [PATCH 00/14] elfutils: DWARF package (.dwp) file support

2023-09-27 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> This patch series adds support for DWARF package files to libdw and the
> elfutils tools. It supports the GNU DebugFission format for DWARF 4 [1]
> and the format standardized in DWARF 5 (section 7.3.5 "DWARF Package
> Files"). It supports both automatically opening the .dwp file for a
> skeleton unit and examining the .dwp file alone, just like for .dwo
> files. [...]

Does this work have any implications for debuginfod?

- FChE



[PATCH] PR30879: debuginfod intermittent terminate()

2023-09-22 Thread Frank Ch. Eigler via Elfutils-devel


Author: Frank Ch. Eigler 
Date:   Fri Sep 22 15:30:51 2023 -0400

PR30879: intermittent debuginfod crash with unhandled exception

Code inspection identified two places where sqlite_ps objects were
being created/used outside try/catch protection.  This patch wraps or
replaces them.

* configure.ac: Look for glibc backtrace headers.
* debuginfod.cxx (scan): New function wrapped by a try/catch loop.
  (sqlite_checkpoint_pb): Use non-exception-producing sqlite functions.
  (main, my_terminate_handler): New terminate() handler.

diff --git a/configure.ac b/configure.ac
index 4b67c84425fa..29ed32feaee6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -839,6 +839,7 @@ AS_IF([test "x$enable_libdebuginfod" = "xdummy"],
   [AC_DEFINE([DUMMY_LIBDEBUGINFOD], [1], [Build dummy libdebuginfod])])
 AM_CONDITIONAL([LIBDEBUGINFOD],[test "x$enable_libdebuginfod" = "xyes" || test 
"x$enable_libdebuginfod" = "xdummy"])
 AM_CONDITIONAL([DUMMY_LIBDEBUGINFOD],[test "x$enable_libdebuginfod" = 
"xdummy"])
+AC_CHECK_HEADERS([execinfo.h])
 
 # Look for libmicrohttpd, libarchive, sqlite for debuginfo server
 # minimum versions as per rhel7.
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index d72d2ad16960..e53228803bb0 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -44,6 +44,12 @@ extern "C" {
 }
 #endif
 
+#ifdef HAVE_EXECINFO_H
+extern "C" {
+#include 
+}
+#endif
+
 extern "C" {
 #include "printversion.h"
 #include "system.h"
@@ -95,6 +101,7 @@ extern "C" {
 #include 
 #include 
 #include 
+#include 
 #include 
 // #include  // on rhel7 gcc 4.8, not competent
 #include 
@@ -1152,22 +1159,13 @@ struct sqlite_ps
 
 struct sqlite_checkpoint_pb: public periodic_barrier
 {
-  sqlite_ps ckpt;
-
+  // NB: don't use sqlite_ps since it can throw exceptions during ctor etc.
   sqlite_checkpoint_pb(unsigned t, unsigned p):
-periodic_barrier(t, p), ckpt(db, "periodic wal checkpoint",
- "pragma wal_checkpoint(truncate);") {}
+periodic_barrier(t, p) { }
   
   void periodic_barrier_work() noexcept
   {
-try
-  {
-ckpt.reset().step_ok_done();
-  }
-catch (const reportable_exception& e)
-  {
-e.report(clog);
-  }
+(void) sqlite3_exec (db, "pragma wal_checkpoint(truncate);", NULL, NULL, 
NULL);
   }
 };
   
@@ -3714,11 +3712,9 @@ scan_archive_file (const string& rps, const stat_t& st,
 // The thread that consumes file names off of the scanq.  We hold
 // the persistent sqlite_ps's at this level and delegate file/archive
 // scanning to other functions.
-static void*
-thread_main_scanner (void* arg)
+static void
+scan ()
 {
-  (void) arg;
-
   // all the prepared statements fit to use, the _f_ set:
   sqlite_ps ps_f_upsert_buildids (db, "file-buildids-intern", "insert or 
ignore into " BUILDIDS "_buildids VALUES (NULL, ?);");
   sqlite_ps ps_f_upsert_fileparts (db, "file-fileparts-intern", "insert or 
ignore into " BUILDIDS "_fileparts VALUES (NULL, ?);");
@@ -3845,8 +3841,25 @@ thread_main_scanner (void* arg)
   inc_metric("thread_work_total","role","scan");
 }
 
-
   add_metric("thread_busy", "role", "scan", -1);
+}
+
+
+// Use this function as the thread entry point, so it can catch our
+// fleet of exceptions (incl. the sqlite_ps ctors) and report.
+static void*
+thread_main_scanner (void* arg)
+{
+  (void) arg;
+  while (! interrupted)
+try
+  {
+scan();
+  }
+catch (const reportable_exception& e)
+  {
+e.report(cerr);
+  }
   return 0;
 }
 
@@ -4359,6 +4372,20 @@ default_concurrency() // guaranteed >= 1
 }
 
 
+// 30879: Something to help out in case of an uncaught exception.
+void my_terminate_handler()
+{
+#if defined(__GLIBC__)
+  void *array[40];
+  int size = backtrace (array, 40);
+  backtrace_symbols_fd (array, size, STDERR_FILENO);
+#endif
+#if defined(__GLIBCXX__) || defined(__GLIBCPP__)
+  __gnu_cxx::__verbose_terminate_handler();
+#endif
+  abort();
+}
+
 
 int
 main (int argc, char *argv[])
@@ -4367,6 +4394,8 @@ main (int argc, char *argv[])
   (void) bindtextdomain (PACKAGE_TARNAME, LOCALEDIR);
   (void) textdomain (PACKAGE_TARNAME);
 
+  std::set_terminate(& my_terminate_handler);
+
   /* Tell the library which version we are expecting.  */
   elf_version (EV_CURRENT);
 


Re: Building Elfutils with Mingw32

2023-09-18 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...] I would prefer to use MinGW instead of Cygwin. For some extra
> context, we use the Yocto Project to build our Embedded Linux
> images, which can provide a cross compiler toolchain and other host
> development tools for Windows using MinGW. [...]

FWIW, I'm not aware of any conflict between these toolsets, to the
extent that anything would prevent you from using together some tools
built for the mingw runtime, and other tools built for cygwin.

- FChE



Re: Building Elfutils with Mingw32

2023-09-13 Thread Frank Ch. Eigler via Elfutils-devel
Hi, Colin -

> I'm currently trying to determine the level of effort required to
> compile Elfutils for Windows using MinGW. I'd like to get a version
> of Elfutils compiled with libdebginfod in order to compile GDB with
> debuginfod support on Windows. I've currently explored two
> avenues...

Nice.  Please be aware of the distinct configury options to build the
debuginfod client (library linked into gdb) and the server (program).
Some of the packages you're looking for ("fts") are only used on the
server, so you wouldn't need them for your gdb project.

.../configure --help:

  --enable-libdebuginfod  Build debuginfod client library (can be =dummy)
  --enable-debuginfod Build debuginfod server
  --enable-debuginfod-urls[=URLS]

(I'm sorry I can't be more helpful with your substantial questions.)


- FChE



[PATCH] PR28204, debuginfod IMA

2023-09-07 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Here's a squashed/rebased version of the big IMA patch.  I also
tweaked a few documentation oriented bits, and removed the
"ima:default" tag.


commit 4e45a08aee42958298a3fad6043cbf96243d13a5 (HEAD -> 
users/fche/try-bz28204, origin/users/fche/try-bz28204)
Author: Ryan Goldberg 
Date:   Mon Aug 14 13:51:00 2023 -0400

debuginfod: PR28204 - RPM IMA per-file signature verification

Recent versions of Fedora/RHEL include per-file cryptographic
signatures in RPMs, not just an overall RPM signature.  This work
extends debuginfod client & server to extract, transfer, and verify
those signatures.  These allow clients to assure users that the
downloaded files have not been corrupted since their original
packaging.  Downloads that fail the test are rejected.

Clients may select a desired level of enforcement for sets of URLs in
the DEBUGINFOD_URLS by inserting special markers ahead of them:

ima:ignore   pay no attention to absence or presence of signatures
ima:permissive   require signatures to be correct, but accept absent 
signatures
ima:enforcingrequire every file to be correctly signed

The default is ima:permissive mode, which allows signatures to
function like a checksum to detect accidental corruption, but accepts
operation in a mix of signed and unsigned packages & servers.

NB: debuginfod section queries are excluded from signature
verification at this time, and function as though ima:ignore were in
effect.

IMA signatures are verified against a set of signing certificates.
These are normally published by distributions.  A selection of such
certificates is included with the debuginfod client, but some system
directories are also searched.  See $DEBUGINFOD_IMA_CERT_PATH.  These
certificates are assumed trusted.

Signed-off-by: Ryan Goldberg 
Signed-off-by: Frank Ch. Eigler 

diff --git a/ChangeLog b/ChangeLog
index 6aed95b6974e..b3b1a8ebc93a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2023-08-14  Ryan Goldberg  
+
+   * configure.ac (ENABLE_IMA_VERIFICATION): Look for librpm, libimaevm 
and libcrypto
+   * (DEBUGINFOD_IMA_CERT_PATH): Default path for ima certificate 
containing
+   dirs. See also profile.*.in.
+
 2023-03-27  Di Chen  
 
* NEWS: Support readelf -Ds for using dynamic segment to
diff --git a/config/ChangeLog b/config/ChangeLog
index ce1f74f621aa..30fc3deea09e 100644
--- a/config/ChangeLog
+++ b/config/ChangeLog
@@ -1,3 +1,10 @@
+2023-08-14  Ryan Goldberg  
+
+   * profile.csh.in: Set DEBUGINFOD_IMA_CERT_PATH directly.
+   * profile.sh.in: Set DEBUGINFOD_IMA_CERT_PATH directly.
+   * elfutils.spec.in: Add BuildRequires rpm-devel,
+   ima-evm-utils-devel, openssl-devel, rpm-sign.
+
 2023-02-21  Mark Wielaard  
 
* eu.am (USE_AFTER_FREE3_WARNING): Define.
diff --git a/config/elfutils.spec.in b/config/elfutils.spec.in
index 9277c08f7c82..2e962bb40d07 100644
--- a/config/elfutils.spec.in
+++ b/config/elfutils.spec.in
@@ -43,6 +43,12 @@ BuildRequires: curl
 # For run-debuginfod-response-headers.sh test case
 BuildRequires: socat
 
+# For debuginfod rpm IMA verification
+BuildRequires: rpm-devel
+BuildRequires: ima-evm-utils-devel
+BuildRequires: openssl-devel
+BuildRequires: rpm-sign
+
 %define _gnu %{nil}
 %define _programprefix eu-
 
diff --git a/config/profile.csh.in b/config/profile.csh.in
index d962d969c05b..2a2ecacb3c80 100644
--- a/config/profile.csh.in
+++ b/config/profile.csh.in
@@ -4,13 +4,17 @@
 # See also [man debuginfod-client-config] for other environment variables
 # such as $DEBUGINFOD_MAXSIZE, $DEBUGINFOD_MAXTIME, $DEBUGINFOD_PROGRESS.
 
+set prefix="@prefix@"
 if (! $?DEBUGINFOD_URLS) then
-set prefix="@prefix@"
 set DEBUGINFOD_URLS=`sh -c 'cat /dev/null "$0"/*.urls 2>/dev/null; :' 
"@sysconfdir@/debuginfod" | tr '\n' ' '`
 if ( "$DEBUGINFOD_URLS" != "" ) then
 setenv DEBUGINFOD_URLS "$DEBUGINFOD_URLS"
+if (! $?DEBUGINFOD_IMA_CERT_PATH) then
+set 
DEBUGINFOD_IMA_CERT_PATH="@sysconfdir@/debuginfod/ima-certs/:@DEBUGINFOD_IMA_CERT_PATH@"
+setenv DEBUGINFOD_IMA_CERT_PATH "$DEBUGINFOD_IMA_CERT_PATH"
+endif
 else
 unset DEBUGINFOD_URLS
 endif
-unset prefix
 endif
+unset prefix
diff --git a/config/profile.sh.in b/config/profile.sh.in
index 3f4397dcb44d..adc06a7ed939 100644
--- a/config/profile.sh.in
+++ b/config/profile.sh.in
@@ -4,9 +4,14 @@
 # See also [man debuginfod-client-config] for other environment variables
 # such as $DEBUGINFOD_MAXSIZE, $DEBUGINFOD_MAXTIME, $DEBUGINFOD_PROGRESS.
 
+prefix="@prefix@"
 if [ -z "$DEBUGINFOD_URLS" ]; then
-prefix="@prefix@"
 DEBUGINFOD_URLS=$(cat /dev/null "@sysconfdir@/debuginfod"/*.urls 
2>/dev/null | tr '\n' ' ')
 [ -n "$DEBUGINFOD_URLS" ] && export DEBUGINFOD_URLS || unset 
DEBUGINFOD_URLS
-unset prefix
+
+if [ -z "$DEBUGINFOD_IMA_CERT_PATH" ]; then

PR30809, was Re: [PATCH v2 2/2] debuginfod-client.c: Fix x-debuginfod-size counts differently than CURLINFO_SIZE_DOWNLOAD_T

2023-08-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> What is the status of this patch/discussion?

U forgot about it.  But that's OK, filed PR30809, and wrote &
tested this little patch:

commit 3ef3fab0d64c89a52dd6e2ce0d01dd5e713d7b5a
Author: Frank Ch. Eigler 
Date:   Tue Aug 29 14:08:04 2023 -0400

PR30809: improve debuginfod client progress-callback parameters

* debuginfod-client.c (debuginfod_query_server): Use fstat(3)
  of the file handle being downloaded into as the preferred
  source of download progress.

Tested by hand, as the testsuite doesn't have enough machinery to
simulate compressed vs. uncompressed service.  Hand testing with
(unmodified) fedora-38 gdb and debuginfod-find shows dramatically
improved progress displays: all have quantitative figures when
fetching from real (unmodified) upstream servers.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index d92d8d62c982..6882cb190d3c 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -1533,6 +1533,20 @@ debuginfod_query_server (debuginfod_client *c,
   long pa = loops; /* default param for progress callback */
   if (target_handle) /* we've committed to a server; report its 
download progress */
 {
+  /* PR30809: Check actual size of cached file.  This same
+ fd is shared by all the multi-curl handles (but only
+ one will end up writing to it).  Another way could be
+ to tabulate totals in debuginfod_write_callback(). */
+  struct stat cached;
+  int statrc = fstat(fd, );
+  if (statrc == 0)
+pa = (long) cached.st_size;
+  else
+{
+  /* Otherwise, query libcurl for its tabulated total.
+ However, that counts http body length, not
+ decoded/decompressed content length, so does not
+ measure quite the same thing as dl. */
   CURLcode curl_res;
 #if CURL_AT_LEAST_VERSION(7, 55, 0)
   curl_off_t dl;
@@ -1549,7 +1563,7 @@ debuginfod_query_server (debuginfod_client *c,
   if (curl_res == 0)
 pa = (dl >= (double)(LONG_MAX+1UL) ? LONG_MAX : (long)dl);
 #endif
-
+}
 }
 
   if ((*c->progressfn) (c, pa, dl_size == -1 ? 0 : dl_size))



Re: Fwd: Anton Baltser, bug with libdebuginfod

2023-07-30 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> I need to install 
> I tried to install elfutils - 0.189 version , and I could solve installing
> libdebuginfod.
> My system ubuntu 20.04/wsl2
> 
> I used ./configure --prefix=/usr\
>   --disable-debuginfod \
>   --enable-libdebuginfod=dummy \
>   --libdir=/lib

debuginfod/Makefile.am says:

if LIBDEBUGINFOD
pkginclude_HEADERS = debuginfod.h
endif

which is active with =dummy.  Just tried this same configure here, and
after a "make install", the header file does get installed.

- FChE



Re: Performance issue with systemd-coredump and container process linking 2000 shared libraries.

2023-06-21 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> On our side Francois also told me this afternoon that he didn’t
> really reproduce the same thing with my reproducer posted here and
> the real systemd-coredump issue he witnessed live, and also noticed
> that with DEBUGINFOD_URLS unset/set to the empty string my
> reproducer has no problem anymore. [...]

Just doing the math from the debuginfod-client point of view (ignoring
other the later systemd side fix that moots this):

For an application that processes these elf/dwarf files sequentially,
queries for each synthetic solib are going to result in 2000 https-404
transactions, sans debuginfod caching.  If you're lucky (reusing a
dwfl object), elfutils may be able to reuse a long-lived https
connection to a server, otherwise a new https connection might have to
be spun up for each.  But even with reuse, we're talking about 2000
pingponging messages.  That will take a handful of minutes of elapsed
time just by itself.

If the calling code made these queries in parallel batches, it would
be much faster overall.

- FChE



Re: Local Build ID Directory Lookup (DEBUGINFOD_LOCAL_PATH)

2023-06-01 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Hmm, how would the effective behavior of this differ from directly
> returning the path? The symlink could become invalid at any time [...]

Effective behaviour is about the same, but code logic and explanation
is simpler.

> It would make sense if the cache were made to contain a hard link to the
> file if it were on the same filesystem as the cache [...]

And this could be a QoI implementation detail: prefer hardlink, fall
back to symlink.

- FChE



Re: Local Build ID Directory Lookup (DEBUGINFOD_LOCAL_PATH)

2023-06-01 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> So I guess, sans the format, the feature request would just be that
> it would have a shortcut for file URLs to produce the path directly
> in response to e.g. a debuginfod_find_debuginfo, rather than making
> a copy of the file via libcurl.

A compromise solution could be for new code to produce a symlink in
the .cache/debuginfod_client directory that points to the matching
file:// bit, and return that symlink name/fd to the calling app.

At future accesses, the client can determine if the symlink is
broken and reject/unlink it.

- FChE



Re: Local Build ID Directory Lookup (DEBUGINFOD_LOCAL_PATH)

2023-05-31 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Ah, I had forgotten completely that file URLs worked out of the box
> (libcurl, duh).

Yeah.

> When we tried that way back when, the issue was just that it
> actually does a libcurl fetch out of that path and saves another
> copy of the file into the cache.

Yes, a copy is made.  This is "traditional" in the sense that the
client code ensures that even if an original upstream server goes
away, cached files stay accessible for a while.  I can also see how
this might be less valuable than just getting at the files with
minimum overhead.  Does anyone think we need to have both as options?

- FChE



Re: Local Build ID Directory Lookup (DEBUGINFOD_LOCAL_PATH)

2023-05-31 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> It would be really nice if there were, a DEBUGINFOD_LOCAL_PATH environment
> variable that provided a colon-separated list of directories in the GDB
> `.build-id//` [...]

Please note that $DEBUGINFOD_URLS can already include file:// URLs
that identify local filesystems whose directory structure mirrors the
webapi.  The old-school /usr/lib/debug/.build-id/xx/.debug
file hierarchy doesn't match that, and of course lacks source code
entirely.  

OTOH a shell script that creates debuginfod-style tree of symlinks
into the old-school hierarchy - for debuginfo/executables only - could
work, and work with current day unmodified debuginfod clients &
servers.  i.e.,:

  find /usr/lib/debug/.build-id -name '*.debug' | while read g; do
   buildid=`echo $g | cut -f6,7 -d/ | tr -d / | cut -f1 -d.`
   mkdir -p buildid/$buildid
   ln -s $g buildid/$buildid/debuginfo
  done

and similar for .../executable

Then the resulting tree is suitable for use as DEBUGINFOD_URLS=file://`pwd`
to a debuginfod server or a client.


- FChE



Re: Segfault in dwfl_module_getsrc

2023-05-09 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> I tried to use elfutils to retrieve the backtrace within a signal handle,
> but I am having a segfault inside `dwfl_module_getsrc`.
> [...]

A red flag is "within a signal handler".  The kinds of code that one
may safely invoke in a signal handler are highly limited.  Most
elfutils functions are not safe this way.  See

% man signal-safety

- FChE



[patch]: PR30316: debuginfod wal checkpointing

2023-05-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

This is patch 2/2 of this series.


commit 9b70bce69268eaae7621791ba7c72639469d4a69
Author: Frank Ch. Eigler 
Date:   Mon May 8 11:05:48 2023 -0400

PR30316: debuginfod wal checkpointing

Add a "--scan-checkpoint=NUM" option to debuginfod to control forced
synchronization & sqlite -wal checkpointing for the multithreaded
scanning process.  In absence of this, a server that's busy with other
read & write operations can accumulate potentially large SQLITE WAL
temporary files.  This option causes the server to take intermittent
quiescent breaks during scanning, during which the -wal file can be
processed and truncated.

Signed-off-by: Frank Ch. Eigler 

diff --git a/NEWS b/NEWS
index 3c63a66026d7..53c717eba546 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,12 @@ Version 0.190
 
 readelf: Support readelf -Ds, --use-dynamic --symbol.
 
+debuginfod: Schema change (reindexing required, sorry!) for a 60%
+compression in filename representation, which was a large
+part of the sqlite index; also, more deliberate sqlite
+-wal management during scanning using the
+--scan-checkpoint option.
+
 Version 0.189 "Don't deflate!"
 
 configure: eu-nm, eu-addr2line and eu-stack can provide demangled symbols
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 7b19bb3a5d72..d72d2ad16960 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -429,6 +429,8 @@ static const struct argp_option options[] =
{ "passive", ARGP_KEY_PASSIVE, NULL, 0, "Do not scan or groom, read-only 
database.", 0 },
 #define ARGP_KEY_DISABLE_SOURCE_SCAN 0x1009
{ "disable-source-scan", ARGP_KEY_DISABLE_SOURCE_SCAN, NULL, 0, "Do not 
scan dwarf source info.", 0 },
+#define ARGP_SCAN_CHECKPOINT 0x100A
+   { "scan-checkpoint", ARGP_SCAN_CHECKPOINT, "NUM", 0, "Number of files 
scanned before a WAL checkpoint.", 0 },
{ NULL, 0, NULL, 0, NULL, 0 },
   };
 
@@ -483,6 +485,7 @@ static unsigned forwarded_ttl_limit = 8;
 static bool scan_source_info = true;
 static string tmpdir;
 static bool passive_p = false;
+static long scan_checkpoint = 256;
 
 static void set_metric(const string& key, double value);
 // static void inc_metric(const string& key);
@@ -684,6 +687,11 @@ parse_opt (int key, char *arg,
 case ARGP_KEY_DISABLE_SOURCE_SCAN:
   scan_source_info = false;
   break;
+case ARGP_SCAN_CHECKPOINT:
+  scan_checkpoint = atol (arg);
+  if (scan_checkpoint < 0)
+argp_failure(state, 1, EINVAL, "scan checkpoint");
+  break;
   // case 'h': argp_state_help (state, stderr, 
ARGP_HELP_LONG|ARGP_HELP_EXIT_OK);
 default: return ARGP_ERR_UNKNOWN;
 }
@@ -929,6 +937,72 @@ class unique_set_reserver
 };
 
 
+
+
+// periodic_barrier is a concurrency control object that lets N threads
+// periodically (based on counter value) agree to wait at a barrier,
+// let one of them carry out some work, then be set free
+
+class periodic_barrier
+{
+private:
+  unsigned period; // number of count() reports to trigger barrier activation
+  unsigned threads; // number of threads participating
+  mutex mtx; // protects all the following fields
+  unsigned counter; // count of count() reports in the current generation
+  unsigned generation; // barrier activation generation
+  unsigned waiting; // number of threads waiting for barrier
+  bool dead; // bring out your
+  condition_variable cv;
+public:
+  periodic_barrier(unsigned t, unsigned p):
+period(p), threads(t), counter(0), generation(0), waiting(0), dead(false) 
{ }
+  virtual ~periodic_barrier() {}
+
+  virtual void periodic_barrier_work() noexcept = 0;
+  void nuke() {
+unique_lock lock(mtx);
+dead = true;
+cv.notify_all();
+  }
+  
+  void count()
+  {
+unique_lock lock(mtx);
+unsigned prev_generation = this->generation;
+if (counter < period-1) // normal case: counter just freely running
+  {
+counter ++;
+return;
+  }
+else if (counter == period-1) // we're the doer
+  {
+counter = period; // entering barrier holding phase
+cv.notify_all();
+while (waiting < threads-1 && !dead)
+  cv.wait(lock);
+// all other threads are now stuck in the barrier
+this->periodic_barrier_work(); // NB: we're holding the mutex the 
whole time
+// reset for next barrier, releasing other waiters
+counter = 0;
+generation ++;
+cv.notify_all();
+return;
+  }
+else if (counter == period) // we're a waiter, in holding phase
+  {
+waiting ++;
+cv.notify_all();
+while (counter == period && generation == prev_generation && !dead)
+  cv.wait(lock);
+waiting --;
+return;
+  }
+  }
+};
+
+
+
 
 
 
@@ -1073,6 

[patch] PR30378: debuginfod, _files table compression

2023-05-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

This is patch 1 of 2 of important filesystem usage reduction work for
debuginfod.


commit 8ee1cc26b08f773c1c6d2f8ed2f68a88534cf93d
Author: Frank Ch. Eigler 
Date:   Fri May 5 13:56:23 2023 -0400

debuginfod: PR30378: better compression for _files table

Split the _files table into two links into a new _fileparts table,
interning the dirname and basename of each file name string.  This
reduces storage requirements for many thousands of almost-identical
long paths that are evident in large builds like kernels.

This is unfortunately a schema-breaking change, so requires reindexing
of the corpus.

While in the vicinity, the file scan work queue is changed from a
 to an .  The intent is that files be scanned in a
more random sequence instead of sorted.  If they're sorted, then files
that contain errors will tend to be retried over and over again at the
next scan cycle, even at the expense of making progress on the other
files in the queue.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index a1ddeb569a36..7b19bb3a5d72 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -85,6 +85,7 @@ extern "C" {
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -138,7 +139,7 @@ string_endswith(const string& haystack, const string& 
needle)
 
 
 // Roll this identifier for every sqlite schema incompatibility.
-#define BUILDIDS "buildids9"
+#define BUILDIDS "buildids10"
 
 #if SQLITE_VERSION_NUMBER >= 3008000
 #define WITHOUT_ROWID "without rowid"
@@ -157,10 +158,23 @@ static const char DEBUGINFOD_SQLITE_DDL[] =
   // NB: all these are overridable with -D option
 
   // Normalization table for interning file names
-  "create table if not exists " BUILDIDS "_files (\n"
+  "create table if not exists " BUILDIDS "_fileparts (\n"
   "id integer primary key not null,\n"
   "name text unique not null\n"
   ");\n"
+  "create table if not exists " BUILDIDS "_files (\n"
+  "id integer primary key not null,\n"
+  "dirname integer not null,\n"
+  "basename integer not null,\n"
+  "unique (dirname, basename),\n"
+  "foreign key (dirname) references " BUILDIDS "_fileparts(id) on 
delete cascade,\n"
+  "foreign key (basename) references " BUILDIDS "_fileparts(id) on 
delete cascade\n"
+  ");\n"
+  "create view if not exists " BUILDIDS "_files_v as\n" // a 
+  "select f.id, n1.name || '/' || n2.name as name\n"
+  "from " BUILDIDS "_files f, " BUILDIDS "_fileparts n1, " BUILDIDS 
"_fileparts n2\n"
+  "where f.dirname = n1.id and f.basename = n2.id;\n"
+  
   // Normalization table for interning buildids
   "create table if not exists " BUILDIDS "_buildids (\n"
   "id integer primary key not null,\n"
@@ -230,33 +244,33 @@ static const char DEBUGINFOD_SQLITE_DDL[] =
   "create view if not exists " BUILDIDS "_query_d as \n"
   "select\n"
   "b.hex as buildid, n.mtime, 'F' as sourcetype, f0.name as source0, 
n.mtime as mtime, null as source1\n"
-  "from " BUILDIDS "_buildids b, " BUILDIDS "_files f0, " BUILDIDS 
"_f_de n\n"
+  "from " BUILDIDS "_buildids b, " BUILDIDS "_files_v f0, " BUILDIDS 
"_f_de n\n"
   "where b.id = n.buildid and f0.id = n.file and n.debuginfo_p = 1\n"
   "union all select\n"
   "b.hex as buildid, n.mtime, 'R' as sourcetype, f0.name as source0, 
n.mtime as mtime, f1.name as source1\n"
-  "from " BUILDIDS "_buildids b, " BUILDIDS "_files f0, " BUILDIDS 
"_files f1, " BUILDIDS "_r_de n\n"
+  "from " BUILDIDS "_buildids b, " BUILDIDS "_files_v f0, " BUILDIDS 
"_files_v f1, " BUILDIDS "_r_de n\n"
   "where b.id = n.buildid and f0.id = n.file and f1.id = n.content and 
n.debuginfo_p = 1\n"
   ";"
   // ... and for E queries
   "create view if not exists " BUILDIDS "_query_e as \n"
   "select\n"
   "b.hex as buildid, n.mtime, 'F' as sourcetype, f0.name as source0, 
n.mtime as mtime, null as source1\n"
-  "from " BUILDIDS "_buildids b, " BUILDIDS "_files f0, " BUILDIDS 
"_f_de n\n"
+  "from " BUILDIDS "_buildids b, " BUILDIDS "_files_v f0, " BUILDIDS 
"_f_de n\n"
   "where b.id = n.buildid and f0.id = n.file and n.executable_p = 1\n"
   "union all select\n"
   "b.hex as buildid, n.mtime, 'R' as sourcetype, f0.name as source0, 
n.mtime as mtime, f1.name as source1\n"
-  "from " BUILDIDS "_buildids b, " BUILDIDS "_files f0, " BUILDIDS 
"_files f1, " BUILDIDS "_r_de n\n"
+  "from " BUILDIDS "_buildids b, " BUILDIDS "_files_v f0, " BUILDIDS 
"_files_v f1, " BUILDIDS "_r_de n\n"
   "where b.id = n.buildid and f0.id = n.file and f1.id = n.content and 
n.executable_p = 1\n"
   ";"
   // ... and for S queries
   "create view if not exists " BUILDIDS "_query_s as \n"
   "select\n"
   "b.hex as buildid, fs.name as 

[patch] PR30377: debuginfod -r -X

2023-04-21 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Planning to push this one-liner fix shortly.
A new test case exercises the -r / -I / -X ops.


commit bc6c774dfb61ac61a66195a269579544f97eeb30
Author: Frank Ch. Eigler 
Date:   Fri Apr 21 17:04:08 2023 -0400

PR30377: fix debuginfod -r -X combination

Until this fix, debuginfod -r -X '.*' didn't trigger groom-time removal
of everything, because the -I include regex overrode it.  Corrected logic
to match the scan-time tie-breaking between -I / -X.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 070dad03fba1..0e4810bba501 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,7 @@
+2023-04-21  Frank Ch. Eigler 
+
+   * debuginfod.cxx (groom): Fix -r / -X logic.
+
 2023-04-13  Frank Ch. Eigler 
 
* debuginfod.cxx (archive_classify, scan_archive_file): Catch and
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index e981d1372233..a1ddeb569a36 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -3893,7 +3893,7 @@ void groom()
 {
   bool reg_include = !regexec (_include_regex, filename, 0, 0, 0);
   bool reg_exclude = !regexec (_exclude_regex, filename, 0, 0, 0);
-  regex_file_drop = reg_exclude && !reg_include;
+  regex_file_drop = !reg_include || reg_exclude; // match logic of 
scan_source_paths  
 }
 
   rc = stat(filename, );
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
index 07cb01aeed10..5358aaba42c9 100644
--- a/doc/debuginfod.8
+++ b/doc/debuginfod.8
@@ -172,7 +172,10 @@ interrupting a groom pass (if any).
 
 .TP
 .B "\-r"
-Apply the -I and -X during groom cycles, so that files excluded by the regexes 
are removed from the index. These parameters are in addition to what normally 
qualifies a file for grooming, not a replacement.
+Apply the -I and -X during groom cycles, so that most content related
+to files excluded by the regexes are removed from the index.  Not all
+content can be practically removed, so eventually a "\-G"
+"maximal-groom" operation may be needed.
 
 .TP
 .B "\-g SECONDS" "\-\-groom\-time=SECONDS"
diff --git a/tests/ChangeLog b/tests/ChangeLog
index eb3e1118fa00..d816873ce083 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,8 @@
+2023-04-21  Frank Ch. Eigler 
+
+   * run-debuginfod-IXr.sh: New test.
+   * Makefile.am: Run it, ship it.
+
 2023-02-10  Mark Wielaard  
 
* varlocs.c (print_expr): Handle DW_OP_GNU_uninit.
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 7e32f1170c1b..0ca8aa9e7e81 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -261,7 +261,8 @@ TESTS += run-debuginfod-dlopen.sh \
  run-debuginfod-response-headers.sh \
  run-debuginfod-extraction-passive.sh \
 run-debuginfod-webapi-concurrency.sh \
-run-debuginfod-section.sh
+run-debuginfod-section.sh \
+run-debuginfod-IXr.sh
 endif
 if !OLD_LIBMICROHTTPD
 # Will crash on too old libmicrohttpd
@@ -578,6 +579,7 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \
  run-debuginfod-extraction-passive.sh \
  run-debuginfod-webapi-concurrency.sh \
 run-debuginfod-section.sh \
+run-debuginfod-IXr.sh \
 debuginfod-rpms/fedora30/hello2-1.0-2.src.rpm \
 debuginfod-rpms/fedora30/hello2-1.0-2.x86_64.rpm \
 debuginfod-rpms/fedora30/hello2-debuginfo-1.0-2.x86_64.rpm \
diff --git a/tests/run-debuginfod-IXr.sh b/tests/run-debuginfod-IXr.sh
new file mode 100755
index ..631b7bbfbfd2
--- /dev/null
+++ b/tests/run-debuginfod-IXr.sh
@@ -0,0 +1,94 @@
+#!/usr/bin/env bash
+#
+# Copyright (C) 2019-2023 Red Hat, Inc.
+# This file is part of elfutils.
+#
+# This file is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# elfutils is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+
+. $srcdir/debuginfod-subr.sh
+
+# for test case debugging, uncomment:
+set -x
+unset VALGRIND_CMD
+
+mkdir R Z
+# This variable is essential and ensures no time-race for claiming ports occurs
+# set base to a unique multiple of 100 not used in any other 
'run-debuginfod-*' test
+base=10100
+get_ports
+
+DB=${PWD}/.debuginfod_tmp.sqlite
+tempfiles $DB
+export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache
+
+cp -rvp ${abs_srcdir}/debuginfod-rpms R
+
+if [ "$zstd" = "false" ]; then  # nuke the zstd fedora 31 ones
+rm -vrf R/debuginfod-rpms/fedora31
+rpms=0
+

[patch] [obv] PR30348: debuginfod: retry partial archive scans

2023-04-13 Thread Frank Ch. Eigler via Elfutils-devel
Planning to push this shortly:

Author: Frank Ch. Eigler 
Date:   Thu Apr 13 13:11:56 2023 -0400

PR30348: debuginfod: retry partial archive scans

On some public debuginfod servers, it was observed that errors
may occur during individual archive scanning operations.  That's
fine, but previous code still went ahead and marked the archive
"done" by inserting a record into the *_file_mtime_scanned table.

New code ensures that exceptions propagate for these cases, and an
archive that encountered an error while scanning will be retried
later.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index f13c28d5c6f7..070dad03fba1 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,8 @@
+2023-04-13  Frank Ch. Eigler 
+
+   * debuginfod.cxx (archive_classify, scan_archive_file): Catch and
+   propagate exceptions during archive scans.
+
 2023-03-30  Jan Alexander Steffens (heftig) 
 
* debuginfod-client.c (update_atime): New function.
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 5ef6cc32189b..e981d1372233 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -3268,6 +3268,7 @@ archive_classify (const string& rps, string& 
archive_extension,
   if (verbose > 3)
 obatched(clog) << "libarchive scanning " << rps << endl;
 
+  bool any_exceptions = false;
   while(1) // parse archive entries
 {
 if (interrupted)
@@ -3405,8 +3406,17 @@ archive_classify (const string& rps, string& 
archive_extension,
   catch (const reportable_exception& e)
 {
   e.report(clog);
+  any_exceptions = true;
+  // NB: but we allow the libarchive iteration to continue, in
+  // case we can still gather some useful information.  That
+  // would allow some webapi queries to work, until later when
+  // this archive is rescanned.  (Its vitals won't go into the
+  // _file_mtime_scanned table until after a successful scan.)
 }
 }
+
+  if (any_exceptions)
+throw reportable_exception("exceptions encountered during archive scan");
 }
 
 
@@ -3453,6 +3463,7 @@ scan_archive_file (const string& rps, const stat_t& st,
   // extract the archive contents
   unsigned my_fts_executable = 0, my_fts_debuginfo = 0, my_fts_sref = 0, 
my_fts_sdef = 0;
   bool my_fts_sref_complete_p = true;
+  bool any_exceptions = false;
   try
 {
   string archive_extension;
@@ -3475,6 +3486,7 @@ scan_archive_file (const string& rps, const stat_t& st,
   catch (const reportable_exception& e)
 {
   e.report(clog);
+  any_exceptions = true;
 }
 
   if (verbose > 2)
@@ -3484,6 +3496,7 @@ scan_archive_file (const string& rps, const stat_t& st,
<< " debuginfos=" << my_fts_debuginfo
<< " srefs=" << my_fts_sref
<< " sdefs=" << my_fts_sdef
+   << " exceptions=" << any_exceptions
<< endl;
 
   fts_executable += my_fts_executable;
@@ -3491,6 +3504,9 @@ scan_archive_file (const string& rps, const stat_t& st,
   fts_sref += my_fts_sref;
   fts_sdef += my_fts_sdef;
 
+  if (any_exceptions)
+throw reportable_exception("exceptions encountered during archive scan");
+
   if (my_fts_sref_complete_p) // leave incomplete?
 ps_scan_done
   .reset()


PR29472: debuginfod metadata query

2023-04-12 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

At long last, for your review, an updated & merged patch for $SUBJECT.
"git diff -s" against git/master follows, code also on
users/fche/try-pr29472e.


Author: Frank Ch. Eigler 
Date:   Tue Apr 11 23:35:25 2023 -0400

PR29472: debuginfod: add metadata query webapi, C api, client

This patch extends the debuginfod API with a "metadata query"
operation.  It allows clients to request an enumeration of file names
known to debuginfod servers, returning a JSON response including the
matching buildids.  This lets clients later download debuginfo for a
range of versions of the same named binaries, in case they need to to
prospective work (like systemtap-based live-patching).  It also lets
server operators implement prefetch triggering operations for popular
but slow debuginfo slivers like kernel vdso.debug files on fedora.

Implementation requires a modern enough json-c library.

% debuginfod-find metadata file /bin/ls
% debuginfod-find metadata glob "/usr/local/bin/c*"

Documentation and testing are included.

Signed-off-by: Ryan Goldberg 
Signed-off-by: Frank Ch. Eigler 

diff --git a/ChangeLog b/ChangeLog
index 10c23002279e..c35a19dd51c4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2023-04-12  Ryan Golberg , Frank Ch. Eigler 

+
+   PR29472: debuginfod metadata query
+   * NEWS: Mention this.
+   * configure.ac: Look for json-c library.
+
 2023-03-03  Mark Wielaard  
 
* NEWS: Add ELFCOMPRESS_ZSTD support for libelf and elfcompress.
diff --git a/NEWS b/NEWS
index 16e37eca5731..1bf38a69f64c 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,7 @@
+Version 0.190
+
+debuginfod: New API for metadata queries: file name -> buildid.
+
 Version 0.189 "Don't deflate!"
 
 configure: eu-nm, eu-addr2line and eu-stack can provide demangled symbols
diff --git a/configure.ac b/configure.ac
index 4b67c84425fa..b319a0119180 100644
--- a/configure.ac
+++ b/configure.ac
@@ -665,6 +665,12 @@ case "$ac_cv_search__obstack_free" in
 esac
 AC_SUBST([obstack_LIBS])
 
+dnl formerly checked for json_object_array_add, but debuginfod needs a newer 
function
+AC_CHECK_LIB(json-c, json_object_from_fd, [
+  AC_DEFINE([HAVE_JSON_C], [1], [Define if json-c is on the machine])
+  AC_SUBST(jsonc_LIBS, '-ljson-c')
+])
+
 dnl The directories with content.
 
 dnl Documentation.
diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index f13c28d5c6f7..ac3fd6aa862f 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,26 @@
+2023-04-12  Ryan Golberg , Frank Ch. Eigler 

+
+   PR29472: debuginfod metadata query
+   * Makefile.am: Add json-c usage.
+   * debuginfod-client.c (debuginfod_find_metadata): New function.
+   (handle_data): New fields to hold metadata being received.
+   (debuginfod_clean_cache): Clean metadata too.
+   (header_callback): Simplify to realloc only.
+   (metadata_callback): New function.
+   (init_server_urls, init_handle, perform_queries, make_cache_path):
+   New refactored functions.
+   (debuginfod_query_server_by_buildid): Renamed, refactored.  Update
+   callers.
+   * debuginfod-find.c (main): Handle metadata queries.
+   * debuginfod.cxx (DEBUGINFOD_SQLITE_DDL): Add an index or two.
+   (metadata_maxtime_s, parse_opt): New parameter for load control.
+   (add_client_federation_headers): New refactored function.
+   (handle_metadata): New function.
+   (handler_cb): Call it for /metadata URL.  Trace it.
+   (groom): Tweak sqlite_ps object lifetimes.
+   * debuginfod.h.in (debuginfod_find_metadata): New decl.
+   * libdebuginfod.map: Export it under ELFUTILS_0.190.
+
 2023-03-30  Jan Alexander Steffens (heftig) 
 
* debuginfod-client.c (update_atime): New function.
diff --git a/debuginfod/Makefile.am b/debuginfod/Makefile.am
index 125be97bbfcc..73ffe0e56468 100644
--- a/debuginfod/Makefile.am
+++ b/debuginfod/Makefile.am
@@ -70,7 +70,7 @@ bin_PROGRAMS += debuginfod-find
 endif
 
 debuginfod_SOURCES = debuginfod.cxx
-debuginfod_LDADD = $(libdw) $(libelf) $(libeu) $(libdebuginfod) $(argp_LDADD) 
$(fts_LIBS) $(libmicrohttpd_LIBS) $(sqlite3_LIBS) $(libarchive_LIBS) -lpthread 
-ldl
+debuginfod_LDADD = $(libdw) $(libelf) $(libeu) $(libdebuginfod) $(argp_LDADD) 
$(fts_LIBS) $(libmicrohttpd_LIBS) $(sqlite3_LIBS) $(libarchive_LIBS) 
$(jsonc_LIBS) $(libcurl_LIBS) -lpthread -ldl
 
 debuginfod_find_SOURCES = debuginfod-find.c
 debuginfod_find_LDADD = $(libdw) $(libelf) $(libeu) $(libdebuginfod) 
$(argp_LDADD) $(fts_LIBS)
@@ -97,7 +97,7 @@ libdebuginfod_so_LIBS = libdebuginfod_pic.a
 if DUMMY_LIBDEBUGINFOD
 libdebuginfod_so_LDLIBS =
 else
-libdebuginfod_so_LDLIBS = -lpthread $(libcurl_LIBS) $(fts_LIBS) $(libelf)
+libdebuginfod_so_LDLIBS = -lpthread $(libcurl_LIBS) $(fts_LIBS) $(libelf) 
$(jsonc_LIBS)
 endif
 $(LIBDEBUGINFOD_SONAME): $(srcdir)/libdebuginfod.map $(libdebuginfod_so_LIBS)
$(AM_V_CCLD)$(LINK) 

Re: Some ideas for process improvements/changes

2023-04-06 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> - Get rid of ChangeLog files and trivial ChangeLog entries
>   [...]

Yes please!

> - Use patchwork more
>   [...]

This doesn't seem like something for community/contributors
to do - patchwork seems mostly a maintainer/committer tool.

>   It would be nice if it was automated a bit more by have a git
>   commit hook that flagged whether a patch was committed. And if
>   the buildbot try-branch system would flag pass/fail on the patch.

Sounds like a sourceware infrastructure RFE.

> - Don't require "real names" in Signed-off-by lines.
>   [...]
> +The name you use as your identity should not be an anonymous id
> +or false name that misrepresents who you are.

(No strong opinion on this one, except that a declaration that is this
informal would have little weight, should it ever be relied upon in
legal proceedings.)


> - "Security" bug guidance
>   [...]

Yeah, a brief SECURITY file would be nice.


- FChE



Re: [PATCH v2 2/2] debuginfod-client.c: Fix x-debuginfod-size counts differently than CURLINFO_SIZE_DOWNLOAD_T

2023-03-30 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > Hey, great idea actually tallying up writes in the callback function.
> > (We need to take care to clear that counter, in case of client object
> > reuse.)  Also, can you think of some reason not to just use that value
> > at all times, i.e., without any of that "if and only if ..." business?
> 
> The counter is in handle_data, and it is already cleared at
> "query_in_parallel:". I don't find other places that may reuse them.

OK.

> The written_size is actual file size (uncompressed), but IIUC
> Content-Length is the compressed size if Content-Encoding says the
> content is compressed. I haven't seen any compressed responses with
> Content-Length, but from the spec I don't read they are not allowed.

OK, so to spell out the hypothetical problem: what if a httpd server
does send back a Content-Length: response header for a compressed
file, and we use that as the denominator for progress reporting.  If
we use the decompressed actual file length as numerator, we'd go over
100%.

Then ISTM a simpler way to handle this would be to say that if the
x-debuginfod-size: response header is found (as denominator), then go
ahead and use the actual data[committed_to].written_size (as
numerator).  Don't even try the CURLINFO_SIZE* queries then.

- FChE



Re: [PATCH 3/3] debuginfod: When retrieving files from cache, update atime manually

2023-03-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> The cache cleaning logic requires atime to be correct (strictatime) but
> most users on Linux only have relatime or even noatime.

This is not really correct: relatime is the kernel default and works
fine with the cache.  atime values updated once a day are still plenty
for caches with a multi-day preservation default.

> Attempt to update the atime manually so that the cache works properly.

Has this been observed to work on noatime-mounted filesystems?  It's
at worst harmless so we could merge it, but it's kind of weird.  Do
you know such precedents in other software that consumes atime?

- FChE



Re: [PATCH v2 1/2] debuginfod-client.c: Fix download size not correctly fallbacks to x-debuginfod-size header

2023-03-29 Thread Frank Ch. Eigler via Elfutils-devel
> +2023-03-29  lilydjwg  
> +
> + * debuginfod/debuginfod-client.c: Fix download size not correctly
> + fallbacks to x-debuginfod-size header

Thanks, merged.

- FChE



Re: [PATCH 1/3] debuginfod: Replace futimes with futimens

2023-03-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Similar to what 8c4aa0ef998191ed828a37190dc179b91649938a did for ar and
> strip, replace the non-standard futimes with the POSIX futimens.

Thanks, merged (with a little ChangeLog thing added ... please free us, mjw!).

- FChE



Re: [PATCH v2 2/2] debuginfod-client.c: Fix x-debuginfod-size counts differently than CURLINFO_SIZE_DOWNLOAD_T

2023-03-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> x-debuginfod-size is the actual file size, but CURLINFO_SIZE_DOWNLOAD_T
> is transferred size, i.e. the gzipped one if gzip is on.
> Let's count written data and use that if and only if x-debuginfod-size
> is used.

Hey, great idea actually tallying up writes in the callback function.
(We need to take care to clear that counter, in case of client object
reuse.)  Also, can you think of some reason not to just use that value
at all times, i.e., without any of that "if and only if ..." business?

- FChE



Re: [PATCH] debuginfod-client.c: Fix download size not correctly fallbacks to x-debuginfod-size header

2023-03-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> The reason is that when Content-Length is unavailable, cl is set to -1
> by curl

Is that behaviour from new versions of curl?

>  but dl_size remains as 0, but later dl_size == -1 is checked.

Or perhaps dl_size needs to be initialized to -1 ("unknown") vs 0
("known to be zero"), and that value mapped to 0 only in the
*c->progressfn() call down below.


- FChE



Re: [PATCH v2 4/7] x86_64_return_value_location: Support lvalue and rvalue references

2023-02-10 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> $ cat 1.cpp
> int () { throw; }
> int &() { throw; }
> $ g++ -g 1.cpp -c
> and then running
> 
> $ LD_LIBRARY_PATH=../libelf:../libdw ./funcretval -e 1.o
> 
> What would be a good way to integrate such a testcase?
> Currently all tests are built with gcc, is it okay to add one
> built with g++? [...]

Sure.


- FChE



Re: ☠ Buildbot (Sourceware): elfutils - failed test (failure) (master)

2023-02-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > A new failure has been detected on builder elfutils-gentoo-sparc while 
> > building elfutils.
> >
> > Full details are available at:
> > https://builder.sourceware.org/buildbot/#builders/225/builds/10
> > - test-suite.log: 
> > https://builder.sourceware.org/buildbot/#builders/225/builds/10/steps/7/logs/test-suite_log
> 
> Failure due to make command timeout. It looks like the only test that didn't
> finish is run-debuginfod-section.sh, which is related to my changes.
> 
> But I'm not seeing how the changes might cause a timeout and unfortunately
> test-suite.log is empty.

Some useful info may be here:

https://builder.sourceware.org/testrun/fc4dda20768ee7e8fd2b3edf9c391cd84db15edd?focus=filelist

- FChE



preservation of signed rpms

2023-01-26 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Recently, fedora koji has started applying per-file (IMA) signatures
to RPMs it has built.  This is in addition to the overall GPG
signature of the RPM payload as a whole.  While this extra capability
is not yet fully developed in userspace, we do have one ready user
(elfutils debuginfod [1]), which is unfortunately frustrated by a
policy in the koji code base.

That policy problem is the periodic pruning of signed RPMs.  That is,
"koji prune-signed-copies" is run on the fedora infra every now and
then.  This operation nukes data/signed/KEYHEX/ARCH/N-V-R.rpm files.
While it leaves behind data/sigcache/ARCH/*, those files appear not to
include the IMA signature content.  That means the IMA signatures are
simply lost, like tears in rain.

We'd like to correct this somehow - to make the IMA signatures
available indefinitely - at least as long as the built RPMs stay of
any interest.  (That means: not restricted to the most-recent-update
of a given fedora release.) 

But how?

1) have the pruning operate by replacing the unsigned binaries with the
   signed ones (hardlink or rename)?

2) have the pruning operate on the unsigned binaries, preserving the
   signed ones?

3) preserve the IMA signature content somewhere nearby (sigcache?)
   to give us a chance at finding the data there after a prune

4) ---> some other way? <---


[1] https://fedoraproject.org/wiki/Debuginfod


- FChE



Re: ☠ Buildbot (GNU Toolchain): elfutils-try-debian-armhf - failed compile (failure) (users/marxin/try-zstd-support-v2)

2022-12-21 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> Unfortunately buildbot itself doesn't show the config.log.
> Trying to get that...

Ah darn.  It would have been saved in bunsen, but the buildbot
configuration means that a make-stage failure means it won't even try.

- FChE



Re: PATCH: Bug debuginfod/29472 followup

2022-11-10 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Are you sure the interface is correct? 

Hard to be sure, so it's left generalized.  The APIs would be
unchanged if future search strategies are added (or subtracted);
they'd affect the choice of acceptable KEY strings.

We know we want glob patterns over executable file names.  I've seen
cases where an exact match query produces a different sqlite query
plan from the glob one, but not sure how much performance difference
that implies.  Searching for source files by glob/match is removed
from today's version because it doesn't run fast enough (without a new
large index).

> Is the sqlite "glob" pattern standardized? 

Yes.

> Can it be provided if the underlying server database isn't sqlite?

Yeah, in that unlikely case we undertake this someday.  glob
expressions can be translated to regular expressions, which are
themselves supported in postgres & mysql.


> I haven't read the whole diff yet. There are several refactorings
> which would be nice to see a separate patch.

One such part that occurs to me is the debuginfod_query_server() ->
init_handles() / perform_queries() subdivision.  Are there others?


> Why does debuginfod-client.c use json-c? Can't the server sent the
> json object as a normal char string? Why does the string from the
> server need to be interpreted as a json object and then turned into a
> string again?

Use of the library allows robust processing (checking & merging) of
incoming json data from multiple upstream servers.  Luckily, json-c is
a small & self-contained library.

- FChE



patch, debuginfod

2022-11-03 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Showing diff -w to omit reindentation whitespace noise.  Maybe worth
backporting to 0.188 releases, could affect federation intermediate
servers with burst workload. 

commit ec166cf3c8d825a2f02aca448a0823de12e78991 (HEAD -> master)
Author: Frank Ch. Eigler 
Date:   Thu Nov 3 10:07:31 2022 -0400

debuginfod.cxx: fix coverity-found use-after-release error

The debuginfod_client object lifetime needs more careful handling,
made easier with the defer_dtor<> gadget.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 1efa080fe337..4d576caec2a5 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,8 @@
+2022-11-03  Frank Ch. Eigler 
+
+   * debuginfod.cxx (handle_buildid): Correctly manage lifetime
+   of debuginfod_client federation callout object.
+
 2022-11-02  Mark Wielaard  
 
* debuginfod-client.c (extract_section): Mark static.
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index f46da6eff4a7..02a11477a515 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -2249,8 +2249,10 @@ handle_buildid (MHD_Connection* conn,
 
   int fd = -1;
   debuginfod_client *client = debuginfod_pool_begin ();
-  if (client != NULL)
-{
+  if (client == NULL)
+throw libc_exception(errno, "debuginfod client pool alloc");
+  defer_dtor client_closer (client, 
debuginfod_pool_end);
+  
   debuginfod_set_progressfn (client, & debuginfod_find_progress);
 
   if (conn)
@@ -2323,11 +2325,6 @@ and will not query the upstream servers");
   (const unsigned char*) buildid.c_str(),
   0, section.c_str(), NULL);
   
-}
-  else
-fd = -errno; /* Set by debuginfod_begin.  */
-  debuginfod_pool_end (client);
-
   if (fd >= 0)
 {
   if (conn != 0)



PATCH: Bug debuginfod/29472 followup

2022-11-01 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

On the users/fche/try-pr29472 branch, I pushed a followup to Ryan's
PR29472 draft from a bunch of weeks ago.  It's missing some
ChangeLog's but appears otherwise complete.  It's structured as Ryan's
original patch plus my followup that changes things around, so as to
preserve both contributions in the history.  I paste the overall diff
here.

There will be some minor merge conflicts between this and amerey's
section-extraction extensions that are also aiming for this release.
I'll be glad to deconflict whichever way.


diff --git a/ChangeLog b/ChangeLog
index 7bbb2c0fe97e..efce07161abe 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2022-10-06  Ryan Goldberg 
+
+   * configure.ac (HAVE_JSON_C): Defined iff libjson-c
+   is found, and debuginfod metadata querying is thus enabled.
+
 2022-10-20  Mark Wielaard  
 
* Makefile.am (rpm): Remove --sign.
diff --git a/configure.ac b/configure.ac
index 1084b4695e2c..6077d52a7daf 100644
--- a/configure.ac
+++ b/configure.ac
@@ -600,6 +600,11 @@ case "$ac_cv_search__obstack_free" in
 esac
 AC_SUBST([obstack_LIBS])
 
+AC_CHECK_LIB(json-c, json_object_array_add, [
+  AC_DEFINE([HAVE_JSON_C], [1], [Define if json-c is on the machine])
+  AC_SUBST(jsonc_LIBS, '-ljson-c')
+])
+
 dnl The directories with content.
 
 dnl Documentation.
diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 1df903fe4ac2..79f827d95217 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,27 @@
+2022-10-06  Ryan Goldberg 
+
+   * Makefile.am (debuginfod_LDADD): Add jsonc_LIBS.
+   (libdebuginfod_so_LDLIBS): Likewise.
+   * debuginfod-find.c (main): Add command line interface for
+   metadata query by path.
+   * debuginfod.h.in: Added debuginfod_find_metadata.
+   * debuginfod.cxx (add_client_federation_headers): New function
+   created from existing code to remove code duplication.
+   (handle_buildid_match): Calls new add_client_federation_headers
+   function.
+   (handle_metadata): New function which queries local DB and
+   upstream for metadata.
+   (handler_cb): New accepted url type, /metadata.
+   * debuginfod-client.c (struct handle_data): New fields: metadata,
+   metadata_size, to store incoming metadata.
+   (metadata_callback): New function called by curl upon reciving
+   metedata
+   (init_server_urls, init_handle, perform_queries) : New functions 
created from
+   existing code within debuginfod_query_server to reduce code duplication.
+   (debuginfod_query_server_by_buildid): debuginfod_query_server renamed, 
and above
+   functions used in place of identical previously inline code.
+   (debuginfod_find_metadata): New function.
+
 2022-10-18  Daniel Thornburgh 
 
   * debuginfod-client.c (debuginfod_query_server): Add DEBUGINFOD_HEADERS_FILE
diff --git a/debuginfod/Makefile.am b/debuginfod/Makefile.am
index 435cb8a6839c..3d6bc26ecc4a 100644
--- a/debuginfod/Makefile.am
+++ b/debuginfod/Makefile.am
@@ -70,7 +70,7 @@ bin_PROGRAMS += debuginfod-find
 endif
 
 debuginfod_SOURCES = debuginfod.cxx
-debuginfod_LDADD = $(libdw) $(libelf) $(libeu) $(libdebuginfod) $(argp_LDADD) 
$(fts_LIBS) $(libmicrohttpd_LIBS) $(sqlite3_LIBS) $(libarchive_LIBS) -lpthread 
-ldl
+debuginfod_LDADD = $(libdw) $(libelf) $(libeu) $(libdebuginfod) $(argp_LDADD) 
$(fts_LIBS) $(libmicrohttpd_LIBS) $(sqlite3_LIBS) $(libarchive_LIBS) 
$(jsonc_LIBS) $(libcurl_LIBS) -lpthread -ldl
 
 debuginfod_find_SOURCES = debuginfod-find.c
 debuginfod_find_LDADD = $(libdw) $(libelf) $(libeu) $(libdebuginfod) 
$(argp_LDADD) $(fts_LIBS)
@@ -97,7 +97,7 @@ libdebuginfod_so_LIBS = libdebuginfod_pic.a
 if DUMMY_LIBDEBUGINFOD
 libdebuginfod_so_LDLIBS =
 else
-libdebuginfod_so_LDLIBS = -lpthread $(libcurl_LIBS) $(fts_LIBS)
+libdebuginfod_so_LDLIBS = -lpthread $(libcurl_LIBS) $(fts_LIBS) $(jsonc_LIBS)
 endif
 $(LIBDEBUGINFOD_SONAME): $(srcdir)/libdebuginfod.map $(libdebuginfod_so_LIBS)
$(AM_V_CCLD)$(LINK) $(dso_LDFLAGS) -o $@ \
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index 716cb7695617..4a75eef2303a 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -56,6 +56,8 @@ int debuginfod_find_executable (debuginfod_client *c, const 
unsigned char *b,
 int s, char **p) { return -ENOSYS; }
 int debuginfod_find_source (debuginfod_client *c, const unsigned char *b,
 int s, const char *f, char **p)  { return -ENOSYS; 
}
+int debuginfod_find_metadata (debuginfod_client *c,
+  const char *k, const char* v, char** m) { return 
-ENOSYS; }
 void debuginfod_set_progressfn(debuginfod_client *c,
   debuginfod_progressfn_t fn) { }
 void debuginfod_set_verbose_fd(debuginfod_client *c, int fd) { }
@@ -103,6 +105,10 @@ void debuginfod_end (debuginfod_client *c) { }
 
 #include 
 
+#ifdef HAVE_JSON_C
+  #include 
+#endif
+
 static 

Re: [PATCH] debuginfod: Support queries for ELF/DWARF sections

2022-10-26 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Is/should the section name be URL-encoded?

Yes!

> I would drop the maybe_debuginfo_section heuristics. There are some
> sections like .strtab/.symtab that are probably in the debug file, but
> might be in the executable. I would assume that a named section can
> normally be found in the debugfile and only use the executable as
> fallback.

That heuristic would work fine for the case of .gdb_index, that
motivated this whole piece of work.  Sure.


> Finally, if the section comes from a file in the cache or if we have to
> download it in full anyway, then extracting the section into its own
> file seems slightly wasteful. It would be great if we could just report
> back "here is the full exe/debug file which does contain the requested
> section name".  [...]

It'd be fine to pass back the extracted section content anyway, even
if the full elf and/or dwarf file is already there.  Consider
federated debuginfod servers.  Intermediate servers may be
willing/able to do this extraction on behalf of clients who really
only want the section in question.  And if they cache the result, as
in amerey's draft code, then this will also help accelerate other
future clients.  That's just the usage scenario (gdb acceleration).


> int
> debuginfod_find_section (debuginfod_client *client,
>  const unsigned char *build_id,
>  int build_id_len,
>  const char *section, char **path,
>  bool *file_is_elf)
> 
> Maybe that is over-designed to avoid a little bit of disk waste?

(Then the client code would have to learn elfutils API internals in
order to extract the section it was actually interested in.)


- FChE



Re: [PATCH] debuginfod: Support queries for ELF/DWARF sections

2022-10-24 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Generally looks fine, thanks a lot.
A few nits:

- use of write(2) to put files onto disk is not quite right; write(2) can
  be partial, so you need a loop (or a macro wrapping a loop)

- not sure I understand why the code worries about dots in or not in
  section names.  Why not just pass them verbatim throughout the code
  base, and not worry about whether or not there's a dot?  Does the
  ELF standard even require a dot?

- not sure whether/why the I queries require a new _query_i view, as
  opposed to running the _query_d & _query_e views union'd together.
  I see an ORDER BY that's different here but not sure why bother;
  if anything, the server could prefer one or the other type, based
  on the same section-name heuristic as the client

- don't really see a need for the X-DEBUGINFOD-SECTION response
  header, which simply echoes back the very same parameter the client
  just requested; the other X-DEBUGINFOD-* headers are novel metadata

- re. verbose logging in the section vs non-section case, suggest
  just keeping the code simple (even if it makes the logs more verbose),
  i.e., not duplicating if (...) clog << STUFF else clog << STUFF;
  no biggie tho

- the webapi docs in debuginfod.8 should document the new query type


Otherwise lgtm.  Lots of nice work.

- FChE



Re: [PATCH 1/7] Rename 'hello2.spec.' -> 'hello2.spec' 'hello3.spec.' -> 'hello3.spec'

2022-10-21 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> C:\work\xemu\elfutils>git reset --hard
> 4cc429d2761846967678fb8cf5868d311d1f7862
> error: invalid path 'tests/debuginfod-rpms/hello2.spec.'
> fatal: Could not reset index file to revision
> '4cc429d2761846967678fb8cf5868d311d1f7862'.

Sounds like a git-induced problem.  Maybe try a different git client?

- FChE



Re: Fuzzing elfutils

2022-10-21 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > Cf https://oss-fuzz.com/testcases?open=yes=Arbitrary=elfutils

This is inaccessible without logins.

> > I would like to know what you think about this. Is this a bug to
> > you ? Or is it expected ?  [...]

Crashes on crafted inputs are generally bugs.  Security implications
are usually not dire for a tool such as elfutils.


- FChE



Re: [PATCH 1/7] Rename 'hello2.spec.' -> 'hello2.spec' 'hello3.spec.' -> 'hello3.spec'

2022-10-19 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> I really want this to be merged :) ping Frank,
> as this would stop me clone elfutils on windows

If it doesn't break "make rpm" (or at least rpm -ts elfutils*.tar.bz2),
it's fine.  But I don't understand the problem - my windows machines have
no problem with files named "hello2.spec2." with two periods.

- FChE



Re: ☠ Buildbot (GNU Toolchain): elfutils - failed test (failure) (master)

2022-10-17 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > Fatal error in GNU libmicrohttpd daemon.c:3831: Failed to remove FD
> > from epoll set.
> > Very odd. I don't have any hypothesis for why these are occuring.
> 
> I believe this is an intermittent libmicrohttpd bug. :-(

Pushing this patch as obvious ... and will try to track this one down
more urgently.


commit 4cc429d2761846967678fb8cf5868d311d1f7862 (HEAD -> master)
Author: Frank Ch. Eigler 
Date:   Mon Oct 17 10:07:39 2022 -0400

debuginfod: report libmicrohttpd version on startup

To assist troubleshooting with intermittent bugs.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 8fb65133f3e1..59d50df1fc8a 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,7 @@
+2022-10-17  Frank Ch. Eigler  
+
+   * debuginfod.cxx (main): Report libmicrohttpd version.
+
 2022-09-28  Aaron Merey  
 
* debuginfod-client.c (debuginfod_query_server): Switch sign of some
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 8e7ee4997e0a..9dc4836bbe12 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -3956,6 +3956,8 @@ main (int argc, char *argv[])
 }
 }
 
+  obatched(clog) << "libmicrohttpd version " << MHD_get_version() << endl;
+  
   /* If '-C' wasn't given or was given with no arg, pick a reasonable default
  for the number of worker threads.  */
   if (connection_pool == 0)



Re: ☠ Buildbot (GNU Toolchain): elfutils - failed test (failure) (master)

2022-10-17 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Fatal error in GNU libmicrohttpd daemon.c:3831: Failed to remove FD
> from epoll set.
> Very odd. I don't have any hypothesis for why these are occuring.

I believe this is an intermittent libmicrohttpd bug. :-(

- FChE



Re: [PATCH] debuginfod: Support queries for ELF/DWARF sections.

2022-09-28 Thread Frank Ch. Eigler via Elfutils-devel
Hi, Aaron -

On Tue, Sep 27, 2022 at 10:10:52PM -0400, Aaron Merey via Elfutils-devel wrote:

> [...]  In order to distinguish between debuginfo and executable
> files with the same build-id, this function includes a bool
> parameter use_debuginfo.  If true, attempt to retrieve the section
> from the debuginfo file with the given build-id. If false, use the
> executable instead.  [...]

How would a client know which one to use?  Does it provide power or
benefit to force them to choose (or iterate?).  Is there a scenario
where the content could be different between the two (if both
existed)?

If that decisionmaking is not warranted to put upon the shoulders of
the client, the server could just be asked for a section name "as if
from an unstripped executable", and let it find that in the executable
or debuginfo, whereever.


> [...] Although this patch does not implement it, we could generate
> .gdb_index on-the-fly if the target file does not contain it.
> However for large debuginfo files generating the index can take a
> non-trivial amount of time (ex. about 25 seconds for a 2.5GB
> debuginfo file).  [...]

Even that is not too bad, considering that the alternative would be
having to download that 2.5GB file.  I recall you saying that on some
distros, gdb-index sections are always there anyway, so we wouldn't
have to rush to implement this feature.

- FChE



Re: [PATCH] Retrive 64bit timestamp from curl_easy_getinfo on _TIME_BITS=64 environment

2022-09-27 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> I'm not sure the fix is the correct way, but
> since -D_TIME_BITS=64 mandates -D_FILE_OFFSET_BITS=64 in glibc,
> this does work on glibc environment.

Thanks, this looked fine in a quick test, so merged.

- FChE



Re: [patch git] PR28284 - debuginfod x-debuginfod* header processing

2022-09-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > Good point, we don't want an aborted new transfer to retain records
> > from a previous run, will fix that.
> 
> Not just a new transfer, but also when we hit the cache before doing a
> new transfer. Currently when we hit the cache and don't do any transfer
> the winning_headers will point to the last http transfer which will
> have nothing to do with the returned (cached) result. Just like we
> clear client->url early.

Got it, pushing this:

diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index 272a6a7a007f..5e5c140ab205 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -588,9 +588,11 @@ debuginfod_query_server (debuginfod_client *c,
   goto out;
 }
 
-  /* Clear the obsolete URL from a previous _find operation. */
+  /* Clear the obsolete data from a previous _find operation. */
   free (c->url);
   c->url = NULL;
+  free (c->winning_headers);
+  c->winning_headers = NULL;
 
   /* PR 27982: Add max size if DEBUGINFOD_MAXSIZE is set. */
   long maxsize = 0;

- FChE



Re: [patch git] PR28284 - debuginfod x-debuginfod* header processing

2022-09-06 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Thanks for giving it a look.

> > I had a bit of time to tweak Noah Sanci's PR28284 work, and I believe
> > it addresses Mark's last set of comments (from a while ago).  This
> > follow-up patch corrects things like case sensitivity, spacing, \r\n
> > processing, and tweaks documentation.
> 
> I hadn't thought about the \r\n at the end of the HTTP headers. Thanks.
> I assume \r\n at the end of HTTP headers required, since you are
> expecting in the code now, or could it also be \n on its own?

At the wire protocol level, yes, required.  But this code strips them
from the normal C/userspace processing (and libcurl adds them back as
needed).

> > The gist of it is to add a new client api function
> > debuginfod_get_headers(), documented to return x-debuginfod* headers
> > from current or previous fetch requests.
> 
> This looks good, but I think c->winning_headers needs to be
> freed/cleared at the start of debuginfod_query_server. Otherwise if you
> reuse the debuginfod_client and you hit the cache, the user gets the
> headers from the last use of debuginfod_client that did fetch something
> from a server. Which imho is confusing (the headers won't match the
> cached result returned).

Good point, we don't want an aborted new transfer to retain records from
a previous run, will fix that.

> >   debuginfod-find prints those
> > in -v verbose mode, and debuginfod relays them in federation.
> 
> This is the only thing I am not 100% happy about. It means you can only
> see the headers using debuginfod-find but no longer with any other
> client when DEBUGINFOD_VERBOSE is set. Is this really what we want?

Well, dunno, can't speak for we all. :-) For debugging purposes (which
is what DEBUGINFOD_VERBOSE is), we could print all headers that come
in, from any server we're connecting to.  For API / human-friendly
use, limiting to x-debuginfod* ones from the winning server seems more
useful.  (I hope we don't have to rethink again once the signature
contents start coming down that pipe too - hope they're not too long.)

Will add a commit to that effect shortly and run it through the
trybots.

- FChE



Re: [PATCH] debuginfod: Use auto-sized connection pool when -C is not given with arg

2022-09-05 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> Fix this by using an auto-sized thread pool when '-C' is not given, just
> as we do when it's given with no argument. Update the doc's description
> of '-C'.

Thanks, pushed.

- FChE



[patch git] PR28284 - debuginfod x-debuginfod* header processing

2022-09-02 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

I had a bit of time to tweak Noah Sanci's PR28284 work, and I believe
it addresses Mark's last set of comments (from a while ago).  This
follow-up patch corrects things like case sensitivity, spacing, \r\n
processing, and tweaks documentation.

The gist of it is to add a new client api function
debuginfod_get_headers(), documented to return x-debuginfod* headers
from current or previous fetch requests.  debuginfod-find prints those
in -v verbose mode, and debuginfod relays them in federation.

This stuff is an enabler for rgoldber's subsequent
signature-passing/checking code, to which I plan to turn my attention
next.

Please see users/fche/try-pr28284d for this draft of the code.  I'd
like to keep it as two separate commits to preserve Noah's id in the
git history, even though that makes it a bit harder to give a final
review.

git diff --stat origin/master...origin/users/fche/try-pr28284d

 debuginfod/ChangeLog | 22 ++
 debuginfod/debuginfod-client.c   | 24 ++--
 debuginfod/debuginfod-find.c |  3 +++
 debuginfod/debuginfod.cxx| 22 ++
 debuginfod/debuginfod.h.in   |  4 
 debuginfod/libdebuginfod.map |  3 +++
 doc/ChangeLog| 10 ++
 doc/debuginfod_find_debuginfo.3  | 14 ++
 doc/debuginfod_get_headers.3 |  2 ++
 tests/ChangeLog  | 12 
 tests/run-debuginfod-response-headers.sh | 53 
+
 11 files changed, 155 insertions(+), 14 deletions(-)


- FChE



debuginfod metadata patch review

2022-08-31 Thread Frank Ch. Eigler via Elfutils-devel
Hi, Ryan -

Sorry it's been taking time.  These are some notes for what I believe
is the newest snapshot in git: users/rgoldber/try-metadata_query.

Overall things look fine.  The gist of it (for the sake of audience)
is the introduction of a new webapi to query debuginfod servers for
metadata about files matching a glob pattern.  So, fetching

 /metadata/GLOBPATTERN

would return a json document with an array of metadata kind of like

[ { "atype": "e", "buildid":
"f17a29b5a25bd4960531d82aa6b07c8abe84fa66", "mtime": 98298348,
"stype": "R", "source0": "/foo/D/hithere_1.0-1_amd64.deb",
"source1": "/usr/bin/hithere", "artifactsrc": "", "source0ref": ""
} ]

based on the current index of the debuginfod server, plus all of its
upstream federated peers.  So it'd be a really easy way of finding all
known versions of "/lib*/libc.so.6" for example.  Corresponding C api
& cli included.  Consumers can use this to map from path names to
buildids (such as for a forthcoming systemtap feature).


Quibble 1: the webapi quoting is a bit clumsy.  The testcase says:
  http://127.0.0.1:$PORT2/metadata/%2Fusr%2Abin%2Ahi%2A
Let's switch to something like:
  http://127.0.0.1:$PORT2/metadata?glob=/usr/bin/*hi*
because this leaves room for future expandability (beyond "glob="),
and does not require quite as much unnatural urlencoding.


Quibble 2: the C api has to be super clear about the lifetime of
the returned string: the m parameter here.

+int debuginfod_find_metadata (debuginfod_client *c,
+  const char* p, char** m) { return -ENOSYS; }

In the other _find_ routines, the result is written to the *m 
pointer (if non-NULL!) and as a copied string that the caller is expected
to free.  That's fine here too, but then debuginfod_find_metadata 
cannot possibly just set   *metadata = "[ ]"; which is not a free'able
copied string.


Quibble 3: autoconf json-c can just feed HAVE_JSON_C rather than
ENABLE_FIND_METADATA.


Quibble 4: in the client, metadata_callback() could be simplified,
as realloc can take NULL first parameter, so the
   if (data->*data == NULL)
 malloc()
   else
 realloc()
isn't needed.  Just use realloc.  (I see the same code in header_callback,
so same simplification can work there too.)


Quibble 5: In debuginfod-find, call PATH a GLOB, or something like that


Quibble 6: in debuginfod.cxx handle_metadata(), you don't really need
three separate sqlite queries.  One can combine the three via

  select XYZ from ABC
  union all
  select XYZ from DEF
  union all   
  select XYZ from GHI

Why the order by?


Quibble 7: Same function, bottom.  In case the metadata queries are
not supported, it's better not to give fake data "[ ]" back.  Older
servers will just reject the query anyway with a 503 or similar via an
exception.  It's okay to do the same here.


Quibble 8: Server, handler_cb().  There is no need for "after you"
type synchronization for metadata.  These queries are not worth
delaying, for "after you" purposes, because near-concurrent query
results r or prefetches are not retained in a cache.


Quibble 9: debuginfod.h doc for new function, spell out that the
**metadata must be non-NULL to receive the string.  Don't bother spell
out that things are "ESCAPED", it's just a normal json array string.


Quibble 10.  We need to decide how much of the individual json
metadata objects to document, i.e., which ones we promise to apps to
expect.  We can leave some fuzziness too, like "extra fields may
appear".  Absent fields are probably better than empty-string fields.
We should document whether there is any duplicate elimination provided
by the client, I think the answer is no, but let's be clear.


Quibble 11.  The test case test-compiles a C file to look for json-c.
It could probably just run `pkg-config json-c` to see if suceeds or
fails.  And heck the main autoconf could use
PKG_CHECK_MODULE([json-c]...)  also actually.  No big deal.


Quibble 12.  The test case should process the resulting files with jq
(really easy to extract exactly all the buildids from a json
array-of-objects), or else get some guarantee from debuginfod-find
that the output is formatted in a specific way and then grep or
whatever.  The test case shouldn't rely on undocumented formatting
accidents from debuginfod.


- FChE



Re: Artix Linux debuginfod

2022-08-25 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Artix Linux has deployed[0] its own instance of debuginfod and we'd be happy
> if you added it to your federated server list. Our CI creates debug packages
> for all eligible software and our -debug repos are currently getting
> populated. The server is available at https://debuginfod.artixlinux.org

Thanks for letting us know!  I added it to our index page and will look into
federation.

https://sourceware.org/elfutils/Debuginfod.html

- FChE



Re: [PATCH] debuginfod: fix http_requests_total{type="debuginfo"} when dwz is used

2022-08-17 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> When dwarf_extract_source_paths is called, it can call handle_buildid
> when a rpm file used dwz. Ignore such internal request in
> http_requests_total statistics.

Noble goal:

> @@ -1906,7 +1906,7 @@ handle_buildid (MHD_Connection* conn,
>  const string& buildid /* unsafe */,
>  string& artifacttype /* unsafe, cleanse on exception/return 
> */,
>  const string& suffix /* unsafe */,
> -int *result_fd)
> +int *result_fd, bool update_metrics = true)

 but no need for an extra parameter.  When conn==0, we have
an internal request, so can use that as a flag.


- FChE



Re: [PATCH] debuginfod: print filename for "cannot open archive" error

2022-08-17 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Report the file that has such a problem so that one can inspect it.
> Signed-off-by: Martin Liska 

The complication with this type of change is that the exception text
objects also make it into the prometheus metrics.  When file names and
such variables show up, they will bifurcate all the affected error
metrics into families.

Not sure what is the best solution to that.  A separate verbose log
print to explain the specifics of this error?  An extra parameter to
the exception object, which is only logged but not passed to
prometheus metric naming?  Some other way?

- FChE



Re: [PATCH] Support nullglob in profile.*.in files

2022-08-16 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> In openSUSE we have the following RPM checker that uses
> shopt -s nullglob:
> https://github.com/openSUSE/post-build-checks/blob/master/checks/50-check-libtool-deps#L31

OK, lgtm.

We'd hate to have a stuck cat.

- FChE


Re: debuginfod Credential Helper RFC

2022-08-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]  I could also see file-based config being useful if some
> aspect of the debuginfod configuration can change from
> moment-to-moment. Environment variables could be used for that, but
> it would require either changing those variables in the calling
> shell or wrapping each debuginfod client utility.

So-so ... if the file contents are modified, but the environment
variable that points to the file is fixed, then one may get into parse
race conditions as different debuginfod client objects in the process
may be active at the same time.


> [...]  You could also do this more granularly:
> DEBUGINFOD_HEADERS_FILES would work for us, and other lists could be
> created for other dynamically controllable aspects of the system.
> [...]

I see some value in doing this sort of thing more broadly,
hypothetically, but it's vague/speculative enough that I'd be just as
glad to limit the concept to the present case ("also add all headers
in given file").  So how about a $DEBUGINFOD_HEADERS and perhaps
$DEBUGINFOD_HEADERS_FILE env vars for now?

- FChE



Re: [PATCH] debuginfod: optimize regular expressions in groom()

2022-08-03 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Subject: [PATCH] debuginfod: optimize regular expressions in groom()

Thanks, pushed.

- FChE



Re: [PATCH 1/1] debuginfod: create indexes to speed up grooming

2022-07-30 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Create indexes on _r_de and _f_de tables
> to speed up delete operations called by groom() function.
> Primary keys of those tables are optimalized to search rows
> by buildids so delete by file and mtime attributes
> has to scan the whole table. [...]

By the way, another possible approach to this is could be to create
those indexes only for the duration of a groom operation: Create at
start, drop at end.  It'd speed up a set of delete's almost as well,
and also save disk space during normal operations.  OTOH, the peak
disk space requirement would be the same, so if the server's running
low on index storage disk, it'll fail either way, so probably not
worth doing this.

- FChE



Re: [PATCH 1/1] debuginfod: create indexes to speed up grooming

2022-07-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Subject: [PATCH 1/1] debuginfod: create indexes to speed up grooming
> Create indexes on _r_de and _f_de tables
> to speed up delete operations called by groom() function.

Pushed, thanks!

- FChE



Re: [PATCH] libdwfl: name dwfl_get_debuginfod_client stub's arg

2022-07-18 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Since the stub version of "dwfl_get_debuginfod_client" doesn't name its
> parameter, building elfuitls fails on a system with gcc 10.2.1:

Thanks, pushed this patch.

- FChE



Re: elfutils configuration issue on musl

2022-07-11 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

>  I am working on building elfutils for a project where I am building for 
> i3x86 with musl and I am facing configure: WARNING: "libc does not have argp"
> checking for argp_parse in -largp... no
> configure: error: "no libargp found"

Yes, musl does not include an implementation of the argp API.
This is brought up with musl folks regularly.  This may help:

https://www.openwall.com/lists/musl/2019/01/06/2

- FChE



Re: Expanding control over debuginfod usage from dwfl [was: Re: Questions regarding debuginfod.h API]

2022-07-06 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> I'm working on GUI applications (hotspot [1], gammaray [2]), people do not 
> look at the command line output. I want to show the download progress and 
> status graphically instead. [...]

Aha, got it.  I'd say a

extern debuginfod_client *dwfl_get_debuginfod_client (Dwfl *);

type function would probably be your bet.  That could also be built
upon later, to provide finer grained control (to override environment
variable settings e.g.).


The other approach of a custom Dwfl_Callbacks .find_debuginfo hook is
also available, wherein your own app can take charge of finding /
downloading debuginfo, with its own bespoke debuginfod_client object.
This would have the benefit of working with existing elfutils.


- FChE



Re: Expanding control over debuginfod usage from dwfl [was: Re: Questions regarding debuginfod.h API]

2022-07-06 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> a) Notifying the user that a download is ongoing. Right now, it feels like 
> the 
> tool is frozen as no feedback is given to the user.

Right, can you explain how DEBUGINFOD_PROGRESS=1 is not a good fit in your case?

- FChE



Re: Expanding control over debuginfod usage from dwfl [was: Re: Questions regarding debuginfod.h API]

2022-07-06 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > Thus my proposal, and RFC:
> > 
> > ```
> > /* Let us mirror the debuginfod progressfn for dwfl and forward it to
> >the internal debuginfod client, if available.
> >This way, dwfl's usage of debuginfod can stay optional and we would not
> >need to link to debuginfod directly from code using dwfl.
> >  */
> > typedef int (*dwfl_debuginfod_progressfn_t)(Dwfl *dwfl, long a, long b);
> > extern void dwfl_set_debuginfod_progressfn(Dwfl *dwfl,
> >dwfl_debuginfod_progressfn_t fn);
> > ```

(Don't quite see how this extension would let you disable or enable
debuginfod support on a given dwfl.  By the time a progressfn is
called, some debuginfod traffic would be attempted.)n


> Alternately:
> [...]
> > /* We could just give full access to the internal client
> >but doing so may clash with future usages of e.g.
> >debuginfod_{set,get}_user_data in dwfl and users of dwfl.
> >Otherwise, this is obviously easiest and gives most flexibility.
> >We just need to forward get_client from debuginfod-client.c
> >  */
> > extern debuginfod_client *dwfl_debuginfod_client (Dwfl *);
> > ```
> > What do you all think?

In order to -influence- things, would you not need to -change- the
client somehow?  We don't have debuginfod-client apis to disable or
reconfigure any particular client object.  Such an API wouldn't let
you replace it with a hook object of your own either.


So, dunno, could you perhaps remind us what your current usage
interests are?

To enable/disable it on a per-dwfl basis?  If yes, and if statically,
maybe a new Dwfl_Callbacks option for .find_debuginfo() could be your
ticket: a dwfl_standard_find_debuginfo variant sans debuginfod at the
end.


- FChE



Re: [PATCH] config: Remove unnecessary setting/unsetting of prefix in profiles

2022-06-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> In both profile.csh.in and profile.sh.in we set and then unset
> prefix, but never use it.

It is used, because the autoconf @sysconfdir@ macro expands to a
string like "$prefix/...", which requires the shell variable $prefix
to be set.

- FChE



Re: [PATCH] debuginfod: add --disable-source-scan option.

2022-06-03 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> --disable-source-scan disables scanning of the dwarf source info
> of debuginfo sections. The source info is not required in setups
> without source code access.

Thanks, merged (with a bit of ChangeLoggery added afterwards).

- FChE



Re: debuginfod & Debian source packages

2022-06-02 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> OK, this was going to be my next question.  Out of curiosity, how would
> debuginfod invoke this external program?

That's the best part.  It doesn't need to. :-) Whatever program you're
using to freshen the repository of .deb's that your debuginfod
scrapes, could also run this program for any new .dsc's.  i.e., keep
the problem from debuginfod entirely.

- FChE



Re: debuginfod & Debian source packages

2022-06-02 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> I'm the maintainer of debuginfod.debian.net (currently offline due to a
> hardware issue :-/).

O no.  (And also, please consider upgrading your elfutils version, as
later ones have less pessimal behavior with long grooming ops.)

> The service provides only debuginfo for now, but I
> would like to start indexing the source code of each package as well.

Neat!

> [...]
> I'm now thinking about an alternative solution to this problem.  Debian
> source packages already contain everything needed to be indexed and
> served to debuginfo consumers; it has the full source code + all the
> downstream patches.  It's represented by a .dsc file that can be fed to
> dget(1) which will download the source tarball and apply all the patches
> automatically.
> 
> Do you think we can teach debuginfod to consume these files and do the
> right thing when indexing the source code of each package? [...]

Interesting idea, but I'd throw it back at you thusly: does debuginfod
need to this itself on the fly?  Other than perhaps a time/storage
tradeoff, is there some reason an auxiliary program couldn't do this
in the background?  Take designated .dsc's, download, apply patches,
and assemble the patched sources into, well, any old random tarball
format?  If someone(tm) were to write such a script, we could ship it
alongside debuginfod if desired.

As long as the archive file name was a close match to the binary deb
file names, and as long as the constituent files were named with the
same paths as mentioned in the binary dwarf, debuginfod would gladly
take the result and serve from it.


- FChE



Re: ☠ Buildbot (GNU Toolchain): elfutils - failed test (failure) (master)

2022-05-27 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

n-metrics.sh: line 198: 27160 Aborted (core dumped) env 
LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../debuginfod/debuginfod $VERBOSE -d 
${DB} -F -U -t0 -g0 -p $PORT1 L D F > vlog$PORT1 2>&1
> [...]
> Process 27160 (debuginfod) of user 994 killed by SIGABRT - ignoring (low free 
> space)
> Lets see what happens on a rebuild.

Did the machine run out of memory / storage?

- FChE



Re: [Bug debuginfod/29098] set default prefetch limits to >0

2022-05-09 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Thanks, committing this, with corrected comments and changelog entries
and a bit of man page cleanup.


diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 619ebd8c9202..026908c85000 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,7 @@
+2022-05-09  Noah Sanci  
+
+   * debuginfod.cxx (main): Set nonzero defaults for fdcache.
+
 2022-05-04  Frank Ch. Eigler 
Mark Wielaard  
 
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 4aaf41c0886e..fde4e194b526 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -3826,6 +3826,13 @@ main (int argc, char *argv[])
   error (EXIT_FAILURE, 0,
  "unexpected argument: %s", argv[remaining]);
 
+  // Make the prefetch cache spaces a fraction of the main fdcache if
+  // unspecified.
+  if (fdcache_prefetch_fds == 0)
+fdcache_prefetch_fds = fdcache_fds / 2;
+  if (fdcache_prefetch_mbs == 0)
+fdcache_prefetch_mbs = fdcache_mbs / 2;
+
   if (scan_archives.size()==0 && !scan_files && source_paths.size()>0)
 obatched(clog) << "warning: without -F -R -U -Z, ignoring PATHs" << endl;
 
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 303e3dc05dc5..cb754d04ba3f 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,7 @@
+2022-05-09  Frank Ch. Eigler  
+
+   * debuginfod.8: Tweak prefetch descriptions.
+
 2022-01-31  Frank Ch. Eigler  
 
* debuginfod-client-config.7: Elaborate DEBUGINFOD_URLS.
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
index ee8e4078e5b5..95b827e9cc35 100644
--- a/doc/debuginfod.8
+++ b/doc/debuginfod.8
@@ -232,34 +232,36 @@ loops in the symbolic directory tree might lead to 
\fIinfinite
 traversal\fP.
 
 .TP
-.B "\-\-fdcache\-fds=NUM"  "\-\-fdcache\-mbs=MB"  "\-\-fdcache\-prefetch=NUM2"
+.B "\-\-fdcache\-fds=NUM"  "\-\-fdcache\-mbs=MB"
 Configure limits on a cache that keeps recently extracted files from
 archives.  Up to NUM requested files and up to a total of MB megabytes
 will be kept extracted, in order to avoid having to decompress their
-archives over and over again.  In addition, up to NUM2 other files
-from an archive may be prefetched into the cache before they are even
-requested.  The default NUM, NUM2, and MB values depend on the
-concurrency of the system, and on the available disk space on the
+archives over and over again. The default NUM and MB values depend on
+the concurrency of the system, and on the available disk space on the
 $TMPDIR or \fB/tmp\fP filesystem.  This is because that is where the
-most recently used extracted files are kept.  Grooming cleans this
+most recently used extracted files are kept.  Grooming cleans out this
 cache.
 
 .TP
 .B "\-\-fdcache\-\-prefetch\-fds=NUM"  "\-\-fdcache\-\-prefetch\-mbs=MB"
-Configure how many file descriptors (fds) and megabytes (mbs) are
-allocated to the prefetch fdcache. If unspecified, values of
-\fB\-\-prefetch\-fds\fP and \fB\-\-prefetch\-mbs\fP depend
-on concurrency of the system and on the available disk space on
-the $TMPDIR. Allocating more to the prefetch cache will improve
-performance in environments where different parts of several large
-archives are being accessed.
+.B "\-\-fdcache\-prefetch=NUM2"
+
+In addition to the main fdcache, up to NUM2 other files from an
+archive may be prefetched into another cache before they are even
+requested.  Configure how many file descriptors (fds) and megabytes
+(mbs) are allocated to the prefetch fdcache. If unspecified, these
+values depend on concurrency of the system and on the available disk
+space on the $TMPDIR.  Allocating more to the prefetch cache will
+improve performance in environments where different parts of several
+large archives are being accessed.  This cache is also cleaned out
+during grooming.
 
 .TP
 .B "\-\-fdcache\-mintmp=NUM"
-Configure a disk space threshold for emergency flushing of the cache.
-The filesystem holding the cache is checked periodically.  If the
-available space falls below the given percentage, the cache is
-flushed, and the fdcache will stay disabled until the next groom
+Configure a disk space threshold for emergency flushing of the caches.
+The filesystem holding the caches is checked periodically.  If the
+available space falls below the given percentage, the caches are
+flushed, and the fdcaches will stay disabled until the next groom
 cycle.  This mechanism, along a few associated /metrics on the webapi,
 are intended to give an operator notice about storage scarcity - which
 can translate to RAM scarcity if the disk happens to be on a RAM



Re: [Bug debuginfod/29098] set default prefetch limits to >0

2022-05-09 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> +  // Make the prefetch cache spaces smaller than the normal
> +  // fd cache if rpm scanning is on. This is to not waste memory
> +  // since the prefetch cache isn't used when -R isn't specified
> +  // Set to 1/2 arbitrarily
> +  if ( scan_archives[".rpm"] == "cat" )
> +{
> +  if ( fdcache_prefetch_fds == 0 )
> +fdcache_prefetch_fds = fdcache_fds * .5;
> +  if ( fdcache_prefetch_mbs == 0 )
> +fdcache_prefetch_mbs = fdcache_mbs * .5;
> +}

Yeah, something like that.  It turns out that space for the prefetch
cache is not used at all if RPM (and don't forget about other archive
types!) is not in effect.  So it's harmless to set those defaults
unconditionally.  How about a simpler:

 if ( fdcache_prefetch_fds == 0 )
   fdcache_prefetch_fds = fdcache_fds * .5;
 if ( fdcache_prefetch_mbs == 0 )
   fdcache_prefetch_mbs = fdcache_mbs * .5;

- FChE



RFC patch: generated AUTHORS

2022-04-24 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

The following patch adds a tool AUTHORS.sh to compute a complete
AUTHORS list to cover the entire git history.  We do this in systemtap
and it's a nice way of showing off the large list of contributors over
time.


commit 72823322be9a8f0143de21ccfb67a89696a9522f (HEAD -> master2)
Author: Frank Ch. Eigler 
Date:   Sun Apr 24 19:39:47 2022 -0400

AUTHORS: Use generator script & git mailmap

We can compute a pretty complete list of contributors.

Signed-off-by: Frank Ch. Eigler 

diff --git a/.mailmap b/.mailmap
new file mode 100644
index ..fd42169b8a3f
--- /dev/null
+++ b/.mailmap
@@ -0,0 +1,25 @@
+# help out git with erroneous or incomplete email address mappings
+
+Piotr Drąg 
+Domingo Becker 
+Ulrich Drepper 
+Ulrich Drepper 
+Hyu_gabaru Ryu_ichi 
+kiyoto hashida  
+Claudio Rodrigo Pereyra Diaz 
+Gladys Guerrero 
+Wei Liu 
+Michael Münch 
+Noah Sanci 
+Noriko Mizumoto 
+Cornelius Neckenig 
+Francesco Tombolini 
+Thomas Spura 
+Geert Warrink 
+Yulia Poyarkova 
+Yuri Chornoivan 
+Ahelenia Ziemiańska 
+Daniel Cabrera 
+Thomas Canniot 
+Ahelenia Ziemiańska  наб via Elfutils-devel 

+
diff --git a/AUTHORS b/AUTHORS
index ef3c5430a963..45cb1f8d8e9f 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -1,4 +1,104 @@
-For Now:
-Ulrich Drepper.
-Roland McGrath
+Aaron Merey
+Adam Markey
+Adrian Ratiu
+Ahelenia Ziemiańska
+Akihiko Odaki
+Alexander Cherepanov
+Alexander Kanavin
+Alexander Miller
+Alice Zhang
+Andreas Krebbel
+Andreas Schwab
+Andrei Homescu
+Anthony G. Basile
+Ben Woodard
+Chih-Hung Hsieh
+Claudio Rodrigo Pereyra Diaz
+Colin Cross
+Cornelius Neckenig
+Daniel Cabrera
+David Abdurachmanov
+Di Chen
+Dima Kogan
+Dimitris Glezos
+Dmitry V. Levin
+Dodji Seketeli
+Domingo Becker
+Eduardo Santiago
+Eli Schwartz
+Érico Nogueira
+Érico Rolim
+Filipe Brandenburger
+Florian Weimer
+Francesco Tombolini
+Frank Ch. Eigler
+Geert Warrink
+Gladys Guerrero
+Gustavo Romero
+Hayatsu Shunsuke
+H.J. Lu
+Hyu_gabaru Ryu_ichi
+Jakub Jelinek
+Jan Kratochvil
+Jan Pokorný
+Jason P. Leasure
+Jean Pihet
+Jeff Kenton
+Jim Wilson
+John M Mellor-Crummey
+John Ogness
+Jonathan Lebon
+Jonathon Anderson
+Jose E. Marchesi
+Josh Stone
+Joshua Watt
+Kevin Cernekee
+kiyoto hashida
+Konrad Kleine
+Kurt Roeckx
+Kyle McMartin
+Lei Zhang
+Lubomir Rintel
+Luca Boccassi
+Luiz Angelo Daros de Luca
+Mao Han
+Marek Polacek
+Mark Wielaard
+Martin Liska
+Masatake YAMATO
+Matt Fleming
+Matthias Klose
+Matthias Maennich
+Max Filippov
+Michael Forney
+Michael Münch
+Mike Frysinger
+Milian Wolff
+Namhyung Kim
+Noah Sanci
+Noriko Mizumoto
+Omar Sandoval
 Petr Machata
+Pino Toscano
+Piotr Drąg
+Ravi Bangoria
+Richard Henderson
+Roland McGrath
+Rosen Penev
+Ross Burton
+Saleem Abdulrasool
+Sergei Trofimovich
+Srđan Milaković
+Steven Chamberlain
+Thomas Canniot
+Thomas Spura
+Timm Bäder
+Tom Tromey
+Ulf Hermann
+Ulrich Drepper
+Wei Liu
+William Cohen
+Yonghong Song
+Yulia Poyarkova
+Yunlian Jiang
+Yuri Chornoivan
diff --git a/AUTHORS.sh b/AUTHORS.sh
new file mode 100644
index ..ff048a6489e0
--- /dev/null
+++ b/AUTHORS.sh
@@ -0,0 +1,12 @@
+#! /bin/sh
+
+# Create the AUTHORS file, by searching the git history.
+
+# Run as "AUTHORS.sh" to get complete history
+# Run with "AUTHORS.sh commitish..commitish" for history between tags
+
+# shortlog will canonicalize the names using the file .mailmap
+git shortlog -s ${1-} |
+sed -e 's, via Elfutils-devel,,' |
+cut -b8- | # strip the commit counts
+sort | uniq
diff --git a/ChangeLog b/ChangeLog
index f0cd28a8ddbe..5ad93a16eeba 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2022-04-24  Frank Ch. Eigler  
+
+   * AUTHORS.sh, .mailmap: New files.
+   * AUTHORS: Regenerated to cover entire history of elfutils.
+
 2022-04-24  Mark Wielaard  
 
* configure.ac (AC_CHECK_FUNCS): Add mremap.



Re: run-debuginfod-webapi-concurrency.sh (Was: ☠ Buildbot (GNU Toolchain): elfutils - failed test (failure)) (master)

2022-04-23 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> But there is another way to prevent the "Server reached connection
> limit. Closing inbound connection." Pass the MHD_USE_ITC flag to
> MHD_start_daemon:

Yeah, that looked promising to me too.  When I was last working on
this, that would have been my next thing to try.  I can't think of
a relevant downside, so let's try it.  (Add a #ifdef guard around
that macro, for older libmicrohttpd, like rhel7 methinks.)

- FChE



Re: run-debuginfod-webapi-concurrency.sh

2022-04-23 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> I think that might be Frank's local experimentation. I also got email
> about my workers being unavailable. If you run a buildbot locally it
> will not see any workers connect and after an hour it will try to
> notify the owners.

Sorry about that.  After this noisy testing, I pushed to the
sourceware builder a configuration change for elfutils builds that
automatically upload testsuite results (both make check and make
distcheck) into the local bunsen testsuite database for future
analysis.

- FChE



Re: [PATCH 2/2] debuginfod: ensure X-DEBUGINFOD-SIZE contains file size

2022-04-22 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> -  add_mhd_response_header (r, "X-DEBUGINFOD-SIZE",
> -   to_string(fs.st_size).c_str());

> +  rc = fstat (fd, );
> +  if (rc == 0)
> +add_mhd_response_header (r, "X-DEBUGINFOD-SIZE",
> + to_string(fs.st_size).c_str());
> +  else
> +{
> +  close (fd);
> +  throw libc_exception (errno, string("stat ") + b_source1 + " 
> archive " + b_source0);
> +}


It shouldn't require a new fstat -- the archive component file's size
should be available from libarchive already: archive_entry_size(e);

- FChE



Re: PATCH: testsuite debuginfod

2022-04-22 Thread Frank Ch. Eigler via Elfutils-devel
> Date: Sat, 16 Apr 2022 14:37:23 -0400


> Mark Wielaard  writes:
> 
> > I think we should drop this patch for now. Or are you still working on
> > it?
> 
> Sure, let's drop it.
> Will bring in the xfail "cmd..." based one instead before too long.
> 
> - FChE

- FChE



Re: Questions regarding debuginfod.h API

2022-04-16 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > > My `debuginfod.h` also does not show any (useful) inline API
> > > documentation for most of that file. Could this please be improved?
> > > The doxygen for dwfl is great and can be read directly together with
> > > the code,
> > 
> > As they say, patches welcome. :-) The header contains some curt
> > comments documenting each function.
> 
> Would you be OK with me simply copying over the contents from the man page 
> over to doxygen? Or is there a better process in place to prevent such kind 
> of 
> documentation duplication? I would have expected that the man pages for API 
> documentation are generated from e.g. doxygen which does not seem to be the 
> case here?

Correct, elfutils does not have much API documentation generally.  The
debuginfod parts are a recent addition that buck that trend. :-) I
don't have a strong opinion about doxygen syntax markup or tooling
dependence, per se.  OTOH I would not want to see documentation
duplication or loss of man pages.

- FChE



Re: ☠ Buildbot (Wildebeest Builder): elfutils - failed test (failure) (master)

2022-04-13 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> So maybe corruptin the sqlite database prevents a proper shutdown of
> the debuginfod process?

Yeah, that test is a bit blunt and unpredictable.  (We're literally
overwriting a piece of the database file that the process has open and
has random parts in cache.)  I'm thinking this is not worth testing.

- FChE



Re: parallel downloads of multiple debuginfo files

2022-04-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> But once again - isn't this a problem that everyone using dwfl is going to 
> encounter? Should we not try to find a general solution to this problem and 
> fix it for all consumers of the API?

I suspect not many apps are going to have a complete list of files
they know they'll need a priori.  A live profiler tends to find out
about target files gradually.  A debugger uses debuginfo to help
enumerate other debuginfo.  DWZ files can only be even known of when
the base DWARF file is fetched.  So yeah I'm not sure there exists a
widespread general problem for which multithreading is not a workable
solution.

- FChE



Re: parallel downloads of multiple debuginfo files

2022-04-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> one more debuginfod question: Would it be possible to extend the API
> to allow downloading of multiple debug info files in parallel?  The
> `debuginfod_find_*` API currently only supports looking at multiple
> server urls in parallel. I would like to ask multiple files in
> parallel.

Spin up N threads, do one set of debuginfod_begin/_find/_end ops in each.
Bob is your uncle!

- FChE



Re: caching failed lookups of debuginfo?

2022-04-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> another debuginfod related question, but unrelated to the other thread I 
> started earlier today. In a work branch I have ported my heaptrack profiler 
> over to elfutils. I have then run the analyzer that uses elfutils (and thus 
> debuginfod internally via dwfl) on a recorded data file to have it download 
> all the debug info files it can find.

Nice.


> These negative lookups are not cached. Meaning rerunning the same process 
> using dwfl and debuginfod on the same data would always incur a significant 
> slowdown, as we would again and again try to look for something that's not 
> there. The lookups take roughly ~200ms for me to realize the data is not on 
> the server.

That's not correct, as of elfutils 0.184 (commit 5f72c51a7e5c0),
with some more recent tweaks in (commit 7d64173fb11c6).

- FChE



> What's worse, I'm seeing multiple lookups for the same buildid *within the 
> same process*. I.e.:
> 
> ```
> export DEBUGINFOD_VERBOSE=1
> ./heaptrack_interpret ... |& egrep "^url 0 https" | sort | uniq -c | sort
> ...
>   6 url 0 https://debuginfod.archlinux.org/buildid/
> 7f4b16b4b407cbae2d7118d6f99610e29a18a56a/debuginfo
>   8 url 0 https://debuginfod.archlinux.org/buildid/
> c09c6f50f6bcec73c64a0b4be77eadb8f7202410/debuginfo
>  14 url 0 https://debuginfod.archlinux.org/buildid/
> 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo
> ```
> 
> Here, we are paying roughly `14 * 0.2s = 2.8s` just for a single library.
> 
> Can we find a way to improve this situation somehow generically? I would 
> personally even be OK with caching the 404 error case locally for some time 
> (say, one hour or one day or ...). Then at least we would at most pay this 
> cost once per library, and not multiple times. And rerunning the analysis a 
> second time would become much faster again.
> 
> Was there a deliberate decision against caching negative server side lookups?
> 
> Thanks
> -- 
> Milian Wolff
> m...@milianw.de
> http://milianw.de




Re: Questions regarding debuginfod.h API

2022-04-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > (See also the DEBUGINFOD_MAXTIME and DEBUGINFOD_MAXSIZE env vars
> > that can limit this.)

> I did come across those, but what are suggested best practices in
> setting those? When using GDB or a profiler on larger non-trivial UI
> applications on Linux for the first time, we would start to download
> dozens of debug packages, taking hundreds of megabytes of
> space. [...]

Yes, and each of those downloads can be limited by these criteria.  An
application such as gdb could adds its own throttling on top, if it is
going to do a long series of downloads.  (Alternately, gdb can try not
to download all this stuff early; we're investigating gdbindex-only
fetch operations.)


> > > Looking at `debuginfod.h` I `debuginfod_set_progressfn` which looks
> > > ideal. But how do I access the default `debuginfod_client *` which
> > > seems to exist without me ever calling `debuginfod_begin` anywhere?
> > > Or do I need to create a new client here via `debuginfod_begin`?
> > 
> > You do need to create a new client object.  You can reuse it.
> 
> Will the default code that uses debuginfod from within dwfl then pick up my 
> new client object and use that? I feel like there's a fundamental confusion 
> about this on my side about this. More on that at the end of this mail below.

Ah yes, I didn't realize you were talking about the hidden debuginfod
client usage in libdwfl.  Yes, you have not much control over that,
because it is   hidden & automatic.  There is a
DEBUGINFOD_PROGRESS env var with which it can activate some
notification to stderr, but that's about it.  No API to get at the
internal transient objects.

If you wish to take control of this part, you could probably still use
dwfl.  You would do all the debuginfod client API calls yourself, then
"report" the dwarf file content to dwfl via dwarf_begin_elf(),
dwarf_begin(), dwarf_setalt() etc.


> ```
> $ man debuginfod_set_progressfn
> No manual entry for debuginfod_set_progressfn
> ```

Well go complain to your distro for this packaging bug. :-)
It's an alias of [man debuginfod_find_debuginfo].


> My `debuginfod.h` also does not show any (useful) inline API
> documentation for most of that file. Could this please be improved?
> The doxygen for dwfl is great and can be read directly together with
> the code, 

As they say, patches welcome. :-) The header contains some curt
comments documenting each function.

> I never used `man` for that purpose in the past?

There is a whole section of POSIX man pages for API docs: 3.


- FChE



Re: Questions regarding debuginfod.h API

2022-04-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> now that archlinux is supporting debuginfod, I have finally tried it
> out. It's such a game changer, many thanks for everyone involved in
> working on this!

Our pleasure!

> Now to my question: In applications using elfutils, we will now
> automatically download debug information when DEBUGINFOD_URLS is
> defined. But doing that can take a very long time. 

(See also the DEBUGINFOD_MAXTIME and DEBUGINFOD_MAXSIZE env vars
that can limit this.)

> I would like to give the user feedback about this situation and
> ideally provide a means to cancel the download.

OK.

> Looking at `debuginfod.h` I `debuginfod_set_progressfn` which looks
> ideal. But how do I access the default `debuginfod_client *` which
> seems to exist without me ever calling `debuginfod_begin` anywhere?
> Or do I need to create a new client here via `debuginfod_begin`?

You do need to create a new client object.  You can reuse it.

> Then, how would I cancel an ongoing download job? GDB seems to do
> that when I press `CTRL+C`, but I do not see any API I could use for
> that purpose?

See [man debuginfod_set_progressfn].  The return code from that
progressfn callback is the ticket.


> Finally, what is the recommended way to check whether debuginfod is
> available?  Should one rely on the build system to discover the
> `debuginfod.h` header file, or is some other compile time check
> suggested to detect it?  [...]

To decide whether or not to compile in support, you'd need a
compile-time check such as for the debuginfod.h header.  (Alternately,
you could opt not to compile in support at all, and instead call out
to the debuginfod-find(1) program.)  To decide at run time whether or
not to use it, you could just use the *_find APIs and get back an
error code if things are not set up.  Or you can check the
DEBUGINFOD_URLS for being set/unset before you call in.


- FChE



Re: patch: debuginfod ipv4-ipv6 dual-stack

2022-04-04 Thread Frank Ch. Eigler via Elfutils-devel
Hi -


> That IN6_IS_ADDR_V4MAPPED is funny. Is that actually used in practice?

Yeah, actually all the time.  On a dual-stack socket, an ipv4 connection
gets an ipv6 peer-address that renders like :::1.2.3.4.


> Too bad this cannot be merged with conninfo above.
> But this calculates less info.

Yeah.


> >obatched(clog) << "started http server on "
> > - << (d4 != NULL ? "IPv4 " : "")
> > - << (d6 != NULL ? "IPv6 " : "")
> > + << "IPv4 "
> > + << "IPv6 "
> >   << "port=" << http_port << endl;
> 
> This keeps the log output the same, but I wouldn't mind just removing
> the IPv4 and IPv6 strings.

Yeah, makes sense.  OK, will push with that change.


- FChE



patch: debuginfod ipv4-ipv6 dual-stack

2022-04-03 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

This little improvement reduces wasted memory from duplicated
worker threads for ipv4 vs ipv6 stacks.


commit 3a00f412b6554ba70f7b7e3d3aa6f67f278868af (HEAD -> master)
Author: Frank Ch. Eigler 
Date:   Sun Apr 3 19:42:48 2022 -0400

debuginfod: use single ipv4+ipv6 microhttpd daemon configuration

Use a single MHD_USE_DUAL_STACK mhd daemon.  This way, the thread
connection pool is not doubled, saving memory and better matching user
expectations.  A slight tweak to logging is required to pull IPv4
remote addresses back out, and also to allow IPv6 ::-laden address
forwarding through federation links.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 38a389e7dfd3..578d951d0102 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,12 @@
+2022-04-03  Frank Ch. Eigler 
+
+   * debuginfod.cxx (main): Use single dual-stack daemon setup,
+   rather than duplicate ipv4 and ipv6.
+   (conninfo, handle_buildid): Represent ipv4-mapped ipv6 addresses
+   in their native ipv4 form for logging and X-F-F: purposes.
+   * debuginfod-client.c (debuginfod_add_http_header): Tolerate
+   colons in http header values.
+
 2022-04-03  Frank Ch. Eigler 
 
* debuginfod.cxx (main): Use MHD_USE_EPOLL for libmicrohttpd, to
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index 024b09545d99..41ba88a56911 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -1548,13 +1548,13 @@ int debuginfod_find_source(debuginfod_client *client,
 int debuginfod_add_http_header (debuginfod_client *client, const char* header)
 {
   /* Sanity check header value is of the form Header: Value.
- It should contain exactly one colon that isn't the first or
+ It should contain at least one colon that isn't the first or
  last character.  */
-  char *colon = strchr (header, ':');
-  if (colon == NULL
-  || colon == header
-  || *(colon + 1) == '\0'
-  || strchr (colon + 1, ':') != NULL)
+  char *colon = strchr (header, ':'); /* first colon */
+  if (colon == NULL /* present */
+  || colon == header /* not at beginning - i.e., have a header name */
+  || *(colon + 1) == '\0') /* not at end - i.e., have a value */
+/* NB: but it's okay for a value to contain other colons! */
 return -EINVAL;
 
   struct curl_slist *temp = curl_slist_append (client->headers, header);
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 99924d36f260..48e0f37b33f0 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -1063,9 +1063,22 @@ conninfo (struct MHD_Connection * conn)
 sts = getnameinfo (so, sizeof (struct sockaddr_in), hostname, sizeof 
(hostname), servname,
sizeof (servname), NI_NUMERICHOST | NI_NUMERICSERV);
   } else if (so && so->sa_family == AF_INET6) {
-sts = getnameinfo (so, sizeof (struct sockaddr_in6), hostname, sizeof 
(hostname),
-   servname, sizeof (servname), NI_NUMERICHOST | 
NI_NUMERICSERV);
+struct sockaddr_in6* addr6 = (struct sockaddr_in6*) so;
+if (IN6_IS_ADDR_V4MAPPED(>sin6_addr)) {
+  struct sockaddr_in addr4;
+  memset (, 0, sizeof(addr4));
+  addr4.sin_family = AF_INET;
+  addr4.sin_port = addr6->sin6_port;
+  memcpy (_addr.s_addr, addr6->sin6_addr.s6_addr+12, 
sizeof(addr4.sin_addr.s_addr));
+  sts = getnameinfo ((struct sockaddr*) , sizeof (addr4),
+ hostname, sizeof (hostname), servname, sizeof 
(servname),
+ NI_NUMERICHOST | NI_NUMERICSERV);
+} else {
+  sts = getnameinfo (so, sizeof (struct sockaddr_in6), hostname, sizeof 
(hostname), NULL, 0,
+ NI_NUMERICHOST);
+}
   }
+  
   if (sts != 0) {
 hostname[0] = servname[0] = '\0';
   }
@@ -2008,13 +2021,26 @@ and will not query the upstream servers");

MHD_CONNECTION_INFO_CLIENT_ADDRESS);
   struct sockaddr *so = u ? u->client_addr : 0;
   char hostname[256] = ""; // RFC1035
-  if (so && so->sa_family == AF_INET)
+  if (so && so->sa_family == AF_INET) {
 (void) getnameinfo (so, sizeof (struct sockaddr_in), hostname, 
sizeof (hostname), NULL, 0,
 NI_NUMERICHOST);
-  else if (so && so->sa_family == AF_INET6)
-(void) getnameinfo (so, sizeof (struct sockaddr_in6), hostname, 
sizeof (hostname), NULL, 0,
-NI_NUMERICHOST);
-
+  } else if (so && so->sa_family == AF_INET6) {
+struct sockaddr_in6* addr6 = (struct sockaddr_in6*) so;
+if (IN6_IS_ADDR_V4MAPPED(>sin6_addr)) {
+  struct sockaddr_in addr4;
+  memset (, 0, sizeof(addr4));
+  addr4.sin_family = AF_INET;
+  addr4.sin_port = 

[PATCH] PR28708 debuginfod testsuite

2022-04-03 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Planning to commit this shortly.


commit e646e363e72e06e0ed5574c929236d815ddcbbaf (HEAD -> master)
Author: Frank Ch. Eigler 
Date:   Sun Apr 3 12:47:17 2022 -0400

PR28708: debuginfod: use MHD_USE_EPOLL for microhttpd threads

Testing on s390x and other architectures indicates that this
configuration reduces thundering-herd wakeups and saturation of a
small number of threads.  The run-debuginfod-webapi-concurrency.sh
test appears solid now.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index dfb5d42ec8a5..38a389e7dfd3 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,8 @@
+2022-04-03  Frank Ch. Eigler 
+
+   * debuginfod.cxx (main): Use MHD_USE_EPOLL for libmicrohttpd, to
+   encourage more round-robin dispatch of incoming connections.
+
 2021-12-09  Alexander Kanavin 
 
* debuginfod-client.c (cache_clean_default_interval_s): Change type to
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index bb8e8e102896..99924d36f260 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -3880,6 +3880,9 @@ main (int argc, char *argv[])
  | MHD_USE_INTERNAL_POLLING_THREAD
 #else
  | MHD_USE_SELECT_INTERNALLY
+#endif
+#ifdef MHD_USE_EPOLL
+ | MHD_USE_EPOLL
 #endif
  | MHD_USE_DEBUG, /* report errors to 
stderr */
  http_port,
@@ -3894,6 +3897,9 @@ main (int argc, char *argv[])
  | MHD_USE_INTERNAL_POLLING_THREAD
 #else
  | MHD_USE_SELECT_INTERNALLY
+#endif
+#ifdef MHD_USE_EPOLL
+ | MHD_USE_EPOLL
 #endif
  | MHD_USE_IPv6
  | MHD_USE_DEBUG, /* report errors to 
stderr */



Re: [PATCH] Avoid dlopen on debuginfod-client when not configured

2022-03-29 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > Avoid dlopen() when no debuginfo url is set.
> 
> The patch itself looks right. But I am slightly afraid this
> (theoretically?) will break some programs which set DEBUGINFOD_URLS
> themselves. We currently have no other way to make libdwfl use
> debuginfod-client. Thoughts?

I've been thinking that doing all this dlopen work in the real dwfl
function(s) rather than the solib ctor would be fine.

- FChE



Re: Signing keys for releases?

2022-02-25 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Could you please point me toward the PGP/GPG public keys necessary to
> validate the elfutils release tarballs against their .sig files?

Try these:

https://sourceware.org/elfutils/ftp/gpgkey-1AA44BE649DE760A.gpg
https://sourceware.org/elfutils/ftp/gpgkey-5C1D1AA44BE649DE760A.gpg

- FChE



RFC PATCH: debuginfod client doc tweak

2022-01-31 Thread Frank Ch. Eigler via Elfutils-devel
commit c4726913b7766634cc5d8a97624e1b4708452d73 (HEAD -> master)
Author: Frank Ch. Eigler 
Date:   Mon Jan 31 18:13:40 2022 -0500

man debuginfod-client-config.7: Elaborate $DEBUGINFOD_URLS

Add reference to /etc/profile.d and /etc/debuginfod/*.urls as possible
source of default.  (No need to autoconf @prefix@ it, these paths are
customarily distro standard rather than elfutils configurables.)
Drop warning about federation loops, due to protection via PR27917 (0.186).

Signed-off-by: Frank Ch. Eigler 

diff --git a/doc/ChangeLog b/doc/ChangeLog
index f25bcd2ee2b3..303e3dc05dc5 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,7 @@
+2022-01-31  Frank Ch. Eigler  
+
+   * debuginfod-client-config.7: Elaborate DEBUGINFOD_URLS.
+
 2021-12-08  Frank Ch. Eigler  
 
PR28661
diff --git a/doc/debuginfod-client-config.7 b/doc/debuginfod-client-config.7
index 1cc19215dd6d..fecc6038df75 100644
--- a/doc/debuginfod-client-config.7
+++ b/doc/debuginfod-client-config.7
@@ -24,8 +24,8 @@ temporary files.  The default is /tmp.
 .B $DEBUGINFOD_URLS
 This environment variable contains a list of URL prefixes for trusted
 debuginfod instances.  Alternate URL prefixes are separated by space.
-Avoid referential loops that cause a server to contact itself, directly
-or indirectly - the results would be hilarious.
+This environment variable may be set by /etc/profile.d scripts
+reading /etc/debuginfod/*.urls files.
 
 .TP
 .B $DEBUGINFOD_CACHE_PATH



Re: [PATCH] config: simplify profile.*sh.in

2022-01-18 Thread Frank Ch. Eigler via Elfutils-devel
Hi -


> 1. Remove needless profile=@profile@

You mean "prefix=@prefix@".  And it's not needless,
because @sysconfdir@ often expands to "$prefix/something",
which requires a prefix var to be set for evaluation.

> 2. Simplify needless sh -c "cat glob 2>/dev/null"
>into cat glob 2>/dev/null

This is not needless, but I forget the exact details.  It probably has
to do with the $prefix expansion just above, or perhaps glob
non-matching error handling.

> 3. Use $( instead of ` under sh

I'm neutral on this, assuming it's not a bashism.

> 4. Assign to D_U directly and either export it or unset it

I'm neutral on this.


- FChE



Re: patch rfc: PR28661: debuginfod thread-pool

2021-12-09 Thread Frank Ch. Eigler via Elfutils-devel
Hi, Mark -

> So I would recommend to simply add a testcase that just uses no-option, 
> -C and -C 256 (or take any arbitrary number, 1 might be an interesting
> corner case) and see that if you have 4 (8, 16, ...?) debuginfod-find
> (or curl metric) processes doing some parallel queries works as
> expected.

OK, here's the current version, with a test, and with one more small
improvement to the error message printer (to add timestamp & batching).


Author: Frank Ch. Eigler 
Date:   Wed Dec 8 22:41:47 2021 -0500

PR28661: debuginfo connection thread pool support

Add an option -C, which activates libmicrohttpd's thread-pool mode for
handling incoming http connections.  Add libmicrohttpd error-logging
callback function so as to receive indication of its internal errors,
and relay counts to our metrics.  Some of these internal errors tipped
us off to a microhttpd bug that thread pooling works around.  Document
in debuginfod.8 page.  Hand-tested against "ulimit -u NNN" shells, and
with a less strenuous new test case.

Signed-off-by: Frank Ch. Eigler 

diff --git a/NEWS b/NEWS
index 490932ae4ef9..6be58866f100 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,8 @@
+Version 0.187 after 0.186
+
+debuginfod: Support -C option for connection thread pooling.
+
+
 Version 0.186
 
 debuginfod-client: Default $DEBUGINFOD_URLS is computed from drop-in files
diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index df373201d5c6..2642ef5eeaf1 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -3,6 +3,14 @@
* debuginfod.cxx (database_stats_report): Don't format clog
using 'right' and 'setw(20)'.
 
+2021-12-08  Frank Ch. Eigler  
+
+   * debuginfod.cxx (connection_pool): New global.
+   (parse_opt): Parse & check -C option to set it.
+   (error_cb): New callback for libmicrohttpd error counting.
+   (main): Activate MHD_OPTION_THREAD_POOL_SIZE if appropriate.
+   Activate error_cb.
+
 2021-12-04  Mark Wielaard  
 
* debuginfod.cxx (main): Call debuginfod_pool_groom before exit.
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 887e4f5abd44..bb8e8e102896 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -353,7 +353,9 @@ static const struct argp_option options[] =
{ "rescan-time", 't', "SECONDS", 0, "Number of seconds to wait between 
rescans, 0=disable.", 0 },
{ "groom-time", 'g', "SECONDS", 0, "Number of seconds to wait between 
database grooming, 0=disable.", 0 },
{ "maxigroom", 'G', NULL, 0, "Run a complete database groom/shrink pass at 
startup.", 0 },
-   { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to 
NUM.", 0 },
+   { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to NUM, 
default=#CPUs.", 0 },
+   { "connection-pool", 'C', "NUM", OPTION_ARG_OPTIONAL,
+ "Use webapi connection pool with NUM threads, default=unlim.", 0 },
{ "include", 'I', "REGEX", 0, "Include files matching REGEX, default=all.", 
0 },
{ "exclude", 'X', "REGEX", 0, "Exclude files matching REGEX, 
default=none.", 0 },
{ "port", 'p', "NUM", 0, "HTTP port to listen on, default 8002.", 0 },
@@ -412,6 +414,7 @@ static unsigned rescan_s = 300;
 static unsigned groom_s = 86400;
 static bool maxigroom = false;
 static unsigned concurrency = std::thread::hardware_concurrency() ?: 1;
+static int connection_pool = 0;
 static set source_paths;
 static bool scan_files = false;
 static map scan_archives;
@@ -558,6 +561,16 @@ parse_opt (int key, char *arg,
   concurrency = (unsigned) atoi(arg);
   if (concurrency < 1) concurrency = 1;
   break;
+case 'C':
+  if (arg)
+{
+  connection_pool = atoi(arg);
+  if (connection_pool < 2)
+argp_failure(state, 1, EINVAL, "-C NUM minimum 2");
+}
+  else // arg not given
+connection_pool = std::thread::hardware_concurrency() * 2 ?: 2;
+  break;
 case 'I':
   // NB: no problem with unconditional free here - an earlier failed 
regcomp would exit program
   if (passive_p)
@@ -1368,6 +1381,7 @@ class libarchive_fdcache
   if (verbose > 3)
 obatched(clog) << "fdcache interned a=" << a << " b=" << b
<< " fd=" << fd << " mb=" << mb << " front=" << front_p 
<< endl;
+  
   set_metrics();
 }
 
@@ -3708,7 +3722,15 @@ sigusr2_handler (int /* sig */)
 }
 
 
-
+static void // error logging callback from libmicrohttpd internals
+error_cb (void *arg, const char *fmt, va_list ap)
+{
+  (void) arg;
+  inc_metric("error_count","libmicrohttpd",fmt);
+  char errmsg[512];
+  (void) vsnprintf (errmsg, sizeof(errmsg), fmt, ap); // ok if slightly 
truncated
+  obatched(cerr) << "libmicrohttpd error: " << errmsg; // MHD_DLOG calls 
already include \n
+}
 
 
 // A user-defined sqlite function, to score the sharedness of the
@@ -3853,7 +3875,7 @@ main (int argc, char *argv[])
 
   // Start httpd server threads. 

Re: patch rfc: PR28661: debuginfod thread-pool

2021-12-09 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> If you can use ulimit -u or ulimit -T in the run-test.sh script then
> please use that, but that probably requires launching sub-shells and
> you quickly end up in shell-hell.

A problem I found with that is that ulimit -u appears to be systemwide
in the sense that a new process/thread will be prevented if the total
systemwide processes of that user exceed this number.  It's not just
for accounting that particular processes' own children.  (ulimit -T
doesn't work on my f35 machine with bash or zsh or csh.)


> So I would recommend to simply add a testcase that just uses no-option, 
> -C and -C 256 (or take any arbitrary number, 1 might be an interesting
> corner case) and see that if you have 4 (8, 16, ...?) debuginfod-find
> (or curl metric) processes doing some parallel queries works as
> expected.

Yeah, I can/will do that.  Well, for any of these -C settings, even
thousands of concurrently issued curl jobs should succeed, just not
quite as quickly.

- FChE



Re: right, trying --enable-sanitizer-address on armv7l

2021-12-09 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> I was trying the new --enable-sanitizer-address on our armv7l buildbot
> worker and it almost works as is, except for...
> 
> debuginfod.cxx:3472:12: runtime error: reference binding to misaligned
> address 0x00561ec9 for type '', which requires 2 byte
> alignment
> [...]
> But I don't understand why. It might be a bug in gcc/libasan (this is
> gcc 8.3.0 Debian 10.11 - Buster). [...]

It must be a bug in gcc/libasan or something.

> Also, do we really want to right align the log here? We don't seem to
> align the log text anywhere else.

This one just prettifies the messages because there is a sequence
metrics & values being printed at startup, so it makes the numbers
line up.  But no great loss to drop; we export those as prometheus
metrics too.

- FChE



Re: [PATCH] debuginfod/debuginfod-client.c: use long for cache time configurations

2021-12-09 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> x32, riscv32, arc use 64bit time_t even while they are 32bit
> architectures, therefore directly using integer printf formats will not
> work portably.

> Use a plain long everywhere as the intervals are small enough
> that it will not be problematic.

lgtm!

- FChE



patch rfc: PR28661: debuginfod thread-pool

2021-12-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

While I think this patch itself is fine, and works around the
libmicrohttpd bug that motivated it, I don't know how to test it
seriously in the testsuite.  (We can certainly try few -C options for
parsing & operability.)  The error edge cases only appear to occur
under very high load and task limits such as those imposed by systemd
cgroups.


Author: Frank Ch. Eigler 
Date:   Wed Dec 8 22:41:47 2021 -0500

PR28661: debuginfo connection thread pool support

Add an option -C, which activates libmicrohttpd's thread-pool mode for
handling incoming http connections.  Add libmicrohttpd error-logging
callback function so as to receive indication of its internal errors,
and relay counts to our metrics.  Some of these internal errors tipped
us off to a microhttpd bug that thread pooling works around.  Document
in debuginfod.8 page.  Hand-tested against "ulimit -u NNN" shells.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 21d0721ef080..125e7d816f51 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,11 @@
+2021-12-08  Frank Ch. Eigler  
+
+   * debuginfod.cxx (connection_pool): New global.
+   (parse_opt): Parse & check -C option to set it.
+   (error_cb): New callback for libmicrohttpd error counting.
+   (main): Activate MHD_OPTION_THREAD_POOL_SIZE if appropriate.
+   Activate error_cb.
+
 2021-12-01  Mark Wielaard  
 
* debuginfod-client.c (debuginfod_query_server): Free tmp_url on
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 0d3f02978ee2..bb0c6cd65670 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -352,7 +352,9 @@ static const struct argp_option options[] =
{ "rescan-time", 't', "SECONDS", 0, "Number of seconds to wait between 
rescans, 0=disable.", 0 },
{ "groom-time", 'g', "SECONDS", 0, "Number of seconds to wait between 
database grooming, 0=disable.", 0 },
{ "maxigroom", 'G', NULL, 0, "Run a complete database groom/shrink pass at 
startup.", 0 },
-   { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to 
NUM.", 0 },
+   { "concurrency", 'c', "NUM", 0, "Limit scanning thread concurrency to NUM, 
default=#CPUs.", 0 },
+   { "connection-pool", 'C', "NUM", OPTION_ARG_OPTIONAL,
+ "Use webapi connection pool with NUM threads, default=unlim.", 0 },
{ "include", 'I', "REGEX", 0, "Include files matching REGEX, default=all.", 
0 },
{ "exclude", 'X', "REGEX", 0, "Exclude files matching REGEX, 
default=none.", 0 },
{ "port", 'p', "NUM", 0, "HTTP port to listen on, default 8002.", 0 },
@@ -411,6 +413,7 @@ static unsigned rescan_s = 300;
 static unsigned groom_s = 86400;
 static bool maxigroom = false;
 static unsigned concurrency = std::thread::hardware_concurrency() ?: 1;
+static int connection_pool = 0;
 static set source_paths;
 static bool scan_files = false;
 static map scan_archives;
@@ -557,6 +560,16 @@ parse_opt (int key, char *arg,
   concurrency = (unsigned) atoi(arg);
   if (concurrency < 1) concurrency = 1;
   break;
+case 'C':
+  if (arg)
+{
+  connection_pool = atoi(arg);
+  if (connection_pool < 2)
+argp_failure(state, 1, EINVAL, "-C NUM minimum 2");
+}
+  else // arg not given
+connection_pool = std::thread::hardware_concurrency() * 2 ?: 2;
+  break;
 case 'I':
   // NB: no problem with unconditional free here - an earlier failed 
regcomp would exit program
   if (passive_p)
@@ -3684,7 +3698,13 @@ sigusr2_handler (int /* sig */)
 }
 
 
-
+static void // error logging callback from libmicrohttpd internals
+error_cb (void *arg, const char *fmt, va_list ap)
+{
+  (void) arg;
+  inc_metric("error_count","libmicrohttpd",fmt);
+  (void) vdprintf (STDERR_FILENO, fmt, ap);
+}
 
 
 // A user-defined sqlite function, to score the sharedness of the
@@ -3829,7 +3849,7 @@ main (int argc, char *argv[])
 
   // Start httpd server threads.  Separate pool for IPv4 and IPv6, in
   // case the host only has one protocol stack.
-  MHD_Daemon *d4 = MHD_start_daemon (MHD_USE_THREAD_PER_CONNECTION
+  MHD_Daemon *d4 = MHD_start_daemon ((connection_pool ? 0 : 
MHD_USE_THREAD_PER_CONNECTION)
 #if MHD_VERSION >= 0x00095300
  | MHD_USE_INTERNAL_POLLING_THREAD
 #else
@@ -3839,8 +3859,11 @@ main (int argc, char *argv[])
  http_port,
  NULL, NULL, /* default accept policy */
  handler_cb, NULL, /* handler callback */
+ MHD_OPTION_EXTERNAL_LOGGER, error_cb, 
NULL,
+ (connection_pool ? 
MHD_OPTION_THREAD_POOL_SIZE : MHD_OPTION_END),
+ (connection_pool ? (int)connection_pool : 
MHD_OPTION_END),
  MHD_OPTION_END);
-  

Re: [PATCH] debuginfod/debuginfod-client.c: correct string format on 32bit arches with 64bit time_t

2021-12-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> [...]
> We seem to not expect these intervals to be much bigger than a week,
> so an int should always be big enough (even when stretched up to a
> whole year).

Yes, ints are fine for these humane-number-of-seconds kinds of values
in the cache configuration.  There's no need for maximum length
integers or even longs for purposes of storage/parse.  In the later
interval arithmetic related to time_t values, we can widen them to
(time_t) then and there.

- FChE



obv patch: debuginfod concurrency fix

2021-12-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

Committing as obvious.

Author: Frank Ch. Eigler 
Date:   Wed Dec 8 10:20:58 2021 -0500

debuginfod: correct concurrency bug in fdcache metrics

The intern() function called set_metrics() outside a necessary lock
being held.  helgrind identified this race condition.  No QA impact.

Signed-off-by: Frank Ch. Eigler 

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 8c937d6629a3..8cbaa9aa6fd9 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,7 @@
+2021-12-08  Frank Ch. Eigler  
+
+   * debuginfod.cxx (intern): Call set_metrics() holding the fdcache mutex.
+
 2021-12-04  Mark Wielaard  
 
* debuginfod-client.c (debuginfod_query_server): Free winning_headers.
diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
index 0d3f02978ee2..a26e7e8fce37 100644
--- a/debuginfod/debuginfod.cxx
+++ b/debuginfod/debuginfod.cxx
@@ -1354,8 +1354,8 @@ class libarchive_fdcache
   if (verbose > 3)
 obatched(clog) << "fdcache interned a=" << a << " b=" << b
<< " fd=" << fd << " mb=" << mb << " front=" << front_p 
<< endl;
+  set_metrics();
 }
-set_metrics();
 
 // NB: we age the cache at lookup time too
 if (statfs_free_enough_p(tmpdir, "tmpdir", fdcache_mintmp))



Re: [PATCHv3] debuginfod: Check result of calling MHD_add_response_header.

2021-12-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> > Why move the *size assignment in there?
>
> Because both statements are unnecessary if r == NULL (aka the response
> couldn't even be created). [...]
> But it is also harmless to do, so if you want I can move it outside the
> if statement.

OK, whichever, doesn't much matter then.

- FChE



Re: [PATCHv3] debuginfod: Check result of calling MHD_add_response_header.

2021-12-08 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Although unlikely the MHD_add_response_header can fail for
> various reasons.  If it fails something odd is going on.
> So check we can actually add a response header and log an
> error if we cannot.

Sure, if you insist. :-)

except:

> -  *size = os.size();
> -  MHD_add_response_header (r, "Content-Type", "text/plain");
> +  if (r != NULL)
> +{
> +  *size = os.size();
> +  add_mhd_response_header (r, "Content-Type", "text/plain");
> +}

Why move the *size assignment in there?


> -  *size = version.size ();
> -  MHD_add_response_header (r, "Content-Type", "text/plain");
> +  if (r != NULL)
> +{
> +  *size = version.size ();
> +  add_mhd_response_header (r, "Content-Type", "text/plain");
> +}
>return r;
>  }

ditto?


- FChE



Re: [PATCH] debuginfod: Clear and reset debuginfod_client winning_headers on reuse

2021-12-06 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> gcc address sanitizer detected a leak of the debuginfod_client
> winning_headers when the handle was reused. Make sure to free and
> reset the winning_headers field before reuse.

This is good.

(Note we're not actually using these winning_headers bits for
anything.  That was to be part of the "return headers to application"
work whose API we were discussing under PR28284.  I suspect it'll come
to a more thorough motivation & understanding when we try to work out
PR28204 - passing file signatures).

- FChE



Re: [PATCH v2] libebl: recognize FDO Packaging Metadata ELF note

2021-12-02 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> JSON has been targeted at the Windows/Java UTF-16 world, there is always
> going to be a mismatch if you try to represent it in UTF-8 or anything
> that doesn't have surrogate pairs.

The JSON RFC8259 8.1 mandates UTF-8 encoding for situations like ours.


> > Yes, and yet we have had the bidi situation recently where UTF-8 raw
> > codes could visually confuse a human reader whereas escaped \u
> > wouldn't.  If we forbid \u unilaterally, we literally become
> > incompatible with JSON (RFC8259 7. String. "Any character may be
> > escaped."), and for what?
> 
> RFC 8259 says this:
> 
>However, the ABNF in this specification allows member names and
>string values to contain bit sequences that cannot encode Unicode
>characters; for example, "\uDEAD" (a single unpaired UTF-16
>surrogate).  Instances of this have been observed, for example, when
>a library truncates a UTF-16 string without checking whether the
>truncation split a surrogate pair.  The behavior of software that
>receives JSON texts containing such values is unpredictable; for
>example, implementations might return different values for the length
>of a string value or even suffer fatal runtime exceptions.
> 
> A UTF-8 environment has to enforce *some* additional constraints
> compared to the official JSON syntax.

I'm sorry, I don't see how.  If a JSON string were to include the
suspect "\uDEAD", but from observing our hypothetical "no escapes!"
rule they could reencode it as the UTF-8 octets 0xED 0xBA 0xAD.
ISTM we're no better off.


- FChE



Re: [PATCHv2] debuginfod: Check result of calling MHD_add_response_header.

2021-12-01 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Although unlikely the MHD_add_response_header can fail for
> various reasons.  If it fails something odd is going on.
> So check we can actually add a response header and log an
> error if we cannot.

TBH I wouldn't bother even this much checking.  It just uglifies the
code.  If it would make covscan happier, I'd stick a (void) in front
of the add-header calls.



> -MHD_add_response_header (r, "Content-Type", "text/plain");
> -MHD_RESULT rc = MHD_queue_response (c, code, r);
> +MHD_RESULT rc1, rc2;
> +rc1 = MHD_add_response_header (r, "Content-Type", "text/plain");
> +rc2 = MHD_queue_response (c, code, r);
>  MHD_destroy_response (r);
> -return rc;
> +return (rc1 == MHD_NO || rc2 == MHD_NO) ? MHD_NO : MHD_YES;


e.g. this part won't work: returning MHD_NO causes libmicrohttpd to
send a 503 error back to the caller, regardless of our intended one.

> +if (MHD_add_response_header (resp, "Last-Modified", datebuf) == 
> MHD_NO)
> +  if (verbose)
> +obatched(clog) << "Error: couldn't add Last-Modified header"
> +   << endl;
>  }

e.g., we normally report errors to the logs, regardless of verbosity
settings.


> +  if (MHD_add_response_header (r, "Content-Type",
> +   "application/octet-stream") == MHD_NO
> +  || MHD_add_response_header (r, "X-DEBUGINFOD-SIZE",
> +  to_string(s.st_size).c_str()) == MHD_NO
> +  || MHD_add_response_header (r, "X-DEBUGINFOD-FILE",
> +  file.c_str()) == MHD_NO)

e.g., this formulation makes it impossible to add some headers if a
previous one failed.


- FChE



Re: [PATCH] debuginfod: Check result of calling MHD_add_response_header.

2021-12-01 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

> Although unlikely the MHD_add_response_header can fail for
> various reasons.  If it fails something odd is going on.
> So check we can actually add a response header before using
> the response.

ISTM it is okay to send the response object (the body), even if
something goes wrong with adding optional headers later.


- FChE



Re: [PATCH v2] libebl: recognize FDO Packaging Metadata ELF note

2021-11-30 Thread Frank Ch. Eigler via Elfutils-devel
Hi -

On Tue, Nov 30, 2021 at 12:25:41PM +0100, Mark Wielaard wrote:
> [...]
> The spec does explain the requirements for JSON numbers, but doesn't
> mention any for JSON strings or JSON objects. It would be good to also
> explain how to make the strings and objects unambiguous. [...]
> For Objects it should require that all names are unique. [...]
> For Strings it should require that \u escaping isn't used [...]
> 
> That should get rid of various corner cases that different parsers are
> known to get wrong. 

Are such buggy parsers seen in the wild, now, after all this time with
JSON?  It seems to me it's not really elfutils' or systemd's place to
impose -additional- constraints on customary JSON.


> Especially \u escaping is really confusing when using the UTF-8
> encoding (and it is totally necessary since UTF-8 can express any
> valid UTF character already).

Yes, and yet we have had the bidi situation recently where UTF-8 raw
codes could visually confuse a human reader whereas escaped \u
wouldn't.  If we forbid \u unilaterally, we literally become
incompatible with JSON (RFC8259 7. String. "Any character may be
escaped."), and for what?

Both these seem to be personal aesthetic decisions that need not be
imposed on a routine application of well-established standards.  We
have many industrial-strength json parser libraries to choose from,
so that part is IMO no longer a timely concern.


- FChE



  1   2   3   >