Re: Error reporting should be improved

2009-12-30 Thread Peter Rosin

Den 2009-12-30 22:22 skrev Bob Friesenhahn:

On Wed, 30 Dec 2009, Peter Rosin wrote:


Aha, you have misunderstood the patch, that's why we don't understand
each other. The patch does indeed allocate a buffer, that's the whole
point of sending the error message to lt_dladderror. The error message
is cached indefinitely inside lt_dladderror. In order to not grow that


In what line of code is a string buffer allocated?  The only buffer 
allocation I see is a REALLOC to increase the buffer which holds 
diagnostic pointers.  Then the passed pointer is assigned directly to 
the pointer at errindex without any allocation or string copy.


Oh, I knew I was too tired when I wrote that patch. You are of course
correct.

*blush*

Cheers,
Peter





Re: Error reporting should be improved

2009-12-30 Thread Bob Friesenhahn

On Wed, 30 Dec 2009, Peter Rosin wrote:


Aha, you have misunderstood the patch, that's why we don't understand
each other. The patch does indeed allocate a buffer, that's the whole
point of sending the error message to lt_dladderror. The error message
is cached indefinitely inside lt_dladderror. In order to not grow that


In what line of code is a string buffer allocated?  The only buffer 
allocation I see is a REALLOC to increase the buffer which holds 
diagnostic pointers.  Then the passed pointer is assigned directly to 
the pointer at errindex without any allocation or string copy.



"cache", I implemented a comparison so that registering the same user
error multiple times results in returning the index of the old error
message instead of inventing a new error number each time.


Yes.  And so it is impossible to decipher meaningless errors from 
meaningful ones.  If the error comes from a loader which should not 
have worked in the first place, then the user should be able to ignore 
it.



Note that user_error_strings is an array of strings, not one big long
string.


Yes, it is an array of string pointers.


The patch does exactly the first part, but not the second. There is simply
no way to know when the user is done with the string, so it has to be
kept indefinitely. We could document the lifetime rules, but until then
there is no way to know when the message can be freed.


The documentation should say that this data is only valid until the 
next lt_* call, which is currently the case.



Yes, that is true. But that is not a new problem, just magnified by
increased usage of lt_dladderror.


It was essentially unused before since it was only used by a rarely 
used loader on OS-X.


This discussion should completely move to the patches list so I will 
delete the Cc: to the bugs list.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/




Re: Error reporting should be improved

2009-12-30 Thread Peter Rosin

Den 2009-12-30 17:37 skrev Bob Friesenhahn:

On Wed, 30 Dec 2009, Peter Rosin wrote:


There used to be some useless effort to make it thread safe. 
Apparently that is gone now.  Regardless, there is also a potential 
re-entrancy issue, which could be encountered if a legacy OS 
interface is supported (re-implemented) via dlopen() or if some other 
implementation logic (e.g. lazy symbol resolution) uses dlsym().


I can't say that I understand what you are describing, but dlerror need
not be reentrant[1], and we do various manipulations of internal state
during lt_dlopen etc, so I don't see how any argument about reentrancy
can hold. What am I missing this time?


What I am saying is that an OS may choose to re-implement its legacy 
loader interfaces via dlopen() and since we still provide a loader 
driver for the legacy interface, use of this other loader may scramble 
the error reports from the dlopen() loader.



This patch does of course not address the original issue, it only
addresses the issue with dlerror returning a string to the ltdl user
that might have a limited life span.


The patch does not seem to address that since it does not allocate a 
buffer to store the error string.  What it does address is how to 
collect multiple error messages.


Aha, you have misunderstood the patch, that's why we don't understand
each other. The patch does indeed allocate a buffer, that's the whole
point of sending the error message to lt_dladderror. The error message
is cached indefinitely inside lt_dladderror. In order to not grow that
"cache", I implemented a comparison so that registering the same user
error multiple times results in returning the index of the old error
message instead of inventing a new error number each time.

Note that user_error_strings is an array of strings, not one big long
string.

Unfortunately, lt_dladderror() is publically exposed so it is 
difficult to change its arguments even though it is not documented.


This patch does not change lt_dladderror in a way that I classify as
significant. Do you disagree?


I don't disagree, but I had suggested that it is quite useful to know 
the origin of the error report.  The origin could have been handled via 
another parameter.


This origin idea is completely orthogonal to what the patch is doing.


I have not found anything in your message that "kills" the patch, but
yet I get the feeling that you do not like it, and I would like to know
what you think is wrong with it...


It seems to me that these error strings need to be in allocated buffer 
storage so that they don't point to potentially volatile storage. Also, 
there needs to be a way to clear out the existing errors when a 
lt_dlopen*() or lt_dlsym() starts so that the errors only correspond to 
the last requested operation.


The patch does exactly the first part, but not the second. There is simply
no way to know when the user is done with the string, so it has to be
kept indefinitely. We could document the lifetime rules, but until then
there is no way to know when the message can be freed.

The current implementation appears to be a memory leak since nothing is 
freeing the error string pointer (user_error_strings).


Yes, that is true. But that is not a new problem, just magnified by
increased usage of lt_dladderror.

Cheers,
Peter

--
They are in the crowd with the answer before the question.
> Why do you dislike Jeopardy?




Re: Error reporting should be improved

2009-12-30 Thread Bob Friesenhahn

On Wed, 30 Dec 2009, Peter Rosin wrote:


There used to be some useless effort to make it thread safe. Apparently 
that is gone now.  Regardless, there is also a potential re-entrancy issue, 
which could be encountered if a legacy OS interface is supported 
(re-implemented) via dlopen() or if some other implementation logic (e.g. 
lazy symbol resolution) uses dlsym().


I can't say that I understand what you are describing, but dlerror need
not be reentrant[1], and we do various manipulations of internal state
during lt_dlopen etc, so I don't see how any argument about reentrancy
can hold. What am I missing this time?


What I am saying is that an OS may choose to re-implement its legacy 
loader interfaces via dlopen() and since we still provide a loader 
driver for the legacy interface, use of this other loader may scramble 
the error reports from the dlopen() loader.



This patch does of course not address the original issue, it only
addresses the issue with dlerror returning a string to the ltdl user
that might have a limited life span.


The patch does not seem to address that since it does not allocate a 
buffer to store the error string.  What it does address is how to 
collect multiple error messages.


Unfortunately, lt_dladderror() is publically exposed so it is difficult to 
change its arguments even though it is not documented.


This patch does not change lt_dladderror in a way that I classify as
significant. Do you disagree?


I don't disagree, but I had suggested that it is quite useful to know 
the origin of the error report.  The origin could have been handled 
via another parameter.



I have not found anything in your message that "kills" the patch, but
yet I get the feeling that you do not like it, and I would like to know
what you think is wrong with it...


It seems to me that these error strings need to be in allocated buffer 
storage so that they don't point to potentially volatile storage. 
Also, there needs to be a way to clear out the existing errors when a 
lt_dlopen*() or lt_dlsym() starts so that the errors only correspond 
to the last requested operation.


The current implementation appears to be a memory leak since nothing 
is freeing the error string pointer (user_error_strings).


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/




Re: Error reporting should be improved

2009-12-30 Thread Peter Rosin

Den 2009-12-30 06:39 skrev Bob Friesenhahn:

On Wed, 30 Dec 2009, Peter Rosin wrote:


The thread safety thing can be ignored as libltdl isn't thread safe
and needs to be mutexed anyway. Care to test this patch?


There used to be some useless effort to make it thread safe. Apparently 
that is gone now.  Regardless, there is also a potential re-entrancy 
issue, which could be encountered if a legacy OS interface is supported 
(re-implemented) via dlopen() or if some other implementation logic 
(e.g. lazy symbol resolution) uses dlsym().


I can't say that I understand what you are describing, but dlerror need
not be reentrant[1], and we do various manipulations of internal state
during lt_dlopen etc, so I don't see how any argument about reentrancy
can hold. What am I missing this time?

[1] http://www.opengroup.org/onlinepubs/009695399/functions/dlerror.html

An alternative approach would be to store loader-specific errors in 
loader-specific storage and iterate through the loaders in order to 
obtain their error messages.


That is outside the scope of the provided patch.

With this patch, how will the accumulated error strings be reported to 
the API user? Some of the error reports will be useless and so it may 
not be useful to combine them since a useless report would have the same 
significance as an important one.  The origin (id of the reporting 
loader) would be useful to record.  If the origin was part of the string 
(or the string is owned by the loader) then that would serve to keep 
them distinct.


This patch does of course not address the original issue, it only
addresses the issue with dlerror returning a string to the ltdl user
that might have a limited life span.

Unfortunately, lt_dladderror() is publically exposed so it is difficult 
to change its arguments even though it is not documented.


This patch does not change lt_dladderror in a way that I classify as
significant. Do you disagree?

I have not found anything in your message that "kills" the patch, but
yet I get the feeling that you do not like it, and I would like to know
what you think is wrong with it...

Cheers,
Peter





Re: Error reporting should be improved

2009-12-29 Thread Bob Friesenhahn

On Wed, 30 Dec 2009, Peter Rosin wrote:


The thread safety thing can be ignored as libltdl isn't thread safe
and needs to be mutexed anyway. Care to test this patch?


There used to be some useless effort to make it thread safe. 
Apparently that is gone now.  Regardless, there is also a potential 
re-entrancy issue, which could be encountered if a legacy OS interface 
is supported (re-implemented) via dlopen() or if some other 
implementation logic (e.g. lazy symbol resolution) uses dlsym().


An alternative approach would be to store loader-specific errors in 
loader-specific storage and iterate through the loaders in order to 
obtain their error messages.


With this patch, how will the accumulated error strings be reported to 
the API user? Some of the error reports will be useless and so it may 
not be useful to combine them since a useless report would have the 
same significance as an important one.  The origin (id of the 
reporting loader) would be useful to record.  If the origin was part 
of the string (or the string is owned by the loader) then that would 
serve to keep them distinct.


Unfortunately, lt_dladderror() is publically exposed so it is 
difficult to change its arguments even though it is not documented.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/




Re: Error reporting should be improved

2009-12-29 Thread Peter Rosin

Den 2009-12-30 01:25 skrev Bob Friesenhahn:

This note in the Solaris manual page for dlerror() is interesting:

NOTES
 The messages returned by  dlerror() can reside in  a  static
 buffer that is overwritten on each call to dlerror(). Appli-
 cation code should not write to this buffer.  Programs want-
 ing  to  preserve  an  error  message  should make their own
 copies of that message.

The implementation of lt__set_last_error  is:

  static const char   *last_error = 0;
  const char *
  lt__set_last_error (const char *errormsg)
  {
return last_error = errormsg;
  }

so the current implementation is not reliable since the buffer pointed 
to may be updated several times before it is used.  It is also not 
thread safe.  Note that dlerror() may only be invoked once per error 
since calling it a second time returns a null pointer.  The Linux manual 
page recommends invoking dlerror() to clear any existing error message 
before invoking dlsym() (which we are not doing) so a symbol which has a 
valid value of zero will result in picking up some old error message.


It is not clear how to make the interface completely thread safe, but it 
is clear that the message needs to be copied immediately to a private 
buffer before returning back to the caller in order to make sure that 
the error returned is due to invoking a libltdl function.


Hi!

The thread safety thing can be ignored as libltdl isn't thread safe
and needs to be mutexed anyway. Care to test this patch?

Cheers,
Peter

Make a copy of the returned dlerror string.

* libltdl/lt_error.c (lt_dladderror): Don't allocate a new
error number if the diagnostic string already exists.
* libltdl/loaders/dlopen.c: Make use of the above and make
sure the returned error string is copied as it might not
survive another dlerror call.
Report by Bob Friesenhahn.

diff --git a/libltdl/loaders/dlopen.c b/libltdl/loaders/dlopen.c
index 1d052b4..f67ca81 100644
--- a/libltdl/loaders/dlopen.c
+++ b/libltdl/loaders/dlopen.c
@@ -140,7 +140,7 @@ get_vtable (lt_user_data loader_data)
 #endif /* !RTLD_LOCAL */

 #if defined(HAVE_DLERROR)
-#  define DLERROR(arg) dlerror ()
+#  define DLERROR(arg) lt_dlseterror (lt_dladderror (dlerror ()))
 #else
 #  define DLERROR(arg) LT__STRERROR (arg)
 #endif
diff --git a/libltdl/lt_error.c b/libltdl/lt_error.c
index d7af36d..bfac8b5 100644
--- a/libltdl/lt_error.c
+++ b/libltdl/lt_error.c
@@ -51,7 +51,14 @@ lt_dladderror (const char *diagnostic)

   assert (diagnostic);

-  errindex = errorcount - LT_ERROR_MAX;
+  for (errindex = 0; errindex < errorcount - LT_ERROR_MAX; ++errindex)
+{
+  if (!streq (user_error_strings[errindex], diagnostic))
+   continue;
+
+  return errindex + LT_ERROR_MAX;
+}
+
   temp = REALLOC (const char *, user_error_strings, 1 + errindex);
   if (temp)
 {