Re: Error reporting should be improved
Den 2009-12-30 22:22 skrev Bob Friesenhahn: On Wed, 30 Dec 2009, Peter Rosin wrote: Aha, you have misunderstood the patch, that's why we don't understand each other. The patch does indeed allocate a buffer, that's the whole point of sending the error message to lt_dladderror. The error message is cached indefinitely inside lt_dladderror. In order to not grow that In what line of code is a string buffer allocated? The only buffer allocation I see is a REALLOC to increase the buffer which holds diagnostic pointers. Then the passed pointer is assigned directly to the pointer at errindex without any allocation or string copy. Oh, I knew I was too tired when I wrote that patch. You are of course correct. *blush* Cheers, Peter
Re: Error reporting should be improved
On Wed, 30 Dec 2009, Peter Rosin wrote: Aha, you have misunderstood the patch, that's why we don't understand each other. The patch does indeed allocate a buffer, that's the whole point of sending the error message to lt_dladderror. The error message is cached indefinitely inside lt_dladderror. In order to not grow that In what line of code is a string buffer allocated? The only buffer allocation I see is a REALLOC to increase the buffer which holds diagnostic pointers. Then the passed pointer is assigned directly to the pointer at errindex without any allocation or string copy. "cache", I implemented a comparison so that registering the same user error multiple times results in returning the index of the old error message instead of inventing a new error number each time. Yes. And so it is impossible to decipher meaningless errors from meaningful ones. If the error comes from a loader which should not have worked in the first place, then the user should be able to ignore it. Note that user_error_strings is an array of strings, not one big long string. Yes, it is an array of string pointers. The patch does exactly the first part, but not the second. There is simply no way to know when the user is done with the string, so it has to be kept indefinitely. We could document the lifetime rules, but until then there is no way to know when the message can be freed. The documentation should say that this data is only valid until the next lt_* call, which is currently the case. Yes, that is true. But that is not a new problem, just magnified by increased usage of lt_dladderror. It was essentially unused before since it was only used by a rarely used loader on OS-X. This discussion should completely move to the patches list so I will delete the Cc: to the bugs list. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
Re: Error reporting should be improved
Den 2009-12-30 17:37 skrev Bob Friesenhahn: On Wed, 30 Dec 2009, Peter Rosin wrote: There used to be some useless effort to make it thread safe. Apparently that is gone now. Regardless, there is also a potential re-entrancy issue, which could be encountered if a legacy OS interface is supported (re-implemented) via dlopen() or if some other implementation logic (e.g. lazy symbol resolution) uses dlsym(). I can't say that I understand what you are describing, but dlerror need not be reentrant[1], and we do various manipulations of internal state during lt_dlopen etc, so I don't see how any argument about reentrancy can hold. What am I missing this time? What I am saying is that an OS may choose to re-implement its legacy loader interfaces via dlopen() and since we still provide a loader driver for the legacy interface, use of this other loader may scramble the error reports from the dlopen() loader. This patch does of course not address the original issue, it only addresses the issue with dlerror returning a string to the ltdl user that might have a limited life span. The patch does not seem to address that since it does not allocate a buffer to store the error string. What it does address is how to collect multiple error messages. Aha, you have misunderstood the patch, that's why we don't understand each other. The patch does indeed allocate a buffer, that's the whole point of sending the error message to lt_dladderror. The error message is cached indefinitely inside lt_dladderror. In order to not grow that "cache", I implemented a comparison so that registering the same user error multiple times results in returning the index of the old error message instead of inventing a new error number each time. Note that user_error_strings is an array of strings, not one big long string. Unfortunately, lt_dladderror() is publically exposed so it is difficult to change its arguments even though it is not documented. This patch does not change lt_dladderror in a way that I classify as significant. Do you disagree? I don't disagree, but I had suggested that it is quite useful to know the origin of the error report. The origin could have been handled via another parameter. This origin idea is completely orthogonal to what the patch is doing. I have not found anything in your message that "kills" the patch, but yet I get the feeling that you do not like it, and I would like to know what you think is wrong with it... It seems to me that these error strings need to be in allocated buffer storage so that they don't point to potentially volatile storage. Also, there needs to be a way to clear out the existing errors when a lt_dlopen*() or lt_dlsym() starts so that the errors only correspond to the last requested operation. The patch does exactly the first part, but not the second. There is simply no way to know when the user is done with the string, so it has to be kept indefinitely. We could document the lifetime rules, but until then there is no way to know when the message can be freed. The current implementation appears to be a memory leak since nothing is freeing the error string pointer (user_error_strings). Yes, that is true. But that is not a new problem, just magnified by increased usage of lt_dladderror. Cheers, Peter -- They are in the crowd with the answer before the question. > Why do you dislike Jeopardy?
Re: Error reporting should be improved
On Wed, 30 Dec 2009, Peter Rosin wrote: There used to be some useless effort to make it thread safe. Apparently that is gone now. Regardless, there is also a potential re-entrancy issue, which could be encountered if a legacy OS interface is supported (re-implemented) via dlopen() or if some other implementation logic (e.g. lazy symbol resolution) uses dlsym(). I can't say that I understand what you are describing, but dlerror need not be reentrant[1], and we do various manipulations of internal state during lt_dlopen etc, so I don't see how any argument about reentrancy can hold. What am I missing this time? What I am saying is that an OS may choose to re-implement its legacy loader interfaces via dlopen() and since we still provide a loader driver for the legacy interface, use of this other loader may scramble the error reports from the dlopen() loader. This patch does of course not address the original issue, it only addresses the issue with dlerror returning a string to the ltdl user that might have a limited life span. The patch does not seem to address that since it does not allocate a buffer to store the error string. What it does address is how to collect multiple error messages. Unfortunately, lt_dladderror() is publically exposed so it is difficult to change its arguments even though it is not documented. This patch does not change lt_dladderror in a way that I classify as significant. Do you disagree? I don't disagree, but I had suggested that it is quite useful to know the origin of the error report. The origin could have been handled via another parameter. I have not found anything in your message that "kills" the patch, but yet I get the feeling that you do not like it, and I would like to know what you think is wrong with it... It seems to me that these error strings need to be in allocated buffer storage so that they don't point to potentially volatile storage. Also, there needs to be a way to clear out the existing errors when a lt_dlopen*() or lt_dlsym() starts so that the errors only correspond to the last requested operation. The current implementation appears to be a memory leak since nothing is freeing the error string pointer (user_error_strings). Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
Re: Error reporting should be improved
Den 2009-12-30 06:39 skrev Bob Friesenhahn: On Wed, 30 Dec 2009, Peter Rosin wrote: The thread safety thing can be ignored as libltdl isn't thread safe and needs to be mutexed anyway. Care to test this patch? There used to be some useless effort to make it thread safe. Apparently that is gone now. Regardless, there is also a potential re-entrancy issue, which could be encountered if a legacy OS interface is supported (re-implemented) via dlopen() or if some other implementation logic (e.g. lazy symbol resolution) uses dlsym(). I can't say that I understand what you are describing, but dlerror need not be reentrant[1], and we do various manipulations of internal state during lt_dlopen etc, so I don't see how any argument about reentrancy can hold. What am I missing this time? [1] http://www.opengroup.org/onlinepubs/009695399/functions/dlerror.html An alternative approach would be to store loader-specific errors in loader-specific storage and iterate through the loaders in order to obtain their error messages. That is outside the scope of the provided patch. With this patch, how will the accumulated error strings be reported to the API user? Some of the error reports will be useless and so it may not be useful to combine them since a useless report would have the same significance as an important one. The origin (id of the reporting loader) would be useful to record. If the origin was part of the string (or the string is owned by the loader) then that would serve to keep them distinct. This patch does of course not address the original issue, it only addresses the issue with dlerror returning a string to the ltdl user that might have a limited life span. Unfortunately, lt_dladderror() is publically exposed so it is difficult to change its arguments even though it is not documented. This patch does not change lt_dladderror in a way that I classify as significant. Do you disagree? I have not found anything in your message that "kills" the patch, but yet I get the feeling that you do not like it, and I would like to know what you think is wrong with it... Cheers, Peter
Re: Error reporting should be improved
On Wed, 30 Dec 2009, Peter Rosin wrote: The thread safety thing can be ignored as libltdl isn't thread safe and needs to be mutexed anyway. Care to test this patch? There used to be some useless effort to make it thread safe. Apparently that is gone now. Regardless, there is also a potential re-entrancy issue, which could be encountered if a legacy OS interface is supported (re-implemented) via dlopen() or if some other implementation logic (e.g. lazy symbol resolution) uses dlsym(). An alternative approach would be to store loader-specific errors in loader-specific storage and iterate through the loaders in order to obtain their error messages. With this patch, how will the accumulated error strings be reported to the API user? Some of the error reports will be useless and so it may not be useful to combine them since a useless report would have the same significance as an important one. The origin (id of the reporting loader) would be useful to record. If the origin was part of the string (or the string is owned by the loader) then that would serve to keep them distinct. Unfortunately, lt_dladderror() is publically exposed so it is difficult to change its arguments even though it is not documented. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
Re: Error reporting should be improved
Den 2009-12-30 01:25 skrev Bob Friesenhahn: This note in the Solaris manual page for dlerror() is interesting: NOTES The messages returned by dlerror() can reside in a static buffer that is overwritten on each call to dlerror(). Appli- cation code should not write to this buffer. Programs want- ing to preserve an error message should make their own copies of that message. The implementation of lt__set_last_error is: static const char *last_error = 0; const char * lt__set_last_error (const char *errormsg) { return last_error = errormsg; } so the current implementation is not reliable since the buffer pointed to may be updated several times before it is used. It is also not thread safe. Note that dlerror() may only be invoked once per error since calling it a second time returns a null pointer. The Linux manual page recommends invoking dlerror() to clear any existing error message before invoking dlsym() (which we are not doing) so a symbol which has a valid value of zero will result in picking up some old error message. It is not clear how to make the interface completely thread safe, but it is clear that the message needs to be copied immediately to a private buffer before returning back to the caller in order to make sure that the error returned is due to invoking a libltdl function. Hi! The thread safety thing can be ignored as libltdl isn't thread safe and needs to be mutexed anyway. Care to test this patch? Cheers, Peter Make a copy of the returned dlerror string. * libltdl/lt_error.c (lt_dladderror): Don't allocate a new error number if the diagnostic string already exists. * libltdl/loaders/dlopen.c: Make use of the above and make sure the returned error string is copied as it might not survive another dlerror call. Report by Bob Friesenhahn. diff --git a/libltdl/loaders/dlopen.c b/libltdl/loaders/dlopen.c index 1d052b4..f67ca81 100644 --- a/libltdl/loaders/dlopen.c +++ b/libltdl/loaders/dlopen.c @@ -140,7 +140,7 @@ get_vtable (lt_user_data loader_data) #endif /* !RTLD_LOCAL */ #if defined(HAVE_DLERROR) -# define DLERROR(arg) dlerror () +# define DLERROR(arg) lt_dlseterror (lt_dladderror (dlerror ())) #else # define DLERROR(arg) LT__STRERROR (arg) #endif diff --git a/libltdl/lt_error.c b/libltdl/lt_error.c index d7af36d..bfac8b5 100644 --- a/libltdl/lt_error.c +++ b/libltdl/lt_error.c @@ -51,7 +51,14 @@ lt_dladderror (const char *diagnostic) assert (diagnostic); - errindex = errorcount - LT_ERROR_MAX; + for (errindex = 0; errindex < errorcount - LT_ERROR_MAX; ++errindex) +{ + if (!streq (user_error_strings[errindex], diagnostic)) + continue; + + return errindex + LT_ERROR_MAX; +} + temp = REALLOC (const char *, user_error_strings, 1 + errindex); if (temp) {