* Josh Triplett ([email protected]) wrote: > On Thu, Jun 21, 2012 at 03:48:38PM -0400, Mathieu Desnoyers wrote: > > * Josh Triplett ([email protected]) wrote: > > > On Thu, Jun 21, 2012 at 03:03:06PM -0400, Mathieu Desnoyers wrote: > > > > * Josh Triplett ([email protected]) wrote: > > > > > On Thu, Jun 21, 2012 at 12:41:13PM -0400, Mathieu Desnoyers wrote: > > > > > > Currently, liburcu calls "exit(-1)" upon internal consistency error. > > > > > > This is not pretty, and usually frowned upon in libraries. > > > > > > > > > > Agreed. > > > > > > > > > > > One example of failure path where we use this is if > > > > > > pthread_mutex_lock() > > > > > > would happen to fail within synchronize_rcu(). Clearly, this should > > > > > > _never_ happen: it would typically be triggered only by memory > > > > > > corruption (or other terrible things like that). That being said, we > > > > > > clearly don't want to make "synchronize_rcu()" return errors like > > > > > > that > > > > > > to the application, because it would complexify the application > > > > > > error > > > > > > handling needlessly. > > > > > > > > > > I think you can safely ignore any error conditions you know you can't > > > > > trigger. pthread_mutex_lock can only return an error under two > > > > > conditions: an uninitialized mutex, or an error-checking mutex already > > > > > locked by the current thread. Neither of those can happen in this > > > > > case. > > > > > Given that, I'd suggest either calling pthread_mutex_lock and ignoring > > > > > any possibility of error, or adding an assert. > > > > > > > > > > > So instead of calling exit(-1), one possibility would be to do > > > > > > something > > > > > > like this: > > > > > > > > > > > > #include <signal.h> > > > > > > #include <pthread.h> > > > > > > #include <stdio.h> > > > > > > > > > > > > #define urcu_die(fmt, ...) \ > > > > > > do { \ > > > > > > fprintf(stderr, fmt, ##__VA_ARGS__); \ > > > > > > (void) pthread_kill(pthread_self(), SIGBUS); \ > > > > > > } while (0) > > > > > > > > > > > > and call urcu_die(); in those "unrecoverable error" cases, instead > > > > > > of > > > > > > calling exit(-1). Therefore, if an application chooses to trap those > > > > > > signals, it can, which is otherwise not possible with a direct call > > > > > > to > > > > > > exit(). > > > > > > > > > > It looks like you want to use signals as a kind of exception > > > > > mechanism, > > > > > to allow the application to clean up (though not to recover). assert > > > > > seems much clearer to me for "this can't happen" cases, and assert > > > > > also > > > > > generates a signal that the application can catch and clean up. > > > > > > > > Within discussions with other LTTng developers, we considered the the > > > > assert, but the thought that this case might be silently ignored on > > > > production systems (which compile with assertions disabled) makes me > > > > uncomfortable. This is why I would prefer a SIGBUS to an assertion. > > > > > > > > Using assert() would be similar to turning the Linux kernel BUG_ON() > > > > mechanism into no-ops on production systems because "it should never > > > > happen" (tm) ;-) > > > > > > Just don't define NDEBUG then. :) > > > > Well, AFAIK, it is usual for some distribution packages to define > > NDEBUG (maybe distro maintainers reading lttng-dev could confirm or > > infirm this assumption ?). So in a context where upstream does not have > > total control on the specific tweaks done by distro packages, I prefer > > not to rely on NDEBUG not being defined to catch internal consistency > > errors in the wild. > > #undef NDEBUG > #include <assert.h> > > Or if you don't consider that sufficient for some reason, you could > define your own assert(), but that seems like an odd thing to not count > on. Nonetheless, if you define your own assert, I'd still suggest > making it look as much like assert() as possible, including the call to > abort().
#undef NDEBUG is unwanted, due to its side-effects. We use "assert()" in other locations of the code, for which we want the assertion check to be disabled if NDEBUG is defined in production. I agree with you that calling "abort()" is exactly what we want, and it's much more standard that sending a signal chosen with a fair roll of dices. How about the following ? [...] diff --git a/urcu-die.h b/urcu-die.h new file mode 100644 index 0000000..227c8dc --- /dev/null +++ b/urcu-die.h @@ -0,0 +1,37 @@ +#ifndef _URCU_DIE_H +#define _URCU_DIE_H + +/* + * urcu-die.h + * + * Userspace RCU library unrecoverable error handling + * + * Copyright (c) 2012 Mathieu Desnoyers <[email protected]> + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <stdlib.h> +#include <stdio.h> +#include <string.h> + +#define urcu_die(cause) \ +do { \ + fprintf(stderr, "(" __FILE__ ":%s@%u) Unrecoverable error: %s\n", \ + __func__, __LINE__, strerror(cause)); \ + abort(); \ +} while (0) + +#endif /* _URCU_DIE_H */ [...] -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com _______________________________________________ lttng-dev mailing list [email protected] http://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
