Hi Again,
looking at the differences between 1.2.0 and 1.2.3 it appears that
some problem was detected in logsys.c -- the log messages have a
dedicated queue independent of the flight recorder, the inter-thread
signalling was changed from condition variables to semaphores, ...
This seems to make the code even more complicated, because now there's
one mechanism for signalling, and one for locking, although condition
variables should be quite sufficient for a thread-safe queue and a
couple of global flags -- if used correctly, which doesn't seem to be
the case in 1.2.0 (many condition variables are used without a
condition!). Also, it appears that some code could be simplified by
making some of the mutexes recursive.
The following function in logsys.c has only one caller that supplies a
sufficiently large buffer, which is a good thing, because the
string-handling does not only contain redundant operations, it could
also overflow a too small buffer...
static char *decode_mode(int subsysid, char *buf, size_t buflen)
{
memset(buf, 0, buflen); // Optimisation: buf[ 0 ] = 0 would be
sufficient.
if (logsys_loggers[subsysid].mode & LOGSYS_MODE_OUTPUT_FILE)
snprintf(buf+strlen(buf), buflen, "FILE,"); // Correctness: It
must
read buflen - strlen(buf) instead of buflen
// Ok, we first cleared the buffer, so in the first snprintf,
buf+strlen(buf) could be replaced with buf, but the later snprintf()
invocations REALLY need the buflen - strlen(buf).
if (logsys_loggers[subsysid].mode & LOGSYS_MODE_OUTPUT_STDERR)
snprintf(buf+strlen(buf), buflen, "STDERR,"); // Same here
if (logsys_loggers[subsysid].mode & LOGSYS_MODE_OUTPUT_SYSLOG)
snprintf(buf+strlen(buf), buflen, "SYSLOG,"); // Same here
if (subsysid == LOGSYS_MAX_SUBSYS_COUNT) {
if (logsys_loggers[subsysid].mode & LOGSYS_MODE_FORK)
snprintf(buf+strlen(buf), buflen, "FORK,"); // Same
here
if (logsys_loggers[subsysid].mode & LOGSYS_MODE_THREADED)
snprintf(buf+strlen(buf), buflen, "THREADED,"); //
Same here
}
memset(buf+strlen(buf)-1,0,1); // Completely redundant, strlen()
expects a zero-terminated string; using memset() to re-write that
single 0-byte again won't change anything.
return buf;
}
Anyhow, we got 1.2.0 stable by short-circuiting _logsys_log_vprintf()
by inserting
syslog(rec_ident, "%s", logsys_print_buffer);
return;
at line 1335. Without this line, corosync eventually goes into a 100%
CPU loop which seems to be triggered by a non-log entry in the
flight-recorder with length 0, so that it doesn't get skipped
correctly (this is not fully confirmed, it's our current hypothesis).
Regards, Colin
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais