On 05/26/2010 03:37 AM, Colin wrote:
> Hi Again,
>
> looking at the differences between 1.2.0 and 1.2.3 it appears that
> some problem was detected in logsys.c -- the log messages have a
> dedicated queue independent of the flight recorder, the inter-thread
> signalling was changed from condition variables to semaphores, ...
> This seems to make the code even more complicated, because now there's
> one mechanism for signalling, and one for locking, although condition
> variables should be quite sufficient for a thread-safe queue and a
Using the condition variable's lock in the main log api creates blocking
- one of the four requirements we want to solve - see below.
> couple of global flags -- if used correctly, which doesn't seem to be
> the case in 1.2.0 (many condition variables are used without a
> condition!). Also, it appears that some code could be simplified by
> making some of the mutexes recursive.
>
> The following function in logsys.c has only one caller that supplies a
> sufficiently large buffer, which is a good thing, because the
> string-handling does not only contain redundant operations, it could
> also overflow a too small buffer...
>
>
this could be improved - pls send patches
> static char *decode_mode(int subsysid, char *buf, size_t buflen)
> {
> memset(buf, 0, buflen); // Optimisation: buf[ 0 ] = 0 would be
> sufficient.
>
> if (logsys_loggers[subsysid].mode& LOGSYS_MODE_OUTPUT_FILE)
> snprintf(buf+strlen(buf), buflen, "FILE,"); // Correctness: It
> must
> read buflen - strlen(buf) instead of buflen
>
> // Ok, we first cleared the buffer, so in the first snprintf,
> buf+strlen(buf) could be replaced with buf, but the later snprintf()
> invocations REALLY need the buflen - strlen(buf).
>
> if (logsys_loggers[subsysid].mode& LOGSYS_MODE_OUTPUT_STDERR)
> snprintf(buf+strlen(buf), buflen, "STDERR,"); // Same here
>
> if (logsys_loggers[subsysid].mode& LOGSYS_MODE_OUTPUT_SYSLOG)
> snprintf(buf+strlen(buf), buflen, "SYSLOG,"); // Same here
>
> if (subsysid == LOGSYS_MAX_SUBSYS_COUNT) {
> if (logsys_loggers[subsysid].mode& LOGSYS_MODE_FORK)
> snprintf(buf+strlen(buf), buflen, "FORK,"); // Same
> here
>
> if (logsys_loggers[subsysid].mode& LOGSYS_MODE_THREADED)
> snprintf(buf+strlen(buf), buflen, "THREADED,"); //
> Same here
> }
>
> memset(buf+strlen(buf)-1,0,1); // Completely redundant, strlen()
> expects a zero-terminated string; using memset() to re-write that
> single 0-byte again won't change anything.
>
> return buf;
> }
>
We are working on simplifying the code - hopefully 1.2.0->1.2.3 shows
how that simplification is taking place:
1) use of counting posix semaphores instead of pthread condition
variables which have to maintain their own counting externally with
extra locks
2) using posix memory map to avoid all the complicated memory copies
around buffer endings
>
> Anyhow, we got 1.2.0 stable by short-circuiting _logsys_log_vprintf()
> by inserting
>
> syslog(rec_ident, "%s", logsys_print_buffer);
> return;
>
> at line 1335. Without this line, corosync eventually goes into a 100%
> CPU loop which seems to be triggered by a non-log entry in the
> flight-recorder with length 0, so that it doesn't get skipped
> correctly (this is not fully confirmed, it's our current hypothesis).
>
With debug: on we noticed the 100% spin when used with pacemaker. This
was addressed in 1.2.2. I understand flight recorder looks complicated.
I am a believer in KISS, but sometimes complication is necessary to
meet requirements.
I have very specific requirements based upon 8+ years of debugging field
deployments of this code base:
1. Record all log events in a buffer so they may be replayed at a
segfault (with corosync-fplay)
2. Do non-block output to the three log output targets (stderr, syslog,
file) (this is why there is a separate logging thread)
3. Allow subsystems to be configured independently for various parts of
the software (so for example, totem messages can log to file at debug
level).
4. low overhead - high performance - this is achieved by all log events
going to memory and other log events going to disk or syslog.
A simple solution of using syslog api doesn't meet any of the other
requirements 1-4 above.
I'm happy to merge patches which simplify the code.
Regards
-steve
Here are the goals of the flight recorder
> Regards, Colin
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linux-foundation.org/mailman/listinfo/openais
_______________________________________________
Openais mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/openais