Steve,
Please find attached a patch against 2.3.6 that, when checkpointing,
notices if we identify an audit.log file to use (ie the dev and inode
match) but we find a complete event whose time is > 2 seconds past the
checkpoint time. This should not happen, as the checkpoint event should
be found BEFORE any other complete event for the checkpoint event was
the last displayed complete event in the file. When this occurs, a
message is printed to stderr and ausearch will terminate with an exit
code of 12.
This typically occurs if there is a lot of processing or a long time
occurs between two invocations of ausearch --checkpoint. Basically, an
inode is reused in one of the new audit.log files.
Rgds
On Mon, 2014-04-14 at 20:11 -0400, Steve Grubb wrote:
> On Sunday, April 13, 2014 11:51:45 AM Burn Alting wrote:
> > A patch is attached that addresses this.
> >
> > Essentially the modification
> > - notices if we identify an audit.log file to use but we do not find the
> > recorded audit event in that log file and so report an error (to stderr)
> > and return a new exit code (12)
> > - allows checkpointing to only use the recorded time from the checkpoint
> > file for comparisons.
>
> I'd like to look at these two pieces separately. Let's have 1 bug/feature per
> patch. This way if something looks good, it can be applied immediately.
> Whereas if something needs more discussion, it would block application of the
> part that is good.
>
>
> > You will note that the patch also contains changes to swig/audit.py.
> > Although this file is automatically generated, it is part of the 2.3.6
> > release ... should it be?
>
> I suppose it should be. What is in the release is decided by
> automake/autoconf. If there are any mistakes in the Makefile.am file, I would
> take a patch.
>
>
> > I also note that a lot of Makefile.in's are also part of the release. Again,
> > should these automatically generated files be part of the release?
>
> The audit package release is done by a script that pretty much does the
> following (its way more complicated than this, but this is the essential
> pieces):
>
> mkdir audit
> cd audit
> svn co http://svn.fedorahosted.org/svn/audit/trunk .
> ./autogen.sh
> ./configure
> make -j 8 distcheck
>
> If it finishes saying it created the tar ball, I send it to rawhide to make
> sure it builds on a current OS. If that is also successful, then I push it to
> my people page and then commit a branch in svn. I also run the development
> audit package on all my systems during the whole development cycle to make
> sure bugs are fixed, nothing new shows up, and its builds under normal
> conditions.
>
> So, anything that is there, is because autotools think it should be there
> unless I made a mistake in a Makefile.am. :-) Patches are welcome.
>
> Thanks,
> -Steve
diff -Npru audit-2.3.6/docs/ausearch.8 audit-2.3.6_checkpt_1/docs/ausearch.8
--- audit-2.3.6/docs/ausearch.8 2014-04-12 05:49:28.000000000 +1000
+++ audit-2.3.6_checkpt_1/docs/ausearch.8 2014-04-18 11:20:39.968218322 +1000
@@ -46,8 +46,7 @@ ausearch will load this checkpoint data
complete events until it matches the checkpointed one. At this point, it will start
outputting complete events.
-Should the file or checkpointed completed event not be found, an error will result and ausearch
-will terminate.
+Should the file or checkpointed completed event not be found, one of a number of errors will result and ausearch will terminate. See \fBEXIT STATUS\fP for detail.
.TP
.BR \-e,\ \-\-exit \ \fIexit-code-or-errno\fP
@@ -173,16 +172,22 @@ String based matches must match the whol
.BR \-x ,\ \-\-executable \ \fIexecutable\fP
Search for an event matching the given \fIexecutable\fP name.
-.SS "Exit status:"
-.TP
+.SH "EXIT STATUS"
+.TP 5
0
if OK,
+.TP
1
if nothing found, or argument errors or minor file acces/read errors,
+.TP
10
-bad checkpoint data,
+invalid checkpoint data found in checkpoint file,
+.TP
11
checkpoint processing error
+.TP
+12
+checkpoint event not found in matching log file
.SH "SEE ALSO"
.BR auditd (8),
.BR pam_loginuid (8).
diff -Npru audit-2.3.6/src/ausearch.c audit-2.3.6_checkpt_1/src/ausearch.c
--- audit-2.3.6/src/ausearch.c 2014-04-12 05:49:25.000000000 +1000
+++ audit-2.3.6_checkpt_1/src/ausearch.c 2014-04-18 10:57:30.743412031 +1000
@@ -153,8 +153,15 @@ int main(int argc, char *argv[])
save_ChkPt(checkpt_filename);
free_ChkPtMemory();
free((void *)checkpt_filename);
- if (checkpt_failure)
- rc = 11;
+ /*
+ * A checkpoint failure at this point means either
+ * - we failed in attempting to create the checkpouint file
+ * and so we will return 11
+ * - we had a corrupted checkpoint file and so we will return 12
+ */
+ if (checkpt_failure) {
+ rc = ((checkpt_failure & CP_CORRUPTED) == CP_CORRUPTED) ? 12 : 11;
+ }
}
lol_clear(&lo);
@@ -305,6 +312,11 @@ static int process_logs(void)
* 0 no output
* 1 can output
* 2 can output but not this event
+ * 3 we have found an event whose time is > MAX_EVENT_DELTA_SECS secs
+ * past our checkpoint time, which means this particulare event is complete.
+ * This should not happen, for we should have found our checkpoint event
+ * before ANY other completed event.
+ *
*/
static int chkpt_output_decision(event * e)
{
@@ -345,10 +357,35 @@ static int chkpt_output_decision(event *
return 2; /* output after this event */
}
/*
- * The nodes are different. Drop through to a no output return
- * value
+ * The nodes are different. Drop through to further checks.
*/
}
+ /*
+ * If the event we are looking at is more than MAX_EVENT_DELTA_SECS
+ * seconds past our checkpoint event, then by definition we should
+ * have had a complete event (ie a complete event is one where at
+ * least MAX_EVENT_DELTA_SECS seconds have passed since it's last
+ * output record).
+ * This means there is a problem, for the recorded checkpoint event was
+ * the last complete event in the file when we last processed it.
+ * Normally we see this if the checkpoint is very old and the system
+ * has used the same inode again in an audit log file.
+ */
+ if (
+ (chkpt_input_levent.sec < e->sec)
+ &&
+ ((e->sec - chkpt_input_levent.sec) > MAX_EVENT_DELTA_SECS)
+ ) {
+#if 0
+ fprintf(stderr, "%s %lu.%03d:%lu vs %s %lu.%03d:%lu\n",
+ chkpt_input_levent.host ? chkpt_input_levent.host : "-",
+ chkpt_input_levent.sec, chkpt_input_levent.milli,
+ chkpt_input_levent.serial,
+ e->host, e->sec, e->milli, e->serial);
+#endif
+ return 3;
+ }
+
return 0;
}
@@ -391,6 +428,17 @@ static int process_log_fd(void)
return 4; /* no memory */
}
}
+ } else if (do_output == 3) {
+ fprintf(stderr,
+ "Corrupted checkpoint file. Inode match, but newer complete event (%lu.%03d:%lu) found before loaded checkpoint %lu.%03d:%lu\n",
+ entries->e.sec, entries->e.milli, entries->e.serial,
+ chkpt_input_levent.sec, chkpt_input_levent.milli,
+ chkpt_input_levent.serial);
+ checkpt_failure |= CP_CORRUPTED;
+ list_clear(entries);
+ free(entries);
+ fclose(log_fd);
+ return 10;
}
if (just_one) {
list_clear(entries);
diff -Npru audit-2.3.6/src/ausearch-checkpt.h audit-2.3.6_checkpt_1/src/ausearch-checkpt.h
--- audit-2.3.6/src/ausearch-checkpt.h 2014-04-12 05:49:26.000000000 +1000
+++ audit-2.3.6_checkpt_1/src/ausearch-checkpt.h 2014-04-18 10:57:58.360753565 +1000
@@ -31,6 +31,7 @@ int load_ChkPt(const char *fn);
#define CP_STATFAILED 0x0002 /* stat() call on last log file failed */
#define CP_STATUSIO 0x0004 /* cannot open/read/write checkpoint file */
#define CP_STATUSBAD 0x0008 /* malformed status checkpoint entries */
+#define CP_CORRUPTED 0x0010 /* corrupted times in checkpoint file */
extern unsigned checkpt_failure;
diff -Npru audit-2.3.6/src/ausearch-common.h audit-2.3.6_checkpt_1/src/ausearch-common.h
--- audit-2.3.6/src/ausearch-common.h 2014-04-12 05:49:26.000000000 +1000
+++ audit-2.3.6_checkpt_1/src/ausearch-common.h 2014-04-18 10:59:16.313185340 +1000
@@ -28,6 +28,17 @@
#include "ausearch-string.h"
+/*
+ * MAX_EVENT_DELTA_SECS is the maximum number of seconds it would take for
+ * auditd and the kernel to emit all of an events' records. Thus, when scanning
+ * a list of audit records without any End of Event marker, we can determine if
+ * all an event's records have been collected if we compare that event's time
+ * with the time of the event we are currently scanning. If
+ * MAX_EVENT_DELTA_SECS have passed, then the event is deamed to be complete
+ * and we have all it's records.
+ */
+#define MAX_EVENT_DELTA_SECS 2
+
/* Global variables that describe what search is to be performed */
extern time_t start_time, end_time;
extern unsigned int event_id;
--
Linux-audit mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-audit