On Tue, 2021-01-05 at 07:12 +1100, Burn Alting wrote:
> On Mon, 2021-01-04 at 09:46 -0500, Steve Grubb wrote:
> > On Monday, January 4, 2021 2:55:25 AM EST Burn Alting wrote:
> > > On Sun, 2021-01-03 at 10:41 -0500, Steve Grubb wrote:
> > > > On Friday, January 1, 2021 4:22:33 PM EST Burn Alting wrote:
> > > > > Sometimes, events recorded in /var/log/audit/audit.log appear 
> > > > > someseconds
> > > > > past co- located events which results inauparse:au_check_events()
> > > > > marking  these events complete before theyare. An example of this can 
> > > > > be
> > > > > seen below with the offending event id44609.
> > > > > This has been plaguing me for a year or two and this morning was 
> > > > > thefirst
> > > > > time I still had access to the raw audit.log files (I monitor a lotof
> > > > > event types and the log files roll over fairly quickly).The example 
> > > > > below
> > > > > is from a fully patched Centos 7 but I have also seenthis on a patched
> > > > > Fedora 32.
> > > > > Has this been seen before? Do we need to re-evaluate how
> > > > > auparse'completes' an event (ie 2 seconds is too quick).
> > > > 
> > > > I have never seen this. But on the way to disk, auditd only does
> > > > lightprocessing of the event.  If the format is enriched, it looks 
> > > > things
> > > > upon a record by record basis. It does not collect events until they
> > > > arecomplete - it dumps it to disk as soon as it can tack on the
> > > > extrainformation.
> > > > So, the question would be, does this delay happen on the way to disk? 
> > > > Oris
> > > > this an artifact of post processing the logs with an auparse 
> > > > basedutility?
> > > > Can this be observed repeatedly on the same raw logs? If so,then maybe
> > > > auparse does have some issue. But if this is a postprocessing issue, 
> > > > then
> > > > the wall clock doesn't matter because this eventshould have collected up
> > > > together.
> > > > I'd say this merits some investigation.
> > > 
> > > OK. I think this needs to be addressed on two fronts. There may be
> > > more.A.  Within post processing ... a 2 second timeout is not sufficient.
> > > Iwould suggest we modify auparse.c:au_check_events() to i) perform 
> > > theevent
> > > type checks first, then  ii) increase the timeout of 2 seconds to be a 
> > > larger
> > > value based onempirical tests.
> > 
> > In the post processing, there are 2 use cases. The first is events that are 
> > on
> > disk. In this usage, the 2 second timeout does not come into effect because 
> > the
> > events are run through probably within nanoseconds or microseconds at the 
> > worst.
> > The only time it would come into effect is if the terminating record is 
> > missing.
> 
> In this first case, the 2 second timeout is on the event's time, not the
> 'processing time'.  See ausearch-lol.c:check_events() and
> auparse.c:au_get_ready_event().
> And I use the checkpointing code to avoid the incomplete event issue.
> 
> In my case, I have not lost records, it's just that an event has arrived on 
> disk
> with an event time more than 2 seconds after the previously written event.
> Basically,
> a. The event was delayed getting to auditd and we look to the kernel for a
> solution.
> b. The event arrived at a reasonable point in time at auditd and for some 
> reason
> auditd delayed it's printing (by the way I tend to use RAW log format, not
> enriched.
> 
> In either case, I believe ausearch-lol.c:check_events() and
> auparse.c:au_get_ready_event() do need to be changed as we have complete 
> events
> written by auditd
> which these two routines fail to process properly.
> 
Changing the two second timeout in ausearch-lol.c:check_events() and, one 
assumes
in  auparse.c:au_get_ready_event() (but I have not tested the auparse code) 
fixes
the processing  of the delayed event.
Changing the value to say 10 seconds fixes my example use case, but given the 
kernel
or auditd could emit an event with a larger delay, should this be a 
configuration
item in /etc/audit/auditd.conf?

I have raised both a bugzilla report (
https://bugzilla.redhat.com/show_bug.cgi?id=1914603) and Issue (
https://github.com/linux-audit/audit-userspace/issues/148)

How do you want me to proceed ... a simple change to 10 seconds or a more 
versatile
configuration item in auditd.conf?
I can perform either and issue a PR if so required.
 
> > The other use case is realtime processing as an audispd plugin. In this use 
> > case
> > the wall clock could matter because records could potentially get lost due 
> > to
> > overflows or another plugin taking too long. This is the use case where the 
> > wall
> > clock matters. And again, it matters when records get lost or delayed in
> > transit. As long as everything is flowing, it should not factor into event
> > processing.
> > > B. I will build a temporary auditd daemon to perform some empirical 
> > > testingto
> > > see how long events can reside within the daemon. I may need someadvice on
> > > this. I assume that the code that sets the timestamp is
> > > insrc/auditd.c:send_audit_event().
> > 
> > This is only for audit daemon's internal events. For all "real" events, 
> > it's set
> > in the kernel.
> 
> If that is the case and the kernel is establishing the timestamp, then either 
> the
> kernel has delayed the eventsarrival at the daemon or the daemon has delayed 
> it's
> writing.
> > > If so, I will see if I can put orchestration debug code in to monitor
> > > anevent's 'time in daemon' until this point. I will then report on this.
> > > I believe given that AUDIT_PROCTITLE and AUDIT_EOE is fairly 
> > > widespread,then
> > > the testing switch in A. will not be a big issue (time cost wise). Itwill 
> > > also
> > > mean that if we over compensate the timeout that would causeadditional 
> > > memory
> > > cost in auparse() then this is mittigated.
> > 
> > I'd suggest breaking up the event completion tests so that an exact 
> > collection
> > termination reason code could be associated to the event.
> > > With respect to 'There may be more' fronts. Are there other points in
> > > the'audit ecosystem' that makes use of the '2 second timeout'.
> > 
> > Ausearch/report has its own special copy of the event collection logic. It
> > should be nearly identical to what auparse does.
> 
> They appear identical  ... ausearch-lol.c:check_events() and
> auparse.c:au_get_ready_event().
> > > I will start work on this, this coming weekend if the above makes sense.
> > 
> > One other thought, the current shipping code is audit-3.0, doing a diff 
> > between
> > it and audit-2.8.5 for the auparse directory does show some differences in 
> > event
> > collection/grouping/next_event. A lot of the differences are cosmetic to fix
> > extra whitespace or indentation. But if you skip all that, there are some 
> > real
> > changes that probably were because of bug reports. For example,
> 
> I will go through these, although this occurs on my Centos 7's (audit-2.8.5-
> 4.el7.x86_64) as well as my 8's 
> (audit-3.0-0.17.20191104git1c2f876.el8.x86_64).
> > @@ -259,15 +260,6 @@ static event_list_t *au_get_ready_event(        if 
> > (lowest
> > && lowest->status == EBS_COMPLETE) {                lowest->status =
> > EBS_EMPTY;                au->au_ready--;-               // Try to 
> > consolidate
> > the array so that we iterate-               // over a smaller portion next
> > time-               if (lowest == &lol->array[lol->maxi])
> > {-                       au_lolnode *ptr = lowest;-                       
> > while
> > (ptr->status == EBS_EMPTY && lol->maxi > 0)
> > {-                               lol->maxi--;-                              
> >  ptr
> > = &lol->array[lol
> > ->maxi];-                       }-               }                return 
> > lowest-
> > >l;        }
> > and
> > @@ -1536,6 +1550,13 @@ static int
> > au_auparse_next_event(auparse                aup_list_create(l);            
> >     
> > aup_list_set_event(l, &e);                aup_list_append(l, au->cur_buf, 
> > au-
> > >list_idx, au-
> > > line_number);
> > +               // Eat standalone EOE - main event was already marked
> > complete+               if (l->head->type == AUDIT_EOE)
> > {+                       au->cur_buf =
> > NULL;+                       aup_list_clear(l);+                       
> > free(l);+
> >                        continue;+               }                if
> > (au_lol_append(au->au_lo, l) == NULL) {                        free((char
> > *)e.host); #ifdef LOL_EVENTS_DEBUG01
> > I don't know if those have an effect on what you are seeing. But that is the
> > only substantial changes that I can see.
> > -Steve

Burn
--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Reply via email to