On 2021-01-19 14:11, Paul Moore wrote: > On Tue, Jan 19, 2021 at 3:18 AM Burn Alting <burn.alt...@iinet.net.au> wrote: > > I tend to have a rigorous auditing posture (see the rules loaded in > > https://github.com/linux-audit/audit-userspace/issues/148) which is not > > normal for most. Perhaps, Paul, you have hit the nail on the head by > > stating that this 'severe delay' is not that unreasonable given my rules > > posture and we just need to 'deal with it' in user space. > > We still get the event data, I just need to adjust the user space tools to > > deal with this occurrence. > > As for what the system is doing, in my home case it's a Centos 7 VM running > > a tomcat service which only gets busy every 20 minutes and the other is a > > HPE Z800 running Centos 8 with 4-5 VM's mostly dormant. I can put any code > > in these hosts to assist in 'validating'/testing the delay. Advise and I > > will run. > > I took a (slightly) closer look at the queuing code just now and the > corner case I suspected doesn't look very promising here. On failure > to send a record to userspace, the record is put back at the front of > the queue to preserve ordering so the kernel *should* still emit > records in order even when auditd is under pressure, dead, or in the > process of a restart. Also, in this case the kernel kicks the auditd > connection, disconnecting auditd's netlink socket; I imagine that > would be a notable event on your systems. > > However, looking at the timestamps in the audit events you posted, I > noticed something a little odd. If I align the timestamps a bit > better let's see if it jumps out at you guys ... > > type=XXX msg=audit(1609519900.159:44606): ... > type=XXX msg=audit(1609519900.161:44607): ... > type=XXX msg=audit(1609519900.163:44608): ... > type=XXX msg=audit(1609519896.829:44609): ... > type=XXX msg=audit(1609519900.170:44610): ... > > Let me guess Burn, you're running NTP/PTP :) I'm not sure how auditd > handles things like this, but it looks like there was a small negative > time correction between events which caused the odd scenario where > event N+1 actually occurred before event N according to the wall > clock. In other words, your system is time traveling ;)
Interesting... The timestamp is assigned on syscall entry. The serial number is assigned on the creation of the first audit record of an event. >From these timings above, NTP/PTP could explain this, but the third and fifth are too close together to make that likely, I think. I'd blame time namespaces, but the host system is all in the same time namespace. I think I'd need to see a larger sequence to be convinced of this... > paul moore - RGB -- Richard Guy Briggs <r...@redhat.com> Sr. S/W Engineer, Kernel Security, Base Operating Systems Remote, Ottawa, Red Hat Canada IRC: rgb, SunRaycer Voice: +1.647.777.2635, Internal: (81) 32635 -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit