Note: I might have confused voices of Valdis and Paul together ...
The notes were taken by me (loulwa) and Debora Velarde.
5/01/2006 lspp Meeting Minutes:
===============================
Known Attendees:
Matt Anderson (HP) - MA
Russell Coker (Red Hat) - RC
Amy Griffis (HP) - AG
Steve Grubb (Red Hat) - SG
Chad Hanson (TCS) - CH
Linda Knippers (HP) - LK
Joy Latten (IBM) - JL
Paul Moore (HP) - PM
Loulwa Salem (IBM) - LS
Al Viro (Red Hat) - AV
Dan Walsh (Red Hat) - DW
George Wilson (IBM) - GW
Janak Desai (IBM) - JD
Klaus Weidner (Atsec) - KW
Darrel Goeddel (TCS) - DG
Bill O'Donnell (SGI) - BO
Valdis Kletnieks(VT) - VAL
GW: Let us start the call. The overriding concern this week has been audit
and the issue we've had with it. The deadlock issue, the performance
issues, and the completion. Al sent out an extensive write up on his
proposal to fix the performance, we should discuss that unless anyone
objects. I think it's the number one issue right now.
Kernel update
-------------
LSPP kernel issues
------------------
AV: The problems with deadlock are sorted out and fixed. shouldn't be a
problem anymore
GW: excellent. Do we want to talk about it occurring in different
kernel versions, or discuss privately later
LK: Is steve building new kernel with the fixes?
SG: Yes .. it's been in the build system since 10 oclock this morning
waiting on ppc
GW: Ok .. we'll discuss other problems with version privately.
AV: it doesn't quite work. if we are taking rules, and sending delete on
each and waiting for ACK before sending next delete. we still have to
wait for connection and we still have to remember what we've got.
SG: I'll fix it to catch them all in linked list, then when we got them all,
go delete them one by one
AV: we have to be ready to remember rules before we send next delete
SG: true, but not something admin needs to remember, i don't think you would
be deleting and adding in another window.
AV: when doing bulk delete, you have to be able to see more rules coming
from list before deleting. In any case, you have to keep a buffer.
SG: we'll read them all and put them in list, then delete them one by one. I
think we are saying same thing
AV: one thing is really important question comes up again .. how can
auditctl know what system calls for example modified a directory. How do
we deal with creation of new system calls. We have to resolve these,
user doesn't know, kernel does. We need to actually reserve numbers on
top of our bitmask that would expand directory modifying and file
modifying system calls, and have kernel side expand that to appropriate
set
SG: that's what RHEL4 did, but we called it permissions bits, write, read,
and append. my problem was with the attributes. We need to remap
attributes to a bit. There was bit mapping called permission
AV: here we can use bitmap where we're using syscalls as far as I can tell
that addresses the problem. I think it's the right thing to do
regardless is at least for usability reasons.
SG: if we have bitmap, who will maintain it after certification
AV: not much trouble to maintain. system calls are not changed that often. 5
minutes work. That is something we can push on people adding new system
calls.
VAL: how are you going to add that to syscall
AV: for example, I am dealing with people adding syscalls. They maintain
table of all syscall, when new one is added, we decide if it should be
there or not.
VAL: this depends on someone to catch it, if someone adds syscall forgets to
update table.
AV: comparing size to system calls. Have a non empty element and you got it.
That's doable and easy to do
SG: one issue i keep thinking about tying syscalls and auditing together,
you might get something that has nothing to do with a file. if event
came back and has an inode with it, it touched the filesystem, if write.
read. execute. on the way out when it decides if auditing is needed. I
am looking for not caring what syscall is involved
LK: why would you not care.
SG: when you add watch, for example I want to know anything that happened to
/etc/passwd
KW: it should be able to use auditctl to select watch for a write access,
without having syscall.
AV: what are you going to do with mmap
VAL: you can map to permission bits
AV: for mmap you need open file descriptor
SG: look at flags to see read, write, or both
VAL: it gets messier.
KW: don't see inconsistency here. if you want a watch on file, it get a
record as soon it is opened. We don't care if it is open or another
variance in the future
AV: unlike access paths, open is open
KW: you also want to know about renames and other things which are not
related to open. makes it easy to say I want write type access to this
path or watch, and you get all actions.
AV: it doesn't really work that way. system modifications there are file
system operations that don't do anything with file inode. you can copy
/etc, then rename it. anybody trying to open /etc after that will get
your modified . nothing happened to direcotry where it is. only thing
that happened is modification of grandparent. there are possibilities of
getting events
KW: for CAPP, you assume non hostile admin. you use trusted tools. and you
are not worried about admin rearranging filesystem.
AV: not obvious which if you are doing it in terms of what should be
considered. Consider something being written into a file, you get tons
of system calls, tons of writes generating a record, insane.
KW: you have suid program, but no similar tools to miss with filesystem. If
normal users do things with elevated privileges, it's good to get audit
for those scenarios. you should handle case when users use passwd
program so you get record, that's why system watches was created. good
to audit that, without putting too much requirements.
AV: alright.. put it that way .. it's not obvious what should be considered
modification. ex. something being written to a file, you get tons of
syscalls. generating record for each of them is insane.
KW: it is discouraged to audit each write syscall
AV: each system calls that modify contents of the file rather than putting a
file in its place
KW: i see your point. you want a generic mechanism to catch just interesting
ones.
AV: syscalls that modify contents of files are all best not watched on
regular basis simple because it's very rarely done in single system file
KW: if we do have mechanism in kernel to audit syscall, it shouldn't audit
reads/writes unless users need them.
AV: we need it for write. it puts us in interesting position
KW: this is heading on wrong track for lspp, we don't need to know when file
contents are modified.
AV: we are talking about modification of directories. so directory
modifiying syscalls ..I can maintain theses sets indefinitely
LK: looks like you two are saying same thing, to have a set of syscalls we
want to audit.
AV: watching for modification of filesystem objects is only useful for
directories, if we do it for files, we get too much records for every
action.
KW: we don't want to audit changing file content.
AV: all we are interested in is auditing system calls. unlike regular files,
that can get contents modified. for directories, and file metadata, it's
all straight forward. not many syscall that we use. we want to see which
operation happened because without that it's absolutely useless. i don't
see what reason we have for approach different audit syscalls. only way
to make it dependable. there are ways of accessing files we have to
catch on syscall level. if we really want to catch all modification for
file, we have to tie it to actual system object.
KW: another way looking at it. i don't know how much is it a priority to
full proof it for future systems. it is important to have it easily
maintainable. but we don't need to guarantee having auditing work for
other versions.
SG: what about other distros like fedora, and the community?
AV: they can use table, and if that changes use shell script and flag it.
that's it. It's up to you to understand what you are doing. supporting
fedora is wonderful. it turns out that when you add syscalls, they often
tend to be lacking permissions checks. new syscall being added to kernel
causes big red flags anyway, and you want to see what it does,
historically that causes trouble.
SG: for the most part, I want to turn my attention to other things instead
of verifying auditctl works with every new kernel.
AV: auditctl don't need to change. define syscalls by what they are doing
and put that in kernel, a way to say which set I want . Something
auditctl doesn't have to understand or know about. that's maintaining
kernel side is easier rather than checks to be done when new syscall is
added, it doesn't have to be done by you or anyone that is
audit related. Hell I can do it with no problem in 10 minutes.
SG: when we might have patch that we can try. I'll have another kernel on
Wednesday, or Thursday. will it be lssp.22, or is that something that
can be done quickly
AV: We can do it, but we'll talk on IRC to decide exactly how.
SG: I really like to keep compatibility with RHEL4 so people updating from
RHEL4 to 5 don't loose their audit setup. I think we should redefine
append to attribute. I can't imagine people auditing and wanting to
audit append.
AV: we are running out of time. we'll discuss on IRC and post results on
list. Now there is also bunch of patches that need to be merged with
branch. the iteration of sending stuff from -mm to linus has been done
this morning. we'll then know to start the next branch. there will be
some reshuffling of git tree. Another thing to discuss on IRC, which
recently posted patches need to go there. which go to -mm and which stay
in lssp for a while, we'll discuss on IRC after call.
Audit performance and stability issues
--------------------------------------
GW: about performance.. proposal to split list and turn it into more
reasonable data structure, your notes Al are more detailed. what's the
plan, Amy or you will make changes
AV: Amy said she'll do it. if she does it's fine, if not I can do it.
AG: I'm putting the rules into the inode into a separate patch, if that's
the part your referring to, then that's what I'm working on
AV: that should be most of it. split set of rules into roughly equal sets.
spend more time deciding which subset to cover. going by inode number is good
since alot fo rules will be split this way. rules bearing inode
number can be carved into as many chains as we need. so far. I haven't
see any other practical splits.
AG: did you feel the 31 number is good as it is. or it can be provided as a
flag.
AV: it's a random number ... we just need to see. We have from 30 -100,
probably 64 is a good number for initial tests.
GW: once that's done. do we need another profiling run to see the benefits,
or is this all we have time for.
AV: it depends on how long profiling runs take.
GW: 1 to 2 hours
AV: then it is worth it.
GW: after split, I'll get Josie to do another profiling run, he said some
inlining will also save 5-6%. Also no one is doing the multiple CPU
Audit performance and stability issues
--------------------------------------
AuditFS/inotify completion
---------------------------
AG: I found issue in my patch where I am not releasing the data, I'm working
on that, and will send to list. I've been working on the splitting the
inode into hash, and the inotify work. I'll try to get performance patch
tomorrow. Next thing is the fix problem with file and directory remove
not being audited
GW: Joy has more bandwidth and can help.
AG: great will keep in touch with her on IRC
Audit of POSIX message queues
------------------------------
GW: I created patch to audit posix msgque. since posix msgque goes through
filesystem interface. my question is what additional data should I be
gathering. Are we just interested in security relevant parameters, this
is question for Steve, and klaus. what should we be gathering?
SG: I like it to be on the smaller size for disk space consumption. I don't
see people using posix system auditing. I don't think it will be heavily
used feature, so i would go for the sparse side.
KW: lssp requires you be able to audit different file objects including
msqque. so we need it to claim coverage for lspp basically.
GW: ok .. I'll put patches
KW: for lspp everything that is viewed as object should be audited. for CAPP
it was easier. It might look like object but not need to be auditd.
GW: we have the patches, so we'll see if we need them. I'll go ahead and
post them
KW: if we create/modify security attributes we need to audit, but not
read/write.
GW: What about time_snd and time_rcv .. I have hooks in that, do i not need
that
KW: I'll recheck protection profile.
GW: there are certain syscalls entry or exit, i set and entry, and it comes
back with EEXIST. is that a bug.. looks like a bug to me
KW: and it doesn't exist?
GW: yeah ... there are no rules.
KW: if it is unsupported, that is different, but if not it's a bug
SG: I like to see the command you used .. might be a syntax error
GW: it's the same command I've used always
KW: loulwa was having similar problems when user space and kernel don't
match .. it tries to do something it doesn't completely understand. It
is still a bug, should be updated to get more informative message at
least.
GW: I'll post what i have.. still didn't modify userspace. hopefully it is
overkill, and I can change if needed.
SG: don't need to update userspace
GW: yes .. it says my records are auxiliary types
SG: oh .. you need to add to mesg type table
Audit API
---------
SG: I am releasing new version of audit in next 1 or 2 days. not much done
work this week.
Audit failure action inquiry function
--------------------------------------
GW: lisa put out a nice write up. maybe more than what we need.
DW: we don't want applications to bring the system down
SG: something that we don't want - the cups application to bring the system
down. from selinux point of view - anything that links with libaudit now
needs to have all of those.
GW: Casey was right about having the daemon do that.
SG: daemon just writes to disk
GW: maybe it acts as intermediary
CH: If something fails, should it shut itself down or keep going
SG: there are two things we can't do, ignore and denial of service
GW: need to shut down.
DW: if we want to do something with that. cups should know how to do that.
how is library knows how to shut down cups.
GW: it still needs to know it's environment
SG: just needs to get an ENUM back - return or denied
LK: didn't we get here since we didn't think it wasn't just cups, but could
be other trusted programs. we wanted a central place to perform the
failure handling.. mostly to have a default action.
DW: there are already existing things... like the pam that may use the
library.
GW: You can also have variation of cups daemon that is lspp aware.. to not
come up when audit is not running. but if you want a common binary, it
needs to know it's environment.
MA: wouldn't there be an exec, and there would be another process that shuts
down the system
DW: what if cups lost ability to audit, but can still shut down the system
GW: cups wants to send msg saying i can't audit. but how does it know to
shut down. applications need to query. you would have to change your
application
SG: it does anyway .. and we can change since cups code isn't submitted yet
SG: what if cups loses ability to audit.. not just it doesn't have
permissions. it should decide if it prints or exits.
GW: depends if you care or not. how does it know that
SG: it have a library .. either ignore or deny it...
LK: also suggesting post audit call query rather than a wrapper.
SG: with all logging, you call one function, rather having to wrap each one.
LK: continue the current hide error audit_send_user_msg - tells libaudit to
hide the error. sounds like 2 ways to do the same thing
SG: we could possibly do that .. i think it's coming from pam .. i have to
check that. If we have query function it is more flexible and can work
with more audits calls
GW: we have something that flags it is a certified system?
SG: not in CAPP... it was up to sys admin. I believe it was pam login_uid
had to change.
KW: single config flag doesn't match reality very well. many things need to
be there for it to apply.
MA: are we back to config files to specify their action
GW: yes .. otherwise how you know what sysadmin wants to do
LK: what does that mean for shadow utils?
SG: I am not planning to touch it
GW: what do you mean affect to shadow utils
LK: for example we have cups instrumented .. what about shadow?
SG: some are easy to modify .. like hardware clock .. but shadow utils is
hard.. has about 350 audit msgs.
GW: I am hearing this is a no-op, or being pushed to application
KW: it is unfortunate, since requirements are if action is not auditable and
supposed to be, it is supposed to be failure. maybe we have an
application like a watchdog, that if audit stops communicating, we can
shut down .. for compliance reason.
VAL: There are so many scenarios that the watch dog can think audit is not
running. that if the application is mislabeled.
KW: it might not be perfect, but in config system the system critical
applications should be correctly labeled. we can assume it is correct.
if admin stops auditd, and doesn't restart it in reasonable time
watchdog can restarts it.
LK: doesn't sound reasonable to me.
KW: it doesn't need to be running by default .. but just to claim lspp
compliance
LK: I think we should have audit do something, good to have independent way
to verify audit is working. If self tests fail, what should it do.
KW: people didn't like that approach. have userspace tool keep a record. use
live monitoring feature. people don't like library changes
LK: doesn't have to be default.
KW: might not get to audit system .. let's say if audit is broken.
LK: but wouldn't it get to the library at least
CH: it is not perfect.. if something is mislabeled. wouldn't audit open
fail?
KW: we need to expect labels are in general correct form. we need some
assumptions in there. it is acceptable to have ways that break system.
as long as we say people have to use rpm to set up system and not use
relabeling and mess up things.
LK: seems we had the same question with self test... what did it do?
GW: it logs anomaly records.
KW: you can combine it and see labels for certain programs are correct.
GW: we want self tests to shut the system down?
PM: there are other things other than different labeling.
KW: we might be going overboard with this. it is ok to do reasonable job to
meet objectives .. but does 't need to be perfect
GW: self tests .. can do that, since it already logs stuff ... what should
it do. maybe it can shut system down
KW: have option to configure it that way.
GW: should it shut it itself, or have general mechanism to do that?
LK: general mechanism is what we discussed
KW: we should have options to decide what to do.
GW: security panic?
KW: something like that
GW: where should it reside?
MA: in self tests is good.
GW: maybe have something in /etc/security configs to tell it what to do
KW: have a script to tell it what to do
MA: isn't that what lisa suggested .. to have scripts
DW: this is different since it is based on one application
KW: two things, one is health check for system in good state, and to check
if you can audit
DW: wouldn't the self test check if you can audit
CH: for cups, you can decide if you want it to audit or not. should
applications fail if you can't audit. I agree with self test, if some
critical problem they can restart audit or shut machine down.
SG: I agree, programs shouldn't fail if it can't audit. and self tests
should do that. i don't want cups to shut system down .. like ssh, and
login, it denied access, but not shut down.
CH: we have daemons that should record when it shuts down
SG: we don't have any daemons recording shutdown
KW: bringing down system can be restricted to self tests tool
GW: what if when admin didn't configure audit, but wants cups to come up
LK: you have it ignored. I'm bummed by auditng failing after operation
SG: it doesn't open early to see if it is communicating or not.
KW: this is again a case, where it doesn't' need to be perfect.
MA: I have no idea regarding implementing auditing in cups to have
no/yes/maybe.
GW: i agree, we just have to decide
AG: having centralized system, we have one way of dealing with things.
GW: that's why i like idea of changing it in library
KW: reasonable approach. .. not shut down, but just have it indicate. you
configure system to do that
PM: user has no idea what happened
GW: point is not having an audited action
MA: don't need to shut down, but maybe send a TERM then a KILL
PM: if system is not auditing, you would want to exit that process quickly.
maybe writing 0 to null is enough
KW: you would have limitation. but that is fairly reasonable.
GW: if system stays sick, next time self tests execute it brings system
down. sounds like a reasonable compromise. Who will change it?
LK: how about we skip write up .. and we'll work on code.
GW: we used all our 1.5 hours. do we want to continue.
LK: we are in process of creating sourceforge project for our audit test
suite. might not be useful for you since you already have that.
GW: great .. thanks ...
LK: you will see a number of IBM copyright notices .. it was forked from the
SLES9 evaluation, but it has been changed. we are trying to return the
favor.
GW: anything else .. we'll adjourn the call...
--
redhat-lspp mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/redhat-lspp