2006 Minutes

Loulwa Salem Wed, 03 May 2006 11:57:31 -0700

Note: I might have confused voices of Valdis and Paul together ...
The notes were taken by me (loulwa) and Debora Velarde.


5/01/2006 lspp Meeting Minutes:
===============================
Known Attendees:
   Matt Anderson (HP) - MA
   Russell Coker (Red Hat) - RC
   Amy Griffis (HP) - AG
   Steve Grubb (Red Hat) - SG
   Chad Hanson (TCS) - CH
   Linda Knippers (HP) - LK
   Joy Latten (IBM) - JL
   Paul Moore (HP) - PM
   Loulwa Salem (IBM) - LS
   Al Viro (Red Hat) - AV
   Dan Walsh (Red Hat) - DW
   George Wilson (IBM) - GW
   Janak Desai (IBM) - JD
   Klaus Weidner (Atsec) - KW
   Darrel Goeddel (TCS) - DG
   Bill O'Donnell (SGI) - BO
   Valdis Kletnieks(VT) - VAL

   GW:  Let us start the call. The overriding concern this week has been audit
        and the issue we've had with it. The deadlock issue, the performance
        issues, and  the completion. Al sent out an extensive write up on his
        proposal to fix the performance, we should discuss that unless anyone
        objects. I think it's the number one issue right now.

Kernel update
-------------
LSPP kernel issues
------------------
   AV:  The problems with deadlock are sorted out and fixed. shouldn't be a
        problem anymore
   GW:  excellent. Do we want to talk about it occurring in different
        kernel versions, or discuss privately later
   LK:  Is steve building new kernel with the fixes?
   SG:  Yes .. it's been in the build system since 10 oclock this morning
        waiting on ppc
   GW:  Ok .. we'll discuss other problems with version privately.
   AV:  it doesn't quite work. if we are taking rules, and sending delete on
        each and waiting for ACK before sending next delete. we still have to
        wait for connection and we still have to remember what we've got.
   SG:  I'll fix it to catch them all in linked list, then when we got them all,
        go delete them one by one
   AV:  we have to be ready to remember rules before we send next delete
   SG:  true, but not something admin needs to remember, i don't think you would
        be deleting and adding in another window.
   AV:  when doing bulk delete, you have to be able to see more rules coming
        from list before deleting. In any case, you have to keep a buffer.
   SG:  we'll read them all and put them in list, then delete them one by one. I
        think we are saying same thing
        
   AV:  one thing is really important question comes up again .. how can
        auditctl know what system calls for example modified a directory. How do
        we deal with creation of new system calls. We have to resolve these,
        user doesn't know, kernel does. We need to actually reserve numbers on
        top of our bitmask that would expand directory modifying and file
        modifying system calls, and have kernel side expand that to appropriate
        set
   SG:  that's what RHEL4 did, but we called it permissions bits, write, read,
        and append. my problem was with the attributes. We need to remap
        attributes to a bit. There was bit mapping called permission
   AV:  here we can use bitmap where we're using syscalls as far as I can tell
        that addresses the problem. I think it's the right thing to do
        regardless is at least for usability reasons.
   SG:  if we have bitmap, who will maintain it after certification
   AV:  not much trouble to maintain. system calls are not changed that often. 5
        minutes work. That is something we can push on people adding new system
        calls.
   VAL: how are you going to add that to syscall
   AV:  for example, I am dealing with people adding syscalls. They maintain
        table of all syscall, when new one is added, we decide if it should be
        there or not.
   VAL: this depends on someone to catch it, if someone adds syscall forgets to
        update table.
   AV:  comparing size to system calls. Have a non empty element and you got it.
        That's doable and easy to do
   SG:  one issue i keep thinking about tying syscalls and auditing together,
        you might get something that has nothing to do with a file. if event
        came back and has an inode with it, it touched the filesystem, if write.
        read. execute. on the way out when it decides if auditing is needed. I
        am looking for not caring what syscall is involved
   LK:  why would you not care.
   SG:  when you add watch, for example I want to know anything that happened to
        /etc/passwd
   KW:  it should be able to use auditctl to select watch for a write access,
        without having syscall.
   AV:  what are you going to do with mmap
   VAL: you can map to permission bits
   AV:  for mmap you need open file descriptor
   SG:  look at flags to see read, write, or both
   VAL: it gets messier.
   KW:  don't see inconsistency here. if you want a watch on file, it get a
        record as soon it is opened. We don't care if it is open or another
        variance in the future
   AV:  unlike access paths, open is open
   KW:  you also want to know about renames and other things which are not
        related to open. makes it easy to say I want write type access to this
        path or watch, and you get all actions.
   AV:  it doesn't really work that way. system modifications there are file
        system operations that don't do anything with file inode. you can copy
        /etc, then rename it. anybody trying to open /etc after that will get
        your modified . nothing happened to direcotry where it is. only thing
        that happened is modification of grandparent. there are possibilities of
        getting events
   KW:  for CAPP, you assume non hostile admin. you use trusted tools. and you
        are not worried about admin rearranging filesystem.
   AV:  not obvious which if you are doing it in terms of what should be
        considered. Consider something being written into a file, you get tons
        of system calls, tons of writes generating a record, insane.
   KW:  you have suid program, but no similar tools to miss with filesystem. If
        normal users do things with elevated privileges, it's good to get audit 
        
        for those scenarios. you should handle case when users use passwd
        program so you get record, that's why system watches was created. good
        to audit that, without putting too much requirements.
   AV:  alright.. put it that way .. it's not obvious what should be considered
        modification. ex. something being written to a file, you get tons of
        syscalls. generating record for each of them is insane.
   KW:  it is discouraged to audit each write syscall
   AV:  each system calls that modify contents of the file rather than putting a
        file in its place
   KW:  i see your point. you want a generic mechanism to catch just interesting
        ones.
   AV:  syscalls that modify contents of files are all best not watched on
        regular basis simple because it's very rarely done in single system file
   KW:  if we do have mechanism in kernel to audit syscall, it shouldn't audit
        reads/writes unless users need them.
   AV:  we need it for write. it puts us in interesting position
   KW:  this is heading on wrong track for lspp, we don't need to know when file
        contents are modified.
   AV:  we are talking about modification of directories. so directory
        modifiying syscalls ..I can maintain theses sets indefinitely
   LK:  looks like you two are saying same thing, to have a set of syscalls we
        want to audit.
   AV:  watching for modification of filesystem objects is only useful for
        directories, if we do it for files, we get too much records for every
        action.
   KW:  we don't want to audit changing file content.
   AV:  all we are interested in is auditing system calls. unlike regular files,
        that can get contents modified. for directories, and file metadata, it's
        all straight forward. not many syscall that we use. we want to see which
        operation happened because without that it's absolutely useless. i don't
        see what reason we have for approach different audit syscalls. only way
        to make it dependable. there are ways of accessing files we have to
        catch on syscall level. if we really want to catch all modification for
        file, we have to tie it to actual system object.
   KW:  another way looking at it. i don't know how much is it a priority to
        full proof it for future systems. it is important to have it easily
        maintainable. but we don't need to guarantee having auditing work for
        other versions.
   SG:  what about other distros like fedora, and the community?
   AV:  they can use table, and if that changes use shell script and flag it.
        that's it. It's up to you to understand what you are doing. supporting
        fedora is wonderful. it turns out that when you add syscalls, they often
        tend to be lacking permissions checks. new syscall being added to kernel
        causes big red flags anyway, and you want to see what it does,
        historically that causes trouble.
   SG:  for the most part, I want to turn my attention to other things instead
        of verifying auditctl works with every new kernel.
   AV:  auditctl don't need to change. define syscalls by what they are doing
        and put that in kernel, a way to say which set I want . Something       
        auditctl doesn't have to understand or know about. that's maintaining
        kernel side is easier rather than checks to be done when new syscall is
        added, it doesn't have to be done by you or anyone that is
        audit related. Hell I can do it with no problem in 10 minutes.
   SG:  when we might have patch that we can try. I'll have another kernel on
        Wednesday, or Thursday. will it be lssp.22, or is that something that
        can be done quickly
   AV:  We can do it, but we'll talk on IRC to decide exactly how.
   SG:  I really like to keep compatibility with RHEL4 so people updating from
        RHEL4 to 5 don't loose their audit setup. I think we should redefine
        append to attribute. I can't imagine people auditing and wanting to
        audit append.
   AV:  we are running out of time. we'll discuss on IRC and post results on
        list. Now there is also bunch of patches that need to be merged with
        branch. the iteration of sending stuff from -mm to linus has been done
        this morning. we'll then know to start the next branch. there will be
        some reshuffling of git tree. Another thing to discuss on IRC, which
        recently posted patches need to go there. which go to -mm and which stay
        in lssp for a while, we'll discuss on IRC after call.


Audit performance and stability issues
--------------------------------------
   GW:  about performance.. proposal to split list and turn it into more
        reasonable data structure, your notes Al are more detailed. what's the
        plan, Amy or you will make changes
   AV:  Amy said she'll do it. if she does it's fine, if not I can do it.
   AG:  I'm putting the rules into the inode into a separate patch, if that's
        the part your referring to, then that's what I'm working on
   AV:  that should be most of it. split set of rules into roughly equal sets.

spend more time deciding which subset to cover. going by inode number is goodsince alot fo rules will be split this way. rules bearing inode

        number can be carved into as many chains as we need. so far. I haven't
        see any other practical splits.
   AG:  did you feel the 31 number is good as it is. or it can be provided as a
        flag.
   AV:  it's a random number ... we just need to see. We have from 30 -100,
        probably 64 is a good number for initial tests.
   GW:  once that's done. do we need another profiling run to see the benefits,
        or is this all we have time for.
   AV:  it depends on how long profiling runs take.
   GW:  1 to 2 hours
   AV:  then it is worth it.
   GW:  after split, I'll get Josie to do another profiling run, he said some
        inlining will also save 5-6%. Also no one is doing the multiple CPU

Audit performance and stability issues
--------------------------------------

AuditFS/inotify completion
---------------------------

   AG:  I found issue in my patch where I am not releasing the data, I'm working
        on that, and will send to list. I've been working on the splitting the
        inode into hash, and the inotify work. I'll try to get performance patch
        tomorrow. Next thing is the fix problem with file and directory remove
        not being audited
   GW:  Joy has more bandwidth and can help.
   AG:  great will keep in touch with her on IRC

Audit of POSIX message queues
------------------------------
   GW:  I created patch to audit posix msgque. since posix msgque goes through
        filesystem interface. my question is what additional data should I be
        gathering. Are we just interested in security relevant parameters, this
        is question for Steve, and klaus. what should we be gathering?
   SG:  I like it to be on the smaller size for disk space consumption. I don't
        see people using posix system auditing. I don't think it will be heavily
        used feature, so i would go for the sparse side.
   KW:  lssp requires you be able to audit different file objects including
        msqque. so we need it to claim coverage for lspp basically.
   GW:  ok .. I'll put patches
   KW:  for lspp everything that is viewed as object should be audited. for CAPP
        it was easier. It might look like object but not need to be auditd.
   GW:  we have the patches, so we'll see if we need them. I'll go ahead and
        post them
   KW:  if we create/modify security attributes we need to audit, but not
        read/write.
   GW:  What about time_snd and time_rcv .. I have hooks in that, do i not need 
        
        that
   KW:  I'll recheck protection profile.
   GW:  there are certain syscalls entry or exit, i set and entry, and it comes
        back with EEXIST. is that a bug.. looks like a bug to me
   KW:  and it doesn't exist?
   GW:  yeah ... there are no rules.
   KW:  if it is unsupported, that is different, but if not it's a bug
   SG:  I like to see the command you used .. might be a syntax error
   GW:  it's the same command I've used always
   KW:  loulwa was having similar problems when user space and kernel don't
        match .. it tries to do something it doesn't completely understand. It
        is still a bug, should be updated to get more informative message at
        least.
   GW:  I'll post what i have.. still didn't modify userspace. hopefully it is
        overkill, and I can change if needed.
   SG:  don't need to update userspace
   GW:  yes .. it says my records are auxiliary types
   SG:  oh .. you need to add to mesg type table

Audit API
---------
   SG:  I am releasing new version of audit in next 1 or 2 days. not much done
        work this week.

Audit failure action inquiry function
--------------------------------------
   GW:  lisa put out a nice write up. maybe more than what we need.
   DW:  we don't want applications to bring the system down
   SG:  something that we don't want - the cups application to bring the system
        down. from selinux point of view - anything that links with libaudit now
        needs to have all of those.
   GW:  Casey was right about having the daemon do that.
   SG:  daemon just writes to disk
   GW:  maybe it acts as intermediary
   CH:  If something fails, should it shut itself down or keep going    
   SG:  there are two things we can't do, ignore and denial of service
   GW:  need to shut down.
   DW:  if we want to do something with that. cups should know how to do that.
        how is library knows how to shut down cups.
   GW:  it still needs to know it's environment
   SG:  just needs to get an ENUM back - return or denied
   LK:  didn't we get here since we didn't think it wasn't just cups, but could
        be other trusted programs. we wanted a central place to perform the
        failure handling.. mostly to have a default action.
   DW:  there are already existing things... like the pam that may use the
        library.
   GW:  You can also have variation of cups daemon that is lspp aware.. to not
        come up when audit is not running. but if you want a common binary, it
        needs to know it's environment.
   MA:  wouldn't there be an exec, and there would be another process that shuts
        down the system
   DW:  what if cups lost ability to audit, but can still shut down the system
   GW:  cups wants to send msg saying i can't audit. but how does it know to
        shut down. applications need to query. you would have to change your
        application
   SG:  it does anyway .. and we can change since cups code isn't submitted yet
   SG:  what if cups loses ability to audit.. not just it doesn't have
        permissions. it should decide if it prints or exits.
   GW:  depends if you care or not. how does it know that
   SG:  it have a library .. either ignore or deny it...
   LK:  also suggesting post audit call query rather than a wrapper.
   SG:  with all logging, you call one function, rather having to wrap each one.
   LK:  continue the current hide error audit_send_user_msg - tells libaudit to
        hide the error. sounds like 2 ways to do the same thing
   SG:  we could possibly do that .. i think it's coming from pam .. i have to
        check that. If we have query function it is more flexible and can work
        with more audits calls
   GW:  we have something that flags it is a certified system?
   SG:  not in CAPP... it was up to sys admin. I believe it was pam login_uid
        had to change.
   KW:  single config flag doesn't match reality very well. many things need to
        be there for it to apply.
   MA:  are we back to config files to specify their action
   GW:  yes .. otherwise how you know what sysadmin wants to do
   LK:  what does that mean for shadow utils?
   SG:  I am not planning to touch it
   GW:  what do you mean affect to shadow utils
   LK:  for example we have cups instrumented .. what about shadow?
   SG:  some are easy to modify .. like hardware clock .. but shadow utils is
        hard.. has about 350 audit msgs.
   GW:  I am hearing this is a no-op, or being pushed to application
   KW:  it is unfortunate, since requirements are if action is not auditable and
        supposed to be, it is supposed to be failure. maybe we have an
        application like a watchdog, that if audit stops communicating, we can
        shut down .. for compliance reason.
   VAL: There are so many scenarios that the watch dog can think audit is not
        running. that if the application is mislabeled.
   KW:  it might not be perfect, but in config system the system critical
        applications should be correctly labeled. we can assume it is correct.
        if admin stops auditd, and doesn't restart it in reasonable time
        watchdog can restarts it.
   LK:  doesn't sound reasonable to me.
   KW:  it doesn't need to be running by default .. but just to claim lspp
        compliance
   LK:  I think we should have audit do something, good to have independent way
        to verify audit is working. If self tests fail, what should it do.
   KW:  people didn't like that approach. have userspace tool keep a record. use
        live monitoring feature. people don't like library changes
   LK:  doesn't have to be default.
   KW:  might not get to audit system .. let's say if audit is broken.
   LK:  but wouldn't it get to the library at least
   CH:  it is not perfect.. if something is mislabeled. wouldn't audit open
        fail?
   KW:  we need to expect labels are in general correct form. we need some
        assumptions in there. it is acceptable to have ways that break system.
        as long as we say people have to use rpm to set up system and not use
        relabeling and mess up things.
   LK:  seems we had the same question with self test... what did it do?
   GW:  it logs anomaly records.
   KW:  you can combine it and see labels for certain  programs are correct.
   GW:  we want self tests to shut the system down?
   PM:  there are other things other than different labeling.
   KW:  we might be going overboard with this. it is ok to do reasonable job to
        meet objectives .. but does 't need to be perfect
   GW:  self tests .. can do that, since it already logs stuff ... what should
        it do. maybe it can shut system down
   KW:  have option to configure it that way.
   GW:  should it shut it itself, or have general mechanism to do that?
   LK:  general mechanism is what we discussed
   KW:  we should have options to decide what to do.
   GW:  security panic?
   KW:  something like that
   GW:  where should it reside?
   MA:  in self tests is good.
   GW:  maybe have something in /etc/security configs to tell it what to do
   KW:  have a script to tell it what to do
   MA:  isn't that what lisa suggested .. to have scripts
   DW:  this is different since it is based on one application
   KW:  two things, one is health check for system in good state, and to check
        if you can audit
   DW:  wouldn't the self test check if you can audit
   CH:  for cups, you can decide if you want it to audit or not. should
        applications fail if you can't audit. I agree with self test, if some
        critical problem they can restart audit or shut machine down.
   SG:  I agree, programs shouldn't fail if it can't audit. and self tests
        should do that. i don't want cups to shut system down .. like ssh, and
        login, it denied access, but not shut down.
   CH:  we have daemons that should record when it shuts down
   SG:  we don't have any daemons recording shutdown
   KW:  bringing down system can be restricted to self tests tool
   GW:  what if when admin didn't configure audit, but wants cups to come up
   LK:  you have it ignored. I'm bummed by auditng failing after operation
   SG:  it doesn't open early to see if it is communicating or not.
   KW:  this is again a case, where it doesn't' need to be perfect.
   MA:  I have no idea regarding implementing auditing in cups to have
        no/yes/maybe.
   GW:  i agree, we just have to decide
   AG:  having centralized system, we have one way of dealing with things.
   GW:  that's why i like idea of changing it in library
   KW:  reasonable approach. .. not shut down, but just have it indicate. you
        configure system to do that
   PM:  user has no idea what happened
   GW:  point is not having an audited action
   MA:  don't need to shut down, but maybe send a TERM then a KILL
   PM:  if system is not auditing, you would want to exit that process quickly.
        maybe writing 0 to null is enough
   KW:  you would have limitation. but that is fairly reasonable.
   GW:  if system stays sick, next time self tests execute it brings system
        down. sounds like a reasonable compromise. Who will change it?
   LK:  how about we skip write up .. and we'll work on code.

   GW:  we used all our 1.5 hours. do we want to continue.
   LK:  we are in process of creating sourceforge project for our audit test
        suite. might not be useful for you since you already have that.
   GW:  great .. thanks ...
   LK:  you will see a number of IBM copyright notices .. it was forked from the
        SLES9 evaluation, but it has been changed. we are trying to return the
        favor.
   GW:  anything else .. we'll adjourn the call...

--
redhat-lspp mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/redhat-lspp

[redhat-lspp] LSPP Development Telecon 05/01/2006 Minutes

Reply via email to