Below are some thoughts that I have been kicking around for a while. I'm not sure that I will time to implement all of this in any short time frame, but if there is sufficient interest perhaps others can chip in and we can work towards this.
Problem Statement ================= Historical login information has the following problems in OpenSolaris: - /var/adm/wtmpx is a write-only log file. Unless I am missing something, the only way to remove old entries is to truncate the file to zero bytes. Presumably periodic jobs could be used to rotate wtmpx before truncating, but long-running login sessions may lose their login record before the logout record is written. - /var/adm/lastlog has the potential to be an extremely sparse file. Experience shows that when an enterprise uses large UID's[1] /var/adm/lastlog may have a size[2] of up to 26 GB while blocks used are measured as a couple megabytes or less. These pose problems for a variety of operations, including but not limited to the following: - Custom solutions are required to maintain proper login history on systems where wtmpx is rotated. - Inexperienced (or hurried) system administrators looking to free a lot of space in the root file system may inadvertently remove or truncate /var/adm/lastlog with no substantial benefit. - Backups performed through POSIX file operations will consume minutes of CPU time performing reads of gigabytes of null (unallocated) blocks. This problem compounds as tens of non-global zones exist on a given server. The problem spreads to other systems as gigabytes of zeros are transmitted across Ethernet and fiber channel networks to be compressed to nothing by tape drives. This leads to high network utilization with low tape drive utilization. - Restores of highly sparse files commonly do not preserve the sparseness of the files and as such even generously sized root file systems may be filled by /var/adm/lastlog. - System[3] and zone[4] cloning mechanisms tend to not be sparse file aware and exhibit the aforementioned space and time problems for provisioning operations which tend to be much more common than recovery operations. It should be noted that wtmpx grows through appending of fixed size records during login and logout events. lastlog is an array of fixed size records. The size of lastlog is related to largest UID used for a login session for the life of the file. The block usage of lastlog is determined by the distribution of the UID's used by the various users that have logged into a system over the life of the lastlog file. Proposed Solution ================= I propose that the current record storage mechanisms be replaced with a format that allows old records to be removed and greatly reduces the sparseness of data files. Rather than developing a proprietary file format or simply providing a method to store existing utmpx or lastlog records as blobs, transitioning to an SQLite database is proposed. In the transition to SQLite, a schema would be developed such that all of the existing fields tracked in utmpx and lastlog data structures are tracked for each login session. The schema would be designed in such a way that common operations are optimized for performance while allowing for future extensions. A utility command would be provided for use by logadm(1M) or similar to expire (delete) old entries. The space consumed by the expired entries could then be reused by future login sessions. Implementation Strategy ======================= SQLite is already a component of Solaris and is used to store the SMF service configuration. Existing access to lastlog is found in various PAM modules[5], sshd, in.ftpd, finger, uucpd, pppd, and libresolv's PRNG mechanism. Transitioning to this new mechanism would remove duplicate or functionally equivalent from the various commands. Arguably, libresolv should not use a private PRNG algorithm and should instead rely upon /dev/random. According to utmpx(4), all access to utmpx and wtmpx files should be through the various access and update library functions documented in getutxent(3C). As such, these functions would become wrappers for access to the SQLite database. Access to files other than /var/adm/utmpx and /var/adm/wtmpx would still be permitted through these library calls. Calls to updtwtmpx(3C) and utmpxname(3C) that refer to /var/adm/utmpx or /var/adm/wtmpx would be intercepted and directed to the SQLite database. Standards Impact ================ While the existing behavior of lastlog is common across Solaris and likely other UNIX implementations and Linux distributions, it is not believed that /var/adm/lastlog is required for standards conformance. No common access or update routines are known to exist in any Solaris library. SUSv2 and likely subsequent requires utmp.h and a subset of the *utmpx* library calls that are to be wrapped. It is believed that no industry standards apply to the format of the "user accounting database" accessed by getutxline() et. al. or who(1). Future Enhancements =================== Since the privileges, roles, etc. assigned to a user will likely vary over time, moving away from a fixed size record would open up the ability to record more detailed information about each session. This information can be useful to assess which of the logged in users may have had access to perform actions that under investigation. While who(1) and last(1) will continue to exist and provide basic reporting capabilities, much richer or focused reports could be generated with minimal effort. For example, the question of "who was logged in at 04:17:23 last Monday" becomes much easier than tedious matching of login and logout records. By consolidating utmpx, wtmpx, and lastlog updates through an API and removing the expectation/requirement for files to exist in a particular format in /var/adm, a distributed database could be developed to allow account aging policies to extend beyond an individual machine. This same consolidated data source could be used for much richer reports that stretch beyond the boundary of an individual machine. Alternatives ============ In previous discussions in this area, the use of SEEK_HOLE has been suggested[6] as a way of mitigating the impact of lastlog sparseness. While making cpio(1) sparse-file aware, this would solve the problem for flash archives, live upgrade, and zoneadm clone it would not help any of the other problem areas cited. According to the direction set forth by the Caiman and potentially other projects, the Solaris root file system as well as zone paths will be forced onto ZFS. As part of this, boot environment and zone cloning operations will take place through "zfs clone". Presumably a live upgrade replacement will come into use that relies upon "zfs send" and "zfs receive" rather than a cpio archive. Again, this solves the problem with sparse files only for flash archives, live upgrade and zoneadm clone. It would not address any of the other problem areas. Various wtmp rotation or cleanup utilities have existed but none are included in OpenSolaris. While such a utility would be much easier to write and integrate, it would do nothing to make reporting or extending to new functionality easier. Footnotes ========= 1. In the example that comes to mind, UID's are concentrated in the 0 - 1,000, 60,000 - 65,000, ranges and scattered throughout 100,000,000 - 999,999,999. 2. From struct stat st_size. 3. Live Upgrade uses cpio which is not sparse file aware. 4. When the zonepath is not on ZFS, "zoneadm clone" also uses cpio. 5. The unix_session and unix_account PAM modules each directly access /var/adm/lastlog. 6. http://www.opensolaris.org/jive/thread.jspa?threadID=7070#29815 -- Mike Gerdts http://mgerdts.blogspot.com/ _______________________________________________ opensolaris-code mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/opensolaris-code
