Hello everyone, I'm Vishesh Yadav, a 3rd year Computer Science and Engineering student from India and Google Summer of Code 2012 student. I'm interested in Computer Architecture and Operating System development, which led me applying for DFBSD.
The goal of my project is to implement Linux's inotify interface and to write an indexing service for locate that will use inotify to keep locate database more up-to-date. These new inotify system calls will be exposed to Linux compatibility layer as well. I've appended my GSoC proposal to this message. This is the first time I'll be doing kernel programming, hence I would really appreciate any advice that you may have. I will put in my best to deliver what I promised for, and am looking forward enjoying hacking DFBSD. [Proposal] http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/vishesh/20002 inotify System Calls and Indexing Service for Filesystem ======================================================== Name : Vishesh Yadav Email : vishes...@gmail.com Address : A2/637, Himsagar Aptts Pocket P4 Greater Noida (U.P) India - 201310 Abstract -------- The goal of this project is to provide file system monitoring facilities in DragonFly BSD. This project is divided in to two parts - 1. Implement inotify 2. Implement a Filesystem indexing service for 'locate' over inotify. ### inotify ### DragonFly BSD provides kqueue/kevent interface to monitor file events which works very well. However it comes with overhead of having unique file descriptor for each watched file. Also, to watch changes in directories each file inside that directory has to opened and watched separately using kqueue/kevent. A system may reach the global/user file descriptor limit when watching a large number of file. To solve this problem, I propose to implement Linux's inotify interface. inotify interface can be used to monitor files and directories. Each inotify instance use one file descriptor. Implementing inotify will benefit various applications that use inotify, eg. Gamin, GIO, KIO etc... It will benefit developers who want to implement file indexers, semantic desktop, malware scanners and end-users who are looking for application compatibility with Linux. ### Filesystem indexing service ### The second part of this project proposes to implement a Filesystem indexing service. The service will prepare database that will be used by locate utility. Unlike the traditional updatedb program, this will listen to filesystem changes and update the database instantly and therefore the database would be more accurate and up to date. If time allows (or after GSoC), I propose to extend the locate utility to store additional information about files such as size, owner, permissions etc. Project Goals and Deliverables ------------------------------ During the GSoC period the goal is to - ### inotify ### * Implement inotify system calls. * Write manpages for inotify. * Write few tests. * Check well known softwares using inotify such as Gamin, GIO, KIO * Expose the new system calls through Linux ABI and test few native Linux binaries against it. ### Filesystem Indexing Service ### * Write indexing service based on new inotify interface. * Extend locate to store additional information about the files. * Write tests for the new indexing service. * Update the man pages. Implementation Details ---------------------- ### inotify ### There are essentially two ways by which we can have inotify on DragonFly-BSD - * Emulate over kqueue - This has been earlier tried in NetBSD (but not in-kernel). However it still doesn't solve the problem of having an open file descriptor for each monitored file (in kernel). Secondly inotify provides few notifications that is not provided by kqueue. However this approach will avoid much complexity and changing of VNODE structure and hooking VNOPS. To avoid bloat of vnode structure and review from the project mentor, this is the preferred method. * Develop from scratch - In this approach we will have to add a couple of members to VNODE structre, call inotify event functions in VNOPS. VNOPS will forward the events to the inotify system. Each vnode will keep a count and list of its watches. The approach is essentially same as implemented in Linux where all file system events are sent to fsnotify which is used by inotify/dnotify. Overall - * Three new system calls viz inotify_init, inotify_add_watch and inotify_rm_watch will be implemented. * Each open inotify instance is associated with an open inotify_device. It keeps a list of watches and queued events. * inotify_watch represents a watch request on specific node and is associated with an inotify device and vnode. * Lifetime of inotify_device and inotify_watch is managed by reference count. * inotify_kernel_event is an inotify event. A list of these is associated with each inotify device. * A basic pseudo filesystem called 'inotifyfs' will be created. It will implement basic file operations associated such as read, ioctl, release. Once inotify system is implemented, it will be made available at Linux emulation layer. ### Filesystem Indexing Service ### The filesystem indexing service will run as a daemon. We will maintain a temporary database which will contain all changes since last update. For every search this temporary database will be checked first for changes. Whenever updatedb is run, this temporary database changes will be committed to the main database. This will let us take benefit of already mature updatedb, new inotify interface to make recent changes available and avoid rewriting to huge database file everytime. Optionally, if time allows (or after GSoC period) locate utility could be further extended to store some additional information about files such as owner, permissions, file type and other attributes. The user can query and filter according to these attributes. Milestones ---------- ### Week 1 - * Write headers and interface. * Make dummy system calls (inotify_*) * Implement inotify device and filesystem. ### Week 2-3 - * Implement in kernel API to add/remove watches, queue events. ### Week 4-7 - * Implement system calls inotfy_init, inotify_watch_add, inotify_watch_rm. * Complete the implementation of inotify. * Test the new interface. Write some tests. ### Week 8 - * Write man pages. * Document code. * Test Gamin, GIO (and KIO if possible) against new interface. * Prepare code for mid-term evaluation and review. ### Week 8 - * Start working on Filesytem indexing service. * Write basic interfaces and data types. * Read config files and setup the service accordingly. ### Week 9-10 - * Listen to filesystem changes. * Write these changes to temporary database. * Commit temporary database to master database whenever updatedb is run. ### Week 11-12 - * Test the service. * Write tests. * Document code. * Write/Update man pages. * Prepare code for master. * Get the code reviewed. ### Week 13-14 * Buffer period for any possible delays or weird bugs and more testing. About Me -------- I'm a 3rd year B.Tech Computer Science and Engineering student from India. I have deep interest in Operating Systems, System Programming and Open Source development. I'm well versed with C and UNIX C API and have been working with it for quite few years. Apart from C, I also program in C++, Python and Scheme. Though I've never worked directly on kernel before, but I've good understanding of Operating System internals. I've also taken Operating System course from university. I have spent last few weeks studying the architecture and implementation of DragonFly-BSD and Linux kernel codebase as well as basic kernel development/debugging workflow. I know String Matching algorithms and have rudimentary understanding of ngrams which will be helpful while working with locate and indexing service. My interest in Operating Systems and Open Source development made me apply to DragonFly-BSD. I'm very excited and looking forward to be part of team, contribute code and do my best. I previously participated in GSoC 2011 for KDE and successfully completed my project and am still maintaining it. ---------end proposal-------- Regards, Vishesh Yadav