On Thu, 2005-08-04 at 12:58 -0400, Daniel Veillard wrote: > On Wed, Aug 03, 2005 at 12:26:22PM -0400, John McCutchan wrote: > > On Wed, 2005-08-03 at 12:20 -0400, John McCutchan wrote: > > > On Wed, 2005-08-03 at 11:15 -0400, Daniel Veillard wrote: > > > > It then must be implemented at the user level. > > > > It is not acceptable to argue about a specific problem in Dnotify > > > > support > > > > to just cancel this fundamental property. inotify would not need to > > > > maintain a tree of stat() info but one per cancleeled kernel monitor. > > > > > > Keeping a stat() tree for each cancelled kernel monitor isn't as easy as > > > it sounds. That is a very racey operation. It would be easy to miss > > > events in between your last inotify event and the scan of the directory. > > > > Replying to myself, > > > > I'm not saying that I wouldn't want this if we can show that it really > > is useful. I'd just like to see some real justification (ie benchmark > > numbers) showing that we do need to provide it. Performance is excellent > > for me without any gamin supplied flow control. >
> You tested it but for activity on a single file growing fast. > This is to me a micro-benchmark, usually making design decision > based on micro-benchmarks is a tempting but dangerous pitfall. Correct. > Let say you have a system with a large number of files and directory > like hum .. rpmfind.net . You want to do "something" when files changes. > You want to know when the change is done and what files (in rpmfind > case what directories) are affected. The phenomenon of locality of > accesses is to be verified there too that me a subset will change > rapidly and then another subset will change etc ... If access is local (within bursts) the kernel flow control will be able to help quite a bit. So will my per-connection flow control that I just added. > If you remove flow control, in my case I'm gonna get zillion of events > as rsync mirrors sets of files, what I really want to know from gamin > would be when they are done, i.e. say like a 10s timeout after any > change to a monitored directory, I can't care less about the N events > per second per modified file at a given time. Worse those will generate > many context switches which on a server are what is generating load. Because of the limited FAM event vocabulary, we can't send a 'file was closed that was open for writing' event (which inotify supports). So, I don't think that FAM (gamin) is the right choice for that work load. Now context switching isn't a problem when using inotify. Let me explain how the inotify backend keeps context switches as low as possible. USER: gam_server blocks on inotify_device_fd KERNEL: event is queued on inotify_device_fd USER: gam_server wakes up, sees that only 1 event has been queued, decides to go back to sleep KERNEL: another event is queued on inotify_device_fd USER: gam_server stays asleep KERNEL: More events queue up USER: gam_server sleep time is up, wakes up, snarfs all the events that queued up. In this case, there is only a handful of context switches. The inotify backend has knobs that we can play with that will control how long we will sleep waiting for more events, and how many events are enough to stop sleeping. The longer we sleep, the more context switches we can avoid. This was not possible with dnotify, since each event (signal delivery) causes a context switch. > I think if I were to switch rpmfind.net to a kernel based update > reporting, then > 1/ I will need inotify as dnotify would explode the > number of open fd Yep > 2/ I want flow control at the user level since > the kernel don't have ways to limit to just the events I need (well > especially though gamin). First off, I don't think gamin is the right choice for this kind of work load. And you wouldn't need flow control if you were talking directly to inotify, since you can just ask for only IN_CLOSE_WRITE events. Also, I'd like to see how the gam_server (using inotify) handles this kind of load. I have a feeling that the performance would be better than expected. > I think this applies to an awful lot of server usage (NNTP, SMTP, > even HTTP to regenerate from modified templates), I think if you were > to switch beagle to gamin you would have to either extend the API or > add flow control at the user level, otherwise the kernel is just > gonna drop events. Beagle doesn't use any flow control at all. The kernel will queue up to 16384 events per inotify instance. That is a ton. > Of course it's hard to benchmark correctly because > correctness is #1 factor. I believe first in getting the architecture > right, and only then benchmark and optimize, not the way around ;-) I think that we should wait until we can find a load that causes a problem before we add 'fallback to poll' flow control. We have all the code, it is trivial to hook it back into the inotify backend. I'd just like to see a real case where the new backend causes a performance problem. Besides, we can save TONS of memory by going this route. Right now memory is much scarcer than CPU. -- John McCutchan <[EMAIL PROTECTED]> _______________________________________________ Gamin-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gamin-list
