Hi All, The attached small patch improves maildir performance by making the assumption that nothing will be modifying the underlying files themselves. It uses the mtimes of cur/ and new/ to mark whether or not to poll files in the directory. This sees the initial poll be a somewhat slow process (nfs for me again), but subsequent polls are much faster. I'm caching the mtimes in sources.yaml too, but currently this doesn't save anything across loads of sup since the id's (and id -> fn map) are not cached across restarts.
I hope you find this useful. -Ben -- --------------------------------------------------------------------------------------------------------------------------- Ben Walton <[EMAIL PROTECTED]> When one person suffers from a delusion, it is called insanity. When many people suffer from a delusion it is called Religion. Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance ---------------------------------------------------------------------------------------------------------------------------
From 7eb73b3800a86008b3b133e508074f9823f81d30 Mon Sep 17 00:00:00 2001 From: Ben Walton <[EMAIL PROTECTED]> Date: Wed, 30 Apr 2008 14:41:27 -0400 Subject: [PATCH] maildir speed improvements * When generating a unique id for a maildir message, call stat and use methods from it instead of File.mtime and File.size which sees two syscalls happen. This is a noticeable speedup on nfs mounted maildirs. * Allow storing the mtimes of the new/ and cur/ maildir subfolders in the sources.yaml file. * Use the mtime stamps to avoid repolling a whole directory if the mtime hasn't changed. [A poll is forced if there are no ids loaded.] This makes all but the initial poll of a maildir reasonably fast. * TODO: Move mail from new/ to cur/ (updating cur.mtime) to avoid polling cur/ in all but sync --changed cases. --- lib/sup/maildir.rb | 50 +++++++++++++++++++++++++++++++++++--------------- 1 files changed, 35 insertions(+), 15 deletions(-) diff --git a/lib/sup/maildir.rb b/lib/sup/maildir.rb index 620e8e2..ddca44a 100644 --- a/lib/sup/maildir.rb +++ b/lib/sup/maildir.rb @@ -12,8 +12,8 @@ class Maildir < Source SCAN_INTERVAL = 30 # seconds ## remind me never to use inheritance again. - yaml_properties :uri, :cur_offset, :usual, :archived, :id, :labels - def initialize uri, last_date=nil, usual=true, archived=false, id=nil, labels=[] + yaml_properties :uri, :cur_offset, :usual, :archived, :id, :labels, :mdirs + def initialize uri, last_date=nil, usual=true, archived=false, id=nil, labels=[], mdirs={} super uri, last_date, usual, archived, id uri = URI(Source.expand_filesystem_uri(uri)) @@ -27,6 +27,9 @@ class Maildir < Source @ids_to_fns = {} @last_scan = nil @mutex = Mutex.new + #the value of this will be dir mtime + @mdirs = mdirs.nil? ? { 'cur' => nil, 'new' => nil } : mdirs + @dir_ids = { 'cur' => [], 'new' => [] } end def file_path; @dir end @@ -79,21 +82,35 @@ class Maildir < Source return unless @ids.empty? || opts[:rescan] return if @last_scan && (Time.now - @last_scan) < SCAN_INTERVAL - Redwood::log "scanning maildir..." - cdir = File.join(@dir, 'cur') - ndir = File.join(@dir, 'new') - - raise FatalSourceError, "#{cdir} not a directory" unless File.directory? cdir - raise FatalSourceError, "#{ndir} not a directory" unless File.directory? ndir + initial_poll = @ids.empty? + Redwood::log "scanning maildir [EMAIL PROTECTED]" begin - @ids, @ids_to_fns = [], {} - (Dir[File.join(cdir, "*")] + Dir[File.join(ndir, "*")]).map do |fn| - id = make_id fn - @ids << id - @ids_to_fns[id] = fn + @mdirs.each_key do |d| + subdir = File.join(@dir, d) + raise FatalSourceError, "#{subdir} not a directory" unless File.directory? subdir + @mdirs[d] = File.mtime subdir if @mdirs[d].nil? #record an initial stamp + mtime = File.mtime subdir + + #only scan the dir if the mtime on the dir is newer + if @mdirs[d] < mtime || initial_poll + Logger::log "#{'initial poll ' if initial_poll}mtime on #{d} [o: [EMAIL PROTECTED], n: #{mtime}]" + @mdirs[d] = mtime if @mdirs[d] < mtime + @dir_ids[d] = [] + Dir[File.join(subdir, '*')].map do |fn| + id = make_id fn + @dir_ids[d] << id + @ids_to_fns[id] = fn + end + else + Logger::log "no poll on #{subdir} [o: [EMAIL PROTECTED], n: #{mtime}]" + end end - @ids.sort! + @ids = @dir_ids.values.flatten.uniq.sort! + #remove old id to fn mappings...hopefully this doesn't actually change + #anything...normally, we'll add to this list but never remove mail. + @ids_to_fns.delete_if { |k, v| [EMAIL PROTECTED](k) } + [EMAIL PROTECTED] rescue SystemCallError, IOError => e raise FatalSourceError, "Problem scanning Maildir directories: #{e.message}." end @@ -145,8 +162,11 @@ class Maildir < Source private def make_id fn + #doing this means 1 syscall instead of 2 (File.mtime, File.size). + #makes a noticeable difference on nfs. + stat = File.stat(fn) # use 7 digits for the size. why 7? seems nice. - sprintf("%d%07d", File.mtime(fn), File.size(fn) % 10000000).to_i + sprintf("%d%07d", stat.mtime, stat.size % 10000000).to_i end def with_file_for id -- 1.5.5.1
_______________________________________________ sup-talk mailing list sup-talk@rubyforge.org http://rubyforge.org/mailman/listinfo/sup-talk