Hi All,

The attached small patch improves maildir performance by making the
assumption that nothing will be modifying the underlying files
themselves.  It uses the mtimes of cur/ and new/ to mark whether or
not to poll files in the directory.  This sees the initial poll be a
somewhat slow process (nfs for me again), but subsequent polls are
much faster.  I'm caching the mtimes in sources.yaml too, but
currently this doesn't save anything across loads of sup since the
id's (and id -> fn map) are not cached across restarts.

I hope you find this useful.
-Ben
-- 
---------------------------------------------------------------------------------------------------------------------------
Ben Walton <[EMAIL PROTECTED]>

When one person suffers from a delusion, it is called insanity. When
many people suffer from a delusion it is called Religion.
Robert M. Pirsig, Zen and the Art of Motorcycle Maintenance

---------------------------------------------------------------------------------------------------------------------------
From 7eb73b3800a86008b3b133e508074f9823f81d30 Mon Sep 17 00:00:00 2001
From: Ben Walton <[EMAIL PROTECTED]>
Date: Wed, 30 Apr 2008 14:41:27 -0400
Subject: [PATCH] maildir speed improvements

* When generating a unique id for a maildir message, call stat and use
  methods from it instead of File.mtime and File.size which sees two
  syscalls happen.  This is a noticeable speedup on nfs mounted maildirs.
* Allow storing the mtimes of the new/ and cur/ maildir subfolders in the
  sources.yaml file.
* Use the mtime stamps to avoid repolling a whole directory if the mtime
  hasn't changed.  [A poll is forced if there are no ids loaded.]  This makes
  all but the initial poll of a maildir reasonably fast.

* TODO: Move mail from new/ to cur/ (updating cur.mtime) to avoid polling cur/
  in all but sync --changed cases.
---
 lib/sup/maildir.rb |   50 +++++++++++++++++++++++++++++++++++---------------
 1 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/lib/sup/maildir.rb b/lib/sup/maildir.rb
index 620e8e2..ddca44a 100644
--- a/lib/sup/maildir.rb
+++ b/lib/sup/maildir.rb
@@ -12,8 +12,8 @@ class Maildir < Source
   SCAN_INTERVAL = 30 # seconds
 
   ## remind me never to use inheritance again.
-  yaml_properties :uri, :cur_offset, :usual, :archived, :id, :labels
-  def initialize uri, last_date=nil, usual=true, archived=false, id=nil, labels=[]
+  yaml_properties :uri, :cur_offset, :usual, :archived, :id, :labels, :mdirs
+  def initialize uri, last_date=nil, usual=true, archived=false, id=nil, labels=[], mdirs={}
     super uri, last_date, usual, archived, id
     uri = URI(Source.expand_filesystem_uri(uri))
 
@@ -27,6 +27,9 @@ class Maildir < Source
     @ids_to_fns = {}
     @last_scan = nil
     @mutex = Mutex.new
+    #the value of this will be dir mtime
+    @mdirs = mdirs.nil? ? { 'cur' => nil, 'new' => nil } : mdirs
+    @dir_ids = { 'cur' => [], 'new' => [] }
   end
 
   def file_path; @dir end
@@ -79,21 +82,35 @@ class Maildir < Source
     return unless @ids.empty? || opts[:rescan]
     return if @last_scan && (Time.now - @last_scan) < SCAN_INTERVAL
 
-    Redwood::log "scanning maildir..."
-    cdir = File.join(@dir, 'cur')
-    ndir = File.join(@dir, 'new')
-    
-    raise FatalSourceError, "#{cdir} not a directory" unless File.directory? cdir
-    raise FatalSourceError, "#{ndir} not a directory" unless File.directory? ndir
+    initial_poll = @ids.empty?
 
+    Redwood::log "scanning maildir [EMAIL PROTECTED]"
     begin
-      @ids, @ids_to_fns = [], {}
-      (Dir[File.join(cdir, "*")] + Dir[File.join(ndir, "*")]).map do |fn|
-        id = make_id fn
-        @ids << id
-        @ids_to_fns[id] = fn
+      @mdirs.each_key do |d|
+	subdir = File.join(@dir, d)
+	raise FatalSourceError, "#{subdir} not a directory" unless File.directory? subdir
+	@mdirs[d] = File.mtime subdir if @mdirs[d].nil?	#record an initial stamp
+	mtime = File.mtime subdir
+
+	#only scan the dir if the mtime on the dir is newer
+	if @mdirs[d] < mtime || initial_poll
+	  Logger::log "#{'initial poll ' if initial_poll}mtime on #{d} [o: [EMAIL PROTECTED], n: #{mtime}]"
+	  @mdirs[d] = mtime if @mdirs[d] < mtime
+	  @dir_ids[d] = []
+	  Dir[File.join(subdir, '*')].map do |fn|
+	    id = make_id fn
+	    @dir_ids[d] << id
+	    @ids_to_fns[id] = fn
+	  end
+	else
+	  Logger::log "no poll on #{subdir} [o: [EMAIL PROTECTED], n: #{mtime}]"
+	end
       end
-      @ids.sort!
+      @ids = @dir_ids.values.flatten.uniq.sort!
+      #remove old id to fn mappings...hopefully this doesn't actually change
+      #anything...normally, we'll add to this list but never remove mail.
+      @ids_to_fns.delete_if { |k, v| [EMAIL PROTECTED](k) }
+      [EMAIL PROTECTED]
     rescue SystemCallError, IOError => e
       raise FatalSourceError, "Problem scanning Maildir directories: #{e.message}."
     end
@@ -145,8 +162,11 @@ class Maildir < Source
 private
 
   def make_id fn
+    #doing this means 1 syscall instead of 2 (File.mtime, File.size).
+    #makes a noticeable difference on nfs.
+    stat = File.stat(fn)
     # use 7 digits for the size. why 7? seems nice.
-    sprintf("%d%07d", File.mtime(fn), File.size(fn) % 10000000).to_i
+    sprintf("%d%07d", stat.mtime, stat.size % 10000000).to_i
   end
 
   def with_file_for id
-- 
1.5.5.1

_______________________________________________
sup-talk mailing list
sup-talk@rubyforge.org
http://rubyforge.org/mailman/listinfo/sup-talk

Reply via email to