Hi Suvayu :) welcome to notmuch and python.
Quoting Patrick Totzke (2012-10-13 18:58:51) > > > And If I look for my own name, this takes over a minute, > > > eventually dying. This could be an issue with libnotmuch though. > > > Possibly, your algorithm takes very long and then reads from an initially > > > opened Database object again, which was invalidated by concurrent writes > > > of other processes.. Hm no, see below. > > > ------------------------------- > > > [~] time nbook Patrick > > > > > > Error opening /home/pazz/mail/gmail/[Google Mail].All > > > Mail/cur/1330682270_0.12958.megatron,U=8766,FMD5=66ff6a8bc18a8a3ac4b311daa93d358a:2,S: > > > Too many open files > > > Traceback (most recent call last): > > > File "/home/pazz/bin/nbook", line 167, in <module> > > > File "/home/pazz/bin/nbook", line 71, in __init__ > > > File > > > "/home/pazz/.local/lib/python2.7/site-packages/notmuch/message.py", line > > > 233, in get_header > > > notmuch.errors.NullPointerError > > > Error in sys.excepthook: > > > Traceback (most recent call last): > > > File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 66, > > > in apport_excepthook > > > ImportError: No module named fileutils > > > > > > Original exception was: > > > Traceback (most recent call last): > > > File "/home/pazz/bin/nbook", line 167, in <module> > > > File "/home/pazz/bin/nbook", line 71, in __init__ > > > File > > > "/home/pazz/.local/lib/python2.7/site-packages/notmuch/message.py", line > > > 233, in get_header > > > notmuch.errors.NullPointerError > > > nbook Patrick 3.20s user 5.47s system 12% cpu 1:11.65 total > > > ------------------------------------ > > > > > > > Yes someone else pointed this out too. Again I'm not sure how to > > proceed here. I had a quick look at this last week and it seemed to me > > the limitation comes from within the python bindings for notmuch. Do > > you have any ideas? > > As mentioned before, I think you invalidate the Database object concurrently > while your long-running algorithm goes through all messages. > Xapian doesn't handle concurrent access to the index like a normalâ„¢ database > would. > This means you are notified by this error that some changes were detected. > Maybe the error message should be more telling here though. Teythoon? The reason for this error is exactly what the error message says, you are opening to many files. Check out this limit using ulimit -n: % ulimit -n 4096 This problem is subtle. Here is a minimal test case: ~~~ snip ~~~ import notmuch with notmuch.Database() as db: query = notmuch.Query(db, 'a').search_messages() for msg in query: msg.get_header('from') with notmuch.Database() as db: query = notmuch.Query(db, 'a').search_messages() for msg in list(query): msg.get_header('from') ~~~ snap ~~~ % python test.py Error opening /home/teythoon/Maildir/.lists.notmuch/cur/1323251462.M53044P18514.thinkbox,S=7306,W=7466:2,: Too many open files Traceback (most recent call last): File "test.py", line 11, in <module> msg.get_header('from') File "/home/teythoon/.local/lib/python2.7/site-packages/notmuch/message.py", line 237, in get_header raise NullPointerError() notmuch.errors.NullPointerError Observe that it blows up in line 11, the first version works. The only difference is that the second version creates a list from the notmuch query. This prevents the garbage collector from collecting the message objects and thus closing the file handles. So here's your fix: ~~~ snip ~~~ diff --git a/nbook b/nbook index 387c71d..b3d4fd6 100755 --- a/nbook +++ b/nbook @@ -173,7 +173,7 @@ class AddressHeaders(object): # Search db = Database() query = Query(db, 'from:"{0}" or to:"{0}"'.format(querystr)) -msgs = list(query.search_messages()) +msgs = query.search_messages() addresses = AddressHeaders(msgs, querystr) print addresses ~~~ snap ~~~ A few more comments: > from notmuch import * Please avoid * imports, they prevent tools like pyflakes from checking whether you accidentally misspelled any identifiers. > pyversion = float('%d.%d' % (sys.version_info.major, sys.version_info.minor)) > if pyversion < 2.7: Converting this to float feels wrong. Consider doing sth like if sys.version_info.major > 2 or (sys.version_info.major == 2 and sys.version_info.minor >= 7): > print '`nbook\' needs Python 2.7 or higher for argparse' Note that in py3k print is a function and not a statement, so you need to use braces. Consider dropping this at the beginning of all your python files to make py2.7 use the new features: from __future__ import print_function, absolute_import, unicode_literals > exit(-1) exit is not a builtin function. You have to use sys.exit. Tools like pyflakes can spot this kind of mistakes. Also, sys.exit also accepts a string as argument which it prints to stderr before exiting with an error code. > self.__fromhdr__ += ',' + msg.get_header('from') Hm, this is somewhat unpythonic. It used to be the case that building strings this way was a lot slower than building a list and then joining it on a delimiter of your choice (i.e. ','.join(from_headers)). This is (was?) because strings are immutable in python and constantly creating strings just to throw them away in the next iteration puts a lot of pressure on the memory management system. Somewhat recent discussion here: http://stackoverflow.com/questions/1316887/what-is-the-most-efficient-string-concatenation-method-in-python > def print_addrs(self, fmtstr='', query=''): > if '' == fmtstr: fmtstr = '%s %s\n' Ok, several things here: * The comparison looks weird, you are using the string constant as the first operand. While this is technically not wrong, it is somewhat unpythonic b/c if you read it out loud (''if the empty string is equal to fmtstr'') it somewhat bends the 1:1 mapping of the semantic of your program and the English sentence. It looks like this c hack that is actually unnecessary in python b/c you cannot use the assignment operator as a value (except for a=b=c=0 style assignments). * Please don't put multiple statements in one line. * This can be written shorter and more idiomatic (yay keyword arguments): def print_addrs(self, fmtstr='%s %s\n', query=''): [...] Happy hacking :) Justus _______________________________________________ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch