On Mon, Jun 03 2013, Austin Clements <[email protected]> wrote: >> * Killing a search buffer that is still in the process of being filled >> causes errors to be thrown. I'm seeing both of the following >> intermittently: >> >> [Sun Jun 2 08:26:40 2013] >> notmuch exited with status killed >> command: notmuch search --format\=sexp --format-version\=1 >> --sort\=newest-first to\:jrollins >> exit signal: killed >> >> [Sun Jun 2 08:32:26 2013] >> notmuch exited with status hangup >> command: notmuch search --format\=sexp --format-version\=1 >> --sort\=newest-first to\:jrollins >> exit signal: hangup >> >> This is somewhat understandable, as the notmuch binary exits with an >> error if it hasn't finished dumping the output, but given how common >> this particular scenario is I think we should try to avoid throwing >> errors in this circumstance. I wonder if we shouldn't just modify the >> binary to not return non-zero if it was manually killed while >> processing the output, or at least special-case the particular error >> caused by manually killing the search. > > Your assessment is correct, of course. The right place to fix this is > in Emacs, not the CLI (the CLI *can't* do anything about this, since it > gets killed by a signal). Probably we should do something different in > the sentinel if the search process's buffer is no longer live. Clearly > we should suppress the status error for the signal, but I think we still > should report anything that appeared in err-file because it may be > relevant to why the user killed the buffer (e.g., maybe a notmuch > wrapper was blocked on something).
That seems like a reasonable approach to me, to suppress the error but
continue to report in *Notmuch errors* buffer.
>> * The next thing I'm seeing is this:
>>
>> Opening input file: no such file or directory,
>> /home/jrollins/tmp/nmerr5390CAY
>>
>> I'm not exactly sure what causes this error, but it looks to me like
>> the temporary error file was removed before we were finished with it.
>
> This one's pretty awesome (and I think is a bug in Emacs). At a high
> level, the sentinel is getting run twice and since the first call
> deletes the error file, the second call fails. At a low level, what
> causes this is fascinating.
>
> 1) You kill the search buffer. This invokes kill_buffer_processes,
> which sends a SIGHUP to notmuch, but doesn't do anything else.
> Meanwhile, the notmuch search process has printed some more output,
> but Emacs hasn't consumed it yet (this is critical).
>
> 2) Emacs gets a SIGCHLD from the dying notmuch process, which invokes
> handle_child_signal, which sets the new process status, but can't do
> anything else because it's a signal handler.
>
> 3) Emacs returns to its idle loop, which calls status_notify, which sees
> that the notmuch process has a new status. This is where things get
> interesting.
>
> 3.1) Emacs guarantees that it will run process filters on any unconsumed
> output before running the process sentinel, so status_notify calls
> read_process_output, which consumes the final output and calls
> notmuch-search-process-filter.
>
> 3.1.1) notmuch-search-process-filter contains code to check if the
> search buffer is still alive and, since it's not, it calls
> delete-process.
>
> 3.1.1.1) delete-process correctly sees that the process is already dead
> and doesn't try to send another signal, *but* it still modifies
> the status to "killed". To deal with the new status, it calls
> status_notify. Dun dun dun. We've seen this function before.
>
> 3.1.1.1.1) The *recursive* status_notify invocation sees that the
> process has a new status and doesn't have any more output to
> consume, so it invokes our sentinel and returns.
>
> 3.2) The outer status_notify call (which we're still in) is now done
> flushing pending process output, so it *also* invokes our sentinel.
>
> It might be that the answer is to just remove the delete-process call
> from the filter. It seems completely redundant (and racy) with Emacs'
> automatic SIGHUP'ing.
Wow, awesome detective work. As mentioned on IRC, this suggestion of
Austin's does seem to fix the problem:
diff --git a/emacs/notmuch.el b/emacs/notmuch.el
index 5a8c957..975ef2b 100644
--- a/emacs/notmuch.el
+++ b/emacs/notmuch.el
@@ -817,7 +817,7 @@ non-authors is found, assume that all of the authors match."
(inhibit-read-only t)
done)
(if (not (buffer-live-p results-buf))
- (delete-process proc)
+ t
(with-current-buffer parse-buf
;; Insert new data
(save-excursion
I'm not sure if this is the ultimate solution, but it does cause the
missing tmp file errors to go away.
>> * Finally, something happened that caused *12,000* of the following lines
>> to be sent to the *Notmuch errors* buffer:
>>
>> A Xapian exception occurred performing query: The revision being read has
>> been discarded - you should call Xapian::Database::reopen() and retry the
>> operation
>>
>> Again, this was related to killing a search buffer that was still
>> being filled. I'm pretty sure the database was not modified during
>> this process.
>
> I have no insight on this one. My best guess is that this has nothing
> to do with this change except that this change makes these warnings
> visible rather than burying them somewhere down in the search results
> buffer.
Yeah, I suspected as much as well.
jamie.
pgpNogZSoY07c.pgp
Description: PGP signature
_______________________________________________ notmuch mailing list [email protected] http://notmuchmail.org/mailman/listinfo/notmuch
