Re: [notmuch] [PATCH 2/2] * avoid gcc 4.4.1 compiler warning due to ignored 'fflush' return value

2009-12-01 Thread Dirk-Jan C . Binnema
Hi Carl,

 Carl == Carl Worth cwo...@cworth.org writes:

Carl [1  text/plain (quoted-printable)]
Carl On Mon, 23 Nov 2009 08:21:50 +0200, Dirk-Jan C. Binnema 
djcb.b...@gmail.com wrote:
 -#define prompt(format, ...) \
 -do {\
 -printf (format, ##__VA_ARGS__); \
 -fflush (stdout);\
 -getline (response, response_size, stdin); \
 -chomp_newline (response);   \
 +#define prompt(format, ...) \
 +do {\
 +int ignored;\
 +printf (format, ##__VA_ARGS__); \
 +fflush (stdout);\
 +ignored = getline (response, response_size, stdin);   \
 +chomp_newline (response);   \
 } while (0)

Carl This patch is incorrect. Ignoring the return value of getline results 
in
Carl the program invoking undefined behavior by reading uninitialized
Carl memory. This is easily tested by, for example, typing Control-D to
Carl provide EOF to a prompt from notmuch setup.

Carl How about just exiting in case of EOF as in the patch below?

Sure, that's the better solution, but note that my patch did not introduce the
undefined behavior -- it was there before. I was trying a minimal patch to
silencing the warning. Note that prompt seems to leak a bit, even after the
committed patch; attached are two more micro patches to fix this and another
small leak. I try to do minimal changes, but the prompt business gets a bit
unwieldy. The leaks are one-time at not critical, but anyway it's always good
stay vigilant.



0001-notmuch-config-fix-small-leak-from-g_key_file_to_dat.patch
Description: Binary data




0002-free-the-response-data-from-prompt.patch
Description: Binary data



Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:d...@djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[notmuch] Subject: [PATCH] update the check whether a dir entry should be ignored.

2009-12-12 Thread Dirk-Jan C . Binnema
Hi all,

This is a draft patch which hopefully improves the check whether a dir entry
should be ignored for that. It adds one feature: if you put a file '.noindex'
in a dir, the whole dir will be ignored for indexing. I find this very useful
for removing e.g. folders with spam messages from the indexing.

There is one maybe controversial change, namely that it ignores all dot-dirs;
this works fine for .notmuch and .nnmaildir (gnus), but maybe there is some
valid use case for having mail in dot-dirs. Maybe one of the IMAP-servers does
this? Not sure. Anyway, I can change that part.

If the overall approach is considered OK, I can make a new patch

Best wishes,
Dirk.


---
 notmuch-new.c |   48 ++--
 1 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 9d20616..28f69bc 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -109,6 +109,30 @@ is_maildir (struct dirent **entries, int count)
 return 0;
 }
 
+
+static int
+ignore_dir_entry (const char* path, struct dirent *entry)
+{
+char noindex[4096]; /* any path will fit */
+
+/* ignore everything starting with a dot; this covers hidden
+* files, as well as special dir (. and ..), but also things like
+* gnus .nnmaildir or .notmuch */
+if (entry-d_name[0] == '.')
+   return 1;
+
+/* we also check if dir contains a file called '.noindex'; if so,
+ * we ignore this directory; alloca would be suitable here, if not
+ * for the portability. */
+snprintf (noindex, sizeof(noindex), %s/%s/.noindex, path, entry-d_name);
+if (access (noindex, F_OK) == 0)
+   return 1;
+
+return 0; /* don't ignore */
+}
+
+
+
 /* Examine 'path' recursively as follows:
  *
  *   o Ask the filesystem for the mtime of 'path' (path_mtime)
@@ -181,21 +205,17 @@ add_files_recursive (notmuch_database_t *notmuch,
if (path_mtime = path_dbtime  entry-d_type == DT_REG)
continue;
 
-   /* Ignore special directories to avoid infinite recursion.
-* Also ignore the .notmuch directory.
-*/
-   /* XXX: Eventually we'll want more sophistication to let the
-* user specify files to be ignored. */
-   if (strcmp (entry-d_name, .) == 0 ||
-   strcmp (entry-d_name, ..) == 0 ||
-   (entry-d_type == DT_DIR 
-(strcmp (entry-d_name, tmp) == 0) 
-is_maildir (namelist, num_entries)) ||
-   strcmp (entry-d_name, .notmuch) ==0)
-   {
-   continue;
-   }
 
+   /* ignore tmp Maildirs, for obvious reasons */
+   if (entry-d_type == DT_DIR 
+   (strcmp (entry-d_name, tmp) == 0) 
+   is_maildir (namelist, num_entries))
+   continue;
+   
+   /* ignore special directories and files */
+   if (ignore_dir_entry (path, entry))
+   continue;
+   
next = talloc_asprintf (notmuch, %s/%s, path, entry-d_name);
 
if (stat (next, st)) {
-- 
1.6.3.3

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [notmuch] indexing mail?

2010-01-14 Thread Dirk-Jan C . Binnema
 Olly == Olly Betts o...@survex.com writes:

Olly On 2010-01-14, Carl Worth wrote:
 On Thu, 14 Jan 2010 18:38:54 +0100, Adrian Perez de Castro 
ape...@igalia.com wrote:
 I am using XFS, which always returns DT_UNKNOWN. Taking into account 
that
 there is a good deal of people using filesystems other than the ones you
 mention, and that other non-linux filesystems may also return 
DT_UNKNOWN,
 in my opinion there should be a fall-back. I will try to post a patch
 Anytime Soon=E2=84=A2.
 
 We definitely want the fallback. I can attempt to code it, but I don't
 have ready access to an afflicted filesystem, so I'd need help testing
 anyway.
 
 I'd love to see a patch for this bug soon. Be sure to CC me when the
 patch is sent and that will help me commit it sooner.

Olly Not a full patch, but I already posted what this code should look like
Olly to handle both systems without d_type, and those which return 
DT_UNKNOWN:

Olly http://article.gmane.org/gmane.mail.notmuch.general/1044

I take a slighly different approach in mu:

/* if the file system does not support entry-d_type, we add it ourselves
 * this is slower (extra stat) but at least it works
 */
static gboolean
_set_dtype (const char* path, struct dirent *entry)
{
struct stat statbuf;
char fullpath[4096];

snprintf (fullpath, sizeof(fullpath), %s%c%s,
  path, G_DIR_SEPARATOR, entry-d_name);

if (stat (fullpath, statbuf) != 0) {
g_warning (stat failed on %s: %s, fullpath,
   strerror(errno));
return FALSE;
}

/* we only care about dirs, regular files and links */
if (S_ISREG (statbuf.st_mode))
entry-d_type = DT_REG;
else if (S_ISDIR (statbuf.st_mode))
entry-d_type = DT_DIR;
else if (S_ISLNK (statbuf.st_mode))
entry-d_type = DT_LNK;

return TRUE;
}


and then in some other places:

/* handle FSs that don't support entry-d_type */
if (entry-d_type == DT_UNKNOWN) 
_set_dtype (dirname, entry);


Note, that is untested as of yet.

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:d...@djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [notmuch] indexing mail?

2010-01-15 Thread Dirk-Jan C . Binnema
Hi Olly,

 Olly == Olly Betts o...@survex.com writes:

Olly On 2010-01-15, Dirk-Jan C  Binnema wrote:
 Olly == Olly Betts o...@survex.com writes:
Olly Not a full patch, but I already posted what this code should look like
Olly to handle both systems without d_type, and those which return 
DT_UNKNOWN:
 
Olly http://article.gmane.org/gmane.mail.notmuch.general/1044

 static gboolean
 _set_dtype (const char* path, struct dirent *entry)

Olly Underscore prefixed identifiers are reserved by ISO C at file-scope;
Olly using them yourself is undefined behaviour...

Ah, thanks for reminding, I thought it was __ and _C (capital), but you are
right:

, (7.1.3 Reserved identifiers)
|  All identifiers that begin with an underscore and either an uppercase letter
|  or another underscore are always reserved for any use.
| 
| — All identifiers that begin with an underscore are always reserved for use as
| identifiers with file scope in both the ordinary and tag name spaces.
`

 /* we only care about dirs, regular files and links */
 if (S_ISREG (statbuf.st_mode))
entry- d_type = DT_REG;
 else if (S_ISDIR (statbuf.st_mode))
entry- d_type = DT_DIR;
 else if (S_ISLNK (statbuf.st_mode))
entry- d_type = DT_LNK;

Olly This addresses the case where the FS returns DT_UNKNOWN for d_type,
Olly but doesn't deal with the case of platforms where struct dirent has
Olly no d_type member - from the Linux readdir man page:

Olly   The only fields in the dirent structure that are mandated by
Olly   POSIX.1 are: d_name[], of unspecified size, with at most NAME_MAX
Olly   characters preceding the terminating null byte; and (as an XSI
Olly   extension) d_ino.  The other fields are unstandardized, and not
Olly   present on all systems; see NOTES below for some further details.

Olly And in NOTES:

Olly   Other than Linux, the d_type field is available mainly only on BSD
Olly   systems.

Yes, my patch could me generalized a bit more, just like your patch could not
hardcode the '/'-separator :)

In practice though, what Unices in use today do not support d_type?

Best wishes,
Dirk,

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:d...@djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Re: thread ordering based on references and/or in-reply-to

2011-11-04 Thread Dirk-Jan C . Binnema


On Wed 02 Nov 2011 04:37:05 PM EET, Austin Clements wrote:

  On Mon, Oct 31, 2011 at 7:07 PM, Florian Friesdorf f...@chaoflow.net wrote:
  
   Hi,
  
   I'm looking into taking the References header into account for thread
   ordering. So far only In-Reply-To is used. My C/C++ is rusty at best, so
   I'd need some help to get this done.

snip
 
  I know this came up on IRC, but have you looked at jwz's threading
  algorithm (http://www.jwz.org/doc/threading.html)?  Carl mentioned
  that notmuch already implements it (except for subject matching), but
  notmuch only implements the subset of it necessary to group messages
  into threads without structure.  Much of the algorithm is devoted to
  exactly this problem of piecing together the thread structure based on
  all of the information in both In-Reply-To and References.  The
  algorithm as described combines the issues of grouping and structuring
  since it's expecting a giant pile of mail as input, but there's no
  reason these can't be teased apart.

I've implemented it for mu[1], maybe some of it can be reusable for notmuch;
see mu-threader.[ch] and mu-container.[ch] in

   http://gitorious.org/mu/mu/blobs/master/src/

(starting point is mu_threader_calculate).
   
I didn't implement subject matching yet, but it does build the hierarchy as
per JWZ and References:.

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:d...@djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: ignore folders patch?

2012-01-22 Thread Dirk-Jan C . Binnema

Hi,

On 2012-01-22T13:32:12 EET, Fabio Zanini wrote:

  Hi!
 
  This is my first message to the list. In 2010 a patch was developed that
  enabled notmuch new to ignore certain subdirectories of the mail folder:
 
  http://notmuchmail.org/pipermail/notmuch/2010/003170.html
 
  However, I cannot find any reference to it in the git history, and it
  does not seem to have been implemented.
 
  This feature seems to be relevant, what would you think of reviving it
  and including it into the codebase? I could try to update the patch
  myself.

I don't think my patch made it either:
 http://notmuchmail.org/pipermail/notmuch/2010/001103.html

which support ignoring directories by dropping a '.noindex' file in them. I've
been happily using that with the mu program for years.

Best wishes,
Dirk.

--
Dirk-Jan C. Binnema  Helsinki, Finland
e:d...@djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[notmuch] interesting project!

2009-11-21 Thread Dirk-Jan C. Binnema
Hi all,

Wow, 'notmuch' looks like a very interesting project. In 2008, I wrote an
e-mail (Maildir) search tool called 'mu'[1], also using Xapian and GMime; my
plan was at some point to turn it into a mail reader (use
offlineimap/fetchmail etc. for getting the mail, and something else for
sending it), but I never got that far. Search works pretty well
though. Anyhow, it seems notmuch is getting there quickly.

Anyhow, I'll study the notmuch code and see if there are some useful bits in
my code that might make sense there, e.g., various dir scanning optimizations,
see [2].

Good luck!
Dirk.


[1] http://www.djcbsoftware.nl/code/mu/
[2] http://djcbflux.blogspot.com/2008/10/seek-destroy.html

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] interesting project!

2009-11-22 Thread Dirk-Jan C. Binnema
Hi Carl,

>>>>> "Carl" == Carl Worth  writes:

>> Anyhow, I'll study the notmuch code and see if there are some useful
>> bits in my code that might make sense there, e.g., various dir scanning
>> optimizations, see [2].

Carl> That sounds great. It's also good to have people with experience in
Carl> this area join and help out. I'll look forward to any ideas or other
Carl> contributions you will have.

Thanks for the nice words!

A small question: it seems that notmuch is avoiding the use of GLib directly
(of course, it depend on it anyway through GMime); is this because of
OOM-handling? It'd be nice if GLib could be used, it would make some things
quite a bit easier.

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] [PATCH] notmuch: Add Maildir directory name as tag name for messages

2009-11-22 Thread Dirk-Jan C. Binnema
>>>>> "Carl" == Carl Worth  writes:

Carl> On Sun, 22 Nov 2009 00:25:43 +0100, Michiel Buddingh'
Carl>  wrote:

Carl> Hi Michel, welcome to Notmuch!

>> Any attempt to match tags up to directories will eventually have 
>> to deal with with the fact that tags can't be neatly mapped onto 
>> them.  If I remove a directory-tag from a message, does this 
>> mean the message is removed from that directory?  What if a 
>> message has two directory-tags, does it mean it's present in both
>> directories?

Carl> Right. We definitely don't want a strong mapping here. I think one
Carl> answer could be that the initial import is different and can take the
Carl> directory names as a "hint" for initializing tag values. After that,
Carl> they are just tags in the database and the user can do whatever they
Carl> want.

In 'mu' I took a slightly different approach, by simply allowing for searches
on the path to the message; ie., by adding the path as just another property
of the message, just like subject: or from:.

Now, the trouble with that the path of a message is not very stable -- people
move messages around. In 'mu' this was handled by simply considering the moved
message as a new one, and removing message from the database whose paths do
not point to message anymore. This makes things quite robust against people
moving things around or deleting them. Tags could be connected to the
hopefully unique Maildir file name then. 

Carl>   notmuch search tag:foo and not tag:foo-search
Carl>   notmuch search tag:foo-search and not tag:foo

Carl> So even if in the end the user comes up with something that could
Carl> replace the original tag, it's probably still beneficial to have it
Carl> there when starting out.

So, an alternative would be something like:
 notmuch search path:inbox

But this requires notmuch to be able to update things when paths change.

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] [PATCH] notmuch: Add Maildir directory name as tag name for messages

2009-11-22 Thread Dirk-Jan C. Binnema
Hi Michiel,

>>>>> "Michiel" == Michiel Buddingh'  writes:

Michiel> Another thing; notmuch currently indexes by message-id (or SHA-1
Michiel> hash if that is not present).  Maildir, on the other hand, uses
Michiel> file names as stable (when stripped of the parts trailing the
Michiel> colon) unique (knock on wood) identifiers.  A Maildir-aware
Michiel> notmuch could incorporate this to be far more resistant to bulk
Michiel> mail moves done by other clients, by using filename lookups to
Michiel> avoid accessing and parsing the mail files themselves.

I think that is a good idea.

Michiel> I should re-iterate that I'm new to notmuch, and it's obvious
Michiel> that I'm trying to beat it into becoming something it was never
Michiel> intended to be; on the other hand, I'd like people to chime in on
Michiel> this.

I'm new here too, so please be gentle as I learn how things work

Michiel> +
Michiel> +static void
Michiel> +derive_tags_from_maildir_flags (notmuch_message_t *message, const 
char *
Michiel> path)

I see you don't handle the "N" -- is that deliberate? Also, a minor addition
may to also allow for '!' instead of ':' as a separator, as that's the
semi-official way to use Maildirs on (V)FAT filesystems (which don't allow for
colons in filenames).


Best wishes,
Dirk.


-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] [PATCH] fix compiler warnings

2009-11-22 Thread Dirk-Jan C. Binnema

(hopefully this is the correct way to send patches...)

With these minor changes, notmuch compiles warning-free with gcc 4.4.1

---
 notmuch-new.c   |4 +++-
 notmuch-setup.c |   13 +++--
 notmuch-tag.c   |4 +++-
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 0dd2784..88b48a6 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -36,7 +36,9 @@ static void
 handle_sigint (unused (int sig))
 {
 static char msg[] = "Stopping... \n";
-write(2, msg, sizeof(msg)-1);
+if (write(2, msg, sizeof(msg)-1) < 0) {
+   /* ignore...*/
+}
 interrupted = 1;
 }

diff --git a/notmuch-setup.c b/notmuch-setup.c
index 482efd2..0d3f186 100644
--- a/notmuch-setup.c
+++ b/notmuch-setup.c
@@ -99,12 +99,13 @@ notmuch_setup_command (unused (void *ctx),
 unsigned int i;
 int is_new;

-#define prompt(format, ...)\
-do {   \
-   printf (format, ##__VA_ARGS__); \
-   fflush (stdout);\
-   getline (, _size, stdin); \
-   chomp_newline (response);   \
+#define prompt(format, ...)\
+do {   \
+   printf (format, ##__VA_ARGS__); \
+   fflush (stdout);\
+   if (getline (, _size, stdin) < 0) \
+   break;  \
+   chomp_newline (response);   \
 } while (0)

 config = notmuch_config_open (ctx, NULL, _new);
diff --git a/notmuch-tag.c b/notmuch-tag.c
index e2311f6..5e40f50 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -26,7 +26,9 @@ static void
 handle_sigint (unused (int sig))
 {
 static char msg[] = "Stopping... \n";
-write(2, msg, sizeof(msg)-1);
+if (write(2, msg, sizeof(msg)-1) < 0) {
+   /* ignore... */
+}
 interrupted = 1;
 }

-- 
1.6.3.3



[notmuch] [PATCH] notmuch: Add Maildir directory name as tag name for messages

2009-11-22 Thread Dirk-Jan C. Binnema
Hi Michiel,

>>>>> "MB" == Michiel Buddingh'  writes:

    MB> Dirk-Jan C. Binnema  wrote:
Michiel> +
Michiel> +static void
Michiel> +derive_tags_from_maildir_flags (notmuch_message_t 
Michiel> *message, const char *
Michiel> path)
>> 
>> I see you don't handle the "N" -- is that deliberate? Also, a 
>> minor addition may to also allow for '!' instead of ':' as a 
>> separator, as that's the semi-official way to use Maildirs on 
>> (V)FAT filesystems (which don't allow for colons in filenames).

MB> Not deliberate.  Simply unaware of the "N" flag, nor aware of 
MB> practices for storing Maildirs on (V)FAT.

Ah, you are right. In my code,
 http://gitorious.org/mu/mainline/blobs/master/msg/mu-msg-flags.c
I handled it because apparently some programs where using that. But I guess
it's better to stick to DJB's Maildir spec for now...

Regarding the VFAT-limitations, see eg. the Note in:
  http://docs.python.org/lib/mailbox-maildir.html
anyway, a minor point.


Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] [PATCH 1/2] * avoid gcc 4.4.1 compiler warnings due to ignored write return values

2009-11-23 Thread Dirk-Jan C. Binnema
From: Dirk-Jan C. Binnema <djcb.b...@gmail.com>
Date: Mon, 23 Nov 2009 08:03:35 +0200


---
 notmuch-new.c |4 +++-
 notmuch-tag.c |4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index a2b30bd..3d04efa 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -35,8 +35,10 @@ static volatile sig_atomic_t interrupted;
 static void
 handle_sigint (unused (int sig))
 {
+ssize_t ignored;
 static char msg[] = "Stopping... \n";
-write(2, msg, sizeof(msg)-1);
+
+ignored = write(2, msg, sizeof(msg)-1);
 interrupted = 1;
 }

diff --git a/notmuch-tag.c b/notmuch-tag.c
index e2311f6..ec98c3b 100644
--- a/notmuch-tag.c
+++ b/notmuch-tag.c
@@ -25,8 +25,10 @@ static volatile sig_atomic_t interrupted;
 static void
 handle_sigint (unused (int sig))
 {
+ssize_t ignored;
+   
 static char msg[] = "Stopping... \n";
-write(2, msg, sizeof(msg)-1);
+ignored = write(2, msg, sizeof(msg)-1);
 interrupted = 1;
 }

-- 
1.6.3.3



[notmuch] [PATCH 2/2] * avoid gcc 4.4.1 compiler warning due to ignored 'fflush' return value

2009-11-23 Thread Dirk-Jan C. Binnema


---
 notmuch-setup.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/notmuch-setup.c b/notmuch-setup.c
index d06fbf8..c50f812 100644
--- a/notmuch-setup.c
+++ b/notmuch-setup.c
@@ -100,12 +100,13 @@ notmuch_setup_command (unused (void *ctx),
 unsigned int i;
 int is_new;

-#define prompt(format, ...)\
-do {   \
-   printf (format, ##__VA_ARGS__); \
-   fflush (stdout);\
-   getline (, _size, stdin); \
-   chomp_newline (response);   \
+#define prompt(format, ...)\
+do {   \
+   int ignored;\
+   printf (format, ##__VA_ARGS__); \
+   fflush (stdout);\
+   ignored = getline (, _size, stdin);   \
+   chomp_newline (response);   \
 } while (0)

 config = notmuch_config_open (ctx, NULL, _new);
-- 
1.6.3.3



[notmuch] interesting project!

2009-11-23 Thread Dirk-Jan C. Binnema
Hi Carl,

>>>>> "Carl" == Carl Worth  writes:

Carl> On Sun, 22 Nov 2009 14:23:10 +0200, Dirk-Jan C. Binnema
Carl>  wrote:
>> A small question: it seems that notmuch is avoiding the use of GLib 
directly
>> (of course, it depend on it anyway through GMime); is this because of
>> OOM-handling? It'd be nice if GLib could be used, it would make some 
things
>> quite a bit easier.

Carl> It's true that I don't like the OOM handling in glib. I also think 
that
Carl> glib tries to be too many different things at the same time. And
Carl> finally, having some talloc-friendly data structures (like a 
hash-table)
Carl> would be really nice.

Well, the counter point to the OOM-problems is that is that in many programs,
the 'malloc returns NULL'-case is often not very well tested (because it's
rather hard to test), and that at least on Linux, it's unlikely that malloc
ever does return NULL. Lennart Poettering wrote this up in some more
detail[1]. Of course, the requirements for notmuch may be a bit different and
I definitely don't want to suggest any radical change here after only finding
out about notmuch a few days ago :)

(BTW, there is a hashtable implementation in libc, (hcreate(3) etc.). Is that
one not sufficiently 'talloc-friendly'? It's not very user-friendly, but
that's another matter)

Carl> In the meantime, as you say, we're already linking with glib because 
of
Carl> GMime, so there's really no reason not to call functions that are 
there
Carl> and that do what we want. What kinds of things were you thinking of 
that
Carl> would be easier with glib?

I could imagine the string functions could replace the ones in talloc. There
are many more string functions, e.g., for handling file names / paths, which
are quite useful. Then there are wrappers for gcc'isms (G_UNLIKELY etc.) that
would make the ones in notmuch unneeded, and a lot of compatibility things
like G_DIR_SEPARATOR. And the datastructures (GSlice/GList/GHashtable) are
nice. The UTF8 functionality might come in handy.

Anyway, I was just curious, people have survived without GLib before, and if
you dislike the OOM-strategy, it's a bit of a no-no of course.

Best wishes,
Dirk.


[1] http://article.gmane.org/gmane.comp.audio.jackit/19998

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] interesting project!

2009-11-24 Thread Dirk-Jan C. Binnema
Hi Carl,

>>>>> "Carl" == Carl Worth  writes:

Carl> I agree that trying to support OOM doesn't make sense without
Carl> testing. But that's why I want to test notmuch with memory-fault
Carl> injection. We've been doing this with the cairo library with good
Carl> success for a while.

Carl> As for "unlikely that malloc ever returns NULL", that's simply a
Carl> system-configuration away (just turn off overcommit). And I can 
imagine
Carl> notmuch being used in lots of places, (netbooks, web servers, etc.), 
so
Carl> I do want to make it as robust as possible.

That is a very laudable goal! But it's also quite hard to achieve, considering
that both GMime and Xapian may have some different ideas about that. And at
least in the current code, I see fprintfs in 'malloc-returns-NULL'-cases --
but fprintf itself will probably allocate memory too. Also, at least now, the
bad?alloc exceptions for C++ are not caught. Of course, that can be changed,
but it's just to show that these things are hard to get right.

Carl> Thanks for mentioning the hash table. The hash table is one of the few
Carl> things that I *am* using from glib right now in notmuch. It's got a
Carl> couple of bizarre things about it:

Carl>   1. The simpler-appearing g_hash_table_new function is useless
Carl>  for common cases like hashing strings. It will just leak
Carl>  memory. So g_hash_table_new_full is the only one worth using.

Hmmm, I never noticed that behavior. Tf you are using dynamically allocated
strings, GHashTable won't free them for you -- but I can really see how it
could (given that it takes generic pointers), so you have to free those
yourself. But any memleaks beyond that?

Carl>   2. There are two lookup functions, g_hash_table_lookup, and
Carl>  g_hash_table_lookup_extended.

Carl>  So, it might make sense if a hash-table interface supported
Carl>  these two modes well. What's bizarre about GHashTable though,
Carl>  is that in the "just a set" case, we only use NULL as the
Carl>  value when inserting. And distinguish "previously inserted
Carl>  with NULL" from "never inserted" is the one thing that
Carl>  g_hash_table_lookup can't do. So I've only found that I could
Carl>  ever use g_hash_table_lookup_extended, (and pass a pair of
Carl>  NULLs for the return arguments I don't need).

Hmmn, well in I found that returning NULL for 'not set' works in many cases,
and it makes it quite easy for that. If you need to distinguish between NULL
and 'not set', you can use either the _extended version as you mention, or use
some special NOT_SET static ptr you can compare with (and handle it
appropriately in the destructor).

Carl> I definitely like the idea of having tiny, focused libraries that do
Carl> one thing and do it well, (and maybe even some things so tiny that
Carl> they are actually designed to be copied into the application---like
Carl> with gnulib and with Eric's new hash table).

Ok; glib fills the role pretty well for me, and I don't really pay for the
parts that I don't use. But tastes differ, no problem ;-)

Carl> Thanks for understanding. :-)
Carl> And I enjoy the conversation,

Same here :) 

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] [PATCH] notmuch-count: make sure all created items are freed, even in error paths

2009-11-24 Thread Dirk-Jan C. Binnema

Another minor patch, fixing a couple of resource leaks in error paths.

---
 notmuch-count.c |   52 
 1 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/notmuch-count.c b/notmuch-count.c
index 77aa433..b5808f5 100644
--- a/notmuch-count.c
+++ b/notmuch-count.c
@@ -28,7 +28,8 @@ notmuch_count_command (void *ctx, int argc, char *argv[])
 notmuch_database_t *notmuch;
 notmuch_query_t *query;
 char *query_str;
-int i;
+int i, retval;
+
 #if 0
 char *opt, *end;
 int i, first = 0, max_threads = -1;
@@ -76,35 +77,54 @@ notmuch_count_command (void *ctx, int argc, char *argv[])
 argc -= i;
 argv += i;

+config= NULL;
+query_str = NULL;
+notmuch   = NULL;
+query = NULL;
+
 config = notmuch_config_open (ctx, NULL, NULL);
-if (config == NULL)
-   return 1;
-
-notmuch = notmuch_database_open (notmuch_config_get_database_path (config),
-NOTMUCH_DATABASE_MODE_READ_ONLY);
-if (notmuch == NULL)
-   return 1;
-
+if (config == NULL) {
+   retval = 1;
+   goto out;
+}
+
 query_str = query_string_from_args (ctx, argc, argv);
 if (query_str == NULL) {
fprintf (stderr, "Out of memory.\n");
-   return 1;
+   retval = 1;
+   goto out;
 }
 if (*query_str == '\0') {
fprintf (stderr, "Error: notmuch count requires at least one count 
term.\n");
-   return 1;
+   retval = 1;
+   goto out;
 }
-
+
+notmuch = notmuch_database_open (notmuch_config_get_database_path (config),
+NOTMUCH_DATABASE_MODE_READ_ONLY);
+if (notmuch == NULL) {
+   fprintf (stderr, "Failed to open database at %s\n",
+notmuch_config_get_database_path (config));
+   retval = 1;
+   goto out;
+}
+
 query = notmuch_query_create (notmuch, query_str);
 if (query == NULL) {
fprintf (stderr, "Out of memory\n");
-   return 1;
+   retval = 1;
+   goto out;
 }

 printf ("%u\n", notmuch_query_count_messages(query));
+retval = 0;

+out:
+talloc_free (query_str);
+notmuch_config_close (config);
 notmuch_query_destroy (query);
-notmuch_database_close (notmuch);
-
-return 0;
+if (notmuch) /* _close does not allow NULL */
+   notmuch_database_close (notmuch);
+
+return retval;
 }
-- 
1.6.3.3



[notmuch] [PATCH 2/2] * avoid gcc 4.4.1 compiler warning due to ignored 'fflush' return value

2009-12-01 Thread Dirk-Jan C. Binnema
Hi Carl,

>>>>> "Carl" == Carl Worth  writes:

Carl> [1  ]
Carl> On Mon, 23 Nov 2009 08:21:50 +0200, Dirk-Jan C. Binnema  wrote:
>> -#define prompt(format, ...) \
>> -do {\
>> -printf (format, ##__VA_ARGS__); \
>> -fflush (stdout);\
>> -getline (, _size, stdin); \
>> -chomp_newline (response);   \
>> +#define prompt(format, ...) \
>> +do {\
>> +int ignored;\
>> +printf (format, ##__VA_ARGS__); \
>> +fflush (stdout);\
>> +ignored = getline (, _size, stdin);   \
>> +chomp_newline (response);   \
>> } while (0)

Carl> This patch is incorrect. Ignoring the return value of getline results 
in
Carl> the program invoking undefined behavior by reading uninitialized
Carl> memory. This is easily tested by, for example, typing Control-D to
Carl> provide EOF to a prompt from "notmuch setup".

Carl> How about just exiting in case of EOF as in the patch below?

Sure, that's the better solution, but note that my patch did not introduce the
undefined behavior -- it was there before. I was trying a minimal patch to
silencing the warning. Note that prompt seems to leak a bit, even after the
committed patch; attached are two more micro patches to fix this and another
small leak. I try to do minimal changes, but the prompt business gets a bit
unwieldy. The leaks are one-time at not critical, but anyway it's always good
stay vigilant.

-- next part --
A non-text attachment was scrubbed...
Name: 0001-notmuch-config-fix-small-leak-from-g_key_file_to_dat.patch
Type: application/octet-stream
Size: 491 bytes
Desc: not available
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20091201/6a42087a/attachment.obj>
-- next part --


-- next part --
A non-text attachment was scrubbed...
Name: 0002-free-the-response-data-from-prompt.patch
Type: application/octet-stream
Size: 1674 bytes
Desc: not available
URL: 
<http://notmuchmail.org/pipermail/notmuch/attachments/20091201/6a42087a/attachment-0001.obj>
-- next part --



Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C



[notmuch] [PATCH 2/2] * free the response data from 'prompt'

2009-12-02 Thread Dirk-Jan C. Binnema
Free the results of the prompt; this patch does the minimal job for that.
It may be nice to refactor the function a bit. 

Signed-off-by: Dirk-Jan C. Binnema 
---
 notmuch-setup.c |   15 +--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/notmuch-setup.c b/notmuch-setup.c
index 622bbaa..293c852 100644
--- a/notmuch-setup.c
+++ b/notmuch-setup.c
@@ -119,12 +119,16 @@ notmuch_setup_command (unused (void *ctx),
 prompt ("Your full name [%s]: ", notmuch_config_get_user_name (config));
 if (strlen (response))
notmuch_config_set_user_name (config, response);
-
+free (response);
+response = NULL;
+
 prompt ("Your primary email address [%s]: ",
notmuch_config_get_user_primary_email (config));
 if (strlen (response))
notmuch_config_set_user_primary_email (config, response);
-
+free (response);
+response = NULL;
+
 other_emails = g_ptr_array_new ();

 old_other_emails = notmuch_config_get_user_other_email (config,
@@ -136,12 +140,17 @@ notmuch_setup_command (unused (void *ctx),
else
g_ptr_array_add (other_emails, talloc_strdup (ctx,
 old_other_emails[i]));
+   free (response);
+   response = NULL;
 }

 do {
prompt ("Additional email address [Press 'Enter' if none]: ");
if (strlen (response))
g_ptr_array_add (other_emails, talloc_strdup (ctx, response));
+   free (response);
+   response = NULL;
+   
 } while (strlen (response));
 if (other_emails->len)
notmuch_config_set_user_other_email (config,
@@ -158,6 +167,8 @@ notmuch_setup_command (unused (void *ctx),
absolute_path = make_path_absolute (ctx, response);
notmuch_config_set_database_path (config, absolute_path);
 }
+free (response);
+response = NULL;

 if (! notmuch_config_save (config)) {
if (is_new)
-- 
1.6.3.3



[notmuch] [PATCH 1/2] * notmuch-config: fix small leak from 'g_key_file_to_data'

2009-12-02 Thread Dirk-Jan C. Binnema

Signed-off-by: Dirk-Jan C. Binnema 
---
 notmuch-config.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/notmuch-config.c b/notmuch-config.c
index fc65d6b..95430db 100644
--- a/notmuch-config.c
+++ b/notmuch-config.c
@@ -317,9 +317,11 @@ notmuch_config_save (notmuch_config_t *config)
fprintf (stderr, "Error saving configuration to %s: %s\n",
 config->filename, error->message);
g_error_free (error);
+   g_free (data);
return 1;
 }

+g_free (data);
 return 0;
 }

-- 
1.6.3.3



[notmuch] Subject: [PATCH] update the check whether a dir entry should be ignored.

2009-12-12 Thread Dirk-Jan C. Binnema
Hi all,

This is a draft patch which hopefully improves the check whether a dir entry
should be ignored for that. It adds one feature: if you put a file '.noindex'
in a dir, the whole dir will be ignored for indexing. I find this very useful
for removing e.g. folders with spam messages from the indexing.

There is one maybe controversial change, namely that it ignores all dot-dirs;
this works fine for .notmuch and .nnmaildir (gnus), but maybe there is some
valid use case for having mail in dot-dirs. Maybe one of the IMAP-servers does
this? Not sure. Anyway, I can change that part.

If the overall approach is considered OK, I can make a new patch

Best wishes,
Dirk.


---
 notmuch-new.c |   48 ++--
 1 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 9d20616..28f69bc 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -109,6 +109,30 @@ is_maildir (struct dirent **entries, int count)
 return 0;
 }

+
+static int
+ignore_dir_entry (const char* path, struct dirent *entry)
+{
+char noindex[4096]; /* any path will fit */
+
+/* ignore everything starting with a dot; this covers hidden
+* files, as well as special dir (. and ..), but also things like
+* gnus .nnmaildir or .notmuch */
+if (entry->d_name[0] == '.')
+   return 1;
+
+/* we also check if dir contains a file called '.noindex'; if so,
+ * we ignore this directory; alloca would be suitable here, if not
+ * for the portability. */
+snprintf (noindex, sizeof(noindex), "%s/%s/.noindex", path, entry->d_name);
+if (access (noindex, F_OK) == 0)
+   return 1;
+
+return 0; /* don't ignore */
+}
+
+
+
 /* Examine 'path' recursively as follows:
  *
  *   o Ask the filesystem for the mtime of 'path' (path_mtime)
@@ -181,21 +205,17 @@ add_files_recursive (notmuch_database_t *notmuch,
if (path_mtime <= path_dbtime && entry->d_type == DT_REG)
continue;

-   /* Ignore special directories to avoid infinite recursion.
-* Also ignore the .notmuch directory.
-*/
-   /* XXX: Eventually we'll want more sophistication to let the
-* user specify files to be ignored. */
-   if (strcmp (entry->d_name, ".") == 0 ||
-   strcmp (entry->d_name, "..") == 0 ||
-   (entry->d_type == DT_DIR &&
-(strcmp (entry->d_name, "tmp") == 0) &&
-is_maildir (namelist, num_entries)) ||
-   strcmp (entry->d_name, ".notmuch") ==0)
-   {
-   continue;
-   }

+   /* ignore tmp Maildirs, for obvious reasons */
+   if (entry->d_type == DT_DIR &&
+   (strcmp (entry->d_name, "tmp") == 0) &&
+   is_maildir (namelist, num_entries))
+   continue;
+   
+   /* ignore special directories and files */
+   if (ignore_dir_entry (path, entry))
+   continue;
+   
next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);

if (stat (next, st)) {
-- 
1.6.3.3



[notmuch] Subject: [PATCH] update the check whether a dir entry should be ignored.

2009-12-12 Thread Dirk-Jan C. Binnema
Hi David,

> "DM" == David Maus  writes:

>> There is one maybe controversial change, namely that it ignores all
>> dot-dirs; this works fine for .notmuch and .nnmaildir (gnus), but maybe
>> there is some valid use case for having mail in dot-dirs. Maybe one of
>> the IMAP-servers does this? Not sure. Anyway, I can change that part.

DM> Yes, ignore dot-dirs completely is not a good idea as the "." is a
DM> common separator for mailbox hierarchies on imap servers. Dovecot uses
DM> it by default, Courier too.

Thanks. I was suspecting something like that. Attached, updated patch that
also updates the counting part.

---
 notmuch-new.c |   75 ++---
 1 files changed, 50 insertions(+), 25 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index 9d20616..411e084 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -109,6 +109,44 @@ is_maildir (struct dirent **entries, int count)
 return 0;
 }

+
+static int
+ignore_dir_entry (const char* path, struct dirent *entry)
+{
+char noindex[4096]; /* any path will fit */
+
+/* ignore everything starting with a dot; this covers hidden
+* files, as well as special dir (. and ..), but also things like
+* gnus .nnmaildir or .notmuch */
+
+/* special handling for dot-dirs */
+if (entry->d_name[0] == '.') {
+
+   /* ignore '.' and '..' */
+   if (entry->d_name[1] == '\0' ||
+   (entry->d_name[1] == '.' && entry->d_name[2] == '\0')) 
+   return 1;
+   
+   if (entry->d_name[1] == 'n')  { /* optimization */
+   /* ignore notmuch, gnus special dirs (or such-named files) */
+   if (strcmp (entry->d_name, ".notmuch") == 0 ||
+   strcmp (entry->d_name, ".nnmaildir") == 0)
+   return 1;
+   }
+}
+
+/* we also check if dir contains a file called '.noindex'; if so,
+ * we ignore this directory; alloca would be suitable here, if not
+ * for the portability. */
+snprintf (noindex, sizeof(noindex), "%s/%s/.noindex", path, entry->d_name);
+if (access (noindex, F_OK) == 0)
+   return 1;
+
+return 0; /* don't ignore */
+}
+
+
+
 /* Examine 'path' recursively as follows:
  *
  *   o Ask the filesystem for the mtime of 'path' (path_mtime)
@@ -181,21 +219,17 @@ add_files_recursive (notmuch_database_t *notmuch,
if (path_mtime <= path_dbtime && entry->d_type == DT_REG)
continue;

-   /* Ignore special directories to avoid infinite recursion.
-* Also ignore the .notmuch directory.
-*/
-   /* XXX: Eventually we'll want more sophistication to let the
-* user specify files to be ignored. */
-   if (strcmp (entry->d_name, ".") == 0 ||
-   strcmp (entry->d_name, "..") == 0 ||
-   (entry->d_type == DT_DIR &&
-(strcmp (entry->d_name, "tmp") == 0) &&
-is_maildir (namelist, num_entries)) ||
-   strcmp (entry->d_name, ".notmuch") ==0)
-   {
-   continue;
-   }

+   /* ignore tmp Maildirs, for obvious reasons */
+   if (entry->d_type == DT_DIR &&
+   (strcmp (entry->d_name, "tmp") == 0) &&
+   is_maildir (namelist, num_entries))
+   continue;
+   
+   /* ignore special directories and files */
+   if (ignore_dir_entry (path, entry))
+   continue;
+   
next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);

if (stat (next, st)) {
@@ -394,18 +428,9 @@ count_files (const char *path, int *count)

 entry= namelist[i++];

-   /* Ignore special directories to avoid infinite recursion.
-* Also ignore the .notmuch directory.
-*/
-   /* XXX: Eventually we'll want more sophistication to let the
-* user specify files to be ignored. */
-   if (strcmp (entry->d_name, ".") == 0 ||
-   strcmp (entry->d_name, "..") == 0 ||
-   strcmp (entry->d_name, ".notmuch") == 0)
-   {
+   if (ignore_dir_entry (path, entry))
continue;
-   }
-
+   
if (asprintf (, "%s/%s", path, entry->d_name) == -1) {
next = NULL;
fprintf (stderr, "Error descending from %s to %s: Out of memory\n",
-- 
1.6.3.3



[notmuch] [PATCH] * notmuch-new.c: refactor and improve dirs-to-ignore a bit

2010-01-10 Thread Dirk-Jan C. Binnema

Below, an updated patch for the latest notmuch; this refractors the dir check
a bit, and adds the useful feature of making 'notmuch new' ignore directories
with a '.noindex'-file in them.

Best wishes,
Dirk.


 [PATCH] * notmuch-new.c: refactor and improve dirs-to-ignore a bit

  add a new function ignore_dir_entry, which determines whether a dir entry
  should be ignored (not entered).

  dirs to ignore are: '.' and '..', '.notmuch' and 'nnmaildir' (the later from
  gnus). Also, ignore dirs that contain a file called '.noindex'; thus, we can
  tell not much not to consider e.g. dirs with spam messages.
---
 notmuch-new.c |   76 ++---
 1 files changed, 56 insertions(+), 20 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index b740ee2..d1526cd 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -169,6 +169,47 @@ _entries_resemble_maildir (struct dirent **entries, int 
count)
 return 0;
 }

+
+/* Ignore special directories to avoid infinite recursion.
+ * Also ignore the .notmuch directory and any "tmp" directory
+ * that appears within a maildir.
+ */
+static int
+ignore_dir_entry (const char* path, struct dirent *entry)
+{
+char noindex[4096]; /* any path will fit */
+
+/* ignore everything starting with a dot; this covers hidden
+ * files, as well as special dir (. and ..), but also things like
+ * gnus .nnmaildir or .notmuch */
+
+/* special handling for dot-dirs */
+if (entry->d_name[0] == '.') {
+   
+   /* ignore '.' and '..' */
+   if (entry->d_name[1] == '\0' ||
+   (entry->d_name[1] == '.' && entry->d_name[2] == '\0')) 
+   return 1;
+   
+   if (entry->d_name[1] == 'n')  { /* optimization */
+   /* ignore notmuch, gnus special dirs (or such-named files) */
+   if (strcmp (entry->d_name, ".notmuch") == 0 ||
+   strcmp (entry->d_name, ".nnmaildir") == 0)
+   return 1;
+   }
+}
+
+/* we also check if dir contains a file called '.noindex'; if so,
+ * we ignore this directory; alloca would be suitable here, if not
+ * for the portability. */
+snprintf (noindex, sizeof(noindex), "%s/%s/.noindex", path, entry->d_name);
+if (access (noindex, F_OK) == 0)
+   return 1;
+
+return 0; /* don't ignore */
+}
+
+
 /* Examine 'path' recursively as follows:
  *
  *   o Ask the filesystem for the mtime of 'path' (fs_mtime)
@@ -275,21 +316,18 @@ add_files_recursive (notmuch_database_t *notmuch,

if (entry->d_type != DT_DIR && entry->d_type != DT_LNK)
continue;
+   
+   /* ignore tmp Maildirs, for obvious reasons */
+   if (is_maildir && strcmp (entry->d_name, "tmp") == 0)
+   continue;

/* Ignore special directories to avoid infinite recursion.
-* Also ignore the .notmuch directory and any "tmp" directory
-* that appears within a maildir.
+* Also ignore Maildir tmp-dirs, dirs contain .noindex files, and
+* the .notmuch and .nnmaildir directories.
 */
-   /* XXX: Eventually we'll want more sophistication to let the
-* user specify files to be ignored. */
-   if (strcmp (entry->d_name, ".") == 0 ||
-   strcmp (entry->d_name, "..") == 0 ||
-   (is_maildir && strcmp (entry->d_name, "tmp") == 0) ||
-   strcmp (entry->d_name, ".notmuch") ==0)
-   {
+   if (ignore_dir_entry (path, entry))
continue;
-   }
-
+   
next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name);
status = add_files_recursive (notmuch, next, state);
if (status && ret == NOTMUCH_STATUS_SUCCESS)
@@ -575,18 +613,16 @@ count_files (const char *path, int *count)

 entry = fs_entries[i++];

+   /* Note: it seems we're missing the '_entries_resemble_maildir' check
+* here */
+   
/* Ignore special directories to avoid infinite recursion.
-* Also ignore the .notmuch directory.
+* Also ignore Maildir tmp-dirs, dirs contain .noindex files, and
+* the .notmuch and .nnmaildir directories.
 */
-   /* XXX: Eventually we'll want more sophistication to let the
-* user specify files to be ignored. */
-   if (strcmp (entry->d_name, ".") == 0 ||
-   strcmp (entry->d_name, "..") == 0 ||
-   strcmp (entry->d_name, ".notmuch") == 0)
-   {
+   if (ignore_dir_entry (path, entry))
        continue;
-   }
-
+   
if (asprintf (, "%s/%s", path, entry->d_name) == -1) {
next = NULL;
fprintf (stderr, "Error descending from %s to %s: Out of memory\n",
-- 
1.6.3.3

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] indexing mail?

2010-01-15 Thread Dirk-Jan C. Binnema
>>>>> "Olly" == Olly Betts  writes:

Olly> On 2010-01-14, Carl Worth wrote:
>> On Thu, 14 Jan 2010 18:38:54 +0100, Adrian Perez de Castro  wrote:
>>> I am using XFS, which always returns DT_UNKNOWN. Taking into account 
that
>>> there is a good deal of people using filesystems other than the ones you
>>> mention, and that other non-linux filesystems may also return 
DT_UNKNOWN,
>>> in my opinion there should be a fall-back. I will try to post a patch
>>> Anytime Soon=E2=84=A2.
>> 
>> We definitely want the fallback. I can attempt to code it, but I don't
>> have ready access to an afflicted filesystem, so I'd need help testing
>> anyway.
>> 
>> I'd love to see a patch for this bug soon. Be sure to CC me when the
>> patch is sent and that will help me commit it sooner.

Olly> Not a full patch, but I already posted what this code should look like
Olly> to handle both systems without d_type, and those which return 
DT_UNKNOWN:

Olly> http://article.gmane.org/gmane.mail.notmuch.general/1044

I take a slighly different approach in mu:

/* if the file system does not support entry->d_type, we add it ourselves
 * this is slower (extra stat) but at least it works
 */
static gboolean
_set_dtype (const char* path, struct dirent *entry)
{
struct stat statbuf;
char fullpath[4096];

snprintf (fullpath, sizeof(fullpath), "%s%c%s",
  path, G_DIR_SEPARATOR, entry->d_name);

if (stat (fullpath, ) != 0) {
g_warning ("stat failed on %s: %s", fullpath,
   strerror(errno));
return FALSE;
}

/* we only care about dirs, regular files and links */
if (S_ISREG (statbuf.st_mode))
entry->d_type = DT_REG;
else if (S_ISDIR (statbuf.st_mode))
entry->d_type = DT_DIR;
else if (S_ISLNK (statbuf.st_mode))
entry->d_type = DT_LNK;

return TRUE;
}


and then in some other places:

/* handle FSs that don't support entry->d_type */
if (entry->d_type == DT_UNKNOWN) 
_set_dtype (dirname, entry);


Note, that is untested as of yet.

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] indexing mail?

2010-01-15 Thread Dirk-Jan C. Binnema
Hi Olly,

>>>>> "Olly" == Olly Betts  writes:

Olly> On 2010-01-15, Dirk-Jan C  Binnema wrote:
>>>>>>> "Olly" == Olly Betts  writes:
Olly> Not a full patch, but I already posted what this code should look like
Olly> to handle both systems without d_type, and those which return 
DT_UNKNOWN:
>> 
Olly> http://article.gmane.org/gmane.mail.notmuch.general/1044

>> static gboolean
>> _set_dtype (const char* path, struct dirent *entry)

Olly> Underscore prefixed identifiers are reserved by ISO C at file-scope;
Olly> using them yourself is undefined behaviour...

Ah, thanks for reminding, I thought it was __ and _C (capital), but you are
right:

, (7.1.3 Reserved identifiers)
|  All identifiers that begin with an underscore and either an uppercase letter
|  or another underscore are always reserved for any use.
| 
| ? All identifiers that begin with an underscore are always reserved for use as
| identifiers with file scope in both the ordinary and tag name spaces.
`

>> /* we only care about dirs, regular files and links */
>> if (S_ISREG (statbuf.st_mode))
entry-> d_type = DT_REG;
>> else if (S_ISDIR (statbuf.st_mode))
entry-> d_type = DT_DIR;
>> else if (S_ISLNK (statbuf.st_mode))
entry-> d_type = DT_LNK;

Olly> This addresses the case where the FS returns DT_UNKNOWN for d_type,
Olly> but doesn't deal with the case of platforms where struct dirent has
Olly> no d_type member - from the Linux readdir man page:

Olly>   The only fields in the dirent structure that are mandated by
Olly>   POSIX.1 are: d_name[], of unspecified size, with at most NAME_MAX
Olly>   characters preceding the terminating null byte; and (as an XSI
Olly>   extension) d_ino.  The other fields are unstandardized, and not
Olly>   present on all systems; see NOTES below for some further details.

Olly> And in NOTES:

Olly>   Other than Linux, the d_type field is available mainly only on BSD
Olly>   systems.

Yes, my patch could me generalized a bit more, just like your patch could not
hardcode the '/'-separator :)

In practice though, what Unices in use today do not support d_type?

Best wishes,
Dirk,

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


[notmuch] indexing mail?

2010-01-16 Thread Dirk-Jan C. Binnema
Hi Olly,

>>>>> "OB" == Olly Betts  writes:

    OB> On 2010-01-15, Dirk-Jan C  Binnema wrote:
Olly> Other than Linux, the d_type field is available mainly only on BSD
Olly> systems.
>> 
>> Yes, my patch could me generalized a bit more, just like your patch 
could not
>> hardcode the '/'-separator :)

OB> Well, '/' works as a directory separator for all Unix systems and also
OB> for Microsoft Windows at this level.  Is there a system which doesn't
OB> accept '/' in this place which is still relevant?

Note the ':)' This was just point that it's very hard to write software that
does not include *some* degree of platform-preference.

Anyway (going slightly OT here) regarding the '\', I think the one system in
wide use that insists on '\' would be Symbian. Porting Unix stuff is usually
not too hard with the Symbian's compatiblity-libc. (But maybe the
dir-separator is still the least of your concerns).

>> In practice though, what Unices in use today do not support d_type?

OB> Solaris 10 doesn't for starters.  I don't have ready access to the other
OB> non-Linux, non-BSD Unix flavours to check those right now.

Ah, thanks. Solaris 10 is fairly important indeed.  Another one is Cygwin (it
added it fairly recently), Linux before 2.6.4. I've added a configure check
for now. Performance must be quite terrible with a working d_type.

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


thread ordering based on references and/or in-reply-to

2011-11-04 Thread Dirk-Jan C. Binnema


On Wed 02 Nov 2011 04:37:05 PM EET, Austin Clements wrote:

 > On Mon, Oct 31, 2011 at 7:07 PM, Florian Friesdorf  
 > wrote:
 > >
 > > Hi,
 > >
 > > I'm looking into taking the References header into account for thread
 > > ordering. So far only In-Reply-To is used. My C/C++ is rusty at best, so
 > > I'd need some help to get this done.



 > I know this came up on IRC, but have you looked at jwz's threading
 > algorithm (http://www.jwz.org/doc/threading.html)?  Carl mentioned
 > that notmuch already implements it (except for subject matching), but
 > notmuch only implements the subset of it necessary to group messages
 > into threads without structure.  Much of the algorithm is devoted to
 > exactly this problem of piecing together the thread structure based on
 > all of the information in both In-Reply-To and References.  The
 > algorithm as described combines the issues of grouping and structuring
 > since it's expecting a giant pile of mail as input, but there's no
 > reason these can't be teased apart.

I've implemented it for mu[1], maybe some of it can be reusable for notmuch;
see mu-threader.[ch] and mu-container.[ch] in

   http://gitorious.org/mu/mu/blobs/master/src/

(starting point is mu_threader_calculate).

I didn't implement subject matching yet, but it does build the hierarchy as
per JWZ and "References:".

Best wishes,
Dirk.

-- 
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C


ignore folders patch?

2012-01-23 Thread Dirk-Jan C. Binnema

Hi,

On 2012-01-22T13:32:12 EET, Fabio Zanini wrote:

 > Hi!
 >
 > This is my first message to the list. In 2010 a patch was developed that
 > enabled notmuch new to ignore certain subdirectories of the mail folder:
 >
 > http://notmuchmail.org/pipermail/notmuch/2010/003170.html
 >
 > However, I cannot find any reference to it in the git history, and it
 > does not seem to have been implemented.
 >
 > This feature seems to be relevant, what would you think of reviving it
 > and including it into the codebase? I could try to update the patch
 > myself.

I don't think my patch made it either:
 http://notmuchmail.org/pipermail/notmuch/2010/001103.html

which support ignoring directories by dropping a '.noindex' file in them. I've
been happily using that with the "mu" program for years.

Best wishes,
Dirk.

--
Dirk-Jan C. Binnema  Helsinki, Finland
e:djcb at djcbsoftware.nl   w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C