notmuch-lib questions and observations

2013-11-19 Thread Tomi Valkeinen
On 2013-11-19 14:12, Jani Nikula wrote:
> On Mon, 18 Nov 2013, Tomi Valkeinen  wrote:
>> Hi,
>>
>> I found out about notmuch quite recently, and now I've been tinkering
>> with it, prototyping a GUI client. I have some questions and observations:
> 
> Hello Tomi, glad you've found notmuch too! ;)

Well, I'm still using Thunderbird... ;)

>> 3.
>>
>> How is a client using notmuch supposed to find out there are new
>> messages, and which messages are new?
> 
> 'notmuch new' tags any new messages it finds with tags specified in
> new.tags config option (man notmuch-config), "inbox" and "unread" by
> default. Some people like to change that to "new", and run a post-new
> hook (man notmuch-hooks) that looks at messages matching tag:new,
> retagging as desired.
> 
>> My current thought is to make 'notmuch new' run a script that tags the
>> messages, and make it add a 'new-gui' or such tag to all new messages.
>> The client would then periodically make a query for that tag, and at the
>> same time remove the tag for any returned messages.
> 
> As said, 'notmuch new' does that, and it can also run a script for you.

I think I wasn't very clear on what I meant. I was thinking about the
behavior that graphical mail clients have: they periodically refresh the
emails, showing new ones if there are any, and they'll show some icon or
such which tells the user this email is "new" (which could mean received
in the last periodic refresh).

So with notmuch, the client would somehow need to know that there has
been changes in the database, and then know which emails are new.

For the former, I have no good idea as there doesn't seem to be any way
to find out the db was changed since the last open. For the latter, I
guess the tagging method I mentioned above should work.

If the xapian document id was available, I believe it could also be used
for the latter, as it should always be increasing.

>> 5.
>>
>> This one is just a vague thought that came to my mind. At the moment
>> notmuch hides Xapian totally behind notmuch's interface, which probably
>> makes things simpler (and gives a nice C API), but also (afaik) prevents
>> using Xapian features that are not at the moment supported in the
>> notmuch API.
>>
>> I wonder how would an approach work where notmuch would be a bit more
>> like a helper library, allowing full use of Xapian's features but making
>> it simple to manage notmuch database. So, for example, when making a
>> query, you'd create a Xapian query with notmuch, and then use Xapian to
>> run the query.
>>
>> I don't have anything clear in mind, and obviously Xapian being C++
>> might make the whole idea unimplementable.
> 
> I think the database implementation has been abstracted on purpose, so
> we could, at least in theory, switch from xapian to something else. I
> don't know how feasible that would be though. I think Austin has
> experimented with that.

Ah, a valid point, I didn't think of that.

 Tomi


-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: 



notmuch-lib questions and observations

2013-11-19 Thread Jani Nikula
On Mon, 18 Nov 2013, Tomi Valkeinen  wrote:
> Hi,
>
> I found out about notmuch quite recently, and now I've been tinkering
> with it, prototyping a GUI client. I have some questions and observations:

Hello Tomi, glad you've found notmuch too! ;)

Your mail would deserve a more thorough answer that I have time for
right now, but hopefully someone(tm) will fill in.

> 1.
>
> The API seems to be a bit broken. I think many of the functions should
> return notmuch_status_t. I encountered this issue with get_header() and
> get_date(), which I happened to call after the DB had been changed
> twice, leading to Xapian::DatabaseModifiedError.
>
> Neither function handle the exception, causing a crash, which is
> obviously a bug, but even if they did handle the exception they don't
> return any sensible error information. Even worse, consider
> count_messages(), for which return value of 0 is valid.

We should never leak exceptions or do INTERNAL_ERROR() on things that
are not internal errors. So agreed those are bugs.

> So, as far as I see, many of the funcs should be changed to something like:
>
> notmuch_status_t
> notmuch_query_count_messages (notmuch_query_t *query, unsigned *count);

We're about to release 0.17, and we expect to have to break the API a
bit after the release anyway. IMO it would be a good time to review this
kind of stuff.

The bug fixes you sent for not crashing should probably be considered
for 0.17.

> 2.
>
> This is more about Xapian, I guess. The behavior that a db reader will
> start failing if the db has been changed twice is rather bad. If I'm not
> mistaken, having a rather long read-only query could be blocked (or,
> well, re-tried) forever, if there just happens to be a few db writes
> during the read.
>
> I think a better approach would be to allow only one change to the db if
> there are open db readers. If a second db writer tries to open the db,
> it would get a failure (instead of the readers).
>
> Anyone know if this has been discussed, or if my suggestion is plain silly?
>
> 3.
>
> How is a client using notmuch supposed to find out there are new
> messages, and which messages are new?

'notmuch new' tags any new messages it finds with tags specified in
new.tags config option (man notmuch-config), "inbox" and "unread" by
default. Some people like to change that to "new", and run a post-new
hook (man notmuch-hooks) that looks at messages matching tag:new,
retagging as desired.

> My current thought is to make 'notmuch new' run a script that tags the
> messages, and make it add a 'new-gui' or such tag to all new messages.
> The client would then periodically make a query for that tag, and at the
> same time remove the tag for any returned messages.

As said, 'notmuch new' does that, and it can also run a script for you.

> 4.
>
> Has there been discussion on returning integer IDs instead of strings
> from various functions like notmuch_message_get_message_id() and
> notmuch_tags_get()?
>
> I have two things behind this question:
>
> - Marshaling strings from native code to managed code requires
> allocating memory and copying the string, whereas returning an int is
> more or less a no-op [1][2]. E.g. at the moment if I fetch tag 'inbox'
> for 10k messages, I'm creating a new 'inbox' string 10k times. I'd
> rather fetch an int 10k times, and the 'inbox' string once.
>
> - My prototype fetches the message ids for all the messages returned by
> the query, so that it can later load the message if the user wants to
> read it. Fetching and storing only an int per message versus a long-ish
> string per message would most likely be good for performance with large
> queries.
>
> 5.
>
> This one is just a vague thought that came to my mind. At the moment
> notmuch hides Xapian totally behind notmuch's interface, which probably
> makes things simpler (and gives a nice C API), but also (afaik) prevents
> using Xapian features that are not at the moment supported in the
> notmuch API.
>
> I wonder how would an approach work where notmuch would be a bit more
> like a helper library, allowing full use of Xapian's features but making
> it simple to manage notmuch database. So, for example, when making a
> query, you'd create a Xapian query with notmuch, and then use Xapian to
> run the query.
>
> I don't have anything clear in mind, and obviously Xapian being C++
> might make the whole idea unimplementable.

I think the database implementation has been abstracted on purpose, so
we could, at least in theory, switch from xapian to something else. I
don't know how feasible that would be though. I think Austin has
experimented with that.


Cheers,
Jani.


>  Tomi
>
>
> [1] That's on C#. I wouldn't be surprised if it's also the same with
> other higher level languages.
>
> [2] That's not entirely true, as strings can be passed as is, if the
> managed code is given the ownership of the string, and the managed code
> will free the string eventually.
>
> 

notmuch-lib questions and observations

2013-11-19 Thread Jesse Rosenthal
Tomi Valkeinen  writes:

> I think I wasn't very clear on what I meant. I was thinking about the
> behavior that graphical mail clients have: they periodically refresh the
> emails, showing new ones if there are any, and they'll show some icon or
> such which tells the user this email is "new" (which could mean received
> in the last periodic refresh).

I do something similar to what you were describing. I put two tags,
"fresh" and "new" on mails that have just come in. "fresh" is for
internal use -- it allows me to run scripts on certain mails that
haven't been checked before, and it is taken off of everything before I
see it. "new" is left on, and means that it just came in with the last
poll. This is all done as a post-new hook. Then, as  pre-new hook, I
remove all the "new" tags. So when I poll again, I only see the ones
that came in with the newest poll.

If I want to see what I've received since the last poll, I just run a
search with "tag:new AND tag:inbox."

Now, this is dones with the hooks that the command-line client uses, so
you'd have to implement it yourself for your client, but that shouldn't
be too hard.




Re: notmuch-lib questions and observations

2013-11-19 Thread Jani Nikula
On Mon, 18 Nov 2013, Tomi Valkeinen tomi.valkei...@iki.fi wrote:
 Hi,

 I found out about notmuch quite recently, and now I've been tinkering
 with it, prototyping a GUI client. I have some questions and observations:

Hello Tomi, glad you've found notmuch too! ;)

Your mail would deserve a more thorough answer that I have time for
right now, but hopefully someone(tm) will fill in.

 1.

 The API seems to be a bit broken. I think many of the functions should
 return notmuch_status_t. I encountered this issue with get_header() and
 get_date(), which I happened to call after the DB had been changed
 twice, leading to Xapian::DatabaseModifiedError.

 Neither function handle the exception, causing a crash, which is
 obviously a bug, but even if they did handle the exception they don't
 return any sensible error information. Even worse, consider
 count_messages(), for which return value of 0 is valid.

We should never leak exceptions or do INTERNAL_ERROR() on things that
are not internal errors. So agreed those are bugs.

 So, as far as I see, many of the funcs should be changed to something like:

 notmuch_status_t
 notmuch_query_count_messages (notmuch_query_t *query, unsigned *count);

We're about to release 0.17, and we expect to have to break the API a
bit after the release anyway. IMO it would be a good time to review this
kind of stuff.

The bug fixes you sent for not crashing should probably be considered
for 0.17.

 2.

 This is more about Xapian, I guess. The behavior that a db reader will
 start failing if the db has been changed twice is rather bad. If I'm not
 mistaken, having a rather long read-only query could be blocked (or,
 well, re-tried) forever, if there just happens to be a few db writes
 during the read.

 I think a better approach would be to allow only one change to the db if
 there are open db readers. If a second db writer tries to open the db,
 it would get a failure (instead of the readers).

 Anyone know if this has been discussed, or if my suggestion is plain silly?

 3.

 How is a client using notmuch supposed to find out there are new
 messages, and which messages are new?

'notmuch new' tags any new messages it finds with tags specified in
new.tags config option (man notmuch-config), inbox and unread by
default. Some people like to change that to new, and run a post-new
hook (man notmuch-hooks) that looks at messages matching tag:new,
retagging as desired.

 My current thought is to make 'notmuch new' run a script that tags the
 messages, and make it add a 'new-gui' or such tag to all new messages.
 The client would then periodically make a query for that tag, and at the
 same time remove the tag for any returned messages.

As said, 'notmuch new' does that, and it can also run a script for you.

 4.

 Has there been discussion on returning integer IDs instead of strings
 from various functions like notmuch_message_get_message_id() and
 notmuch_tags_get()?

 I have two things behind this question:

 - Marshaling strings from native code to managed code requires
 allocating memory and copying the string, whereas returning an int is
 more or less a no-op [1][2]. E.g. at the moment if I fetch tag 'inbox'
 for 10k messages, I'm creating a new 'inbox' string 10k times. I'd
 rather fetch an int 10k times, and the 'inbox' string once.

 - My prototype fetches the message ids for all the messages returned by
 the query, so that it can later load the message if the user wants to
 read it. Fetching and storing only an int per message versus a long-ish
 string per message would most likely be good for performance with large
 queries.

 5.

 This one is just a vague thought that came to my mind. At the moment
 notmuch hides Xapian totally behind notmuch's interface, which probably
 makes things simpler (and gives a nice C API), but also (afaik) prevents
 using Xapian features that are not at the moment supported in the
 notmuch API.

 I wonder how would an approach work where notmuch would be a bit more
 like a helper library, allowing full use of Xapian's features but making
 it simple to manage notmuch database. So, for example, when making a
 query, you'd create a Xapian query with notmuch, and then use Xapian to
 run the query.

 I don't have anything clear in mind, and obviously Xapian being C++
 might make the whole idea unimplementable.

I think the database implementation has been abstracted on purpose, so
we could, at least in theory, switch from xapian to something else. I
don't know how feasible that would be though. I think Austin has
experimented with that.


Cheers,
Jani.


  Tomi


 [1] That's on C#. I wouldn't be surprised if it's also the same with
 other higher level languages.

 [2] That's not entirely true, as strings can be passed as is, if the
 managed code is given the ownership of the string, and the managed code
 will free the string eventually.

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 

Re: notmuch-lib questions and observations

2013-11-19 Thread Tomi Valkeinen
On 2013-11-19 14:12, Jani Nikula wrote:
 On Mon, 18 Nov 2013, Tomi Valkeinen tomi.valkei...@iki.fi wrote:
 Hi,

 I found out about notmuch quite recently, and now I've been tinkering
 with it, prototyping a GUI client. I have some questions and observations:
 
 Hello Tomi, glad you've found notmuch too! ;)

Well, I'm still using Thunderbird... ;)

 3.

 How is a client using notmuch supposed to find out there are new
 messages, and which messages are new?
 
 'notmuch new' tags any new messages it finds with tags specified in
 new.tags config option (man notmuch-config), inbox and unread by
 default. Some people like to change that to new, and run a post-new
 hook (man notmuch-hooks) that looks at messages matching tag:new,
 retagging as desired.
 
 My current thought is to make 'notmuch new' run a script that tags the
 messages, and make it add a 'new-gui' or such tag to all new messages.
 The client would then periodically make a query for that tag, and at the
 same time remove the tag for any returned messages.
 
 As said, 'notmuch new' does that, and it can also run a script for you.

I think I wasn't very clear on what I meant. I was thinking about the
behavior that graphical mail clients have: they periodically refresh the
emails, showing new ones if there are any, and they'll show some icon or
such which tells the user this email is new (which could mean received
in the last periodic refresh).

So with notmuch, the client would somehow need to know that there has
been changes in the database, and then know which emails are new.

For the former, I have no good idea as there doesn't seem to be any way
to find out the db was changed since the last open. For the latter, I
guess the tagging method I mentioned above should work.

If the xapian document id was available, I believe it could also be used
for the latter, as it should always be increasing.

 5.

 This one is just a vague thought that came to my mind. At the moment
 notmuch hides Xapian totally behind notmuch's interface, which probably
 makes things simpler (and gives a nice C API), but also (afaik) prevents
 using Xapian features that are not at the moment supported in the
 notmuch API.

 I wonder how would an approach work where notmuch would be a bit more
 like a helper library, allowing full use of Xapian's features but making
 it simple to manage notmuch database. So, for example, when making a
 query, you'd create a Xapian query with notmuch, and then use Xapian to
 run the query.

 I don't have anything clear in mind, and obviously Xapian being C++
 might make the whole idea unimplementable.
 
 I think the database implementation has been abstracted on purpose, so
 we could, at least in theory, switch from xapian to something else. I
 don't know how feasible that would be though. I think Austin has
 experimented with that.

Ah, a valid point, I didn't think of that.

 Tomi




signature.asc
Description: OpenPGP digital signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: notmuch-lib questions and observations

2013-11-19 Thread Jesse Rosenthal
Tomi Valkeinen tomi.valkei...@iki.fi writes:

 I think I wasn't very clear on what I meant. I was thinking about the
 behavior that graphical mail clients have: they periodically refresh the
 emails, showing new ones if there are any, and they'll show some icon or
 such which tells the user this email is new (which could mean received
 in the last periodic refresh).

I do something similar to what you were describing. I put two tags,
fresh and new on mails that have just come in. fresh is for
internal use -- it allows me to run scripts on certain mails that
haven't been checked before, and it is taken off of everything before I
see it. new is left on, and means that it just came in with the last
poll. This is all done as a post-new hook. Then, as  pre-new hook, I
remove all the new tags. So when I poll again, I only see the ones
that came in with the newest poll.

If I want to see what I've received since the last poll, I just run a
search with tag:new AND tag:inbox.

Now, this is dones with the hooks that the command-line client uses, so
you'd have to implement it yourself for your client, but that shouldn't
be too hard.


___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


notmuch-lib questions and observations

2013-11-18 Thread Tomi Valkeinen
Hi,

I found out about notmuch quite recently, and now I've been tinkering
with it, prototyping a GUI client. I have some questions and observations:

1.

The API seems to be a bit broken. I think many of the functions should
return notmuch_status_t. I encountered this issue with get_header() and
get_date(), which I happened to call after the DB had been changed
twice, leading to Xapian::DatabaseModifiedError.

Neither function handle the exception, causing a crash, which is
obviously a bug, but even if they did handle the exception they don't
return any sensible error information. Even worse, consider
count_messages(), for which return value of 0 is valid.

So, as far as I see, many of the funcs should be changed to something like:

notmuch_status_t
notmuch_query_count_messages (notmuch_query_t *query, unsigned *count);


2.

This is more about Xapian, I guess. The behavior that a db reader will
start failing if the db has been changed twice is rather bad. If I'm not
mistaken, having a rather long read-only query could be blocked (or,
well, re-tried) forever, if there just happens to be a few db writes
during the read.

I think a better approach would be to allow only one change to the db if
there are open db readers. If a second db writer tries to open the db,
it would get a failure (instead of the readers).

Anyone know if this has been discussed, or if my suggestion is plain silly?

3.

How is a client using notmuch supposed to find out there are new
messages, and which messages are new?

My current thought is to make 'notmuch new' run a script that tags the
messages, and make it add a 'new-gui' or such tag to all new messages.
The client would then periodically make a query for that tag, and at the
same time remove the tag for any returned messages.

4.

Has there been discussion on returning integer IDs instead of strings
from various functions like notmuch_message_get_message_id() and
notmuch_tags_get()?

I have two things behind this question:

- Marshaling strings from native code to managed code requires
allocating memory and copying the string, whereas returning an int is
more or less a no-op [1][2]. E.g. at the moment if I fetch tag 'inbox'
for 10k messages, I'm creating a new 'inbox' string 10k times. I'd
rather fetch an int 10k times, and the 'inbox' string once.

- My prototype fetches the message ids for all the messages returned by
the query, so that it can later load the message if the user wants to
read it. Fetching and storing only an int per message versus a long-ish
string per message would most likely be good for performance with large
queries.

5.

This one is just a vague thought that came to my mind. At the moment
notmuch hides Xapian totally behind notmuch's interface, which probably
makes things simpler (and gives a nice C API), but also (afaik) prevents
using Xapian features that are not at the moment supported in the
notmuch API.

I wonder how would an approach work where notmuch would be a bit more
like a helper library, allowing full use of Xapian's features but making
it simple to manage notmuch database. So, for example, when making a
query, you'd create a Xapian query with notmuch, and then use Xapian to
run the query.

I don't have anything clear in mind, and obviously Xapian being C++
might make the whole idea unimplementable.

 Tomi


[1] That's on C#. I wouldn't be surprised if it's also the same with
other higher level languages.

[2] That's not entirely true, as strings can be passed as is, if the
managed code is given the ownership of the string, and the managed code
will free the string eventually.

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 901 bytes
Desc: OpenPGP digital signature
URL: 



notmuch-lib questions and observations

2013-11-18 Thread Tomi Valkeinen
Hi,

I found out about notmuch quite recently, and now I've been tinkering
with it, prototyping a GUI client. I have some questions and observations:

1.

The API seems to be a bit broken. I think many of the functions should
return notmuch_status_t. I encountered this issue with get_header() and
get_date(), which I happened to call after the DB had been changed
twice, leading to Xapian::DatabaseModifiedError.

Neither function handle the exception, causing a crash, which is
obviously a bug, but even if they did handle the exception they don't
return any sensible error information. Even worse, consider
count_messages(), for which return value of 0 is valid.

So, as far as I see, many of the funcs should be changed to something like:

notmuch_status_t
notmuch_query_count_messages (notmuch_query_t *query, unsigned *count);


2.

This is more about Xapian, I guess. The behavior that a db reader will
start failing if the db has been changed twice is rather bad. If I'm not
mistaken, having a rather long read-only query could be blocked (or,
well, re-tried) forever, if there just happens to be a few db writes
during the read.

I think a better approach would be to allow only one change to the db if
there are open db readers. If a second db writer tries to open the db,
it would get a failure (instead of the readers).

Anyone know if this has been discussed, or if my suggestion is plain silly?

3.

How is a client using notmuch supposed to find out there are new
messages, and which messages are new?

My current thought is to make 'notmuch new' run a script that tags the
messages, and make it add a 'new-gui' or such tag to all new messages.
The client would then periodically make a query for that tag, and at the
same time remove the tag for any returned messages.

4.

Has there been discussion on returning integer IDs instead of strings
from various functions like notmuch_message_get_message_id() and
notmuch_tags_get()?

I have two things behind this question:

- Marshaling strings from native code to managed code requires
allocating memory and copying the string, whereas returning an int is
more or less a no-op [1][2]. E.g. at the moment if I fetch tag 'inbox'
for 10k messages, I'm creating a new 'inbox' string 10k times. I'd
rather fetch an int 10k times, and the 'inbox' string once.

- My prototype fetches the message ids for all the messages returned by
the query, so that it can later load the message if the user wants to
read it. Fetching and storing only an int per message versus a long-ish
string per message would most likely be good for performance with large
queries.

5.

This one is just a vague thought that came to my mind. At the moment
notmuch hides Xapian totally behind notmuch's interface, which probably
makes things simpler (and gives a nice C API), but also (afaik) prevents
using Xapian features that are not at the moment supported in the
notmuch API.

I wonder how would an approach work where notmuch would be a bit more
like a helper library, allowing full use of Xapian's features but making
it simple to manage notmuch database. So, for example, when making a
query, you'd create a Xapian query with notmuch, and then use Xapian to
run the query.

I don't have anything clear in mind, and obviously Xapian being C++
might make the whole idea unimplementable.

 Tomi


[1] That's on C#. I wouldn't be surprised if it's also the same with
other higher level languages.

[2] That's not entirely true, as strings can be passed as is, if the
managed code is given the ownership of the string, and the managed code
will free the string eventually.



signature.asc
Description: OpenPGP digital signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch