Strange match to my query

2011-04-19 Thread Jameson Graef Rollins
On Thu, 14 Apr 2011 09:32:30 +0200, Florian Friesdorf  
wrote:
> On Tue, 01 Mar 2011 15:15:22 -0800, Jameson Rollins  finestructure.net> wrote:
> > > A simple rebuild when you go to bed can look like:
> > 
> > I think you're missing an important step:
> > 
> > notmuch dump >dump.txt
> > mv $(notmuch config get database.path){,.bak}
> 
> Catching up and confused here: Shouldn't this be:
> 
> mv $(notmuch config get database.path)/.notmuch{,.bak}

Yes, you're right.  Nice correction correction!

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



Re: Strange match to my query

2011-04-19 Thread Jameson Graef Rollins
On Thu, 14 Apr 2011 09:32:30 +0200, Florian Friesdorf f...@chaoflow.net wrote:
 On Tue, 01 Mar 2011 15:15:22 -0800, Jameson Rollins 
 jroll...@finestructure.net wrote:
   A simple rebuild when you go to bed can look like:
  
  I think you're missing an important step:
  
  notmuch dump dump.txt
  mv $(notmuch config get database.path){,.bak}
 
 Catching up and confused here: Shouldn't this be:
 
 mv $(notmuch config get database.path)/.notmuch{,.bak}

Yes, you're right.  Nice correction correction!

jamie.


pgp8znrogP9GW.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Strange match to my query

2011-04-14 Thread Florian Friesdorf
On Tue, 01 Mar 2011 15:15:22 -0800, Jameson Rollins 
jroll...@finestructure.net wrote:
  A simple rebuild when you go to bed can look like:
 
 I think you're missing an important step:
 
 notmuch dump dump.txt
 mv $(notmuch config get database.path){,.bak}

Catching up and confused here: Shouldn't this be:

mv $(notmuch config get database.path)/.notmuch{,.bak}

Otherwise I would move away all my emails, not just notmuch's database.

 notmuch new
 notmuch restore dump.txt

-- 
Florian Friesdorf f...@chaoflow.net
  GPG FPR: 7A13 5EEE 1421 9FC2 108D  BAAF 38F8 99A3 0C45 F083
Jabber/XMPP: f...@chaoflow.net
IRC: chaoflow on freenode,ircnet,blafasel,OFTC


pgpuK8zewjdMu.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Strange match to my query

2011-03-01 Thread Mark Anderson
On Tue, 1 Mar 2011 17:15:22 -0600, Jameson Rollins  wrote:
> On Tue, 1 Mar 2011 16:00:51 -0700, Mark Anderson  
> wrote:
>  
> > A simple rebuild when you go to bed can look like:
> 
> I think you're missing an important step:
> 
> notmuch dump >dump.txt
> mv $(notmuch config get database.path){,.bak}
> notmuch new
> notmuch restore dump.txt

True, that would be much better than my proposed flow.

-Mark



Strange match to my query

2011-03-01 Thread Mark Anderson
On Fri, 25 Feb 2011 15:29:05 -0600, Jameson Rollins  wrote:
> So I am in fact still seeing this bug, although I am ostensibly using a
> version that includes the patch to fix it (db70f3f0).  Does this fix
> require rebuilding the database?

Yes.

The termlist is constructed when the message is added to the database,
so the database must be reconstructed.

Newer messages will index email addresses so that they can't be matched
by overlapping term indexes.  However, the corpus of your database is
not going to change without manual intervention.

A simple rebuild when you go to bed can look like:

notmuch dump >dump.txt; notmuch new; notmuch restore dump.txt

-Mark



Strange match to my query

2011-03-01 Thread Jameson Rollins
On Tue, 1 Mar 2011 16:00:51 -0700, Mark Anderson  
wrote:
> On Fri, 25 Feb 2011 15:29:05 -0600, Jameson Rollins  finestructure.net> wrote:
> > So I am in fact still seeing this bug, although I am ostensibly using a
> > version that includes the patch to fix it (db70f3f0).  Does this fix
> > require rebuilding the database?
> 
> Yes.
>
> The termlist is constructed when the message is added to the database,
> so the database must be reconstructed.
> 
> Newer messages will index email addresses so that they can't be matched
> by overlapping term indexes.  However, the corpus of your database is
> not going to change without manual intervention.

Ok, that's what I thought.  Thanks for the feedback, Mark.

> A simple rebuild when you go to bed can look like:

I think you're missing an important step:

notmuch dump >dump.txt
mv $(notmuch config get database.path){,.bak}
notmuch new
notmuch restore dump.txt

but I get the idea ;)

Thanks again.

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



Re: Strange match to my query

2011-03-01 Thread Jameson Rollins
On Tue, 1 Mar 2011 16:00:51 -0700, Mark Anderson markr.ander...@amd.com wrote:
 On Fri, 25 Feb 2011 15:29:05 -0600, Jameson Rollins 
 jroll...@finestructure.net wrote:
  So I am in fact still seeing this bug, although I am ostensibly using a
  version that includes the patch to fix it (db70f3f0).  Does this fix
  require rebuilding the database?
 
 Yes.

 The termlist is constructed when the message is added to the database,
 so the database must be reconstructed.
 
 Newer messages will index email addresses so that they can't be matched
 by overlapping term indexes.  However, the corpus of your database is
 not going to change without manual intervention.

Ok, that's what I thought.  Thanks for the feedback, Mark.
 
 A simple rebuild when you go to bed can look like:

I think you're missing an important step:

notmuch dump dump.txt
mv $(notmuch config get database.path){,.bak}
notmuch new
notmuch restore dump.txt

but I get the idea ;)

Thanks again.

jamie.


pgpMEtx1OyXaO.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: Strange match to my query

2011-03-01 Thread Mark Anderson
On Tue, 1 Mar 2011 17:15:22 -0600, Jameson Rollins jroll...@finestructure.net 
wrote:
 On Tue, 1 Mar 2011 16:00:51 -0700, Mark Anderson markr.ander...@amd.com 
 wrote:
  
  A simple rebuild when you go to bed can look like:
 
 I think you're missing an important step:
 
 notmuch dump dump.txt
 mv $(notmuch config get database.path){,.bak}
 notmuch new
 notmuch restore dump.txt

True, that would be much better than my proposed flow.

-Mark

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Strange match to my query

2011-02-25 Thread Mark Anderson
On Fri, 25 Feb 2011 12:19:21 -0600, Jameson Rollins  wrote:
> On Tue, 25 Jan 2011 16:29:22 -0700, Mark Anderson  
> wrote:
> > Apparently matching on email addresses doesn't work the way I hoped.
> > 
> > While debugging why my to:x at y.com search was matching far too many
> > entries, I whittled it down to this:
> > 
> > WORD1=hello
> > WORD2=goodbye
> > MSGID=junk$(date +%s)
> > TESTDIR=$(notmuch config get database.path)/.tmp/new
> > TESTMAIL=$TESTDIR/$MSGID:2,
> > 
> > mkdir -p $TESTDIR
> > 
> > echo Testcase for $WORD1@$WORD2, msgid: $MSGID at junk.com
> > 
> > echo "From: nobody at nobody.com
> > To: c@${WORD1}.com, K-R@${WORD2}.com
> > Date: Mon, 24 Jan 2011 23:41:34 -0600
> > Subject: Error
> > Message-ID: <$MSGID at junk.com>
> > 
> > Not empty body.=
> > 
> > " > $TESTMAIL
> > 
> > notmuch new
> > notmuch search --output=files to:$WORD1@$WORD2
> > notmuch search --output=files to:\"$WORD1@$WORD2\"
> > 
> > Why does that match, but this doesn't?
> > 
> > notmuch search --output=files to:\'$WORD1@$WORD2\'
> 
> Hey, guys.  Reopening an old thread here, found while trying to track
> down a similar problem.
> 
> I'm confused why any of these searches should return anything at all.
> "$WORD1@$WORD2" doesn't actually match either of the addresses in the
> test message, especially when quoted.  The expanded addresses should be:
> 
>   c at hello.com
>   K-R at goodbye.com
> 
> Why should
> 
>   hello at goodbye
> 
> match anything?  And in fact it doesn't for me if I recreate the same
> setup.  Am I missing something?

It shouldn't match anything, that's the value of finding this bug.

What happened is the term counter was reset for each email address, so
the term list for emails in "to:" looks something like this:

0 c  K
1 hello  R
2 comgoodbye
3com

So it matched a hello at 1 and a goodbye at 2.

I don't remember where the discussion on this went, but it was on the
list.

Perhaps you should search for it, it should take notmuch to
find... *duck*

-Mark



Strange match to my query

2011-02-25 Thread Jameson Rollins
On Fri, 25 Feb 2011 13:57:23 -0700, Mark Anderson  
wrote:
> It shouldn't match anything, that's the value of finding this bug.
> 
> What happened is the term counter was reset for each email address, so
> the term list for emails in "to:" looks something like this:
> 
> 0 c  K
> 1 hello  R
> 2 comgoodbye
> 3com
> 
> So it matched a hello at 1 and a goodbye at 2.

I see now.  I was confused about which problem you were reporting.

So I am in fact still seeing this bug, although I am ostensibly using a
version that includes the patch to fix it (db70f3f0).  Does this fix
require rebuilding the database?

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



Strange match to my query

2011-02-25 Thread Jameson Rollins
On Tue, 25 Jan 2011 16:29:22 -0700, Mark Anderson  
wrote:
> Apparently matching on email addresses doesn't work the way I hoped.
> 
> While debugging why my to:x at y.com search was matching far too many
> entries, I whittled it down to this:
> 
> WORD1=hello
> WORD2=goodbye
> MSGID=junk$(date +%s)
> TESTDIR=$(notmuch config get database.path)/.tmp/new
> TESTMAIL=$TESTDIR/$MSGID:2,
> 
> mkdir -p $TESTDIR
> 
> echo Testcase for $WORD1@$WORD2, msgid: $MSGID at junk.com
> 
> echo "From: nobody at nobody.com
> To: c@${WORD1}.com, K-R@${WORD2}.com
> Date: Mon, 24 Jan 2011 23:41:34 -0600
> Subject: Error
> Message-ID: <$MSGID at junk.com>
> 
> Not empty body.=
> 
> " > $TESTMAIL
> 
> notmuch new
> notmuch search --output=files to:$WORD1@$WORD2
> notmuch search --output=files to:\"$WORD1@$WORD2\"
> 
> Why does that match, but this doesn't?
> 
> notmuch search --output=files to:\'$WORD1@$WORD2\'

Hey, guys.  Reopening an old thread here, found while trying to track
down a similar problem.

I'm confused why any of these searches should return anything at all.
"$WORD1@$WORD2" doesn't actually match either of the addresses in the
test message, especially when quoted.  The expanded addresses should be:

  c at hello.com
  K-R at goodbye.com

Why should

  hello at goodbye

match anything?  And in fact it doesn't for me if I recreate the same
setup.  Am I missing something?

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



Strange match to my query

2011-01-26 Thread Carl Worth
On Wed, 26 Jan 2011 12:19:17 +1000, Carl Worth  wrote:
> And thanks, Mark for the bug report and the nice test case. I'll add
> this to the test suite, and fix it. And that will give us yet one more
> reason for all of us to rebuild our databases after the upcoming
> release.

I've added a test case for this now, fixed the bug, and pushed out the
new code.

Thanks again for the bug report.

-Carl
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



Strange match to my query

2011-01-26 Thread Carl Worth
On Tue, 25 Jan 2011 19:51:14 -0500, Austin Clements  
wrote:
> Well-constructed test message.  Xapian's query parser is actually doing the
> right thing [1] and this is a bug in the way notmuch indexes address list
> headers.  For each address, _notmuch_message_gen_terms resets the term
> generator's term position, so your To header indexes with positions as
>   c:1 hello:2 com:3 K:1 R:2 world:3 com:4

Thanks, Austin!

I was actually giving a demo of notmuch to someone yesterday who was
really interested in the details of how Xapian actually stores things.

I dug around a bit with delve and we were both really surprised by the
position results we were seeing. Neither of us could make any sense of
them at all.

And thanks, Mark for the bug report and the nice test case. I'll add
this to the test suite, and fix it. And that will give us yet one more
reason for all of us to rebuild our databases after the upcoming
release.

-Carl

-- 
carl.d.worth at intel.com
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



Strange match to my query

2011-01-26 Thread Mark Anderson
On Tue, 25 Jan 2011 23:59:50 -0600, Carl Worth  wrote:
> On Wed, 26 Jan 2011 12:19:17 +1000, Carl Worth  wrote:
> > And thanks, Mark for the bug report and the nice test case. I'll add
> > this to the test suite, and fix it. And that will give us yet one more
> > reason for all of us to rebuild our databases after the upcoming
> > release.
> 
> I've added a test case for this now, fixed the bug, and pushed out the
> new code.
> 
> Thanks again for the bug report.

That's great, apparently submitting the testcase was the best thing I
could do, because I didn't realize that I needed a 2-part name to align
the term lists, although I did start from one.  And now at least I know
that I can't construct the correct query without an updated notmuch.

It was very confusing trying to bend my head around the issue and tell
myself that I just didn't understand how notmuch worked at all on
searching through email addresses.

Glad to see such a quick response to my bug report.

> -Carl




Re: Strange match to my query

2011-01-26 Thread Mark Anderson
On Tue, 25 Jan 2011 23:59:50 -0600, Carl Worth cwo...@cworth.org wrote:
 On Wed, 26 Jan 2011 12:19:17 +1000, Carl Worth cwo...@cworth.org wrote:
  And thanks, Mark for the bug report and the nice test case. I'll add
  this to the test suite, and fix it. And that will give us yet one more
  reason for all of us to rebuild our databases after the upcoming
  release.
 
 I've added a test case for this now, fixed the bug, and pushed out the
 new code.
 
 Thanks again for the bug report.

That's great, apparently submitting the testcase was the best thing I
could do, because I didn't realize that I needed a 2-part name to align
the term lists, although I did start from one.  And now at least I know
that I can't construct the correct query without an updated notmuch.

It was very confusing trying to bend my head around the issue and tell
myself that I just didn't understand how notmuch worked at all on
searching through email addresses.

Glad to see such a quick response to my bug report.

 -Carl


___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Strange match to my query

2011-01-25 Thread Austin Clements
Well-constructed test message.  Xapian's query parser is actually doing the
right thing [1] and this is a bug in the way notmuch indexes address list
headers.  For each address, _notmuch_message_gen_terms resets the term
generator's term position, so your To header indexes with positions as
  c:1 hello:2 com:3 K:1 R:2 world:3 com:4
Thus, the phrase query "hello world" matches hello in position 2 and world
in position 3.  Probably the right thing for notmuch to do is to jump up the
term generator position between each address so phrase queries don't cross
them or span them.

[1] Your to:\'$WORD1@$WORD2\' query didn't work because Xapian doesn't
accept a single quote after a prefix.

On Tue, Jan 25, 2011 at 6:29 PM, Mark Anderson wrote:

> Hi guys, What's up? ("Notmuch")
>
> Apparently matching on email addresses doesn't work the way I hoped.
>
> While debugging why my to:x at y.com  search was matching far
> too many
> entries, I whittled it down to this:
>
> WORD1=hello
> WORD2=goodbye
> MSGID=junk$(date +%s)
> TESTDIR=$(notmuch config get database.path)/.tmp/new
> TESTMAIL=$TESTDIR/$MSGID:2,
>
> mkdir -p $TESTDIR
>
> echo Testcase for $WORD1@$WORD2, msgid: $MSGID at junk.com
>
> echo "From: nobody at nobody.com
> To: c@${WORD1}.com, K-R@${WORD2}.com
> Date: Mon, 24 Jan 2011 23:41:34 -0600
> Subject: Error
> Message-ID: <$MSGID at junk.com>
>
> Not empty body.=
>
> " > $TESTMAIL
>
> notmuch new
> notmuch search --output=files to:$WORD1@$WORD2
> notmuch search --output=files to:\"$WORD1@$WORD2\"
>
> Why does that match, but this doesn't?
>
> notmuch search --output=files to:\'$WORD1@$WORD2\'
>
> Apparently single quotes are the only quote for Xapian's parser?
>
> I guess this is a strong vote for the quick integration of the custom
> parser with optimization passes that turn emails into phrases that can't
> match across multiple emails.
>
> This was just an egregious example of notmuch giving me notmuch of what
> I wanted, or actually, far too much of what I didn't want.
>
> Thanks,
> -Mark
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
>
-- next part --
An HTML attachment was scrubbed...
URL: 



Strange match to my query

2011-01-25 Thread Mark Anderson
Hi guys, What's up? ("Notmuch")

Apparently matching on email addresses doesn't work the way I hoped.

While debugging why my to:x at y.com search was matching far too many
entries, I whittled it down to this:

WORD1=hello
WORD2=goodbye
MSGID=junk$(date +%s)
TESTDIR=$(notmuch config get database.path)/.tmp/new
TESTMAIL=$TESTDIR/$MSGID:2,

mkdir -p $TESTDIR

echo Testcase for $WORD1@$WORD2, msgid: $MSGID at junk.com

echo "From: nobody at nobody.com
To: c@${WORD1}.com, K-R@${WORD2}.com
Date: Mon, 24 Jan 2011 23:41:34 -0600
Subject: Error
Message-ID: <$MSGID at junk.com>

Not empty body.=

" > $TESTMAIL

notmuch new
notmuch search --output=files to:$WORD1@$WORD2
notmuch search --output=files to:\"$WORD1@$WORD2\"

Why does that match, but this doesn't?

notmuch search --output=files to:\'$WORD1@$WORD2\'

Apparently single quotes are the only quote for Xapian's parser?

I guess this is a strong vote for the quick integration of the custom
parser with optimization passes that turn emails into phrases that can't
match across multiple emails.

This was just an egregious example of notmuch giving me notmuch of what
I wanted, or actually, far too much of what I didn't want.

Thanks,
-Mark



Re: Strange match to my query

2011-01-25 Thread Carl Worth
On Tue, 25 Jan 2011 19:51:14 -0500, Austin Clements amdra...@gmail.com wrote:
 Well-constructed test message.  Xapian's query parser is actually doing the
 right thing [1] and this is a bug in the way notmuch indexes address list
 headers.  For each address, _notmuch_message_gen_terms resets the term
 generator's term position, so your To header indexes with positions as
   c:1 hello:2 com:3 K:1 R:2 world:3 com:4

Thanks, Austin!

I was actually giving a demo of notmuch to someone yesterday who was
really interested in the details of how Xapian actually stores things.

I dug around a bit with delve and we were both really surprised by the
position results we were seeing. Neither of us could make any sense of
them at all.

And thanks, Mark for the bug report and the nice test case. I'll add
this to the test suite, and fix it. And that will give us yet one more
reason for all of us to rebuild our databases after the upcoming
release.

-Carl

-- 
carl.d.wo...@intel.com


pgp04iN9DjrgH.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch