[notmuch] strange behavior of indexing of and searching for strings containing '[]'

2010-02-08 Thread Jed Brown
On Mon, 08 Feb 2010 12:24:06 -0500, Jameson Rollins  wrote:
> I don't think that this is exactly correct.  The quoting is interpreted
> by the shell in order to construct a single string that is then passed
> as an argument to the program.

The command line distinguishes, but the constructed query does not.
Look at query-string.c, the arguments are just concatenated.

Jed


[notmuch] strange behavior of indexing of and searching for strings containing '[]'

2010-02-08 Thread Jameson Rollins
On Mon, 08 Feb 2010 18:35:44 +0100, Jed Brown  wrote:
> On Mon, 08 Feb 2010 12:24:06 -0500, Jameson Rollins  finestructure.net> wrote:
> > I don't think that this is exactly correct.  The quoting is interpreted
> > by the shell in order to construct a single string that is then passed
> > as an argument to the program.
> 
> The command line distinguishes, but the constructed query does not.
> Look at query-string.c, the arguments are just concatenated.

Hi, Jed.  Yes, this is clear from the behavior, but I'm claiming it's a
bug that should be fixed.  It produces unexpected behavior with
confusing results.

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



[notmuch] strange behavior of indexing of and searching for strings containing '[]'

2010-02-08 Thread Jameson Rollins
On Fri, 5 Feb 2010 23:48:03 + (UTC), Olly Betts  wrote:
> On 2010-02-05, Jameson Rollins wrote:
> > Hey, folks.  I've been noticing some strange behavior of notmuch search
> > results for strings containing '[]'.  Here are some searches for some
> > exact strings in messages subjects:
> 
> The '[]' is a red herring.  Xapian's TermGenerator and QueryParser classes
> treat these two characters pretty much as if they were spaces.

Ah.  Thanks for the response, Olly.  This clears things up a lot.

> > servo:~ 0$ notmuch search subject:'emacs paned UI'
> 
> Note that the '' is quoting for the shell only here.  So Xapian sees:
> 
> subject:emacs paned UI
> 
> Assuming you are defaulting to an AND search, that's `emacs in the subject'
> AND `paned anywhere in the indexed text' AND `UI anywhere in the indexed 
> text'.

I don't think that this is exactly correct.  The quoting is interpreted
by the shell in order to construct a single string that is then passed
as an argument to the program.  Notmuch should then be seeing the single
string argument as the search parameter, and not breaking it up further.

Here's an example of what I mean:

servo:~/tmp/cdtemp.AYroUf 0$ cat parse 
#!/bin/bash
for arg; do echo "$arg"; done
servo:~/tmp/cdtemp.AYroUf 0$ ./parse subject:foo bar baz
subject:foo
bar
baz
servo:~/tmp/cdtemp.AYroUf 0$ ./parse subject:'foo bar' baz
subject:foo bar
baz
servo:~/tmp/cdtemp.AYroUf 0$ ./parse subject:"foo bar" baz
subject:foo bar
baz
servo:~/tmp/cdtemp.AYroUf 0$ 

As you can see in the last command, the argument "subject'foo bar'" is
passed as a single string by the shell, and should therefore be
interpreted as such by notmuch.

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: 



Re: [notmuch] strange behavior of indexing of and searching for strings containing '[]'

2010-02-08 Thread Jameson Rollins
On Fri, 5 Feb 2010 23:48:03 + (UTC), Olly Betts o...@survex.com wrote:
 On 2010-02-05, Jameson Rollins wrote:
  Hey, folks.  I've been noticing some strange behavior of notmuch search
  results for strings containing '[]'.  Here are some searches for some
  exact strings in messages subjects:
 
 The '[]' is a red herring.  Xapian's TermGenerator and QueryParser classes
 treat these two characters pretty much as if they were spaces.

Ah.  Thanks for the response, Olly.  This clears things up a lot.

  servo:~ 0$ notmuch search subject:'emacs paned UI'
 
 Note that the '' is quoting for the shell only here.  So Xapian sees:
 
 subject:emacs paned UI
 
 Assuming you are defaulting to an AND search, that's `emacs in the subject'
 AND `paned anywhere in the indexed text' AND `UI anywhere in the indexed 
 text'.

I don't think that this is exactly correct.  The quoting is interpreted
by the shell in order to construct a single string that is then passed
as an argument to the program.  Notmuch should then be seeing the single
string argument as the search parameter, and not breaking it up further.

Here's an example of what I mean:

servo:~/tmp/cdtemp.AYroUf 0$ cat parse 
#!/bin/bash
for arg; do echo $arg; done
servo:~/tmp/cdtemp.AYroUf 0$ ./parse subject:foo bar baz
subject:foo
bar
baz
servo:~/tmp/cdtemp.AYroUf 0$ ./parse subject:'foo bar' baz
subject:foo bar
baz
servo:~/tmp/cdtemp.AYroUf 0$ ./parse subject:foo bar baz
subject:foo bar
baz
servo:~/tmp/cdtemp.AYroUf 0$ 

As you can see in the last command, the argument subject'foo bar' is
passed as a single string by the shell, and should therefore be
interpreted as such by notmuch.

jamie.


pgpBHd8c9bE0Q.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[notmuch] strange behavior of indexing of and searching for strings containing '[]'

2010-02-05 Thread Olly Betts
On 2010-02-05, Jameson Rollins wrote:
> Hey, folks.  I've been noticing some strange behavior of notmuch search
> results for strings containing '[]'.  Here are some searches for some
> exact strings in messages subjects:

The '[]' is a red herring.  Xapian's TermGenerator and QueryParser classes
treat these two characters pretty much as if they were spaces.

> servo:~ 0$ notmuch search subject:'emacs paned UI'

Note that the '' is quoting for the shell only here.  So Xapian sees:

subject:emacs paned UI

Assuming you are defaulting to an AND search, that's `emacs in the subject'
AND `paned anywhere in the indexed text' AND `UI anywhere in the indexed text'.

To specify a quoted phrase you want "" anyway (not ''), so the command
matching what I think you intended to search for is:

notmuch search 'subject:"emacs paned UI"'

> servo:~ 0$ notmuch search subject:'[notmuch] emacs paned UI'

notmuch search 'subject:"[notmuch] emacs paned UI"'

Which should return identical results to:

notmuch search 'subject:"notmuch emacs paned UI"'

> thread:5f2cb4b108773a39161b33c86e54f7fd  4 mins. ago [1/1] Jameson Rollins;=
>  [notmuch] loss of duplicate messages (inbox)
> servo:~ 0$=20
>
> Not only did it not turn up the message that *does* match that exact
> string in it's subject line, it actually turns up a completely different
> message that doesn't match the search term at all!

It matches the notmuch in the subject, and presumably emacs, paned, and UI
in the body.

> [snip the rest - the same explanations apply]

Cheers,
Olly



[notmuch] strange behavior of indexing of and searching for strings containing '[]'

2010-02-05 Thread Oliver Charles
On Fri, Feb 5, 2010 at 4:44 PM, Jameson Rollins
 wrote:
> Does anyone have any idea what's going on here? ?I think I saw mention
> of this issue on IRC somewhere, but I thought I should bring it up
> explicitly here. ?This is definitely some buggy behavior.

Afaik, stuff in between [] is not indexed, but that doesn't quite
explain the other weird results.

-- 
Oliver Charles / aCiD2


[notmuch] strange behavior of indexing of and searching for strings containing '[]'

2010-02-05 Thread Jameson Rollins
Hey, folks.  I've been noticing some strange behavior of notmuch search
results for strings containing '[]'.  Here are some searches for some
exact strings in messages subjects:

servo:~ 0$ notmuch search subject:'emacs paned UI'
thread:533da424197bb6ba61a42b667d5d8d8f   Wed. 14:12 [2/2] Tad Fisher, Jameson 
Rollins; [notmuch] Emacs paned UI ()
servo:~ 0$ 

So that's fine and expected.  This however is not:

servo:~ 0$ notmuch search subject:'[notmuch] emacs paned UI'
thread:5f2cb4b108773a39161b33c86e54f7fd  4 mins. ago [1/1] Jameson Rollins; 
[notmuch] loss of duplicate messages (inbox)
servo:~ 0$ 

Not only did it not turn up the message that *does* match that exact
string in it's subject line, it actually turns up a completely different
message that doesn't match the search term at all!

This search actually turns up both:

servo:~ 0$ notmuch search subject:'notmuch emacs paned UI'
thread:5f2cb4b108773a39161b33c86e54f7fd  5 mins. ago [1/1] Jameson Rollins; 
[notmuch] loss of duplicate messages (inbox)
thread:533da424197bb6ba61a42b667d5d8d8f   Wed. 14:12 [2/2] Tad Fisher, Jameson 
Rollins; [notmuch] Emacs paned UI ()
servo:~ 0$ 

Which is again strange, because the second message does not at all match
that search term.

Does anyone have any idea what's going on here?  I think I saw mention
of this issue on IRC somewhere, but I thought I should bring it up
explicitly here.  This is definitely some buggy behavior.

jamie.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: