edismax with windows path input?

2011-02-10 Thread Ryan McKinley
I am using the edismax query parser -- its awesome!  works well for
standard dismax type queries, and allows explicit fields when
necessary.

I have hit a snag when people enter something that looks like a windows path:
lst name=params
 str name=qF:\path\to\a\file/str
/lst
this gets parsed as:
str name=rawquerystringF:\path\to\a\file/str
str name=querystringF:\path\to\a\file/str
str name=parsedquery+()/str

Putting it in quotes makes the not-quite right query:
str name=rawquerystringF:\path\to\a\file/str
str name=querystringF:\path\to\a\file/str
str name=parsedquery
+DisjunctionMaxQuery((path:f:pathtoafile^4.0 | name:f (pathtoafile
fpathtoafile)^7.0)~0.01)
/str
str name=parsedquery_toString
+(path_path:f:pathtoafile^4.0 | name:f (pathtoafile fpathtoafile)^7.0)~0.01
/str

Telling people to escape the query:
q=F\:\\path\\to\\a\\file
is unrealistic, but gives the proper parsed query:

+DisjunctionMaxQuery((path_path:f:/path/to/a/file^4.0 | name:f path
to a (file fpathtoafile)^7.0)~0.01)

Any ideas on how to support this?  I could try looking for things like
paths in the app, and then modify the query, or maybe look at
extending edismax.  Perhaps when F: does not match a given field, it
could auto escape the rest of the word?

thanks
ryan


Re: edismax with windows path input?

2011-02-10 Thread Chris Hostetter

: extending edismax.  Perhaps when F: does not match a given field, it
: could auto escape the rest of the word?

that's actually what yonik initially said it was suppose to do, but when i 
tried to add a param to let you control which fields would be supported 
using the : syntax i discovered it didn't work but oculdn't figure out 
why ... details are in the SOLR-1553 comments


-Hoss


Re: edismax with windows path input?

2011-02-10 Thread Ryan McKinley
ah -- that makes sense.

Yonik... looks like you were assigned to it last week -- should I take
a look, or do you already have something in the works?


On Thu, Feb 10, 2011 at 2:52 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : extending edismax.  Perhaps when F: does not match a given field, it
 : could auto escape the rest of the word?

 that's actually what yonik initially said it was suppose to do, but when i
 tried to add a param to let you control which fields would be supported
 using the : syntax i discovered it didn't work but oculdn't figure out
 why ... details are in the SOLR-1553 comments


 -Hoss



Re: edismax with windows path input?

2011-02-10 Thread Yonik Seeley
On Thu, Feb 10, 2011 at 2:52 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : extending edismax.  Perhaps when F: does not match a given field, it
 : could auto escape the rest of the word?

 that's actually what yonik initially said it was suppose to do

Hmmm, not really.
essentially that FOO:BAR and FOO\:BAR would be equivalent if FOO is
not the name of a real field according to the IndexSchema

That part is true, but doesn't say anything about escaping.  And for
some unknown reason, this no longer works.

foo_s:foo\-bar
is a valid lucene query (with only a dash between the foo and the
bar), and presumably it should be treated the same in edismax.
Treating it as foo_s:foo\\-bar (a backslash and a dash between foo and
bar) might cause more problems than it's worth?

-Yonik
http://lucidimagination.com


Re: edismax with windows path input?

2011-02-10 Thread Yonik Seeley
On Thu, Feb 10, 2011 at 3:05 PM, Ryan McKinley ryan...@gmail.com wrote:
 ah -- that makes sense.

 Yonik... looks like you were assigned to it last week -- should I take
 a look, or do you already have something in the works?

I got busy on other things, and don't have anything in the works.
I think edismax should probably just be marked as experimental for 3.1.

-Yonik
http://lucidimagination.com


Re: edismax with windows path input?

2011-02-10 Thread Chris Hostetter

: essentially that FOO:BAR and FOO\:BAR would be equivalent if FOO is
: not the name of a real field according to the IndexSchema
: 
: That part is true, but doesn't say anything about escaping.  And for
: some unknown reason, this no longer works.

that's the only part i was refering to.


-Hoss


Re: edismax with windows path input?

2011-02-10 Thread Ryan McKinley

 foo_s:foo\-bar
 is a valid lucene query (with only a dash between the foo and the
 bar), and presumably it should be treated the same in edismax.
 Treating it as foo_s:foo\\-bar (a backslash and a dash between foo and
 bar) might cause more problems than it's worth?


I don't think we should escape anything that has a valid field name.
If foo_s is a field, then foo_s:foo\-bar should be used as is.

If foo_s is not a field, I would want the whole thing escaped to:
foo_s\:foo\\-bar before getting passed to the rest of the dismax mojo.

Does that make sense?

marking edismax as experimental for 3.1 makes sense!

ryan


Re: edismax with windows path input?

2011-02-10 Thread Yonik Seeley
On Thu, Feb 10, 2011 at 5:51 PM, Ryan McKinley ryan...@gmail.com wrote:

 foo_s:foo\-bar
 is a valid lucene query (with only a dash between the foo and the
 bar), and presumably it should be treated the same in edismax.
 Treating it as foo_s:foo\\-bar (a backslash and a dash between foo and
 bar) might cause more problems than it's worth?


 I don't think we should escape anything that has a valid field name.
 If foo_s is a field, then foo_s:foo\-bar should be used as is.

 If foo_s is not a field, I would want the whole thing escaped to:
 foo_s\:foo\\-bar before getting passed to the rest of the dismax mojo.

 Does that make sense?

Maybe in some scenarios, but probably not in others?

The clause  bar\-baz  will be equiv to field:bar-baz
Hence it seems like the clause foo:bar\-baz should be equiv to
field:foo:bar-baz (assuming foo is not a field)

-Yonik
http://lucidimagination.com