I'm just as confused by QueryParser and character escaping as the next guy :)

Jean-Francois' patches seemed fine to me, although if I remember correctly there were lots of patches all merged together. I'm weary of applying too many things all at once. Maybe I'm wrong about the patches though.

Erik


On Mar 2, 2004, at 2:33 PM, Otis Gospodnetic wrote:


Yes, I'm aware of the patch.  I was looking at it today, and then your
old email below.  The patch assumes that the existing code and even
unit tests have a bug, and had it all along, which sounds amazing, so I
want to double-check with somebody on lucene-dev who knows QueryParser
and escaping issues better than me....Erik? :)

Once we resolve this, I'll apply your patch, if the unit tests and the
code it tests really are buggy.

\\-Otis


--- Jean-Francois Halleux <[EMAIL PROTECTED]> wrote:
I fixed the escaping bug and others in the patch I submitted for Bug
24665:
"[PATCH] Query parser doesn't handle escaped field names"

I think the fix was clean. I traced it to an image token returned by
JavaCC
still containing the escaped char. I included several tests as well
if I
remember well.

This patch never got applied, don't know why.


KR,


Jean-Francois Halleux

-----Original Message-----
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: mardi 2 mars 2004 16:35
To: Lucene Developers List; [EMAIL PROTECTED]
Subject: Re: Question regarding escaped sequence


I have a feeling that query escaping really is broken in Lucene. Try running the class below like this:

prompt> java Escaper '+string' '\+string'

I get:

$ java Escaper '+string' '\+string'
0: +string
1: \+string
QUERY: \+string
HITS: 0

That should give me 1 hit, shouldn't it?

import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.analysis.*;
import org.apache.lucene.earch.*;
import org.apache.lucene.index.*;
import org.apache.lucene.store.*;
import org.apache.lucene.document.*;

public class Escaper
{
    public static void main(String[] args) throws Exception
    {
        System.out.println("0: " + args[0]);
        System.out.println("1: " + args[1]);

        Directory dir = new RAMDirectory();
        IndexWriter writer = new IndexWriter(dir, new
WhitespaceAnalyzer(), true);
        Document doc = new Document();
        doc.add(Field.Text("text", args[0]));
        writer.addDocument(doc);
        writer.optimize();
        writer.close();

        QueryParser qp = new QueryParser("text", new
WhitespaceAnalyzer());
        Query q = qp.parse(args[1]);
        System.out.println("QUERY: " + q.toString("text"));

        IndexSearcher searcher = new IndexSearcher(dir);
        Hits hits = searcher.search(q);
        System.out.println("HITS: " + hits.length());
        searcher.close();
    }
}

Thanks,
Otis


--- Jean-Francois Halleux <[EMAIL PROTECTED]> wrote:
Hello,

in TestQueryParser, method testEscaped(), I see the following:

...
assertQueryEquals("\\+blah", a, "\\+blah");
assertQueryEquals("\\(blah", a, "\\(blah");

assertQueryEquals("\\-blah", a, "\\-blah");
assertQueryEquals("\\!blah", a, "\\!blah");
assertQueryEquals("\\{blah", a, "\\{blah");
assertQueryEquals("\\}blah", a, "\\}blah");
...

is this really the expected behavior? Shouldn't \\-blah be
interpreted
as -blah and \\!blah as !blah ?

Thanks,

Jean-Francois Halleux



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to