I fixed the escaping bug and others in the patch I submitted for Bug 24665:
"[PATCH] Query parser doesn't handle escaped field names"

I think the fix was clean. I traced it to an image token returned by JavaCC
still containing the escaped char. I included several tests as well if I
remember well.

This patch never got applied, don't know why.


KR,

Jean-Francois Halleux

-----Original Message-----
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: mardi 2 mars 2004 16:35
To: Lucene Developers List; [EMAIL PROTECTED]
Subject: Re: Question regarding escaped sequence


I have a feeling that query escaping really is broken in Lucene.
Try running the class below like this:

prompt> java Escaper '+string' '\+string'

I get:

$ java Escaper '+string' '\+string'
0: +string
1: \+string
QUERY: \+string
HITS: 0

That should give me 1 hit, shouldn't it?

import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.analysis.*;
import org.apache.lucene.earch.*;
import org.apache.lucene.index.*;
import org.apache.lucene.store.*;
import org.apache.lucene.document.*;

public class Escaper
{
    public static void main(String[] args) throws Exception
    {
        System.out.println("0: " + args[0]);
        System.out.println("1: " + args[1]);

        Directory dir = new RAMDirectory();
        IndexWriter writer = new IndexWriter(dir, new
WhitespaceAnalyzer(), true);
        Document doc = new Document();
        doc.add(Field.Text("text", args[0]));
        writer.addDocument(doc);
        writer.optimize();
        writer.close();

        QueryParser qp = new QueryParser("text", new
WhitespaceAnalyzer());
        Query q = qp.parse(args[1]);
        System.out.println("QUERY: " + q.toString("text"));

        IndexSearcher searcher = new IndexSearcher(dir);
        Hits hits = searcher.search(q);
        System.out.println("HITS: " + hits.length());
        searcher.close();
    }
}

Thanks,
Otis


--- Jean-Francois Halleux <[EMAIL PROTECTED]> wrote:
> Hello,
>
>       in TestQueryParser, method testEscaped(), I see the following:
>
> ...
> assertQueryEquals("\\+blah", a, "\\+blah");
> assertQueryEquals("\\(blah", a, "\\(blah");
>
> assertQueryEquals("\\-blah", a, "\\-blah");
> assertQueryEquals("\\!blah", a, "\\!blah");
> assertQueryEquals("\\{blah", a, "\\{blah");
> assertQueryEquals("\\}blah", a, "\\}blah");
> ...
>
> is this really the expected behavior? Shouldn't \\-blah be
> interpreted
> as -blah and \\!blah as !blah ?
>
> Thanks,
>
> Jean-Francois Halleux
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to