I question whether your example actually searches what you think. What I suggest is that you get a copy of Luke and look at what your queries actually produce. That'll give you a much better idea of what happens under the covers.
Also, query.toString() is your friend. Try printing out your queries (again, I recommend a small test program) to get a better feel for what each analyzer does to your queries. Of course you have to escape the ':' character when it does not refer to a field. How else could you imagine that the query parser can distinguish between a field designation and a field value? Consider the query: f:stuff g:morestuff The parser has to understand that the ':' separates the field 'f' from the value 'stuff' and that 'g' is another field. So if you want to parse f:stuff g:somemore:stuff the parser gets all confused by whether the ':' between somemore and stuff is a field delimiter or a value, unless it's escaped. Best Erick On Tue, Sep 16, 2008 at 9:49 AM, miztaken <[EMAIL PROTECTED]> wrote: > > Hi there, > I will check that out but what do you suggest for searching?? > without escaping works for query string "fw: fyi.dat" but i have to escape > : > char for query string "fw:" so i am having two cases? > > Please help me > > > > > Erick Erickson wrote: > > > > You can easily answer the questions about what WhitespaceTokenizer > > produces by getting a copy of Luke and looking at your index. Or writing > > a really simple test program that prints out tokens. > > > > At the bottom of this page is a list of special characters for escaping: > > http://lucene.apache.org/java/docs/queryparsersyntax.html > > > > Best > > Erick > > > > On Tue, Sep 16, 2008 at 9:05 AM, miztaken <[EMAIL PROTECTED]> > wrote: > > > >> > >> Hi there, > >> I am using WhiteSpaceAnalyser to index documents. I have used this > >> because > >> i > >> need to split tokens based on space only. Also Tokensized=true > >> While indexing what does it do with special characters like + - && || ! > ( > >> ) > >> { } [ ] ^ " ~ * ? : \, will these characters be indexed or will be > >> chopped > >> off? I am confused about this. > >> > >> Now i am having problem while searching as well.. > >> for query strings like "jason dartling (e-mail)" and "re: fyi.dat", i > >> don't > >> have to escape the special characters ( , ) and : but for input such as > >> "re:" queryParser is producing error so i have escaped characters here. > >> So it seems like i have two cases to deal with.. > >> Can anyone suggest me one generic way to deal with both the cases? > >> > >> Basically how to index and search string with escape characters will be > >> my > >> generalized question? > >> > >> > >> Please help me > >> miztaken > >> > >> > >> > >> > >> > >> -- > >> View this message in context: > >> > http://www.nabble.com/Issues-with-Special-Characters-tp19511428p19511428.html > >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > > > > > > -- > View this message in context: > http://www.nabble.com/Issues-with-Special-Characters-tp19511428p19512277.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
