Re: HELP in QueyParsing !!

2003-07-14 Thread Victor Hadianto
 Input:   QueryCreated Remarks
 c\+\+  c   (Escape character not working)

The StandardTokenizer and QueryParser will drop the ++ sign. This problem is 
similar to the recent thread. Search the archive the the following strings
'-' characer not interpreted correctly in field names

You may be able to implement similar solution to the one that I've posted. 

Actually your query got me interested, I've tried my solution for c-- and the 
-- signs are dropped. This because I define DASHESWORD as 

| DASHESWORD: ALPHANUM (- ALPHANUM)+ 

This will search for t-shirt, but not tshirt-. Yet another QueryParser 
peculiarity :)

If you absolutely has to search for c++ then I suggest you define another 
token which encompasses all alpharnumeric word and plus sign. For example 
(modify StandardTokenizer.jj):

MYTOKEN: (ALPHANUM|+)+ 

add the line:

token = MYTOKEN

in the next() method. This may work.

 c++-   (Parser throws an exception) [NOTE-1]
As expected.

 *c -   (throws an exception -   [NOTE-2]
There has been a number of discussion on this subject, search the mailing list 
for more information. 

 Does that mean that the program should taken care of validating the
 User input and then pass the query string to QueryParser?

Depends how do you look at it. QueryParser will throw ParseException if it has 
parsing issues, you can in some way treat this as the validation.


HTH,
victor


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: HELP in QueyParsing !!

2003-07-14 Thread Bharatbhushan_Shetty
Thanks Victor. I'll look into your earlier postings for the solution. 
But I was wandering, there might be many more scenarios what a user
might search for. 

-Original Message-
From: Victor Hadianto [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 14, 2003 2:02 PM
To: Lucene Users List
Subject: Re: HELP in QueyParsing !!


 Input:   QueryCreated Remarks
 c\+\+  c   (Escape character not working)

The StandardTokenizer and QueryParser will drop the ++ sign. This
problem is 
similar to the recent thread. Search the archive the the following
strings '-' characer not interpreted correctly in field names

You may be able to implement similar solution to the one that I've
posted. 

Actually your query got me interested, I've tried my solution for c--
and the 
-- signs are dropped. This because I define DASHESWORD as 

| DASHESWORD: ALPHANUM (- ALPHANUM)+ 

This will search for t-shirt, but not tshirt-. Yet another QueryParser 
peculiarity :)

If you absolutely has to search for c++ then I suggest you define
another 
token which encompasses all alpharnumeric word and plus sign. For
example 
(modify StandardTokenizer.jj):

MYTOKEN: (ALPHANUM|+)+ 

add the line:

token = MYTOKEN

in the next() method. This may work.

 c++-   (Parser throws an exception) [NOTE-1]
As expected.

 *c -   (throws an exception -   [NOTE-2]
There has been a number of discussion on this subject, search the
mailing list 
for more information. 

 Does that mean that the program should taken care of validating the 
 User input and then pass the query string to QueryParser?

Depends how do you look at it. QueryParser will throw ParseException if
it has 
parsing issues, you can in some way treat this as the validation.


HTH,
victor


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]