hi marcel,

thanks for the informations
can you add your comments to the jira issue ?
https://issues.apache.org/jira/browse/JCR-1248

ok if try to run the query like this

//element(*, nt:base)[jcr:contains(., 'test\"!\"')]"

it works fine

but i think jackrabbit should handle the query properly if the sign is at the 
end ..

>>What I propose is to limit the set to only those that are really required. 
>>e.g. 
>>the "!" is equivalent to "-" and the keyword NOT. And then clearly document 
>>it.

yes the cleary documenttation is often the problem :-)

>>This however means that you need to escape more than the specified set of 
>>characters.

should we add a UtilClass that handles this kind of escaping because we have 
ISO9075 that
handles filenames and ISO8601 that handles date/time things so  it would be fine
to encode search literals also

BR,
claus


-----Ursprüngliche Nachricht-----
Von: Marcel Reutegger [mailto:[EMAIL PROTECTED] 
Gesendet: Freitag, 30. November 2007 11:11
An: [email protected]
Betreff: Re: AW: FullText Search Problem


KÖLL Claus wrote:
> so either i will filter some characters from the search string or jackrabbit 
> should handle it.
> i think the second one will be better

JSR 170 specifies a set of characters that need to be escaped if one wishes to 
use them as literal instead of the semantics the spec gives them:

"Within the searchexp literal instances of single quote ("'"), double quote 
(""") and hyphen ("-") must be escaped with a backslash ("\"). Backslash itself 
must therefore also be escaped, ending up as double backslash ("\\")."

Jackrabbit extended this set to provide additional functionality. e.g. you can 
do a fuzzy search: test~

This however means that you need to escape more than the specified set of 
characters. Strictly speaking this is a violation of the spec. But without 
extending this set of characters additional functionality is very difficult to 
implement.

The current set of special characters that need escaping is:

"\\", "+", "-", "!", "(", ")", ":", "^", "[", "]", "\"", "{", "}", "~", "*", "?"

What I propose is to limit the set to only those that are really required. e.g. 
the "!" is equivalent to "-" and the keyword NOT. And then clearly document it.

regards
  marcel

Reply via email to