[
https://issues.apache.org/jira/browse/STANBOL-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977889#comment-14977889
]
Rupert Westenthaler commented on STANBOL-877:
---------------------------------------------
An analysis based on the information provided in [1] showed that Virtuoso does
no longer like quotes in full text query strings. After some digging I found
the specification of full text queries in the Virtuoso documentation at [2].
{code:none}
expr ::= proximity_expr
expr AND expr
| expr OR expr
| expr AND NOT expr
| '(' expr ')'
word_expr ::=
word
| '"' phrase '"'
proximity_expr ::=
word_expr
| proximity_expr NEAR word_expr
word ::=
<word char>*
phrase ::=
word
| phrase <whitespace> word
word_char ::= alphanumeric characters, '*', ISO Latin accented characters.
{code}
To ensure that only alphanumeric characters are used in full text query parts
'{{\W}}' is now used for splitting query strings instead of '{{\s}}'.
[2] http://docs.openlinksw.com/virtuoso/queryingftcols.html#textexprsyntax
> Double quote in query text cause sparql query to fail
> -----------------------------------------------------
>
> Key: STANBOL-877
> URL: https://issues.apache.org/jira/browse/STANBOL-877
> Project: Stanbol
> Issue Type: Bug
> Components: Entityhub
> Reporter: Florent ANDRE
> Assignee: Rupert Westenthaler
> Attachments: SPARQL-grammar-escapes-STANBOL-877_rw.patch,
> escape-quote-877.patch
>
>
> With the use of NLP engines and some content with quoted text inside, quotes
> can be in the string searched by the entityhub.
> Associated with a RDF store, the generated sparql query is not legal as the
> double quote is not escaped.
> Patch submitted as I'm actually stick to rev 1420034.
> This patch contains :
> * A unit test at the
> query/clerezza/src/test/java/org/apache/stanbol/entityhub/query/clerezza/SparqlQueryUtilsTest.java
> level
> * A quote escape in
> generic/servicesapi/src/main/java/org/apache/stanbol/entityhub/servicesapi/query/TextConstraint.java
> for escaping in all query generation cases
> * a remove in
> generic/servicesapi/src/main/java/org/apache/stanbol/entityhub/servicesapi/util/PatternUtils.java
> as this double escape something already escaped that lead to not still
> escape the characters during regex part generation.
> All the project compile with this patch at this rev.
> ++
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)