[
https://issues.apache.org/jira/browse/JENA-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709241#comment-16709241
]
Code Ferret commented on JENA-1645:
-----------------------------------
It would be helpful to see example queries and how you have used the subject
URI.
I agree that the {{concreteSubject}} *should* create Lucene queries that
include a term of the form:
{code}
... AND uri:http://example.org/data/resource/R0123
{code}
Currently the code for {{concreteSubject}} collects results for all possible
subjects and then after the results are returned selects just the ones
corresponding to the provided {{subject}} and discards the rest of the results.
Quite inefficient!
This behavior is transparent to the user other than the performance; however,
if there is some reason to keep this behavior then the _new_ behavior can be
handled by adding a {{boolean}} {{TextIndex}} option in the configuration:
{{text:useConcreteSubject}}.
The implementation involves threading the subject into the
{{TextIndex.query(...)}}, adding a new query method to {{TextIndex}},
{{TextIndexLucene}} and {{TextIndexES}}. It should be rather straightforward.
> Poor performance with full text search (Lucene)
> -----------------------------------------------
>
> Key: JENA-1645
> URL: https://issues.apache.org/jira/browse/JENA-1645
> Project: Apache Jena
> Issue Type: Question
> Components: Jena
> Affects Versions: Jena 3.9.0
> Reporter: Vasyl Danyliuk
> Priority: Major
>
> Situation: half of a million of an indexed by Lucene documents(emails
> actually), searching for emails by sender/receiver and some text.
> If to put text filter in the start of SPARQL query it executes once but in a
> case of very common words here are a lot of results(100 000+) that leads to
> poor performance, limiting results count may and up with missed results.
> If to put text search as the last condition it executes once per each already
> found subject. That's completely OK but text search completely ignores
> subject URI.
> I found two methods in TextQueryPF class: variableSubject(...) for the first
> case, and concreteSubject(...) for the second one.
> The question is: why can't subject URI be used as a constraint in the text
> search?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)