[
https://issues.apache.org/jira/browse/JENA-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922239#comment-16922239
]
Brian McBride edited comment on JENA-1749 at 9/4/19 7:49 AM:
-------------------------------------------------------------
[[
I'm in email contact and can discuss.
]]
That's great. Thank you. If you are on vacation, this can wait. Right now I
think there is a technical solution, but I am also conscious I may not have
understood something important.
[[
The _functionality_ has been clearly documented since 3.6.0 as _not supported_.
]]
I understand and respect that position. Mine is a bit different though I think
there is a way of looking at this that would reconcile our positions. Whilst
that would be an interesting discussion that appeals to my philosophical
nature, I should follow your lead and see if we can figure out an actual
working solution.
[[
The use of {{text:withFields}} is an approach that came to mind.
]]
I had a variation on the same idea.
[[
I appreciate your opinion. Otoh, the model for the integration of Jena w/
Lucene is _one triple == one document_ and the use of {{text:withFields}} is
one way of unambiguously indicating that the query is based on a different
model and the complexities that arise can be dealt with in a clear manner.
]]
You are right - I was expressing a value judgement and yours may differ. In
practical terms, if the property name were to change for the behaviour we use
that will affect up to 5 applications and libraries (it may just be one library
but I can't be sure of that without checking) and more significantly - we have
a public sparql endpoint so it will affect all our users who use text queries.
Now that is my problem - I just would like to be clear I'm not being capricious
here.
[[
Do you have another approach to dealing with the inherent ambiguity that your
use case exploits in:
{code:java}
?s text:query ( "query with fields" LUCENE_LIMIT )
{code}
]]
I'm not sure what you mean by "inherent ambiguity" so I may be missing
something obvious.
The query itself is unambiguous. If it does not specify a field then lucene
will search the default field.
In terms of the result, there is no ambiguity - there is only one subject for
each document.
If you are considering the case of the problematic query form:
(?s ?score ?lit) text:query ( "query with or without fields" LUCENE_LIMIT )
then there are two cases:
1) the normal case - the lucene hit has only one field - so use that to return
the ?lit value (which I guess is what happens in 3.12.0)
2) the multi-field index case - the lucene hit more than one field so don't
bind ?lit and possibly throw an exception
I recognise I am probably missing something - maybe this will help identify it.
Another way in would be to ask - why has it changed since 3.12.0? Is there
something about OR that required this change?
Another thought: if its the case that the code needs to know that it is
dealing with a multifield index, then a config parameter could be used to tell
it rather than a property name.
was (Author: bwm):
[[
I'm in email contact and can discuss.
]]
That's great. Thank you. If you are on vacation, this can wait. Right now I
think there is a technical solution, but I am also conscious I may not have
understood something important.
[[
The _functionality_ has been clearly documented since 3.6.0 as _not supported_.
]]
I understand and respect that position. Mine is a bit different though I think
there is a way of looking at this that would reconcile our positions. Whilst
that would be an interesting discussion that appeals to my philosophical
nature, I should follow your lead and see if we can figure out an actual
working solution.
[[
The use of {{text:withFields}} is an approach that came to mind.
]]
I had a variation on the same idea.
[[
I appreciate your opinion. Otoh, the model for the integration of Jena w/
Lucene is _one triple == one document_ and the use of {{text:withFields}} is
one way of unambiguously indicating that the query is based on a different
model and the complexities that arise can be dealt with in a clear manner.
]]
You are right - I was expressing a value judgement and yours may differ. In
practical terms, if the property name were to change for the behaviour we use
that will affect up to 5 applications and libraries (it may just be one library
but I can't be sure of that without checking) and more significantly - we have
a public sparql endpoint so it will affect all our users who use text queries.
Now that is my problem - I just would like to be clear I'm not being capricious
here.
[[
Do you have another approach to dealing with the inherent ambiguity that your
use case exploits in:
{code:java}
?s text:query ( "query with fields" LUCENE_LIMIT )
{code}
]]
I'm not sure what you mean by "inherent ambiguity" so I may be missing
something obvious.
The query itself is unambiguous. If it does not specify a field then lucene
will search the default field.
In terms of the result, there is no ambiguity - there is only one subject for
each document.
If you are considering the case of the problematic query form:
(?s ?score ?lit) text:query ( "query with or without fields" LUCENE_LIMIT )
then there are two cases:
1) the normal case - the lucene hit has only one field - so use that to return
the ?lit value (which I guess is what happens in 3.12.0)
2) the multi-field index case - the lucene hit more than one field so don't
bind ?lit and possibly throw an exception
I recognise I am probably missing something - maybe this will help identify it.
Another way in would be to ask - why has it changed since 3.12.0? Is there
something about OR that required this change?
> Support lucene field names in jena text queries
> -----------------------------------------------
>
> Key: JENA-1749
> URL: https://issues.apache.org/jira/browse/JENA-1749
> Project: Apache Jena
> Issue Type: Bug
> Components: Text
> Affects Versions: Jena 3.13.0
> Reporter: Brian McBride
> Priority: Major
> Attachments: stacktrace.txt
>
>
> Until recent changes made during implementation of JENA-1723, it was possible
> to have a Lucene text query that used Lucene field names. With the
> implementation of JENA-1723 such queries result in a exception
> For example:
> {quote}PREFIX xsd:
> [<http://www.w3.org/2001/XMLSchema#>|http://www.w3.org/2001/XMLSchema#]
> PREFIX text: [<http://jena.apache.org/text#>|http://jena.apache.org/text#]
> PREFIX ppd:
> [<http://landregistry.data.gov.uk/def/ppi/>|http://landregistry.data.gov.uk/def/ppi/]
>
> PREFIX lrcommon:
> [<http://landregistry.data.gov.uk/def/common/>|http://landregistry.data.gov.uk/def/common/]
>
> {{SELECT * {}}
> ?ppd_propertyAddress
> text:query ( "street: the" 3000000 ) .
> {{} LIMIT 1}}
> Cannot parse 'text:street: the ': Encountered " ":" ": "" at line 1, column
> 11.
> {quote}
> This is a simplifed query from a running production system that works in
> 3.12.0 but is failing in 3.13.0-SNAPSHOT.
> Some discussion and analysis of this issue has occurred in email:
> [https://lists.apache.org/thread.html/ccc1d5c5eaebcddafc2dbae85f3b5901396e3ab203df6bb4014e8270@%3Cusers.jena.apache.org%3E]
>
--
This message was sent by Atlassian Jira
(v8.3.2#803003)