Yes the dev list is the more appropriate place for discussing new
features, enhancements, patches etc

Rob

On 04/12/2013 05:41, "Osma Suominen" <[email protected]> wrote:

>Hi all,
>
>there's been no replies so far to my suggestion for jena-text
>enhancements that I'd like to implement to get better performance when
>there are many named graphs. Should I maybe post this to jena-dev instead?
>
>-Osma
>
>29.11.2013 14:02, Osma Suominen kirjoitti:
>> Hi Andy!
>>
>>> Should this be per map entry/ per predicate?  I don't know which is
>>> best - whether a index-wide configuration or whether it might be
>>> some predicates are indexed one way and some another.
>>
>> For now, I think this can be global, i.e. not possible to set per
>> predicate.
>>
>>> (and if there is no lang, presumably "") .
>>
>> Probably yes, though I'll defer the lang discussion for now and
>> concentrate on getting the graph information into the index first
>> because that is more critical for me - I have dozens of graphs, but only
>> a few languages in each graph.
>>
>>> Sounds sane.
>>
>> Great!
>>
>>> What would the query predicate in SPARQL look like?
>>
>> For the graph part, I think there is no need to introduce any new
>> syntax. Simply having the text:query within the context of a specific
>> graph should be enough, i.e. this should work:
>>
>> GRAPH <http://example.com/mygraph> {
>>    ?s text:query "keyword" .
>> }
>>
>> For the language part, I'm not so sure, but I'll defer the discussion
>> for now.
>>
>>> If it all defaults back to the current mode of operations, we have a
>>> non-disturptive upgrade path which would better if possible.  It's a
>>> change of disk-format which is always more of an issue for existing
>>> use.
>>
>> Yes, that is my intent, to not disrupt existing use in any way.
>>
>> Attached is a first draft patch which is my attempt at adding graph
>> information to the index, iff graphField has been set in the config
>> file, as in the attached config file.
>>
>> With this patch, you can use a query such as this:
>>
>> SELECT ?s {
>>    ?s text:query '+res* +graph:"http\\://example.com/graphA"' .
>> }
>>
>> and you will only get results from within the specified graph. This is
>> obviously a bit awkward since you have to know the name of the graph
>> field, and also the URI quoting is ugly. But at least it proves that the
>> graph information was successfully stored in the index and can be used
>> for retrieval.
>>
>> However, I couldn't figure out how to get the URI of the current graph
>> at query time so that an explicit "graph:" query part could be avoided.
>>
>> An ExecutionContext is passed to TextQueryPF methods and it has a
>> getActiveGraph() method which looks promising. But neither the Graph
>> interface nor the GraphBase implementation seem to be aware of the URI
>> (or Node in general) they are identified by. The only (possible,
>> untested) way that I could think of would be to also call
>> ExecutionContext.getDataset(); then call DatasetGraph.listGraphNodes();
>> and for each of the Nodes, call DatasetGraph.getGraph(node) and see if
>> the result matches the Graph that getActiveGraph() returned. But this
>> seems awfully inefficient, especially if there are lots of graphs. Is
>> there a better way to find out the URI of the current graph within
>> TextQueryPF methods?
>>
>> Finally some misc notes:
>> - TextDocProducerEntities seems to be unused - not touched
>> - TextDocProducerTriples.[qQ]uadsToTriples is unused - not touched
>> - TextIndexLucene.get$ - it seems a bit stupid to use a QueryParser
>>    when you could directly create a Query programmatically - not touched
>> - I think get$ was broken anyway because it doesn't take into account
>>    that the query is tokenized by StandardAnalyzer - but this should now
>>    be fixed as a side effect of using PerFieldAnalyzerWrapper
>> - I made similar changes in TextIndexSolr as in TextIndexLucene, but
>>    have so far tested only the Lucene part
>>
>> -Osma
>>
>
>
>-- 
>Osma Suominen
>D.Sc. (Tech), Information Systems Specialist
>National Library of Finland
>P.O. Box 26 (Teollisuuskatu 23)
>00014 HELSINGIN YLIOPISTO
>Tel. +358 50 3199529
>[email protected]
>http://www.nationallibrary.fi




Reply via email to