Yes the dev list is the more appropriate place for discussing new features, enhancements, patches etc
Rob On 04/12/2013 05:41, "Osma Suominen" <[email protected]> wrote: >Hi all, > >there's been no replies so far to my suggestion for jena-text >enhancements that I'd like to implement to get better performance when >there are many named graphs. Should I maybe post this to jena-dev instead? > >-Osma > >29.11.2013 14:02, Osma Suominen kirjoitti: >> Hi Andy! >> >>> Should this be per map entry/ per predicate? I don't know which is >>> best - whether a index-wide configuration or whether it might be >>> some predicates are indexed one way and some another. >> >> For now, I think this can be global, i.e. not possible to set per >> predicate. >> >>> (and if there is no lang, presumably "") . >> >> Probably yes, though I'll defer the lang discussion for now and >> concentrate on getting the graph information into the index first >> because that is more critical for me - I have dozens of graphs, but only >> a few languages in each graph. >> >>> Sounds sane. >> >> Great! >> >>> What would the query predicate in SPARQL look like? >> >> For the graph part, I think there is no need to introduce any new >> syntax. Simply having the text:query within the context of a specific >> graph should be enough, i.e. this should work: >> >> GRAPH <http://example.com/mygraph> { >> ?s text:query "keyword" . >> } >> >> For the language part, I'm not so sure, but I'll defer the discussion >> for now. >> >>> If it all defaults back to the current mode of operations, we have a >>> non-disturptive upgrade path which would better if possible. It's a >>> change of disk-format which is always more of an issue for existing >>> use. >> >> Yes, that is my intent, to not disrupt existing use in any way. >> >> Attached is a first draft patch which is my attempt at adding graph >> information to the index, iff graphField has been set in the config >> file, as in the attached config file. >> >> With this patch, you can use a query such as this: >> >> SELECT ?s { >> ?s text:query '+res* +graph:"http\\://example.com/graphA"' . >> } >> >> and you will only get results from within the specified graph. This is >> obviously a bit awkward since you have to know the name of the graph >> field, and also the URI quoting is ugly. But at least it proves that the >> graph information was successfully stored in the index and can be used >> for retrieval. >> >> However, I couldn't figure out how to get the URI of the current graph >> at query time so that an explicit "graph:" query part could be avoided. >> >> An ExecutionContext is passed to TextQueryPF methods and it has a >> getActiveGraph() method which looks promising. But neither the Graph >> interface nor the GraphBase implementation seem to be aware of the URI >> (or Node in general) they are identified by. The only (possible, >> untested) way that I could think of would be to also call >> ExecutionContext.getDataset(); then call DatasetGraph.listGraphNodes(); >> and for each of the Nodes, call DatasetGraph.getGraph(node) and see if >> the result matches the Graph that getActiveGraph() returned. But this >> seems awfully inefficient, especially if there are lots of graphs. Is >> there a better way to find out the URI of the current graph within >> TextQueryPF methods? >> >> Finally some misc notes: >> - TextDocProducerEntities seems to be unused - not touched >> - TextDocProducerTriples.[qQ]uadsToTriples is unused - not touched >> - TextIndexLucene.get$ - it seems a bit stupid to use a QueryParser >> when you could directly create a Query programmatically - not touched >> - I think get$ was broken anyway because it doesn't take into account >> that the query is tokenized by StandardAnalyzer - but this should now >> be fixed as a side effect of using PerFieldAnalyzerWrapper >> - I made similar changes in TextIndexSolr as in TextIndexLucene, but >> have so far tested only the Lucene part >> >> -Osma >> > > >-- >Osma Suominen >D.Sc. (Tech), Information Systems Specialist >National Library of Finland >P.O. Box 26 (Teollisuuskatu 23) >00014 HELSINGIN YLIOPISTO >Tel. +358 50 3199529 >[email protected] >http://www.nationallibrary.fi
