[GitHub] jena issue #335: Jena 1453 reduce docs
Github user afs commented on the issue: https://github.com/apache/jena/pull/335 Documentation changes applied. Thanks! ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user xristy commented on the issue: https://github.com/apache/jena/pull/335 Yes it was ready to merge. The documentation updates are queued in the anonymous "improve this page" commit that I made a week ago. ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user afs commented on the issue: https://github.com/apache/jena/pull/335 Presumably this was ready to merge. I've noted the announcement text as well, thanks, and will sort out the documentation (unless someone beats me to it). ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user xristy commented on the issue: https://github.com/apache/jena/pull/335 Happy to. Here's a brief statement: This release includes updates to the Jena Lucene integration that reduces the size of the documents indexed by Lucene and reduces the size of the resulting indexes. Re-indexing is not necessary as the changes are compatible with existing indexes. Additionally, there is an optional output argument for `text:query` that allows to retrieve the graph that contains a result triple. See the updated [jena text documentation](http://jena.apache.org/documentation/query/text-query.html). ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user afs commented on the issue: https://github.com/apache/jena/pull/335 A few words for the release announcement, especially about non needing to rebuild indexes but if you do, hey are smaller would be most helpful. ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user xristy commented on the issue: https://github.com/apache/jena/pull/335 I've submitted an update to the jena-text documentation to reflect the graph output argument for `text:query`. ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user ajs6f commented on the issue: https://github.com/apache/jena/pull/335 @xristy That's what I meant-- I didn't mean to suggest an in-place op, sorry. ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user xristy commented on the issue: https://github.com/apache/jena/pull/335 @ajs6f I haven't thought about trying to do an in-place update of a text index; however, perhaps one could use jena/textindexer with an assembler file modified to create a Lucene index off-to-the-side and then halt the main server and swap in the new index. ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user xristy commented on the issue: https://github.com/apache/jena/pull/335 Ah! I should have written on this. Upgrading to this PR does not affect an existing text index. The changes to the Lucene documents that are indexed will affect triples that are added - they will have fewer fields and if the graph field is enabled then a single stored graph field will be present rather than several instances. This PR removes redundant fields or unreferenced fields when indexing new triple documents. The triple/document deletion functionality will behave as before if it was enabled when the text index was created. The graph return feature will function with older indexes provided that the graph field was enabled when the text index was created. Re-indexing should generally reduce the size of the text index. ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user afs commented on the issue: https://github.com/apache/jena/pull/335 Does this change the on-disk format? I do't think it does but confirmation would be good. ---
[GitHub] jena issue #335: Jena 1453 reduce docs
Github user xristy commented on the issue: https://github.com/apache/jena/pull/335 Assuming the PR is accepted I'll update the jena-text doc to reflect that there is an additional output arg for `text:query`. An example is: select ?g ?s ?lit ?sc where { (?s ?sc ?lit ?g) text:query (skos:altLabel "one" 100 "lang:en") . } where the `?g` reports the graph in which the matching triples occur. This is likely to be rather more performant than iterating over all graphs or collecting the graph URIs after the fact. ---