[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-11 Thread afs
Github user afs commented on the issue:

https://github.com/apache/jena/pull/335
  
Documentation changes applied. Thanks!


---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-10 Thread xristy
Github user xristy commented on the issue:

https://github.com/apache/jena/pull/335
  
Yes it was ready to merge. The documentation updates are queued in the 
anonymous "improve this page" commit that I made a week ago.


---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-10 Thread afs
Github user afs commented on the issue:

https://github.com/apache/jena/pull/335
  
Presumably this was ready to merge. I've noted the announcement text as 
well, thanks, and will sort out the documentation (unless someone beats me to 
it).




---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-06 Thread xristy
Github user xristy commented on the issue:

https://github.com/apache/jena/pull/335
  
Happy to. Here's a brief statement:

This release includes updates to the Jena Lucene integration that reduces 
the size of the documents indexed by Lucene and reduces the size of the 
resulting indexes. Re-indexing is not necessary as the changes are compatible 
with existing indexes. Additionally, there is an optional output argument for 
`text:query` that allows to retrieve the graph that contains a result triple. 
See the updated [jena text 
documentation](http://jena.apache.org/documentation/query/text-query.html).


---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-06 Thread afs
Github user afs commented on the issue:

https://github.com/apache/jena/pull/335
  
A few words for the release announcement, especially about non needing to 
rebuild indexes but if you do, hey are smaller would be most helpful.



---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-03 Thread xristy
Github user xristy commented on the issue:

https://github.com/apache/jena/pull/335
  
I've submitted an update to the jena-text documentation to reflect the 
graph output argument for `text:query`.


---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-03 Thread ajs6f
Github user ajs6f commented on the issue:

https://github.com/apache/jena/pull/335
  
@xristy That's what I meant-- I didn't mean to suggest an in-place op, 
sorry.


---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-03 Thread xristy
Github user xristy commented on the issue:

https://github.com/apache/jena/pull/335
  
@ajs6f I haven't thought about trying to do an in-place update of a text 
index; however, perhaps one could use jena/textindexer with an  assembler file 
modified to create a Lucene index off-to-the-side and then halt the main server 
and swap in the new index.


---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-03 Thread xristy
Github user xristy commented on the issue:

https://github.com/apache/jena/pull/335
  
Ah! I should have written on this.

Upgrading to this PR does not affect an existing text index.

The changes to the Lucene documents that are indexed will affect triples 
that are added - they will have fewer fields and if the graph field is enabled 
then a single stored graph field will be present rather than several instances. 

This PR removes redundant fields or unreferenced fields when indexing new 
triple documents. 

The triple/document deletion functionality will behave as before if it was 
enabled when the text index was created.

The graph return feature will function with older indexes provided that the 
graph field was enabled when the text index was created.

Re-indexing should generally reduce the size of the text index.




---


[GitHub] jena issue #335: Jena 1453 reduce docs

2018-01-03 Thread afs
Github user afs commented on the issue:

https://github.com/apache/jena/pull/335
  
Does this change the on-disk format? I do't think it does but confirmation 
would be good.


---


[GitHub] jena issue #335: Jena 1453 reduce docs

2017-12-28 Thread xristy
Github user xristy commented on the issue:

https://github.com/apache/jena/pull/335
  
Assuming the PR is accepted I'll update the jena-text doc to reflect that 
there is an additional output arg for `text:query`. An example is:

select ?g ?s ?lit ?sc
where {
   (?s ?sc ?lit ?g) text:query (skos:altLabel "one" 100 "lang:en") .
}

where the `?g` reports the graph in which the matching triples occur. This 
is likely to be rather more performant than iterating over all graphs or 
collecting the graph URIs after the fact.


---