GitHub user ehedgehog opened a pull request:
https://github.com/apache/jena/pull/39
Updated text indexing
This change addresses JENA-686, support for cross field conjunctive queries
in jena-text, by
allowing TextDocProducers to be specified in a TextDatasetAssembler
assembly and
existing index entities to be updated as well as added.
DatasetTextGraph - commit's monitor finish() moved to the top
of the method. This is because a batching TextDocProducer
(as we have in our external app) may have quads buffered up
awaiting end-of-batch and they must be flushed by finish();
if they are not, they are auto-flushed after the commit has
been run and an exception is thrown.
TextDatasetFactory - needed methods to create datasets with
doc producers and close-index-on-close flags
TextIndex - added new operation updateEntity to allow update
of (possibly) existing entities
TextIndexLucene - added implementation of updateEntity. Added
deleteDocuments(Term) for deletion of documents (as is used
in ppd-text-index text doc producer batch).
TextDatasetAssembler - updated to allow specification of doc
producer (since done by mainline Jena) and close-index-on
close.
AbstractTestDatasetWithLuceneIndex, DummyDocProducer,
TestTextDatasetAssembler - fix up some issues in the test
framework (use of statics vs use of instance variables,
remembering to close datasets @After done, etc).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/epimorphics/jena-config-doc-producer
updated-text-indexing
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/jena/pull/39.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #39
----
commit 1a919696739e60d46f92b1e58d0fbefb50c14dee
Author: Chris Dollin <[email protected]>
Date: 2015-02-19T15:44:20Z
Integrated changes to jena-text
from work on JENA-686.
commit 5c4e91981f3564252cd74d638285497d149229b2
Author: Chris Dollin <[email protected]>
Date: 2015-02-20T11:31:27Z
Merged in the changes since the apache-jena
fork (over 400 commits) and fix up the jena-text changes that get conflicts
(mostly due to the
automatic merge not handling all the cases well).
commit c702ba48388304dfaf88e84788aed59d3566d685
Author: Chris Dollin <[email protected]>
Date: 2015-02-26T09:40:33Z
Fixed conflicts following merge with latest jena master.
commit d7040a9955a62eeef2e16d38eefb101420a07c06
Author: Chris Dollin <[email protected]>
Date: 2015-02-26T12:02:52Z
Updated dataset assembler so that it can handle dociument
producers that need a dataset as well as a text index, viz, the dependant
text indexer.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---