[
https://issues.apache.org/jira/browse/CONNECTORS-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681211#comment-14681211
]
Shinichiro Abe commented on CONNECTORS-1219:
--------------------------------------------
Progress report: multiple indexwriters to an index with NoLockFactory lead to
corrupt the index.
{noformat}
ERROR 2015-08-04 08:17:27,565 (Worker thread '32') - Exception tossed:
org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file
truncated?): actual footer=1768776044 vs expected footer=-1071082520
(resource=_br_Lucene50_0.pos)
org.apache.manifoldcf.core.interfaces.ManifoldCFException:
org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file
truncated?): actual footer=1768776044 vs expected footer=-1071082520
(resource=_br_Lucene50_0.pos)
{noformat}
In Oak even if there are multiple indexwriters, in fact a single thread writes
to an index in the cluster.
http://markmail.org/thread/2awr5or54vpexzx2
In MCF I think we can have three alternatives.
* use LockManager.enterWriteLock() in multiprocess mode to get global lock and
to guarantee single writer when writing.
(But it didn't work when I tried. Maybe it was incorrect for me to write the
code. Also, multiple fast indexing is lost by single writer, so I don't want to
use that.)
* use RMI.
(Because there is no way except for this at this time, this will require much
time to implement.)
* This connector doesn't support multiprocess mode unless mcf supports
removeDocument per process.
(Is this violate for mcf's multiprocess specification?)
I'm likely to give up this connector unless any help. I'll postpone this ticket
for the time being.
> Lucene Output Connector
> -----------------------
>
> Key: CONNECTORS-1219
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1219
> Project: ManifoldCF
> Issue Type: New Feature
> Reporter: Shinichiro Abe
> Assignee: Shinichiro Abe
> Attachments: CONNECTORS-1219-v0.1patch.patch,
> CONNECTORS-1219-v0.2.patch, CONNECTORS-1219-v0.3.patch
>
>
> A output connector for Lucene local index directly, not via remote search
> engine. It would be nice if we could use Lucene various API to the index
> directly, even though we could do the same thing to the Solr or Elasticsearch
> index. I assume we can do something to classification, categorization, and
> tagging, using e.g lucene-classification package.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)