[
https://issues.apache.org/jira/browse/CONNECTORS-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627952#comment-14627952
]
Karl Wright commented on CONNECTORS-1219:
-----------------------------------------
Hi Abe-san,
No, it is not necessary to serialize indexwriter. I think you may
misunderstand the proposal. So to make it clear:
(1) ALL lucene activity would happen in one sidecar process, including the
Lucene searcher and a separate Jetty instance it would run under
(2) ManifoldCF would have multiple processes
(3) Communication between the ManifoldCF processes and the Lucene process would
be via a socket
(4) The socket protocol would either be Java-serialization-based RMI (which I
would need to research), or some other low-level protocol. The goal would be
to NOT use REST or XML or JSON or any other heavyweight, open protocol.
(5) The reason an open protocol is undesirable is because we definitely don't
want to reinvent ElasticSearch, Solr, or any other Lucene wrapper. The reason,
though, to have a separate process is because Lucene's memory and disk model is
inconsistent with ManifoldCF's.
Does this make sense?
> Lucene Output Connector
> -----------------------
>
> Key: CONNECTORS-1219
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1219
> Project: ManifoldCF
> Issue Type: New Feature
> Reporter: Shinichiro Abe
> Assignee: Shinichiro Abe
> Attachments: CONNECTORS-1219-v0.1patch.patch,
> CONNECTORS-1219-v0.2.patch, CONNECTORS-1219-v0.3.patch
>
>
> A output connector for Lucene local index directly, not via remote search
> engine. It would be nice if we could use Lucene various API to the index
> directly, even though we could do the same thing to the Solr or Elasticsearch
> index. I assume we can do something to classification, categorization, and
> tagging, using e.g lucene-classification package.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)