[jira] [Updated] (CONNECTORS-1219) Lucene Output Connector

Shinichiro Abe (JIRA) Thu, 02 Jul 2015 10:00:47 -0700

     [ 
https://issues.apache.org/jira/browse/CONNECTORS-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shinichiro Abe updated CONNECTORS-1219:
---------------------------------------
    Attachment: CONNECTORS-1219-v0.1patch.patch

strawman patch, still be improved more.
I think this connector will need to have much heap memory for working well. 
Where are memory problems you said? Multiple threads are writing to an index? 
If so, I took it into account the below.
In tika connector, on the other hand, BodyContentHandler should be replaced 
with WriteOutContentHandler because any connectors might treat big string 
object. WriteOutContentHandler has writeLimit param and have used by tika 
facade or jackrabbit oak's solr integration to avoid consuming more memory. 
Also, I have a plan to introduce mcf-search-api-service.war based on this 
connector, since mcf would be able to have a search engine with pull-agent, 
it's just an idea for me though. As to Lucene memory, multiple connections of 
this connector share one client instance per local path because of those, and I 
also have an idea to use it from search-api side.

> Lucene Output Connector
> -----------------------
>
>                 Key: CONNECTORS-1219
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1219
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Shinichiro Abe
>            Assignee: Shinichiro Abe
>         Attachments: CONNECTORS-1219-v0.1patch.patch
>
>
> A output connector for Lucene local index directly, not via remote search 
> engine. It would be nice if we could use Lucene various API to the index 
> directly, even though we could do the same thing to the Solr or Elasticsearch 
> index. I assume we can do something to classification, categorization, and 
> tagging, using e.g lucene-classification package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CONNECTORS-1219) Lucene Output Connector

Reply via email to