[ 
https://issues.apache.org/jira/browse/CONNECTORS-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14613588#comment-14613588
 ] 

Shinichiro Abe commented on CONNECTORS-1219:
--------------------------------------------

I created a branch. r1689110.
And added some fixes. r1689113.

I have known issue or limitations in the branch.
* indexing big contents with parallel output connections may happen OOM. to 
avoid this:
** reduce throttling size
** make tika to cut content out by limit(not implemented)
** make term vector off (not implemented)
* can not reflect online schema changes until being called by 
IConnector.disconnect() or poll() expiration.
* not implemented analyzer resource path.
* not implemented other field types except for string and text.

Please review the branch. LuceneClientTest.java by maven and luke index browser 
might be helpful for test. Thank you.

> Lucene Output Connector
> -----------------------
>
>                 Key: CONNECTORS-1219
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1219
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Shinichiro Abe
>            Assignee: Shinichiro Abe
>         Attachments: CONNECTORS-1219-v0.1patch.patch
>
>
> A output connector for Lucene local index directly, not via remote search 
> engine. It would be nice if we could use Lucene various API to the index 
> directly, even though we could do the same thing to the Solr or Elasticsearch 
> index. I assume we can do something to classification, categorization, and 
> tagging, using e.g lucene-classification package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to