[jira] [Commented] (CONNECTORS-962) Support multiple output connections for a single job

Rafa Haro (JIRA) Thu, 12 Jun 2014 03:15:29 -0700

    [ 
https://issues.apache.org/jira/browse/CONNECTORS-962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028992#comment-14028992
 ]


Rafa Haro commented on CONNECTORS-962:
--------------------------------------

Hi [[email protected]]. Although the use case of multiple processing 
pipelines with different outputs is also quite great for us, our current 
approach is actually generating different repository documents from the 
original one as a result of a SINGLE processing pipelines. So rather than 
sending the same stream to different processing pipelines, we got different 
documents representation as a result of a enhancement process, each document 
corresponding to a different output connector. That is how we are doing the 
trick right now (although was a dirty and quick solution). Probably you don't 
want to consider it right now, but Apache Camel is a great resource for 
achieving the processing architecture that you seem to reach because it allows 
you to easily define different processing routes like sending the same event to 
different pipelines

> Support multiple output connections for a single job
> ----------------------------------------------------
>
>                 Key: CONNECTORS-962
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-962
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Framework crawler agent
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.7
>
>
> Zaizi has a requirement to support multiple outputs for a single job.  In 
> theory this requirement can be met by doing the following:
> - Allow multiple output connections, and multiple pipelines, per job
> - Keep a distinct ingeststatus record for each document/output combination
> - Modify WorkerThread to call IncrementalIndexer multiple times for every 
> document fetched
> Places where different things need to happen are:
> - RepositoryDocument - because one binary stream will not do for multiple 
> outputs
> - UI, obviously, because there will need to be multiple pipelines, not just 
> one, and in addition it would be probably important to be able to "split" the 
> pipeline at arbitrary points



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CONNECTORS-962) Support multiple output connections for a single job

Reply via email to