Karl Wright created CONNECTORS-946:
--------------------------------------

             Summary: Add support for pipeline connector
                 Key: CONNECTORS-946
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-946
             Project: ManifoldCF
          Issue Type: New Feature
          Components: Framework crawler agent
    Affects Versions: ManifoldCF 1.7
            Reporter: Karl Wright
            Assignee: Karl Wright
             Fix For: ManifoldCF 1.7


In the Amazon Search Connector, we finally found an example of an output 
connector that needed to do full document processing in order to work.  This 
ticket represents work in the framework to create a concept of "pipeline 
connector".  Pipeline connections would receive RepositoryDocument objects, and 
transform them to new RepositoryDocument objects.  There would be a single 
important method:

{code}
public void transformDocument(RepositoryDocument rd, ITransformationActivities 
activities) throws ServiceInterruption, ManifoldCFException;
{code}

... where ITransformationActivities would include a method that would send a 
RepositoryDocument object onward to either the output connection or to the next 
pipeline connection.

Each pipeline connection would have:
- A name
- A description
- Configuration data
- An optional prerequisite pipeline connection

Every output connection would have a new field, which is an optional 
prerequisite pipeline connection.

This design is based loosely on how mapping connections and authority 
connections interrelate.  An alternate design would involve having per-job 
specification information, but I think this would wind up being way too complex 
for very little benefit, since each pipeline connection/stage would be expected 
to do relatively simple/granular things, not usually involving interaction with 
an external system.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to