Karl Wright created CONNECTORS-946:
--------------------------------------
Summary: Add support for pipeline connector
Key: CONNECTORS-946
URL: https://issues.apache.org/jira/browse/CONNECTORS-946
Project: ManifoldCF
Issue Type: New Feature
Components: Framework crawler agent
Affects Versions: ManifoldCF 1.7
Reporter: Karl Wright
Assignee: Karl Wright
Fix For: ManifoldCF 1.7
In the Amazon Search Connector, we finally found an example of an output
connector that needed to do full document processing in order to work. This
ticket represents work in the framework to create a concept of "pipeline
connector". Pipeline connections would receive RepositoryDocument objects, and
transform them to new RepositoryDocument objects. There would be a single
important method:
{code}
public void transformDocument(RepositoryDocument rd, ITransformationActivities
activities) throws ServiceInterruption, ManifoldCFException;
{code}
... where ITransformationActivities would include a method that would send a
RepositoryDocument object onward to either the output connection or to the next
pipeline connection.
Each pipeline connection would have:
- A name
- A description
- Configuration data
- An optional prerequisite pipeline connection
Every output connection would have a new field, which is an optional
prerequisite pipeline connection.
This design is based loosely on how mapping connections and authority
connections interrelate. An alternate design would involve having per-job
specification information, but I think this would wind up being way too complex
for very little benefit, since each pipeline connection/stage would be expected
to do relatively simple/granular things, not usually involving interaction with
an external system.
--
This message was sent by Atlassian JIRA
(v6.2#6252)