[ 
https://issues.apache.org/jira/browse/CONNECTORS-946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011436#comment-14011436
 ] 

Karl Wright commented on CONNECTORS-946:
----------------------------------------

Here's a more concrete example, for IRepositoryConnector.  The following 
methods have a new canonical signature:

{code}
  /** Output the specification header section.
  * This method is called in the head section of a job page which has selected 
a repository connection of the
  * current type.  Its purpose is to add the required tabs to the list, and to 
output any javascript methods
  * that might be needed by the job editing HTML.
  * The connector will be connected before this method can be called.
  *@param out is the output to which any HTML should be sent.
  *@param locale is the locale the output is preferred to be in.
  *@param ds is the current document specification for this job.
  *@param connectionSequenceNumber is the unique number of this connection 
within the job.
  *@param tabsArray is an array of tab names.  Add to this array any tab names 
that are specific to the connector.
  */
  public void outputSpecificationHeader(IHTTPOutput out, Locale locale, 
DocumentSpecification ds,
    int connectionSequenceNumber, List<String> tabsArray)
    throws ManifoldCFException, IOException;
  
  /** Output the specification body section.
  * This method is called in the body section of a job page which has selected 
a repository connection of the
  * current type.  Its purpose is to present the required form elements for 
editing.
  * The coder can presume that the HTML that is output from this configuration 
will be within appropriate
  *  <html>, <body>, and <form> tags.  The name of the form is always "editjob".
  * The connector will be connected before this method can be called.
  *@param out is the output to which any HTML should be sent.
  *@param locale is the locale the output is preferred to be in.
  *@param ds is the current document specification for this job.
  *@param connectionSequenceNumber is the unique number of this connection 
within the job.
  *@param actualSequenceNumber is the connection within the job that has 
currently been selected.
  *@param tabName is the current tab name.  (actualSequenceNumber, tabName) 
form a unique tuple within
  *  the job.
  */
  public void outputSpecificationBody(IHTTPOutput out, Locale locale, 
DocumentSpecification ds,
    int connectionSequenceNumber, int actualSequenceNumber, String tabName)
    throws ManifoldCFException, IOException;
  
  /** Process a specification post.
  * This method is called at the start of job's edit or view page, whenever 
there is a possibility that form
  * data for a connection has been posted.  Its purpose is to gather form 
information and modify the
  * document specification accordingly.  The name of the posted form is always 
"editjob".
  * The connector will be connected before this method can be called.
  *@param variableContext contains the post data, including binary file-upload 
information.
  *@param locale is the locale the output is preferred to be in.
  *@param ds is the current document specification for this job.
  *@param connectionSequenceNumber is the unique number of this connection 
within the job.
  *@return null if all is well, or a string error message if there is an error 
that should prevent saving of
  * the job (and cause a redirection to an error page).
  */
  public String processSpecificationPost(IPostParameters variableContext, 
Locale locale, DocumentSpecification ds,
    int connectionSequenceNumber)
    throws ManifoldCFException;
  
  /** View specification.
  * This method is called in the body section of a job's view page.  Its 
purpose is to present the document
  * specification information to the user.  The coder can presume that the HTML 
that is output from
  * this configuration will be within appropriate <html> and <body> tags.
  * The connector will be connected before this method can be called.
  *@param out is the output to which any HTML should be sent.
  *@param locale is the locale the output is preferred to be in.
  *@param ds is the current document specification for this job.
  *@param connectionSequenceNumber is the unique number of this connection 
within the job.
  */
  public void viewSpecification(IHTTPOutput out, Locale locale, 
DocumentSpecification ds,
    int connectionSequenceNumber)
    throws ManifoldCFException, IOException;
{code}

Right now, outputSpecification() typically does things like the following:

if (tabName == 'XXX')
  -- do tab XXX
else
  -- do hiddens for XXX
  
if (tabName == 'YYY')
  -- do tab YYY
else
  -- do hiddens for YYY
  
With an *ignored* sequence number, we count on XXX and YYY to be unique, which 
is the current behavior.

Canonically, the description of a tab is as follows: (Tab name, optional 
sequence number)
The proper way to do it is thus:

if ((sequenceNumber == null || sequenceNumber == connectionSequenceNumber) && 
tabName == 'XXX' )
  -- do tab XXX for connectionSequenceNumber
else
  -- do hiddens for XX for connectionSequenceNumber
  
etc.

SelectTab() javascript method currently takes just the tab name.  Since tabs 
are identified by tuple now, we'll have two variants going forward: 
SelectTab(tabName) and SelectTab(sequenceNumber,tabName).  All these do is set 
the TabName form variable and optionally the SequenceNumber form variable, and 
repost.


> Add support for pipeline connector
> ----------------------------------
>
>                 Key: CONNECTORS-946
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-946
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Framework crawler agent
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.7
>
>
> In the Amazon Search Connector, we finally found an example of an output 
> connector that needed to do full document processing in order to work.  This 
> ticket represents work in the framework to create a concept of "pipeline 
> connector".  Pipeline connections would receive RepositoryDocument objects, 
> and transform them to new RepositoryDocument objects.  There would be a 
> single important method:
> {code}
> public void transformDocument(RepositoryDocument rd, 
> ITransformationActivities activities) throws ServiceInterruption, 
> ManifoldCFException;
> {code}
> ... where ITransformationActivities would include a method that would send a 
> RepositoryDocument object onward to either the output connection or to the 
> next pipeline connection.
> Each pipeline connection would have:
> - A name
> - A description
> - Configuration data
> - An optional prerequisite pipeline connection
> Every output connection would have a new field, which is an optional 
> prerequisite pipeline connection.
> This design is based loosely on how mapping connections and authority 
> connections interrelate.  An alternate design would involve having per-job 
> specification information, but I think this would wind up being way too 
> complex for very little benefit, since each pipeline connection/stage would 
> be expected to do relatively simple/granular things, not usually involving 
> interaction with an external system.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to