[ 
https://issues.apache.org/jira/browse/CONNECTORS-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004453#comment-14004453
 ] 

Karl Wright commented on CONNECTORS-916:
----------------------------------------

Hi Takumi,

Yes, you have the basic idea.  My detailed writeup:

- We keep local repository on disk.  Each document in the repository consists 
of: (key, deleted/not_deleted, amazon_data).  The key is Amazon key -- document 
URI (or its hash).  deleted/not_deleted is a flag which is set if this record 
represents a deletion.  amazon_data is all the data needed for transmission to 
amazon (json?)
- addOrReplaceDocument() adds document or replaces document in local repository 
only, and clears deleted/not_deleted flags.
- deleteDocument() adds document or replaces document in local repository only, 
and sets deleted/not_deleted flag.
- There is a method called transmit().  Transmit() sends a chunk of documents 
from local repository to Amazon - say 1000 at a time from local repository.  If 
transmit() is successful, all documents it sent are removed from local 
repository.  Otherwise they are left.
- transmit() return false if there are no more documents to be transmitted.  
notifyOfCompletion() calls transmit() until either there is an error exception, 
or until it returns false.



> Amazon CloudSearch output connector
> -----------------------------------
>
>                 Key: CONNECTORS-916
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-916
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Amazon CloudSearch output connector
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Takumi Yoshida
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.7
>
>         Attachments: 0507.diff, 0520.diff, 0520_2.diff, 1.patch, 2.diff, 
> 3.diff, AmazonCloudSearchParam.java, AmazonCloudSearchSpecs.java, 
> exception_handling.diff, exception_handling_2.diff, licenselist.txt
>
>
> I wrote some codes snipetts of output connector for Amazon CloudSearch.
> I would like you to review my code. You can crawl web site and feed HTML page 
> to Amazon CloudSearch.
> but it is not perfectly completed followoing reason.
> - does not write any codes for configuration page.
> - supporting file type is only HTML
> Thank you for your time,
>  Takumi Yoshida



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to