[jira] [Commented] (CONNECTORS-916) Amazon CloudSearch output connector

Takumi Yoshida (JIRA) Tue, 08 Apr 2014 08:08:49 -0700

    [ 
https://issues.apache.org/jira/browse/CONNECTORS-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963065#comment-13963065
 ]


Takumi Yoshida commented on CONNECTORS-916:
-------------------------------------------

Hi Karl,

Thank you for your work and comments! I cloud build and run on my env in new 
branch CONNECTOR-916!
I answer question.

> I know the license for the jackson libraries, but not for tagsoup. Do you 
> know what it is?
the license for the tagsoup is Apache License 2.0. 
(http://home.ccil.org/~cowan/XML/tagsoup/)
tagsoup is used by tika for analyzing HTML document to get meta information and 
body text. 

> (1) I take it that it is these parameters that need to be configurable:
wow! that's sounds great!

> Question: What kind of authorization is used for this? Where/how should that 
> be specified?
there are no authorization required. on Amazon CloudSearch, a client which send 
documents are filterd by IP address, and a user can set ip address which is 
permitted or not permitted on AWS console.

> (2) For the addOrReplaceDocument() method, the error handling simply prints 
> all exceptions. That's not a reasonable thing to do. Exceptions should either 
> result in a document rejection, or a ServiceInterruption exception, or a 
> ManifoldCFException, depending on what you want ManifoldCF to do when it 
> received one.
You are absolutely correct. I thought, in all of these exception case, 
connector should reject the document and process next one. what is the 
difference between document rejection, throwing ServiceInterruption exception 
and ManifoldCFException? 


> Amazon CloudSearch output connector
> -----------------------------------
>
>                 Key: CONNECTORS-916
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-916
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Amazon CloudSearch output connector
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Takumi Yoshida
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.7
>
>         Attachments: 1.patch, 2.diff
>
>
> I wrote some codes snipetts of output connector for Amazon CloudSearch.
> I would like you to review my code. You can crawl web site and feed HTML page 
> to Amazon CloudSearch.
> but it is not perfectly completed followoing reason.
> - does not write any codes for configuration page.
> - supporting file type is only HTML
> Thank you for your time,
>  Takumi Yoshida



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (CONNECTORS-916) Amazon CloudSearch output connector

Reply via email to