[
https://issues.apache.org/jira/browse/CONNECTORS-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963065#comment-13963065
]
Takumi Yoshida commented on CONNECTORS-916:
-------------------------------------------
Hi Karl,
Thank you for your work and comments! I cloud build and run on my env in new
branch CONNECTOR-916!
I answer question.
> I know the license for the jackson libraries, but not for tagsoup. Do you
> know what it is?
the license for the tagsoup is Apache License 2.0.
(http://home.ccil.org/~cowan/XML/tagsoup/)
tagsoup is used by tika for analyzing HTML document to get meta information and
body text.
> (1) I take it that it is these parameters that need to be configurable:
wow! that's sounds great!
> Question: What kind of authorization is used for this? Where/how should that
> be specified?
there are no authorization required. on Amazon CloudSearch, a client which send
documents are filterd by IP address, and a user can set ip address which is
permitted or not permitted on AWS console.
> (2) For the addOrReplaceDocument() method, the error handling simply prints
> all exceptions. That's not a reasonable thing to do. Exceptions should either
> result in a document rejection, or a ServiceInterruption exception, or a
> ManifoldCFException, depending on what you want ManifoldCF to do when it
> received one.
You are absolutely correct. I thought, in all of these exception case,
connector should reject the document and process next one. what is the
difference between document rejection, throwing ServiceInterruption exception
and ManifoldCFException?
> Amazon CloudSearch output connector
> -----------------------------------
>
> Key: CONNECTORS-916
> URL: https://issues.apache.org/jira/browse/CONNECTORS-916
> Project: ManifoldCF
> Issue Type: New Feature
> Components: Amazon CloudSearch output connector
> Affects Versions: ManifoldCF 1.7
> Reporter: Takumi Yoshida
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.7
>
> Attachments: 1.patch, 2.diff
>
>
> I wrote some codes snipetts of output connector for Amazon CloudSearch.
> I would like you to review my code. You can crawl web site and feed HTML page
> to Amazon CloudSearch.
> but it is not perfectly completed followoing reason.
> - does not write any codes for configuration page.
> - supporting file type is only HTML
> Thank you for your time,
> Takumi Yoshida
--
This message was sent by Atlassian JIRA
(v6.2#6252)