[ 
https://issues.apache.org/jira/browse/CONNECTORS-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975138#comment-13975138
 ] 

Takumi Yoshida commented on CONNECTORS-916:
-------------------------------------------

Hi Karl,

because of spec of CloudSearch. let me explain.

- CloudSearch drops document which contains field that does not defined in 
CloudSearch schema.
-  To feed binary files, the output connector extract them with using Tika. 
because CloudSearch does not extract binary files on server side (not like 
Solr).
- So, if there are No mapping page, Users need to define all of the fields 
which extracted by Tika. Otherwise you cannnot feed them.   

of course i need to make some test and documents. whch do you prefer to do 
first ? I go along with community way.

> Amazon CloudSearch output connector
> -----------------------------------
>
>                 Key: CONNECTORS-916
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-916
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Amazon CloudSearch output connector
>    Affects Versions: ManifoldCF 1.7
>            Reporter: Takumi Yoshida
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.7
>
>         Attachments: 1.patch, 2.diff, 3.diff, exception_handling.diff, 
> exception_handling_2.diff
>
>
> I wrote some codes snipetts of output connector for Amazon CloudSearch.
> I would like you to review my code. You can crawl web site and feed HTML page 
> to Amazon CloudSearch.
> but it is not perfectly completed followoing reason.
> - does not write any codes for configuration page.
> - supporting file type is only HTML
> Thank you for your time,
>  Takumi Yoshida



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to