[
https://issues.apache.org/jira/browse/CONNECTORS-916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002600#comment-14002600
]
Karl Wright commented on CONNECTORS-916:
----------------------------------------
bq. About (5) - there are two limitation for id. You cannot use multi-byte
characters. And id Can be up to 64 characters. Do you have any good idea to
generate id from document URI ?
I would use ManifoldCF.hash(document_id) to create an ID that Amazon Cloud
Search will like. The hash produced is a base64-encoded SHA hash, which is 40
character long. There is no need to URL encode it afterwards either.
ManifoldCF uses this everywhere.
> Amazon CloudSearch output connector
> -----------------------------------
>
> Key: CONNECTORS-916
> URL: https://issues.apache.org/jira/browse/CONNECTORS-916
> Project: ManifoldCF
> Issue Type: New Feature
> Components: Amazon CloudSearch output connector
> Affects Versions: ManifoldCF 1.7
> Reporter: Takumi Yoshida
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.7
>
> Attachments: 0507.diff, 0520.diff, 1.patch, 2.diff, 3.diff,
> AmazonCloudSearchParam.java, AmazonCloudSearchSpecs.java,
> exception_handling.diff, exception_handling_2.diff
>
>
> I wrote some codes snipetts of output connector for Amazon CloudSearch.
> I would like you to review my code. You can crawl web site and feed HTML page
> to Amazon CloudSearch.
> but it is not perfectly completed followoing reason.
> - does not write any codes for configuration page.
> - supporting file type is only HTML
> Thank you for your time,
> Takumi Yoshida
--
This message was sent by Atlassian JIRA
(v6.2#6252)