[
https://issues.apache.org/jira/browse/NUTCH-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601640#comment-14601640
]
Ji Kwon Lim commented on NUTCH-1517:
------------------------------------
Hi,
We are attempting to use nutch with CloudSearch, and we are using the patch
provided in this ticket. However, we noticed that the patch seems to be
incomplete, requiring a manual change to
org.apache.nutch.parse,MetaTagsParser.java to replace all references to
'metadata.add("metatag."' with 'metadata.add("metatag_"', changing out the
period with an underscore. Is there a newer patch out that addresses this issue
or a newer process altogether for getting nutch to work with CloudSearch? If
not, could we get an update to the patch to include the change to
org.apache.nutch.parse,MetaTagsParser.java that's necessary for the indexer to
work properly?
Regards,
Ji Kwon Lim
> CloudSearch indexer
> -------------------
>
> Key: NUTCH-1517
> URL: https://issues.apache.org/jira/browse/NUTCH-1517
> Project: Nutch
> Issue Type: New Feature
> Components: indexer
> Reporter: Julien Nioche
> Fix For: 1.11
>
> Attachments: 0023883254_1377197869_indexer-cloudsearch.patch,
> 0025666929_1382393138_indexer-cloudsearch.20131021.patch
>
>
> Once we have made the indexers pluggable, we should add a plugin for Amazon
> CloudSearch. See http://aws.amazon.com/cloudsearch/. Apparently it uses a
> JSON based representation Search Data Format (SDF), which we could reuse for
> a file based indexer.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)