[ 
https://issues.apache.org/jira/browse/NUTCH-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601640#comment-14601640
 ] 

Ji Kwon Lim commented on NUTCH-1517:
------------------------------------

Hi,

We are attempting to use nutch with CloudSearch, and we are using the patch 
provided in this ticket. However, we noticed that the patch seems to be 
incomplete, requiring a manual change to 
org.apache.nutch.parse,MetaTagsParser.java to replace all references to 
'metadata.add("metatag."' with 'metadata.add("metatag_"', changing out the 
period with an underscore. Is there a newer patch out that addresses this issue 
or a newer process altogether for getting nutch to work with CloudSearch? If 
not, could we get an update to the patch to include the change to 
org.apache.nutch.parse,MetaTagsParser.java that's necessary for the indexer to 
work properly?


Regards,

Ji Kwon Lim

> CloudSearch indexer
> -------------------
>
>                 Key: NUTCH-1517
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1517
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>            Reporter: Julien Nioche
>             Fix For: 1.11
>
>         Attachments: 0023883254_1377197869_indexer-cloudsearch.patch, 
> 0025666929_1382393138_indexer-cloudsearch.20131021.patch
>
>
> Once we have made the indexers pluggable, we should add a plugin for Amazon 
> CloudSearch. See http://aws.amazon.com/cloudsearch/. Apparently it uses a 
> JSON based representation Search Data Format (SDF), which we could reuse for 
> a file based indexer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to