[
https://issues.apache.org/jira/browse/NUTCH-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14693364#comment-14693364
]
Yogendra Kumar Soni commented on NUTCH-2079:
--------------------------------------------
You can add information you have parsed in parsemeta metadata. You need to add
your plugin in parse-plugins.xml if you are writing your own parser plugin.
after that you need to write index plugin to index the field you are adding.
and finally add your plugins in nutch-site.xml.
to summarize you need to :
1. change parse-plugin-> getParse(Content ) to add new <key,value> in Metadata
object.
2. change index-plugin -> to add your new field into Nutch Document or Webpage.
3. change gora-mongodb-mapping to add new field.
> Tika Parsing plugin issue
> -------------------------
>
> Key: NUTCH-2079
> URL: https://issues.apache.org/jira/browse/NUTCH-2079
> Project: Nutch
> Issue Type: New Feature
> Components: deployment
> Affects Versions: 2.3
> Environment: Ubuntu 14.04
> Reporter: Pradumna Panditrao
> Fix For: 2.3
>
>
> Hi,
> I am trying to parse particular data & post the same on the mongodb, however
> when I am trying to do some modifications into into parse tika plugin, it has
> too much inter connectivity with other classes & it misses the data. I want
> to pick up particular data from website using the same plugin & put into
> mongo db.
> Please suggest for the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)