Hi nutch-users, I would like to write a nutch plugin to parse each url and extract different elements from the page (using something like jsoup parser) and construct a json and write it to s3 (I am running my nutch cluster in AWS). I am curious to know whether there is any existing plugin that can do some of the work for me.
I do see an example of how to write a parser plugin over at https://wiki.apache.org/nutch/WritingPluginExample-1.2 I am curious to hear from people who have tried a similar use case, to learn from others experience. Thanks Srini

