Re: Post process Nutch data

Julien Nioche Mon, 05 May 2014 08:28:29 -0700

Hi

As mentioned earlier in a different discussion on this list behemoth would
be the right tool for this


Julien

On Monday, 5 May 2014, Srikanth Shankara Rao <[email protected]> wrote:

>
> Hi All,
>
> I have crawled Nutch data using 1.8. Data is in HDFS. I would like to
> post-process this data before indexing into SOLR. The idea is to transform
> the data based on the content and add few additional fields that describe
> the content.
>
> I would like to do this as part of a hadoop job. What would be the best
> place to add code?
>
> Thanks
> Srikanth
>


-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: Post process Nutch data

Reply via email to