Hi You can do that as a custom protocol implementation. The fetcher code would stay the same but the byte content returned for a given URL would be produced by phantomjs or whichever selenuim backend you'd to use.
HTH Julien On 7 June 2014 11:35, remi tassing <[email protected]> wrote: > I'm currently looking at those separately but an integrated option would be > more efficient. > > Looking forward for any experience sharing > > > On Sat, Jun 7, 2014 at 6:25 PM, Patrick Kirsch <[email protected]> wrote: > > > Hey list, > > I'm sure this issue was asked several times, but a quick look in the > > nutch user archive did not help, so: > > > > Has anyone documentation or tried to use a browser (like chromium) or > > phantomjs etc. for fetching web pages? > > > > Due to a heavily loaded javascript site, nutch needs to see the fully > > rendered page. > > > > Second question, would it be better to implement it as plugin or rather > > native in the fetcher class? > > > > Regards, > > Patrick > > > > > -- Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

