Sorry I made a typo, actually I am using the 1.11-trunk. Thank you anyway!
Junpeng Luo > On Oct 8, 2015, at 2:45 PM, Mattmann, Chris A (3980) > <[email protected]> wrote: > > You should be using nutch 1.11-trunk for your assignment > > Sent from my iPhone > > On Oct 8, 2015, at 1:55 PM, Junpeng Luo <[email protected] > <mailto:[email protected]>> wrote: > >> Hi everyone, >> >> I am using nutch 1.10 and try to use the interactive selenium plugin of the >> following link: >> https://github.com/apache/nutch/tree/trunk/src/plugin/protocol-interactiveselenium >> >> <https://github.com/apache/nutch/tree/trunk/src/plugin/protocol-interactiveselenium> >> >> And I try to crawl some websites that requires login. >> >> What I found is that when the website return a http response with code 403 >> in the first time, even if the interactive selenium process the website and >> got the new content after it login successfully, nutch still consider the >> response code of 403 and would not fetch the page. >> >> When I go through the code of interactive selenium plugin, I found it didn’t >> update the http response status after got the new content. Is that something >> supposed to happen? Or do I miss some detail about using the plugin? >> >> Best, >> >> Junpeng Luo >>

