You should be using nutch 1.11-trunk for your assignment Sent from my iPhone
On Oct 8, 2015, at 1:55 PM, Junpeng Luo <[email protected]<mailto:[email protected]>> wrote: Hi everyone, I am using nutch 1.10 and try to use the interactive selenium plugin of the following link: https://github.com/apache/nutch/tree/trunk/src/plugin/protocol-interactiveselenium And I try to crawl some websites that requires login. What I found is that when the website return a http response with code 403 in the first time, even if the interactive selenium process the website and got the new content after it login successfully, nutch still consider the response code of 403 and would not fetch the page. When I go through the code of interactive selenium plugin, I found it didn’t update the http response status after got the new content. Is that something supposed to happen? Or do I miss some detail about using the plugin? Best, Junpeng Luo

