Definitely sounds like a bug to me. A patch would be super awesome ;)
-- Jimmy On Thu, Oct 8, 2015 at 1:54 PM, Junpeng Luo <[email protected]> wrote: > Hi everyone, > > I am using nutch 1.10 and try to use the interactive selenium plugin of > the following link: > > https://github.com/apache/nutch/tree/trunk/src/plugin/protocol-interactiveselenium > > And I try to crawl some websites that requires login. > > What I found is that when the website return a http response with code 403 > in the first time, even if the interactive selenium process the website and > got the new content after it login successfully, nutch still consider the > response code of 403 and would not fetch the page. > > When I go through the code of interactive selenium plugin, I found it > didn’t update the http response status after got the new content. Is that > something supposed to happen? Or do I miss some detail about using the > plugin? > > Best, > > Junpeng Luo > >

