I remember someone else mentioning the same problem in the class. So probably you're not alone and it indeed is a bug.
On Thu, Oct 8, 2015 at 2:56 PM Junpeng Luo <[email protected]> wrote: > Sorry I made a typo, actually I am using the 1.11-trunk. Thank you anyway! > > Junpeng Luo > > On Oct 8, 2015, at 2:45 PM, Mattmann, Chris A (3980) < > [email protected]> wrote: > > You should be using nutch 1.11-trunk for your assignment > > Sent from my iPhone > > On Oct 8, 2015, at 1:55 PM, Junpeng Luo <[email protected]> wrote: > > Hi everyone, > > I am using nutch 1.10 and try to use the interactive selenium plugin of > the following link: > > https://github.com/apache/nutch/tree/trunk/src/plugin/protocol-interactiveselenium > > And I try to crawl some websites that requires login. > > What I found is that when the website return a http response with code 403 > in the first time, even if the interactive selenium process the website and > got the new content after it login successfully, nutch still consider the > response code of 403 and would not fetch the page. > > When I go through the code of interactive selenium plugin, I found it > didn’t update the http response status after got the new content. Is that > something supposed to happen? Or do I miss some detail about using the > plugin? > > Best, > > Junpeng Luo > > >

