Hi everyone, I am using nutch 1.10 and try to use the interactive selenium plugin of the following link: https://github.com/apache/nutch/tree/trunk/src/plugin/protocol-interactiveselenium <https://github.com/apache/nutch/tree/trunk/src/plugin/protocol-interactiveselenium>
And I try to crawl some websites that requires login. What I found is that when the website return a http response with code 403 in the first time, even if the interactive selenium process the website and got the new content after it login successfully, nutch still consider the response code of 403 and would not fetch the page. When I go through the code of interactive selenium plugin, I found it didn’t update the http response status after got the new content. Is that something supposed to happen? Or do I miss some detail about using the plugin? Best, Junpeng Luo

