Hi Byzen, Yes my team and I have been working on doing Selenium and Nutch across a large Amazon EMR cluster. We have some really interesting results that we are working now on writing up for a conference paper. We can likely share some of the tips and configuration experiences soon.
Kim Whitehall, CC’ed, was leading this effort on my team. Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Baizhang Ma <baizhang...@gmail.com> Reply-To: "user@nutch.apache.org" <user@nutch.apache.org> Date: Monday, December 21, 2015 at 10:17 PM To: "user@nutch.apache.org" <user@nutch.apache.org> Subject: Re: How to deploy Selenium on Server? >Hi, Mattmann, thank you for your reply! > >I used the same manual as you offered to deploy my nutch and it can >funtion >well in the local model, however, when i move it to the remote server, it >couldn't work well. I wonder what the differences between local machine >and >remote server, since I also installed a desktop on the remote server. And >in my conception, the remote server with a dektop should be same as a >local >computer, which can be visited through vnc4server and vncviewer. > >By the way, you said this plugin is old, do you have some recommendations >for me, which is easy to deploy as i am a quite inexperience nutch user? > >Thanks again, Mattmann. > >Best Regards, >Byzen. Ma > >2015-12-22 1:44 GMT+08:00 Mattmann, Chris A (3980) < >chris.a.mattm...@jpl.nasa.gov>: > >> Hi Byzen, >> >> That’s the old plugin, we integrated it into Nutch trunk. >> >> Have a look at it integrated with Nutch here: >> >> https://wiki.apache.org/nutch/AdvancedAjaxInteraction >> >> >> Cheers, >> Chris >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Chief Architect >> Instrument Software and Science Data Systems Section (398) >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 168-519, Mailstop: 168-527 >> Email: chris.a.mattm...@nasa.gov >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Associate Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> -----Original Message----- >> From: Baizhang Ma <baizhang...@gmail.com> >> Reply-To: "user@nutch.apache.org" <user@nutch.apache.org> >> Date: Monday, December 21, 2015 at 4:54 AM >> To: "user@nutch.apache.org" <user@nutch.apache.org> >> Subject: How to deploy Selenium on Server? >> >> >Hi, everyone. >> >I want to use Selenium plugins to crawl dynamic content of pages. I >>deploy >> >it as https://github.com/momer/nutch-selenium says and can run normally >> in >> >local computer(my own computer). However, the plugins don't work after >>i >> >deploy on the remote server. At the beginning, I thought it might need >>a >> >deplay or desktop as same as local model, so i installed a desktop on >>the >> >server, but unfortunately, it still cann't work. Is there anyone who >>have >> >ideas about this? Thanks very much! >> > >> >Best Regards, >> >Byzen. Ma >> >>