Re: How to deploy Selenium on Server?

2015-12-22 Thread Mattmann, Chris A (3980)
Hi Byzen,

Yes my team and I have been working on doing Selenium and Nutch
across a large Amazon EMR cluster. We have some really interesting
results that we are working now on writing up for a conference
paper. We can likely share some of the tips and configuration
experiences soon.

Kim Whitehall, CC’ed, was leading this effort on my team.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++





-Original Message-
From: Baizhang Ma <baizhang...@gmail.com>
Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
Date: Monday, December 21, 2015 at 10:17 PM
To: "user@nutch.apache.org" <user@nutch.apache.org>
Subject: Re: How to deploy Selenium on Server?

>Hi, Mattmann, thank you for your reply!
>
>I used the same manual as you offered to deploy my nutch and it can
>funtion
>well in the local model, however, when i move it to the remote server, it
>couldn't work well. I wonder what the differences between local machine
>and
>remote server, since I also installed a desktop on the remote server. And
>in my conception, the remote server with a dektop should be same as a
>local
>computer, which can be visited through vnc4server and vncviewer.
>
>By the way, you said this plugin is old, do you have some recommendations
>for me, which is easy to deploy as i am a quite inexperience nutch user?
>
>Thanks again, Mattmann.
>
>Best Regards,
>Byzen. Ma
>
>2015-12-22 1:44 GMT+08:00 Mattmann, Chris A (3980) <
>chris.a.mattm...@jpl.nasa.gov>:
>
>> Hi Byzen,
>>
>> That’s the old plugin, we integrated it into Nutch trunk.
>>
>> Have a look at it integrated with Nutch here:
>>
>> https://wiki.apache.org/nutch/AdvancedAjaxInteraction
>>
>>
>> Cheers,
>> Chris
>>
>> ++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: chris.a.mattm...@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++
>>
>>
>>
>>
>>
>> -Original Message-
>> From: Baizhang Ma <baizhang...@gmail.com>
>> Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
>> Date: Monday, December 21, 2015 at 4:54 AM
>> To: "user@nutch.apache.org" <user@nutch.apache.org>
>> Subject: How to deploy Selenium on Server?
>>
>> >Hi, everyone.
>> >I want to use Selenium plugins to crawl dynamic content of pages. I
>>deploy
>> >it as https://github.com/momer/nutch-selenium says and can run normally
>> in
>> >local computer(my own computer). However, the plugins don't work after
>>i
>> >deploy on the remote server. At the beginning, I thought it might need
>>a
>> >deplay or desktop as same as local model, so i installed a desktop on
>>the
>> >server, but unfortunately, it still cann't work. Is there anyone who
>>have
>> >ideas about this? Thanks very much!
>> >
>> >Best Regards,
>> >Byzen. Ma
>>
>>



Re: How to deploy Selenium on Server?

2015-12-22 Thread Baizhang Ma
Hi, Mattmann.
It is really great to hear that! Since I need to deploy my nutch and
Selenium on my server but little reference about this on the Internet. I
tried again and again, but it always can't crawl the whole
page differently from the local model, which can fetch whole page normally. So,
if you and your team have any progress, really please let me know! Thanks
for ahead, Mattmann.

Best Regards,
Byzen.Ma

2015-12-23 2:11 GMT+08:00 Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov>:

> Hi Byzen,
>
> Yes my team and I have been working on doing Selenium and Nutch
> across a large Amazon EMR cluster. We have some really interesting
> results that we are working now on writing up for a conference
> paper. We can likely share some of the tips and configuration
> experiences soon.
>
> Kim Whitehall, CC’ed, was leading this effort on my team.
>
> Cheers,
> Chris
>
> ++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++
>
>
>
>
>
> -Original Message-
> From: Baizhang Ma <baizhang...@gmail.com>
> Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
> Date: Monday, December 21, 2015 at 10:17 PM
> To: "user@nutch.apache.org" <user@nutch.apache.org>
> Subject: Re: How to deploy Selenium on Server?
>
> >Hi, Mattmann, thank you for your reply!
> >
> >I used the same manual as you offered to deploy my nutch and it can
> >funtion
> >well in the local model, however, when i move it to the remote server, it
> >couldn't work well. I wonder what the differences between local machine
> >and
> >remote server, since I also installed a desktop on the remote server. And
> >in my conception, the remote server with a dektop should be same as a
> >local
> >computer, which can be visited through vnc4server and vncviewer.
> >
> >By the way, you said this plugin is old, do you have some recommendations
> >for me, which is easy to deploy as i am a quite inexperience nutch user?
> >
> >Thanks again, Mattmann.
> >
> >Best Regards,
> >Byzen. Ma
> >
> >2015-12-22 1:44 GMT+08:00 Mattmann, Chris A (3980) <
> >chris.a.mattm...@jpl.nasa.gov>:
> >
> >> Hi Byzen,
> >>
> >> That’s the old plugin, we integrated it into Nutch trunk.
> >>
> >> Have a look at it integrated with Nutch here:
> >>
> >> https://wiki.apache.org/nutch/AdvancedAjaxInteraction
> >>
> >>
> >> Cheers,
> >> Chris
> >>
> >> ++
> >> Chris Mattmann, Ph.D.
> >> Chief Architect
> >> Instrument Software and Science Data Systems Section (398)
> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> Office: 168-519, Mailstop: 168-527
> >> Email: chris.a.mattm...@nasa.gov
> >> WWW:  http://sunset.usc.edu/~mattmann/
> >> ++
> >> Adjunct Associate Professor, Computer Science Department
> >> University of Southern California, Los Angeles, CA 90089 USA
> >> ++
> >>
> >>
> >>
> >>
> >>
> >> -Original Message-
> >> From: Baizhang Ma <baizhang...@gmail.com>
> >> Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
> >> Date: Monday, December 21, 2015 at 4:54 AM
> >> To: "user@nutch.apache.org" <user@nutch.apache.org>
> >> Subject: How to deploy Selenium on Server?
> >>
> >> >Hi, everyone.
> >> >I want to use Selenium plugins to crawl dynamic content of pages. I
> >>deploy
> >> >it as https://github.com/momer/nutch-selenium says and can run
> normally
> >> in
> >> >local computer(my own computer). However, the plugins don't work after
> >>i
> >> >deploy on the remote server. At the beginning, I thought it might need
> >>a
> >> >deplay or desktop as same as local model, so i installed a desktop on
> >>the
> >> >server, but unfortunately, it still cann't work. Is there anyone who
> >>have
> >> >ideas about this? Thanks very much!
> >> >
> >> >Best Regards,
> >> >Byzen. Ma
> >>
> >>
>
>


Re: How to deploy Selenium on Server?

2015-12-21 Thread Karanjeet Singh
Hi Byzen,

I hope you have installed all required libraries (Firefox, Xvfb) for
Selenium on your remote server. Can you please share your logs
(${NUTCH_HOME}/logs/hadoop.log) to get an insight of this issue.

Thanks & Regards,
Karanjeet Singh
CS Graduate Student
University of Southern California
karan...@usc.edu


On Mon, Dec 21, 2015 at 4:54 AM, Baizhang Ma  wrote:

> Hi, everyone.
> I want to use Selenium plugins to crawl dynamic content of pages. I deploy
> it as https://github.com/momer/nutch-selenium says and can run normally in
> local computer(my own computer). However, the plugins don't work after i
> deploy on the remote server. At the beginning, I thought it might need a
> deplay or desktop as same as local model, so i installed a desktop on the
> server, but unfortunately, it still cann't work. Is there anyone who have
> ideas about this? Thanks very much!
>
> Best Regards,
> Byzen. Ma
>


Re: How to deploy Selenium on Server?

2015-12-21 Thread Baizhang Ma
Hi, Mattmann, thank you for your reply!

I used the same manual as you offered to deploy my nutch and it can funtion
well in the local model, however, when i move it to the remote server, it
couldn't work well. I wonder what the differences between local machine and
remote server, since I also installed a desktop on the remote server. And
in my conception, the remote server with a dektop should be same as a local
computer, which can be visited through vnc4server and vncviewer.

By the way, you said this plugin is old, do you have some recommendations
for me, which is easy to deploy as i am a quite inexperience nutch user?

Thanks again, Mattmann.

Best Regards,
Byzen. Ma

2015-12-22 1:44 GMT+08:00 Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov>:

> Hi Byzen,
>
> That’s the old plugin, we integrated it into Nutch trunk.
>
> Have a look at it integrated with Nutch here:
>
> https://wiki.apache.org/nutch/AdvancedAjaxInteraction
>
>
> Cheers,
> Chris
>
> ++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++
>
>
>
>
>
> -Original Message-
> From: Baizhang Ma <baizhang...@gmail.com>
> Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
> Date: Monday, December 21, 2015 at 4:54 AM
> To: "user@nutch.apache.org" <user@nutch.apache.org>
> Subject: How to deploy Selenium on Server?
>
> >Hi, everyone.
> >I want to use Selenium plugins to crawl dynamic content of pages. I deploy
> >it as https://github.com/momer/nutch-selenium says and can run normally
> in
> >local computer(my own computer). However, the plugins don't work after i
> >deploy on the remote server. At the beginning, I thought it might need a
> >deplay or desktop as same as local model, so i installed a desktop on the
> >server, but unfortunately, it still cann't work. Is there anyone who have
> >ideas about this? Thanks very much!
> >
> >Best Regards,
> >Byzen. Ma
>
>


How to deploy Selenium on Server?

2015-12-21 Thread Baizhang Ma
Hi, everyone.
I want to use Selenium plugins to crawl dynamic content of pages. I deploy
it as https://github.com/momer/nutch-selenium says and can run normally in
local computer(my own computer). However, the plugins don't work after i
deploy on the remote server. At the beginning, I thought it might need a
deplay or desktop as same as local model, so i installed a desktop on the
server, but unfortunately, it still cann't work. Is there anyone who have
ideas about this? Thanks very much!

Best Regards,
Byzen. Ma


Re: How to deploy Selenium on Server?

2015-12-21 Thread Mattmann, Chris A (3980)
Hi Byzen,

That’s the old plugin, we integrated it into Nutch trunk.

Have a look at it integrated with Nutch here:

https://wiki.apache.org/nutch/AdvancedAjaxInteraction


Cheers,
Chris

++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++





-Original Message-
From: Baizhang Ma <baizhang...@gmail.com>
Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
Date: Monday, December 21, 2015 at 4:54 AM
To: "user@nutch.apache.org" <user@nutch.apache.org>
Subject: How to deploy Selenium on Server?

>Hi, everyone.
>I want to use Selenium plugins to crawl dynamic content of pages. I deploy
>it as https://github.com/momer/nutch-selenium says and can run normally in
>local computer(my own computer). However, the plugins don't work after i
>deploy on the remote server. At the beginning, I thought it might need a
>deplay or desktop as same as local model, so i installed a desktop on the
>server, but unfortunately, it still cann't work. Is there anyone who have
>ideas about this? Thanks very much!
>
>Best Regards,
>Byzen. Ma