This is the wiki page i have used 
https://wiki.apache.org/nutch/AdvancedAjaxInteraction

> On Apr 13, 2016, at 1:30 AM, Mattmann, Chris A (3980) 
> <[email protected]> wrote:
> 
> Hi, the plugin is now part of Nutch, so you don’t need to use the
> Github one and can you show me the wiki page by linking to it since
> it’s likely out of date..
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: [email protected]
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Director, Information Retrieval and Data Science Group (IRDS)
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> WWW: http://irds.usc.edu/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On 4/12/16, 10:29 PM, "Sabah Sajjad Khan" <[email protected]> wrote:
> 
>> The link that i provided is the same as the one on the wiki page.
>> 
>>> On Apr 13, 2016, at 1:13 AM, Mattmann, Chris A (3980) 
>>> <[email protected]> wrote:
>>> 
>>> Please use the selenium plugin that is part of Nutch and described
>>> on the wiki in the Advanced Ajax Interaction section.
>>> 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398)
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: [email protected]
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Director, Information Retrieval and Data Science Group (IRDS)
>>> Adjunct Associate Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> WWW: http://irds.usc.edu/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 4/12/16, 9:38 PM, "Sabah Sajjad Khan" <[email protected]> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> 
>>>> I am very new to nutch and am having issues crawling to receive the 
>>>> content that i need. i am crawling electronic part websites to see prices 
>>>> but when using readdb to dump i don't see all the data under content. I 
>>>> have attached the dump file.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> My setup is nutch with selenium using this link 
>>>> https://github.com/momer/nutch-selenium 
>>>> <https://github.com/momer/nutch-selenium> but i don't use the last 
>>>> command(bin/crawl) because i am not using solr. selenium seems to be 
>>>> working as well as the headless browser but it just doesn't seem to 
>>>> extract any data. any help would be appreciated. Like
>>>> i said i'm very new so if there is any other information i could provide 
>>>> to help understand my problem let me know or let me know how i could track 
>>>> my problem.
>>>> 
>>>> 
>>>> Thank you in advance.
>>>> 
>>>> 
>> 

Reply via email to