Re: nutch javascript capabilities

Lewis John Mcgibbney Sun, 13 Jan 2013 10:08:30 -0800

This should be correct yes.
If you look at the plugin source you can see the patterns it uses to
extract links.
Also you can check what's iyour crawldb using the readdb command
Hth
Lewis


On Saturday, January 12, 2013, Michael Gang <[email protected]> wrote:
> Hi,
>
> So if there is a javascript which actually submits a form, nutch won't
> follow the link, because it just deals with urls.
> Is this correct?
>
> Thanks,
> David
>
>
> On Tue, Jan 8, 2013 at 5:15 PM, Michael Gang <[email protected]>
wrote:
>
>> Hi all,
>>
>> From the features of nutch
>> http://wiki.apache.org/nutch/Features
>> i understand that there is a sort of javascript support
>>
>> JavaScript (for extracting links only?) (parse-js)
>>
>> I don't understand what this exactly means.
>> Let's say if i have a link
>> <a onclick="do_something">
>> or a jquery binding in onready
>> and in this code i open a new window and show there a result of a form
>> submit
>> will nutch extract for me the resulting page as link ?
>>
>> Thanks,
>> David
>>
>>
>

-- 
*Lewis*

Re: nutch javascript capabilities

Reply via email to