Hi, I understand. Is there a way to use for a set of predefined pages another browser as fetcher? For example, would it be possible to say nutch that he should use firefox or htmlunit as a fetcher? There are many internet sites with ajax loads and where a click makes a form submit, where no real html snippets exist.
Thanks, David On Sun, Jan 13, 2013 at 8:08 PM, Lewis John Mcgibbney < [email protected]> wrote: > This should be correct yes. > If you look at the plugin source you can see the patterns it uses to > extract links. > Also you can check what's iyour crawldb using the readdb command > Hth > Lewis > > On Saturday, January 12, 2013, Michael Gang <[email protected]> wrote: > > Hi, > > > > So if there is a javascript which actually submits a form, nutch won't > > follow the link, because it just deals with urls. > > Is this correct? > > > > Thanks, > > David > > > > > > On Tue, Jan 8, 2013 at 5:15 PM, Michael Gang <[email protected]> > wrote: > > > >> Hi all, > >> > >> From the features of nutch > >> http://wiki.apache.org/nutch/Features > >> i understand that there is a sort of javascript support > >> > >> JavaScript (for extracting links only?) (parse-js) > >> > >> I don't understand what this exactly means. > >> Let's say if i have a link > >> <a onclick="do_something"> > >> or a jquery binding in onready > >> and in this code i open a new window and show there a result of a form > >> submit > >> will nutch extract for me the resulting page as link ? > >> > >> Thanks, > >> David > >> > >> > > > > -- > *Lewis* >

