ok, java script seems to be one problem. Thank you Andrzej. I activate the JavaSript parser and some more pages are being indexed. But the entries of the left menue are missing.
Is there an other solution as building an 'sitemap'? Andrzej Bialecki <[EMAIL PROTECTED]> wrote on 03.02.2006 16:15:37: > mos wrote: > > The problem at www.gildemeister.com is the use of JavaScript for link > > generation. > > That's the reason why nutch can't find the other pages (the links are > > invisible). > > Two ideas: > > - You need something like a sitemap, that links the other main pages. > > If it's not available > > right now, you should try to generate it (e.g. use the apache log-file) > > - Enhance the nutch html parser and make it able to intepret the > JavaScipt links > > > > You can try activating parse-js - it can extract JavaScript snippets > embedded in HTML actions, and figure out the links. It works reasonably > well, at least most of the time... ;-) > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
