Hi, I`m using nutch 0.9 with cygwin. I´d started a iterative generate, fetch,update process, wich at this time was at depth of 5 until it hung up. I´ll run the command in backgound with nohup.When the fetch process reach a site it freeze (monitoring the nohup.out). The hadoop.log show nutch accesing the plugins, I don´tunderstand what is. I preciate any help since I spent 5 days in this fetching process...anyway, is there any means of resuming the process or someone has a plugin to do that?, when a crawl fails at bigger depths, it is painfull the time waisted. I´ll send you my log....thanks again
2007-07-03 19:10:08,728 INFO fetcher.Fetcher - fetching http://www.lanparty.com.uy/phpBB2/privmsg.php?folder=inbox&sid=3ffb48875ad340581a2d943a59108063 2007-07-03 19:10:09,269 INFO fetcher.Fetcher - fetching http://www.infoteca.com.uy/foro/faq.php?sid=952f0c285d497774edf61305910881df 2007-07-03 19:10:12,714 INFO fetcher.Fetcher - fetching http://www.derechocomercial.edu.uy/ClaseContSoc01.htm#_ftnref13 2007-07-03 19:41:25,417 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2007-07-03 19:41:25,567 INFO plugin.PluginRepository - Plugins: looking in: D:\nutch\nutch-0.9\plugins 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Plugin Auto-activation mode: [true] 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Registered Plugins: 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Site Query Filter (query-site) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - MSWord Parse Plug-in (parse-msword) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Pdf Parse Plug-in (parse-pdf) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Jakarta POI - Java API To Access Microsoft Format Files (lib-jakarta-poi) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Text Parse Plug-in (parse-text) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Basic Query Filter (query-basic) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - HTTP Framework (lib-http) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - URL Query Filter (query-url) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Parse MS Documents Framework (lib-parsems) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Log4j (lib-log4j) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - More Indexing Filter (index-more) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - More Query Filter (query-more) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Registered Extension-Points: 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Summarizer (org.apache.nutch.searcher.Summarizer) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Scoring ( org.apache.nutch.scoring.ScoringFilter) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Protocol ( org.apache.nutch.protocol.Protocol) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Online Search Results Clustering Plugin ( org.apache.nutch.clustering.OnlineClusterer) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Ontology Model Loader (org.apache.nutch.ontology.Ontology) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Analysis ( org.apache.nutch.analysis.NutchAnalyzer) 2007-07-03 19:41:26,388 INFO plugin.PluginRepository - Nutch Query Filter (org.apache.nutch.searcher.QueryFilter) at which it ends
