Hello, i was trying to crawl a website listed in my urls directory, but injecting part doesn't stop
it prints out following output endlessly. Command : Crawl c:\urls -thread 1 -depth 1 Output : 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- default.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/crawl- tool.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/mapred- default.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- site.xml 060118 162225 crawl started in: crawl-20060118162225 060118 162225 rootUrlDir = c:\urls 060118 162225 threads = 10 060118 162225 depth = 5 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- default.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/crawl- tool.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- site.xml 060118 162225 Injector: starting 060118 162225 Injector: crawlDb: crawl-20060118162225\crawldb 060118 162225 Injector: urlDir: c:\urls 060118 162225 Injector: Converting injected urls to crawl db entries. 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- default.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/crawl- tool.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/mapred- default.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/mapred- default.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- site.xml 060118 162225 Running job: job_ipqvtf 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- default.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/mapred- default.xml 060118 162225 parsing \tmp\nutch\mapred\local\localRunner\job_ipqvtf.xml 060118 162225 parsing file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/nutch- site.xml 060118 162225 Using URL normalizer: org.apache.nutch.net.BasicUrlNormalizer 060118 162225 Plugins: looking in: C:\Documents and Settings\Sameer Tamsekar\My Documents\project\NutchNight\bin\plugins 060118 162226 Plugin Auto-activation mode: [true] 060118 162226 Registered Plugins: 060118 162226 URL Query Filter (query-url) 060118 162226 Site Query Filter (query-site) 060118 162226 Http / Https Protocol Plug-in (protocol-httpclient) 060118 162226 Html Parse Plug-in (parse-html) 060118 162226 the nutch core extension points (nutch-extensionpoints) 060118 162226 Basic Indexing Filter (index-basic) 060118 162226 Text Parse Plug-in (parse-text) 060118 162226 JavaScript Parser (parse-js) 060118 162226 Regex URL Filter (urlfilter-regex) 060118 162226 Basic Query Filter (query-basic) 060118 162226 Registered Extension-Points: 060118 162226 Nutch Protocol (org.apache.nutch.protocol.Protocol) 060118 162226 Nutch URL Filter (org.apache.nutch.net.URLFilter) 060118 162226 HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter) 060118 162226 Nutch Online Search Results Clustering Plugin ( org.apache.nutch.clustering.OnlineClusterer) 060118 162226 Nutch Indexing Filter ( org.apache.nutch.indexer.IndexingFilter) 060118 162226 Nutch Content Parser (org.apache.nutch.parse.Parser) 060118 162226 Ontology Model Loader (org.apache.nutch.ontology.Ontology) 060118 162226 Nutch Analysis (org.apache.nutch.analysis.NutchAnalyzer) 060118 162226 Nutch Query Filter (org.apache.nutch.searcher.QueryFilter) 060118 162226 found resource crawl-urlfilter.txt at file:/C:/Documents%20and%20Settings/Sameer%20Tamsekar/My%20Documents/project/NutchNight/bin/crawl- urlfilter.txt 060118 162226 map 0% 060118 162226 c:\urls\urllist.txt:0+20 060118 162227 map -485200% 060118 162227 c:\urls\urllist.txt:0+20 060118 162228 map -1433905% 060118 162228 c:\urls\urllist.txt:0+20 060118 162229 map -2369110% 060118 162229 c:\urls\urllist.txt:0+20 060118 162230 map -3320875% 060118 162230 c:\urls\urllist.txt:0+20 060118 162231 map -4257505% 060118 162231 c:\urls\urllist.txt:0+20 060118 162232 map -5189935% 060118 162232 c:\urls\urllist.txt:0+20 060118 162233 map -6120770% 060118 162233 c:\urls\urllist.txt:0+20 060118 162234 map -7072030% 060118 162234 c:\urls\urllist.txt:0+20 060118 162235 map -7974285% 060118 162235 c:\urls\urllist.txt:0+20 060118 162236 map -8906220% 060118 162236 c:\urls\urllist.txt:0+20 060118 162237 map -9854044% 060118 162237 c:\urls\urllist.txt:0+20 060118 162238 map -10800620% 060118 162238 c:\urls\urllist.txt:0+20 060118 162239 map -11773884% 060118 162240 c:\urls\urllist.txt:0+20 060118 162240 map -12733334% 060118 162241 c:\urls\urllist.txt:0+20 060118 162241 map -13682200% 060118 162242 c:\urls\urllist.txt:0+20 060118 162242 map -14657790% 060118 162243 c:\urls\urllist.txt:0+20 060118 162243 map -15636600% 060118 162244 c:\urls\urllist.txt:0+20 060118 162244 map -16615990% 060118 162245 c:\urls\urllist.txt:0+20 060118 162245 map -17590724% 060118 162246 c:\urls\urllist.txt:0+20 060118 162246 map -18569596% 060118 162247 c:\urls\urllist.txt:0+20 060118 162247 map -19514354% 060118 162248 c:\urls\urllist.txt:0+20 060118 162248 map -20477716% 060118 162249 c:\urls\urllist.txt:0+20 060118 162249 map -21455666% 060118 162250 c:\urls\urllist.txt:0+20 060118 162250 map -22386566%
