i crawl sites with nutch 1.3. i see this exception in my log when nutch crawl
my sites:
Malformed URL: '', skipping (java.net.MalformedURLException: no
protocol:
at java.net.URL.<init>(URL.java:567)
at java.net.URL.<init>(URL.java:464)
at java.net.URL.<init>(URL.java:413)
at org.apache.nutch.crawl.Generator$Selector.reduce(Generator.java:247)
at org.apache.nutch.crawl.Generator$Selector.reduce(Generator.java:109)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Malformed-URL-skipping-java-net-MalformedURLException-tp3590161p3590161.html
Sent from the Nutch - User mailing list archive at Nabble.com.