Hi,
I tried injecting www.google.com into my crawldb without prepending
http://to it.
It injected it fine, however when I ran generate on it it gave the
following warning:
"Malformed URL: 'www.google.com', skipping (java.net.MalformedURLException:
no protocol: www.google.com"

Why doesn't nutch assume that web links that have www. at the beginning are
of the http protocol?

Thanks,
Diaa

Reply via email to