https://wiki.apache.org/nutch/IndexMetatags 

Soon as i switch to nutch-site_v2 nutch throws protocol missing errors during 
crawl. 

2016-09-06 12:23:53,102 INFO fetcher.Fetcher - -activeThreads=50, 
spinWaiting=50, fetchQueues.totalSize=442, fetchQueues.getQueueCount=1 
2016-09-06 12:23:53,576 INFO fetcher.FetcherThread - fetching 
https://snip/inside/events/events_summary/documents/Harford_Co_Sheriff_Special_Brief.pdf
 (queue crawl delay=500ms) 
2016-09-06 12:23:53,576 INFO fetcher.FetcherThread - fetch of 
https://snip/inside/events/events_summary/documents/Harford_Co_Sheriff_Special_Brief.pdf
 failed with: org.apache.nutch.protocol.ProtocolNotFound: protocol not found 
for url=https 
at 
org.apache.nutch.protocol.ProtocolFactory.getProtocol(ProtocolFactory.java:84) 
at org.apache.nutch.fetcher.FetcherThread.run(FetcherThread.java:257) 

how can i fix this? 

Kris 

Attachment: nutch-site.xml
Description: XML document

Attachment: nutch-site_v2.xml
Description: XML document

Reply via email to