Hi, Has anyone been able to use SFTP with Nutch 2.0?
* I have enabled the out-of-the-box SFTP plugin in nutch-site.xml / plugin.includes property * I have added the appropriate line to prefix-urlfilter.txt * I configured Nutch to accept everything in regex-urlfilter.txt * I am trying to inject a single URL with SFTP to a clean HBase / Nutch / Solr setup I consider my setup working properly otherwise since I am able to inject / generate / fetch / parse / etc. a sample of 1,000 URLs from the DMOZ Open Directory (similar to the Nutch 1.x tutorial). Here is the output of the inject command: InjectorJob: starting InjectorJob: urlDir: ***censored*** Skipping sftp://***censored***/:java.net.MalformedURLException: unknown protocol: sftp InjectorJob: finished Here is the related snippet from the log file with TRACE level: 2012-09-27 11:21:50,874 DEBUG plugin.PluginRepository - parsing: /home/totha/development/apache-nutch-2.0/plugins/protocol-sftp/plugin.xml 2012-09-27 11:21:50,875 DEBUG plugin.PluginRepository - plugin: id=protocol-sftp name=Sftp Protocol Plug-in version=1.0.0 provider=nutch.orgclass=null 2012-09-27 11:21:50,875 DEBUG plugin.PluginRepository - impl: point=org.apache.nutch.protocol.Protocol class=org.apache.nutch.protocol.sftp.Sftp ... 2012-09-27 11:21:50,880 INFO plugin.PluginRepository - Registered Plugins: 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Basic URL Normalizer (urlnormalizer-basic) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - HTTP Framework (lib-http) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Pass-through URL Normalizer (urlnormalizer-pass) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Sftp Protocol Plug-in (protocol-sftp) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Regex URL Normalizer (urlnormalizer-regex) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Tika Parser Plug-in (parse-tika) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Anchor Indexing Filter (index-anchor) 2012-09-27 11:21:50,881 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter) Thanks. IMPORTANT NOTICE: This message, including attachments, may be confidential or legally privileged and is for the intended recipient(s) only. Unauthorized distribution, copying or disclosure is strictly prohibited. By accepting email communications that may contain your personal information, you are deemed to consent to its transmission. Please delete this email if obtained in error and email confirmation to sender.

