Hello there.

Is it possible that the plugin lifecycle is broken or at least buggy?

I'm trying to setup Nutch 1.13 on Solr 6.6.1 such that it crawls the intranet.
That said, a lot of our documents are accessed via SMB, and to make the URLs in 
the search result actually clickable, I want to enable Nutch to fetch the 
documents via SMB/jcifs.

So first I configured Nutch to scan urls like smb://server/share.
Nutch writes into the logs that the smb protocol is unknown and therefore the 
url is skipped (yes, it already passed all the regex filters)
Then I installed the protocol-smb plugin from here: 
https://issues.apache.org/jira/browse/NUTCH-427
Nutch confirms that protocol-smb is loaded on startup and registered in the 
PluginRepository.
But right after that Nutch writes into the logs that the smb protocol is 
unknown and therefore the url is skipped....

So I was wondering what may have happened here and I went to check the plugin 
source code.
It seems as soon as the protocol-smb plugin is instantiated, it writes a log 
message indicating this fact. Then it tries to register the SMB protocol 
URLHandler with the JVM and again writes a log message. I have not seen any of 
these two messages.

Then I checked the Nutch 1.13 source code, especially the PluginRepository 
class. It detects and successfully registers the plugins, and the code is 
commented as being sparse on resources by only instantiating plugins when they 
are required. So it is intentional that the protocol-smb plugin is registered 
but not instantiated. Which invokes a chicken-egg problem.

If the protocol plugin does not get instantiated, it cannot register its 
protocol. So although the plugin is registered, the smb://.... urls will throw 
MalformedURLExceptions.
And more generally speaking: Plugins are not able to initialize after being 
registered, only just before they are being loaded. My feeling is something is 
missing the plugin lifecycle....

Any ideas? Or should this post go to the developer's list?

Hiran


Hiran Chaudhuri
Principal Support Engineer
Service Reliability Engineering - Custom
Amadeus Data Processing GmbH
Berghamer Strasse 6
85435 Erding
T: +49-8122-43x3662
hiran.chaudh...@amadeus.com
http://amadeus.com<http://amadeus.com/>
[cid:image001.png@01D32DA0.17B19E50]

Reply via email to