Hi Breno,

On Tue, Jun 2, 2015 at 1:38 AM, <[email protected]> wrote:

>
> We are indexing several domains for a specific project, which may contain
> duplicated content (e.g. pdf files). The users of the system come from
> different organisations and wonder why the content is not appearing under
> certain domains. It's a usability issue (with a political aftertaste).
>

Thanks for explanation.


>
> Yes, I extended Signature, and I'm also able to use it through the
> db.signature.class property, if I pack the class into its own jar and put
> it into nutch/lib. I'd much rather like to include it in our existing
> plugin jar, though.


This is rather strange as Signatures are part of the *core* codebase e.g.
/src/java and not /src/plugins. Does this make sense?


> I'm not sure what you mean by ".job jar.".


If you build the Nutch source, you'll see /runtime/deploy/nutch.XXX.job
this is the main artifact sent to deployment clusters (JobTracker).


> We have been developing our plugin outside of nutch and placing the
> corresponding jars into a plugin directory together with the plugin.xml. Is
> there any "magic" happening regarding the classpath when one has ant
> building it inside nutch?


In general our documentation can be seen here
http://wiki.apache.org/nutch/PluginCentral
Specifically, you can see here
http://wiki.apache.org/nutch/WhatsTheProblemWithPluginsAndClass-loading
This is why I think it is a bit strange that you've implemented your
signature as a plugin and not as part of the core codebase.



> Is there a naming convention regarding the plugin name and corresponding
> jar? Do they have to match?
>

For plugins, accompanying and required files and naming conventions please
see
http://wiki.apache.org/nutch/WritingPluginExample


>
> The reason behind developing our plugin outside of nutch and decoupling
> the build environment is to make updates of nutch easier. That way we can
> simply download the binary release and overlay our plugin. I realize now
> this seems to be a little off the usual way of writing plugins for nutch.
>

I understand that and I would say it makes sense, however as I said before
Signatures are usually part of the core codebase and not implemented as
plugins (at least I've never implemented a signature as a plugin).
hth
Lewis

Reply via email to