No worries, thanks for the responses so far. Unfortunately, I've been
looking at parse-tika for the last hour or so wracking my brain trying to
figure out what is different there vs. what I am doing. From what I see
pretty much everything is the same... sorry to flood you with info, but
here is some more details. I'm going to continue troubleshooting and see if
I get anywhere, if you have any suggestions let me know!
Here is my plugin.xml:
<plugin
id="index-vulns"
name="Nutch Website Vulnerability Indexer"
version="1.0.0"
provider-name="nutch.org">
<runtime>
<library name="index-vulns.jar">
<export name="*"/>
</library>
<library name="commons-codec-1.6.jar"/>
<library name="commons-io-2.4.jar"/>
<library name="commons-logging-1.1.1.jar"/>
<library name="fluent-hc-4.2.5.jar"/>
<library name="httpclient-4.2.5.jar"/>
<library name="httpclient-cache-4.2.5.jar"/>
<library name="httpcore-4.2.4.jar"/>
<library name="httpmime-4.2.5.jar"/>
<library name="web3-scanner-1.0.jar"/>
</runtime>
<requires>
<import plugin="nutch-extensionpoints"/>
</requires>
<extension id="org.apache.nutch.indexer.vulns"
name="Nutch Website Vulnerabiliy Indexer"
point="org.apache.nutch.indexer.IndexingFilter">
<implementation id="VulnIndexingFilter"
class="org.apache.nutch.indexer.vulns.VulnIndexingFilter"/>
</extension>
</plugin>
Everything seems to go just fine in the build. Then when I try to run
indexchecker I get:
punk@punk-kali:~/HGDev/web3-spider-compiled/nutch/runtime/local$ bin/nutch
indexchecker "http://sqli1.hyperiongray.com/?user=root"
fetching: http://sqli1.hyperiongray.com/?user=root
http://sqli1.hyperiongray.com/?user=root skipped. Content of size 88 was
truncated to 69
Content is truncated, parse may fail!
parsing: http://sqli1.hyperiongray.com/?user=root
contentType: text/html
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/http/client/utils/URIBuilder
at
com.hyperiongray.web3scanner.UriGenerator.getUrlsToFuzz(UriGenerator.java:20)
at
com.hyperiongray.web3scanner.SqliScanner.generateUrls(SqliScanner.java:35)
at
com.hyperiongray.web3scanner.SqliScanner.scan(SqliScanner.java:74)
at
org.apache.nutch.indexer.vulns.VulnIndexingFilter.filter(VulnIndexingFilter.java:63)
at
org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:109)
at
org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:126)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.nutch.indexer.IndexingFiltersChecker.main(IndexingFiltersChecker.java:150)
Caused by: java.lang.ClassNotFoundException:
org.apache.http.client.utils.URIBuilder
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 8 more
On Wed, Jul 24, 2013 at 12:43 AM, Lewis John Mcgibbney <
[email protected]> wrote:
> parse-tika is an excellent example here.
> I am sorry I'm not in front of the code right now.
> hth
>
> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote:
> > Excellent, thank you! It did clarify a bit - in particular now I know
> that
> > I'm not crazy and that this is indeed a somewhat common problem with a
> > solution. However, I'm still a little lost on the solution part of it
> :-).
> > If I understand correctly...
> >
> > In the plugin manifest file I need to define the dependencies and they
> will
> > be added to the class-loader such that they will get used. However, I'm
> a
> > little shady on how to do that, in my plugin.xml file I have the
> following:
> >
> > <runtime>
> > <library name="index-vulns.jar">
> > <export name="*"/>
> > </library>
> > <library name="commons-codec-1.6.jar"/>
> > <library name="commons-io-2.4.jar"/>
> > <library name="commons-logging-1.1.1.jar"/>
> > <library name="fluent-hc-4.2.5.jar.jar"/>
> > <library name="httpclient-4.2.5.jar"/>
> > <library name="httpclient-cache-4.2.5.jar"/>
> > <library name="httpcore-4.2.4.jar"/>
> > <library name="httpmime-4.2.5"/>
> > <library name="web3-scanner-1.0.jar"/>
> > </runtime>
> >
> > If I understand properly this should be sufficient to have the jars added
> > to the plugin class-loader. However, that doesn't appear to be the case -
> I
> > must be missing something, but I'm not sure what that is...?
> >
> > Alex
> >
> >
> >
> > On Tue, Jul 23, 2013 at 11:30 PM, Lewis John Mcgibbney <
> > [email protected]> wrote:
> >
> >> Hi Alex,
> >> About now is a good time to read how Nutch deals with classloading.
> >> Navigate to plugin central on the wiki and you will see the
> documentation.
> >> hth you out
> >> Lewis
> >>
> >> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote:
> >> > Hi All,
> >> >
> >> > I'm attempting to build a Nutch plugin on Nutch 1.7 with some external
> >> > dependencies. The way I've handled this in the past is to just put the
> >> > dependencies in lib/ and be done with it. However, now I have some
> >> > dependencies that are newer versions of dependencies already present
> in
> >> > Nutch. For example, I'm using the Apache httpclient-4.2.5.jar library
> >> > whereas Nutch appears to use httpclient-4.1.2 and httpclient-4.1.3.
> >> > Unfortunately, I do need the plugin to compile and run with my version
> (I
> >> > can't just downgrade, not a valid solution).
> >> >
> >> > I've added the necessary jars to my plugin's ivy.xml and they download
> >> just
> >> > fine and the plugin compiles, which is wonderful. However, when I go
> to
> >> run
> >> > it (in local mode) I get "Exception in thread "main"
> >> > java.lang.NoClassDefFoundError:
> >> > org/apache/http/client/utils/URLEncodedUtils". Moving my own later
> >> versions
> >> > of the file to the ClassPath in runtime/local/lib/ gives me additional
> >> > issues - "Exception in thread "main" java.lang.NoSuchMethodError:
> >> > org.apache.http.client.utils.URLEncodedUtils.parse" which appears to
> be
> >> an
> >> > error caused by Nutch trying to use an older version of the httpclient
> >> > library (it disappears if I remove the "older" jars from the
> classpath).
> >> >
> >> > My question is - how would I go about adding these dependencies such
> that
> >> > my plugin would use these jars and I'm not having to remove libraries
> >> that
> >> > Nutch may need? I believe I'm missing a critical step here or a
> missing
> >> > directive in one of the plugin XML directive files... any ideas?
> >> >
> >> > Thanks!
> >> >
> >> > Alex
> >> >
> >>
> >> --
> >> *Lewis*
> >>
> >
>
> --
> *Lewis*
>