No worries, thanks for the responses so far. Unfortunately, I've been
looking at parse-tika for the last hour or so wracking my brain trying to
figure out what is different there vs. what I am doing. From what I see
pretty much everything is the same... sorry to flood you with info, but
here is some more details. I'm going to continue troubleshooting and see if
I get anywhere, if you have any suggestions let me know!

Here is my plugin.xml:

<plugin
   id="index-vulns"
   name="Nutch Website Vulnerability Indexer"
   version="1.0.0"
   provider-name="nutch.org">

   <runtime>
      <library name="index-vulns.jar">
         <export name="*"/>
      </library>
      <library name="commons-codec-1.6.jar"/>
      <library name="commons-io-2.4.jar"/>
      <library name="commons-logging-1.1.1.jar"/>
      <library name="fluent-hc-4.2.5.jar"/>
      <library name="httpclient-4.2.5.jar"/>
      <library name="httpclient-cache-4.2.5.jar"/>
      <library name="httpcore-4.2.4.jar"/>
      <library name="httpmime-4.2.5.jar"/>
      <library name="web3-scanner-1.0.jar"/>
   </runtime>

   <requires>
      <import plugin="nutch-extensionpoints"/>
   </requires>

   <extension id="org.apache.nutch.indexer.vulns"
              name="Nutch Website Vulnerabiliy Indexer"
              point="org.apache.nutch.indexer.IndexingFilter">
      <implementation id="VulnIndexingFilter"

class="org.apache.nutch.indexer.vulns.VulnIndexingFilter"/>
   </extension>
</plugin>

Everything seems to go just fine in the build. Then when I try to run
indexchecker I get:

punk@punk-kali:~/HGDev/web3-spider-compiled/nutch/runtime/local$ bin/nutch
indexchecker "http://sqli1.hyperiongray.com/?user=root";
fetching: http://sqli1.hyperiongray.com/?user=root
http://sqli1.hyperiongray.com/?user=root skipped. Content of size 88 was
truncated to 69
Content is truncated, parse may fail!
parsing: http://sqli1.hyperiongray.com/?user=root
contentType: text/html
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/http/client/utils/URIBuilder
        at
com.hyperiongray.web3scanner.UriGenerator.getUrlsToFuzz(UriGenerator.java:20)
        at
com.hyperiongray.web3scanner.SqliScanner.generateUrls(SqliScanner.java:35)
        at
com.hyperiongray.web3scanner.SqliScanner.scan(SqliScanner.java:74)
        at
org.apache.nutch.indexer.vulns.VulnIndexingFilter.filter(VulnIndexingFilter.java:63)
        at
org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:109)
        at
org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:126)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at
org.apache.nutch.indexer.IndexingFiltersChecker.main(IndexingFiltersChecker.java:150)
Caused by: java.lang.ClassNotFoundException:
org.apache.http.client.utils.URIBuilder
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 8 more




On Wed, Jul 24, 2013 at 12:43 AM, Lewis John Mcgibbney <
[email protected]> wrote:

> parse-tika is an excellent example here.
> I am sorry I'm not in front of the code right now.
> hth
>
> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote:
> > Excellent, thank you! It did clarify a bit - in particular now I know
> that
> > I'm not crazy and that this is indeed a somewhat common problem with a
> > solution. However, I'm still a little lost on the solution part of it
> :-).
> > If I understand correctly...
> >
> > In the plugin manifest file I need to define the dependencies and they
> will
> > be added to the class-loader such that they will get used.  However, I'm
> a
> > little shady on how to do that, in my plugin.xml file I have the
> following:
> >
> >    <runtime>
> >       <library name="index-vulns.jar">
> >          <export name="*"/>
> >       </library>
> >       <library name="commons-codec-1.6.jar"/>
> >       <library name="commons-io-2.4.jar"/>
> >       <library name="commons-logging-1.1.1.jar"/>
> >       <library name="fluent-hc-4.2.5.jar.jar"/>
> >       <library name="httpclient-4.2.5.jar"/>
> >       <library name="httpclient-cache-4.2.5.jar"/>
> >       <library name="httpcore-4.2.4.jar"/>
> >       <library name="httpmime-4.2.5"/>
> >       <library name="web3-scanner-1.0.jar"/>
> >    </runtime>
> >
> > If I understand properly this should be sufficient to have the jars added
> > to the plugin class-loader. However, that doesn't appear to be the case -
> I
> > must be missing something, but I'm not sure what that is...?
> >
> > Alex
> >
> >
> >
> > On Tue, Jul 23, 2013 at 11:30 PM, Lewis John Mcgibbney <
> > [email protected]> wrote:
> >
> >> Hi Alex,
> >> About now is a good time to read how Nutch deals with classloading.
> >> Navigate to plugin central on the wiki and you will see the
> documentation.
> >> hth you out
> >> Lewis
> >>
> >> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote:
> >> > Hi All,
> >> >
> >> > I'm attempting to build a Nutch plugin on Nutch 1.7 with some external
> >> > dependencies. The way I've handled this in the past is to just put the
> >> > dependencies in lib/ and be done with it. However, now I have some
> >> > dependencies that are newer versions of dependencies already present
> in
> >> > Nutch. For example, I'm using the Apache httpclient-4.2.5.jar library
> >> > whereas Nutch appears to use httpclient-4.1.2 and httpclient-4.1.3.
> >> > Unfortunately, I do need the plugin to compile and run with my version
> (I
> >> > can't just downgrade, not a valid solution).
> >> >
> >> > I've added the necessary jars to my plugin's ivy.xml and they download
> >> just
> >> > fine and the plugin compiles, which is wonderful. However, when I go
> to
> >> run
> >> > it (in local mode) I get "Exception in thread "main"
> >> > java.lang.NoClassDefFoundError:
> >> > org/apache/http/client/utils/URLEncodedUtils". Moving my own later
> >> versions
> >> > of the file to the ClassPath in runtime/local/lib/ gives me additional
> >> > issues - "Exception in thread "main" java.lang.NoSuchMethodError:
> >> > org.apache.http.client.utils.URLEncodedUtils.parse" which appears to
> be
> >> an
> >> > error caused by Nutch trying to use an older version of the httpclient
> >> > library (it disappears if I remove the "older" jars from the
> classpath).
> >> >
> >> > My question is - how would I go about adding these dependencies such
> that
> >> > my plugin would use these jars and I'm not having to remove libraries
> >> that
> >> > Nutch may need? I believe I'm missing a critical step here or a
> missing
> >> > directive in one of the plugin XML directive files... any  ideas?
> >> >
> >> > Thanks!
> >> >
> >> > Alex
> >> >
> >>
> >> --
> >> *Lewis*
> >>
> >
>
> --
> *Lewis*
>

Reply via email to