As an update, after compilation, I totally removed the "older" libraries
from runtime/local/lib and still got the same error. It seems the libs are
never actually making it to my execution classpath. I must be doing
something very incorrect here...


On Wed, Jul 24, 2013 at 1:45 AM, AC Nutch <[email protected]> wrote:

> No worries, thanks for the responses so far. Unfortunately, I've been
> looking at parse-tika for the last hour or so wracking my brain trying to
> figure out what is different there vs. what I am doing. From what I see
> pretty much everything is the same... sorry to flood you with info, but
> here is some more details. I'm going to continue troubleshooting and see if
> I get anywhere, if you have any suggestions let me know!
>
> Here is my plugin.xml:
>
> <plugin
>    id="index-vulns"
>    name="Nutch Website Vulnerability Indexer"
>    version="1.0.0"
>    provider-name="nutch.org">
>
>
>    <runtime>
>       <library name="index-vulns.jar">
>          <export name="*"/>
>       </library>
>       <library name="commons-codec-1.6.jar"/>
>       <library name="commons-io-2.4.jar"/>
>       <library name="commons-logging-1.1.1.jar"/>
>       <library name="fluent-hc-4.2.5.jar"/>
>
>       <library name="httpclient-4.2.5.jar"/>
>       <library name="httpclient-cache-4.2.5.jar"/>
>       <library name="httpcore-4.2.4.jar"/>
>       <library name="httpmime-4.2.5.jar"/>
>
>       <library name="web3-scanner-1.0.jar"/>
>    </runtime>
>
>    <requires>
>       <import plugin="nutch-extensionpoints"/>
>    </requires>
>
>    <extension id="org.apache.nutch.indexer.vulns"
>               name="Nutch Website Vulnerabiliy Indexer"
>               point="org.apache.nutch.indexer.IndexingFilter">
>       <implementation id="VulnIndexingFilter"
>
> class="org.apache.nutch.indexer.vulns.VulnIndexingFilter"/>
>    </extension>
> </plugin>
>
> Everything seems to go just fine in the build. Then when I try to run
> indexchecker I get:
>
> punk@punk-kali:~/HGDev/web3-spider-compiled/nutch/runtime/local$
> bin/nutch indexchecker "http://sqli1.hyperiongray.com/?user=root";
> fetching: http://sqli1.hyperiongray.com/?user=root
> http://sqli1.hyperiongray.com/?user=root skipped. Content of size 88 was
> truncated to 69
> Content is truncated, parse may fail!
> parsing: http://sqli1.hyperiongray.com/?user=root
> contentType: text/html
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/http/client/utils/URIBuilder
>         at
> com.hyperiongray.web3scanner.UriGenerator.getUrlsToFuzz(UriGenerator.java:20)
>         at
> com.hyperiongray.web3scanner.SqliScanner.generateUrls(SqliScanner.java:35)
>         at
> com.hyperiongray.web3scanner.SqliScanner.scan(SqliScanner.java:74)
>         at
> org.apache.nutch.indexer.vulns.VulnIndexingFilter.filter(VulnIndexingFilter.java:63)
>         at
> org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:109)
>         at
> org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:126)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.nutch.indexer.IndexingFiltersChecker.main(IndexingFiltersChecker.java:150)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.http.client.utils.URIBuilder
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         ... 8 more
>
>
>
>
> On Wed, Jul 24, 2013 at 12:43 AM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
>> parse-tika is an excellent example here.
>> I am sorry I'm not in front of the code right now.
>> hth
>>
>> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote:
>> > Excellent, thank you! It did clarify a bit - in particular now I know
>> that
>> > I'm not crazy and that this is indeed a somewhat common problem with a
>> > solution. However, I'm still a little lost on the solution part of it
>> :-).
>> > If I understand correctly...
>> >
>> > In the plugin manifest file I need to define the dependencies and they
>> will
>> > be added to the class-loader such that they will get used.  However,
>> I'm a
>> > little shady on how to do that, in my plugin.xml file I have the
>> following:
>> >
>> >    <runtime>
>> >       <library name="index-vulns.jar">
>> >          <export name="*"/>
>> >       </library>
>> >       <library name="commons-codec-1.6.jar"/>
>> >       <library name="commons-io-2.4.jar"/>
>> >       <library name="commons-logging-1.1.1.jar"/>
>> >       <library name="fluent-hc-4.2.5.jar.jar"/>
>> >       <library name="httpclient-4.2.5.jar"/>
>> >       <library name="httpclient-cache-4.2.5.jar"/>
>> >       <library name="httpcore-4.2.4.jar"/>
>> >       <library name="httpmime-4.2.5"/>
>> >       <library name="web3-scanner-1.0.jar"/>
>> >    </runtime>
>> >
>> > If I understand properly this should be sufficient to have the jars
>> added
>> > to the plugin class-loader. However, that doesn't appear to be the case
>> -
>> I
>> > must be missing something, but I'm not sure what that is...?
>> >
>> > Alex
>> >
>> >
>> >
>> > On Tue, Jul 23, 2013 at 11:30 PM, Lewis John Mcgibbney <
>> > [email protected]> wrote:
>> >
>> >> Hi Alex,
>> >> About now is a good time to read how Nutch deals with classloading.
>> >> Navigate to plugin central on the wiki and you will see the
>> documentation.
>> >> hth you out
>> >> Lewis
>> >>
>> >> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote:
>> >> > Hi All,
>> >> >
>> >> > I'm attempting to build a Nutch plugin on Nutch 1.7 with some
>> external
>> >> > dependencies. The way I've handled this in the past is to just put
>> the
>> >> > dependencies in lib/ and be done with it. However, now I have some
>> >> > dependencies that are newer versions of dependencies already present
>> in
>> >> > Nutch. For example, I'm using the Apache httpclient-4.2.5.jar library
>> >> > whereas Nutch appears to use httpclient-4.1.2 and httpclient-4.1.3.
>> >> > Unfortunately, I do need the plugin to compile and run with my
>> version
>> (I
>> >> > can't just downgrade, not a valid solution).
>> >> >
>> >> > I've added the necessary jars to my plugin's ivy.xml and they
>> download
>> >> just
>> >> > fine and the plugin compiles, which is wonderful. However, when I go
>> to
>> >> run
>> >> > it (in local mode) I get "Exception in thread "main"
>> >> > java.lang.NoClassDefFoundError:
>> >> > org/apache/http/client/utils/URLEncodedUtils". Moving my own later
>> >> versions
>> >> > of the file to the ClassPath in runtime/local/lib/ gives me
>> additional
>> >> > issues - "Exception in thread "main" java.lang.NoSuchMethodError:
>> >> > org.apache.http.client.utils.URLEncodedUtils.parse" which appears to
>> be
>> >> an
>> >> > error caused by Nutch trying to use an older version of the
>> httpclient
>> >> > library (it disappears if I remove the "older" jars from the
>> classpath).
>> >> >
>> >> > My question is - how would I go about adding these dependencies such
>> that
>> >> > my plugin would use these jars and I'm not having to remove libraries
>> >> that
>> >> > Nutch may need? I believe I'm missing a critical step here or a
>> missing
>> >> > directive in one of the plugin XML directive files... any  ideas?
>> >> >
>> >> > Thanks!
>> >> >
>> >> > Alex
>> >> >
>> >>
>> >> --
>> >> *Lewis*
>> >>
>> >
>>
>> --
>> *Lewis*
>>
>
>

Reply via email to