As an update, after compilation, I totally removed the "older" libraries from runtime/local/lib and still got the same error. It seems the libs are never actually making it to my execution classpath. I must be doing something very incorrect here...
On Wed, Jul 24, 2013 at 1:45 AM, AC Nutch <[email protected]> wrote: > No worries, thanks for the responses so far. Unfortunately, I've been > looking at parse-tika for the last hour or so wracking my brain trying to > figure out what is different there vs. what I am doing. From what I see > pretty much everything is the same... sorry to flood you with info, but > here is some more details. I'm going to continue troubleshooting and see if > I get anywhere, if you have any suggestions let me know! > > Here is my plugin.xml: > > <plugin > id="index-vulns" > name="Nutch Website Vulnerability Indexer" > version="1.0.0" > provider-name="nutch.org"> > > > <runtime> > <library name="index-vulns.jar"> > <export name="*"/> > </library> > <library name="commons-codec-1.6.jar"/> > <library name="commons-io-2.4.jar"/> > <library name="commons-logging-1.1.1.jar"/> > <library name="fluent-hc-4.2.5.jar"/> > > <library name="httpclient-4.2.5.jar"/> > <library name="httpclient-cache-4.2.5.jar"/> > <library name="httpcore-4.2.4.jar"/> > <library name="httpmime-4.2.5.jar"/> > > <library name="web3-scanner-1.0.jar"/> > </runtime> > > <requires> > <import plugin="nutch-extensionpoints"/> > </requires> > > <extension id="org.apache.nutch.indexer.vulns" > name="Nutch Website Vulnerabiliy Indexer" > point="org.apache.nutch.indexer.IndexingFilter"> > <implementation id="VulnIndexingFilter" > > class="org.apache.nutch.indexer.vulns.VulnIndexingFilter"/> > </extension> > </plugin> > > Everything seems to go just fine in the build. Then when I try to run > indexchecker I get: > > punk@punk-kali:~/HGDev/web3-spider-compiled/nutch/runtime/local$ > bin/nutch indexchecker "http://sqli1.hyperiongray.com/?user=root" > fetching: http://sqli1.hyperiongray.com/?user=root > http://sqli1.hyperiongray.com/?user=root skipped. Content of size 88 was > truncated to 69 > Content is truncated, parse may fail! > parsing: http://sqli1.hyperiongray.com/?user=root > contentType: text/html > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/http/client/utils/URIBuilder > at > com.hyperiongray.web3scanner.UriGenerator.getUrlsToFuzz(UriGenerator.java:20) > at > com.hyperiongray.web3scanner.SqliScanner.generateUrls(SqliScanner.java:35) > at > com.hyperiongray.web3scanner.SqliScanner.scan(SqliScanner.java:74) > at > org.apache.nutch.indexer.vulns.VulnIndexingFilter.filter(VulnIndexingFilter.java:63) > at > org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:109) > at > org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:126) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.nutch.indexer.IndexingFiltersChecker.main(IndexingFiltersChecker.java:150) > Caused by: java.lang.ClassNotFoundException: > org.apache.http.client.utils.URIBuilder > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 8 more > > > > > On Wed, Jul 24, 2013 at 12:43 AM, Lewis John Mcgibbney < > [email protected]> wrote: > >> parse-tika is an excellent example here. >> I am sorry I'm not in front of the code right now. >> hth >> >> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote: >> > Excellent, thank you! It did clarify a bit - in particular now I know >> that >> > I'm not crazy and that this is indeed a somewhat common problem with a >> > solution. However, I'm still a little lost on the solution part of it >> :-). >> > If I understand correctly... >> > >> > In the plugin manifest file I need to define the dependencies and they >> will >> > be added to the class-loader such that they will get used. However, >> I'm a >> > little shady on how to do that, in my plugin.xml file I have the >> following: >> > >> > <runtime> >> > <library name="index-vulns.jar"> >> > <export name="*"/> >> > </library> >> > <library name="commons-codec-1.6.jar"/> >> > <library name="commons-io-2.4.jar"/> >> > <library name="commons-logging-1.1.1.jar"/> >> > <library name="fluent-hc-4.2.5.jar.jar"/> >> > <library name="httpclient-4.2.5.jar"/> >> > <library name="httpclient-cache-4.2.5.jar"/> >> > <library name="httpcore-4.2.4.jar"/> >> > <library name="httpmime-4.2.5"/> >> > <library name="web3-scanner-1.0.jar"/> >> > </runtime> >> > >> > If I understand properly this should be sufficient to have the jars >> added >> > to the plugin class-loader. However, that doesn't appear to be the case >> - >> I >> > must be missing something, but I'm not sure what that is...? >> > >> > Alex >> > >> > >> > >> > On Tue, Jul 23, 2013 at 11:30 PM, Lewis John Mcgibbney < >> > [email protected]> wrote: >> > >> >> Hi Alex, >> >> About now is a good time to read how Nutch deals with classloading. >> >> Navigate to plugin central on the wiki and you will see the >> documentation. >> >> hth you out >> >> Lewis >> >> >> >> On Tuesday, July 23, 2013, AC Nutch <[email protected]> wrote: >> >> > Hi All, >> >> > >> >> > I'm attempting to build a Nutch plugin on Nutch 1.7 with some >> external >> >> > dependencies. The way I've handled this in the past is to just put >> the >> >> > dependencies in lib/ and be done with it. However, now I have some >> >> > dependencies that are newer versions of dependencies already present >> in >> >> > Nutch. For example, I'm using the Apache httpclient-4.2.5.jar library >> >> > whereas Nutch appears to use httpclient-4.1.2 and httpclient-4.1.3. >> >> > Unfortunately, I do need the plugin to compile and run with my >> version >> (I >> >> > can't just downgrade, not a valid solution). >> >> > >> >> > I've added the necessary jars to my plugin's ivy.xml and they >> download >> >> just >> >> > fine and the plugin compiles, which is wonderful. However, when I go >> to >> >> run >> >> > it (in local mode) I get "Exception in thread "main" >> >> > java.lang.NoClassDefFoundError: >> >> > org/apache/http/client/utils/URLEncodedUtils". Moving my own later >> >> versions >> >> > of the file to the ClassPath in runtime/local/lib/ gives me >> additional >> >> > issues - "Exception in thread "main" java.lang.NoSuchMethodError: >> >> > org.apache.http.client.utils.URLEncodedUtils.parse" which appears to >> be >> >> an >> >> > error caused by Nutch trying to use an older version of the >> httpclient >> >> > library (it disappears if I remove the "older" jars from the >> classpath). >> >> > >> >> > My question is - how would I go about adding these dependencies such >> that >> >> > my plugin would use these jars and I'm not having to remove libraries >> >> that >> >> > Nutch may need? I believe I'm missing a critical step here or a >> missing >> >> > directive in one of the plugin XML directive files... any ideas? >> >> > >> >> > Thanks! >> >> > >> >> > Alex >> >> > >> >> >> >> -- >> >> *Lewis* >> >> >> > >> >> -- >> *Lewis* >> > >

