Hi Lewis, Thanks for your reply. You¹re right, there¹s no homebrew recipe for Nutch. I use the official nutch 2.3 OS X release download from the Apache website. I run nutch from /runtime/local/bin. The homebrew packages are other dependent software (mongo, cassandra, hbase,e tc.)
All the problems I described are with the nutch 2.3 download, not homebrew packages. Where do I download nutch 2.3.1? Should I just pull the latest from http://svn.apache.org/viewvc/nutch/trunk/ ? Cheers, Sherban On 9/27/15, 9:57 AM, "Lewis John Mcgibbney" <[email protected]> wrote: >Hi Drulea, > >On Sun, Sep 27, 2015 at 7:36 AM, <[email protected]> >wrote: > >> >> I¹m using nutch 2.3 on OS X 10.9.5 with homebrew. >> > > >From the start I would like to point you at the current release candidate >for Nutch 2.3.1. The VOTE is currently open and the release candidate is >being tested by the community. There are a number of bugs fixed down in >Gora (particularly within the gora-mongodb module) which Nutch 2.3.1 will >benefit from. >It can be obtained from here >http://www.mail-archive.com/dev%40nutch.apache.org/msg19271.html > >Another thing here is that, AFAIK we are not publishing Homebrew recipes! >Wherever you got your recipe from I can guarantee you that it is not an >official Nutch one! I do however see two > >lmcgibbn@LMC-032857 /usr/local(joshua) $ brew search nutch >No formula found for "nutch". >==> Searching pull requests... >Closed pull requests: >Added formula for Apache Nutch ( >https://github.com/Homebrew/homebrew/pull/26587) >Added Apache Nutch 2.2.1 (https://github.com/Homebrew/homebrew/pull/22004) > >None of these are from the release managers at Nutch... maybe this is >something we should look in to. > > >> >> I¹ve been unable to use the crawl command with MySQL, Mongo, or >>Cassandra. >> The inject step fails in each configuration with the following arcane >> errors: >> >> 1.) MySQL (after downgrading to gora-cpre 0.2.1 in ivy.xml as per >>comments) >> > > >MySQL backend for Gora is broken by now. Things have changed and moved on >with the SQL module being left in the dust. Avro has also moved on >significantly and we now utilize a MUCH never version of Avro so your >NoSuchMethodError below us entirely understandable. > > >> InjectorJob: Injecting urlDir: urls >> > >[...snip] > > > >> >> >> 2.) Mongo with default 0.5 gora >> >> InjectorJob: Injecting urlDir: urls >> >> InjectorJob: org.apache.gora.util.GoraException: >> java.lang.NullPointerException >> >> >> >[...snip] > >This is gone in the Nutch 2.3.1 release candidate. > > >> 3.) Mongo(upgrading to gora 0.6.1 to resolve previous issue above) >> >> InjectorJob: Injecting urlDir: urls >> >> InjectorJob: java.lang.UnsupportedOperationException: Not implemented by >> the DistributedFileSystem FileSystem implementation >> >> >> >[...snip] > >Can you please try with the 2.3.1 release candidate and provide the same >feedback? > > >> 4.) Cassandra using default gora 0.5 >> >> InjectorJob: Injecting urlDir: urls >> >> Exception in thread "main" java.lang.NoSuchMethodError: >> org.apache.avro.Schema.access$1400()Ljava/lang/ThreadLocal; >> >> >> >[...snip] > >I've never seen this before. On another note, Renato and me are currently >overhauling the gora-cassandra driver from Hector --> Datastax Java >Driver. >Work is ongoing here >https://github.com/renato2099/gora/tree/gora-datastax-cassandra > > >> Does the ³crawl" script inject task work with any backend storage >>reliably >> on OS X? >> > >Well we can better answer that question if and when you and more people >try >our the 2.3.1 release candidate. > > > >> >> Which backend is the most reliable to use with nutch 2.3? >> > >HBase 0.94.14 > > >> >> It¹s frustrating that 3 common (and supposedly supported) backends don¹t >> work with nutch due to arcane errors. >> >> >I agree. But lets not throw the baby out with the bath water here. Hows >about you try out the above and respond and we can take it from there? >Would be great to have more developers submitting patches for 2.X branch. >If you are keen then it would be great to have you on board. >Thanks >Lewis __________________________________________________________________________ This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

