Hi Drulea,

On Sun, Sep 27, 2015 at 7:36 AM, <[email protected]> wrote:

>
> I’m using nutch 2.3 on OS X 10.9.5 with homebrew.
>


>From the start I would like to point you at the current release candidate
for Nutch 2.3.1. The VOTE is currently open and the release candidate is
being tested by the community. There are a number of bugs fixed down in
Gora (particularly within the gora-mongodb module) which Nutch 2.3.1 will
benefit from.
It can be obtained from here
http://www.mail-archive.com/dev%40nutch.apache.org/msg19271.html

Another thing here is that, AFAIK we are not publishing Homebrew recipes!
Wherever you got your recipe from I can guarantee you that it is not an
official Nutch one! I do however see two

lmcgibbn@LMC-032857 /usr/local(joshua) $ brew search nutch
No formula found for "nutch".
==> Searching pull requests...
Closed pull requests:
Added formula for Apache Nutch (
https://github.com/Homebrew/homebrew/pull/26587)
Added Apache Nutch 2.2.1 (https://github.com/Homebrew/homebrew/pull/22004)

None of these are from the release managers at Nutch... maybe this is
something we should look in to.


>
> I’ve been unable to use the crawl command with MySQL, Mongo, or Cassandra.
> The inject step fails in each configuration with the following arcane
> errors:
>
> 1.) MySQL (after downgrading to gora-cpre 0.2.1 in ivy.xml as per comments)
>


MySQL backend for Gora is broken by now. Things have changed and moved on
with the SQL module being left in the dust. Avro has also moved on
significantly and we now utilize a MUCH never version of Avro so your
NoSuchMethodError below us entirely understandable.


>       InjectorJob: Injecting urlDir: urls
>

[...snip]



>
>
> 2.) Mongo with default 0.5 gora
>
> InjectorJob: Injecting urlDir: urls
>
> InjectorJob: org.apache.gora.util.GoraException:
> java.lang.NullPointerException
>
>
>
[...snip]

This is gone in the Nutch 2.3.1 release candidate.


> 3.) Mongo(upgrading to gora 0.6.1 to resolve previous issue above)
>
> InjectorJob: Injecting urlDir: urls
>
> InjectorJob: java.lang.UnsupportedOperationException: Not implemented by
> the DistributedFileSystem FileSystem implementation
>
>
>
[...snip]

Can you please try with the 2.3.1 release candidate and provide the same
feedback?


> 4.) Cassandra using default gora 0.5
>
> InjectorJob: Injecting urlDir: urls
>
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.avro.Schema.access$1400()Ljava/lang/ThreadLocal;
>
>
>
[...snip]

I've never seen this before. On another note, Renato and me are currently
overhauling the gora-cassandra driver from Hector --> Datastax Java Driver.
Work is ongoing here
https://github.com/renato2099/gora/tree/gora-datastax-cassandra


> Does the “crawl" script inject task work with any backend storage reliably
> on OS X?
>

Well we can better answer that question if and when you and more people try
our the 2.3.1 release candidate.



>
> Which backend is the most reliable to use with nutch 2.3?
>

HBase 0.94.14


>
> It’s frustrating that 3 common (and supposedly supported) backends don’t
> work with nutch due to arcane errors.
>
>
I agree. But lets not throw the baby out with the bath water here. Hows
about you try out the above and respond and we can take it from there?
Would be great to have more developers submitting patches for 2.X branch.
If you are keen then it would be great to have you on board.
Thanks
Lewis

Reply via email to