I'd think that this would be more a case for the universal exporter (a.k.a
multiple indexing backends) that we mentioned several times. The REST API
is more a way of piloting a crawl remotely. It could certainly be twisted
into doing all sorts of things but I am not sure it would be very
Thanks Julien
Looking at Andrzej's comments on the issue I saw that as you mention
he's run a full crawl using POST over REST and retrieving results as
JSON and this sounded appealing to get moving with.
From my own pov it appears that Nutch 2.X is 'closer' to the model
required for a multiple
Hi Lewis
I realise I was thinking about NUTCH-880, not NUTCH-932 which is indeed
about retrieving crawl results as JSON
From my own pov it appears that Nutch 2.X is 'closer' to the model
required for a multiple backends implementation although there is
still quite a bit of work to do here.
Hi Julian,
Just to share our experiences with using Nutch 2.0:
Indexing in Nutch actually has nothing to do with indexing itself. It just
selects some fields from a WebPage, does some very minimal processing (both
typically in the indexing filter plugins) and sends the result to a writer.
Hi
Thanks for your comments. This confirms if needs be that
https://issues.apache.org/jira/browse/NUTCH-1047 would be a useful thing to
have.
J
On 11 July 2012 13:22, Mathijs Homminga mathijs.hommi...@kalooga.comwrote:
Hi Julian,
Just to share our experiences with using Nutch 2.0:
Hi,
I was trying to setup Nutch 2.0 and I am facing problems setting up HBase
properly. When I start the start-hbase.sh script, it seems like the it has
worked perfectly but in the hbase shell I can't create tables etc and
instead get a myriad of errors. I suspect that its because I couldn't start
What about asking on the Hbase list instead?
On 11 July 2012 23:35, Prasanna. Suman prasa...@growerideas.com wrote:
Hi,
I was trying to setup Nutch 2.0 and I am facing problems setting up HBase
properly. When I start the start-hbase.sh script, it seems like the it has
worked perfectly but in
7 matches
Mail list logo