Re: Indexing to Solr4.2 with nutch 1.6

2013-04-10 Thread Amit Sela
I saw the patch for nutch 2.x where you replaced CommonsHttpSolrServer with ConcurrentUpdateSolrServer but in 1.6 StringUtils.getCommonsHttpSolrServer is used for getting SolrServer. Should we add a getConcurrentUpdateSolrServer to SolrUtils ? As I understand it, the exception I got was caused by

Re: Indexing to Solr4.2 with nutch 1.6

2013-04-10 Thread Amit Sela
Yep. That seemed to be the problem. If the id field is to be set by schema.XML then it shouldn't be constant. Or decide that Nutch always uses id as unique key. On Apr 10, 2013 6:01 PM, Amit Sela am...@infolinks.com wrote: I saw the patch for nutch 2.x where you replaced CommonsHttpSolrServer

Re: Indexing to Solr4.2 with nutch 1.6

2013-04-10 Thread Lewis John Mcgibbney
From memory we always use the id as the unique key with no exceptions. As for use of the ConcurrentUpdateSolrServer, this is not correct (my bad) we should just use HttpSolrServer and use the defaults. I will update the patch and I will also cook up one for trunk. Thanks for your feedback Amit. It

Re: Indexing to Solr4.2 with nutch 1.6

2013-04-09 Thread Amit Sela
Well, according to our other corresponding, the only thing I did different in my schema.xml (schema-solr4.xml) before rebuilding nutch was the uniqueKeyurl/uniqueKey instead of uniqueKeyid/uniqueKey. It all goes well until the dedup phase where the MapReduce throws:

Re: Indexing to Solr4.2 with nutch 1.6

2013-04-09 Thread Lewis John Mcgibbney
Before we do the upgrade we need to consolidate all of these use cases. What criteria do we want to review and accept as the unique key? Will this change between Nutch trunk and 2.x? On Tuesday, April 9, 2013, Amit Sela am...@infolinks.com wrote: Well, according to our other corresponding, the

Indexing to Solr4.2 with nutch 1.6

2013-04-08 Thread Amit Sela
Is it possible ? I saw a Jira open about connecting to SolrCloud via ZooKeeper but in direct connection to one of the server is it possible to index with nutch 1.6 into Solr4.2 setup as cloud with ZooKeeper ensemble ? because I keep getting IndexOutOfBounds exceptions in the dedup M/R phase.

Re: Indexing to Solr4.2 with nutch 1.6

2013-04-08 Thread Lewis John Mcgibbney
I would probably be best to describe what you've tried here, possibly a paste of your schema, what you've done (if anything) to the Nutch source to get it working with Solr 4, etc. The stack trace you get would also be beneficial. Thank you Lewis On Mon, Apr 8, 2013 at 4:13 AM, Amit Sela