Is it possible to get around this problem by using an older version of Solr or
Nutch or both?
From: Michael Coffey <[email protected]>
To: "[email protected]" <[email protected]>; Furkan KAMACI
<[email protected]>; Michael Coffey <[email protected]>
Sent: Tuesday, December 20, 2016 8:41 PM
Subject: Re: nutch 1.12 and Solr 5.4.1
This should work, shouldn't it? But it is not working. I am using Nutch 1.12
with the recommended version of Solr (5.4.1) and Hadoop 2.7.2. I haven't
changed any Java code, but I get a low-level Java error when trying to write to
the index. Is this not a tested configuration? Based on web searching, I know
that others have had similar problems, going back several months, but I haven't
seen any solutions. I did try a couple of variations on the patch posted for
NUTCH-2267 (a slightly different manifestation) and that did not help. I notice
that the 2267 patch has been reverted in the master branch.
I am willing to work on some Java code, if necessary, to help resolve this. At
this point, I don't know what to try next, other than switching to
ElasticSearch.
From: Michael Coffey <[email protected]>
To: "[email protected]" <[email protected]>; Furkan KAMACI
<[email protected]>; Michael Coffey <[email protected]>
Sent: Monday, December 19, 2016 7:13 PM
Subject: Re: nutch 1.12 and Solr 5.4.1
Some additional info: I am using solr.server.type=http, not cloud. I have tried
plugins.include with protocol-http and also with protocol-httpclient. My
current settings are listed below. Also, I am using hadoop 2.7.2, in case that
matters.
<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>
<property>
<name>solr.server.type</name>
<value>http</value>
<description>
Specifies the SolrServer implementation to use. This is a string value
of one of the following 'cloud', 'concurrent', 'http' or 'lb'.
The values represent CloudSolrServer, ConcurrentUpdateSolrServer,
HttpSolrServer or LBHttpSolrServer respectively.
</description>
</property>
<property>
<name>solr.server.url</name>
<value>http://solr5-00:8983/solr/nutch-0</value>
<description>
Defines the Solr URL into which data should be indexed using the
indexer-solr plugin.
</description>
</property>
From: Michael Coffey <[email protected]>
To: Furkan KAMACI <[email protected]>; "[email protected]"
<[email protected]>
Sent: Monday, December 19, 2016 5:10 PM
Subject: Re: nutch 1.12 and Solr 5.4.1
I'm not sure how to do that. According to a find command, I have more than one
solrj on the nutch
machine../hadass/apache-nutch-1.12/runtime/local/plugins/indexer-solr/solr-solrj-5.4.1.jar./hadass/apache-nutch-1.12/build/plugins/indexer-solr/solr-solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr-solrj./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr-solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr-solrj-4.6.0.jar
On the solr machine, I have./solr-5.4.1/dist/solrj-lib
./solr-5.4.1/server/solr-webapp/webapp/WEB-INF/lib/solr-solrj-5.4.1.jar
./solr-5.4.1/docs/solr-solrj
./solr-5.4.1/docs/solr-solrj/org/apache/solr/client/solrj
./solr-5.4.1/docs/solr-core/org/apache/solr/client/solrj
Should I make the change to SolrUtils.java, mentioned in
https://issues.apache.org/jira/browse/NUTCH-2267
Lewis and Stephen might know about this.
From: Furkan KAMACI <[email protected]>
To: Michael Coffey <[email protected]>; [email protected]
Sent: Monday, December 19, 2016 4:13 PM
Subject: Re: nutch 1.12 and Solr 5.4.1
Hi Michael,
Could you check the version of solrj at your Nutch and compare it with version
of Solr at your server?
Kind Regards,Furkan KAMACI
On Dec 20, 2016 1:01 AM, "Michael Coffey" <[email protected]> wrote:
What is the recommended fix (or workaround) for the "bad return type" error
related to "Type 'org/apache/http/impl/client/ DefaultHttpClient' (current
frame, stack[0]) is not assignable to 'org/apache/http/impl/client/
CloseableHttpClient'"
It seems that switching to different versions of Solr has not helped (6.3.0,
5.5.3, 5.4.1). FWIW, I have same version of Java on both machines.
OpenJDK Runtime Environment (IcedTea 2.6.8) (7u121-2.6.8-1ubuntu0.14.04.1)
OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)
From: Michael Coffey <[email protected]>
To: "[email protected]" <[email protected]>; Michael Coffey
<[email protected]>
Sent: Saturday, November 19, 2016 8:05 AM
Subject: Re: nutch 1.12 and Solr 6.3.0
I think this is what Lewis and Furkan know as NUTCH-2267. I get the same
problem with Solr 5.5.3.
I really would like to know which versions of nutch/solar work together "out of
the box".
From: Michael Coffey <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Friday, November 18, 2016 2:04 PM
Subject: nutch 1.12 and Solr 6.3.0
I decided to plunge ahead with Solr indexing, but so far it doesn't work. The
first error I got is listed below. Could it be that I am running JDK 7 on the
nutch server and JDK 8 on the Solr server. As far as I know Nutch 1.x won't
work with JDK 8 and Solr 6.3 wont work with JDK less than 8. Any suggestions or
advice?
16/11/18 13:59:52 INFO mapreduce.Job: Task Id : attempt_1479499237600_0021_r_
000000_0, Status : FAILED
Error: Bad return type
Exception Details:
Location:
org/apache/solr/client/solrj/ impl/HttpClientUtil.
createClient(Lorg/apache/solr/ common/params/SolrParams;Lorg/ apache/http/conn/
ClientConnectionManager;)Lorg/ apache/http/impl/client/ CloseableHttpClient;
@58: areturn
Reason:
Type 'org/apache/http/impl/client/ DefaultHttpClient' (current frame,
stack[0]) is not assignable to 'org/apache/http/impl/client/
CloseableHttpClient' (from method signature)
Current Frame:
bci: @58
flags: { }
locals: { 'org/apache/solr/common/ params/SolrParams',
'org/apache/http/conn/ ClientConnectionManager', 'org/apache/solr/common/
params/ModifiableSolrParams', 'org/apache/http/impl/client/ DefaultHttpClient' }
stack: { 'org/apache/http/impl/client/ DefaultHttpClient' }
Bytecode:
0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601
0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
0000020: b600 0a2c b600 0bb6 000c b900 0d02 002b
0000030: b800 104e 2d2c b800 0f2d b0
Stackmap Table:
append_frame(@47,Object[#143])
Container killed by the ApplicationMaster.