Hi Michael, It's great that your problem is resolved ;) Don't hesitate to ask if you have any other questions.
Kind Regards, Furkan KAMACI On Thu, Dec 22, 2016 at 9:44 PM, Michael Coffey <[email protected]> wrote: > Thank you very much for replying. I know it's holiday season and you > probably have a million things to do! > OMG, it is working now that I am using the version of SolrUtils you > pointed to. I had previously focused on a version where it uses > SystemDefaultHttpClient but not as a static. It seems that making it static > made a critical difference. So this is awesome. > For the record, I would say I am using solrj 5.4.1, based on the presence > of the following files in my Nutch directories. > ./apache-nutch-1.12/runtime/local/plugins/indexer-solr/ > solr-solrj-5.4.1.jar > ./apache-nutch-1.12/build/plugins/indexer-solr/solr-solrj-5.4.1.jar > > For httpclient, within the nutch.12 directories, I have a lot of jars in > my nutch folder. > ./apache-nutch-1.12/runtime/local/lib/httpclient-4.3.5.jar > ./apache-nutch-1.12/runtime/local/lib/commons-httpclient-3.1.jar > ./apache-nutch-1.12/runtime/local/plugins/protocol-httpclient/protocol- > httpclient.jar > ./apache-nutch-1.12/runtime/local/plugins/indexer-solr/ > httpclient-4.4.1.jar > ./apache-nutch-1.12/runtime/local/plugins/lib-htmlunit/ > httpclient-4.3.4.jar > ./apache-nutch-1.12/runtime/local/plugins/lib-selenium/ > httpclient-4.5.1.jar > ./apache-nutch-1.12/runtime/local/plugins/indexer- > cloudsearch/httpclient-4.3.6.jar > ./apache-nutch-1.12/build/protocol-httpclient/protocol-httpclient.jar > ./apache-nutch-1.12/build/lib/httpclient-4.3.5.jar > ./apache-nutch-1.12/build/lib/commons-httpclient-3.1.jar > ./apache-nutch-1.12/build/plugins/protocol-httpclient/ > protocol-httpclient.jar > ./apache-nutch-1.12/build/plugins/indexer-solr/httpclient-4.4.1.jar > ./apache-nutch-1.12/build/plugins/lib-htmlunit/httpclient-4.3.4.jar > ./apache-nutch-1.12/build/plugins/lib-selenium/httpclient-4.5.1.jar > ./apache-nutch-1.12/build/plugins/indexer-cloudsearch/httpclient-4.3.6.jar > The hadoop directory has the following httpclient-related > jars/posix/hadoop-2.7.2/share/hadoop/kms/tomcat/webapps/kms/ > WEB-INF/lib/httpclient-4.2.5.jar > /posix/hadoop-2.7.2/share/hadoop/httpfs/tomcat/webapps/ > webhdfs/WEB-INF/lib/httpclient-4.2.5.jar > /posix/hadoop-2.7.2/share/hadoop/tools/lib/httpclient-4.2.5.jar > /posix/hadoop-2.7.2/share/hadoop/tools/lib/commons-httpclient-3.1.jar > /posix/hadoop-2.7.2/share/hadoop/common/lib/httpclient-4.2.5.jar > /posix/hadoop-2.7.2/share/hadoop/common/lib/commons-httpclient-3.1.jar > > Over on the Solr5 machine, we have./solr-5.4.1/dist/solrj- > lib/httpclient-4.4.1.jar > ./solr-5.4.1/server/solr-webapp/webapp/WEB-INF/lib/httpclient-4.4.1.jar > > thanks again > From: Furkan KAMACI <[email protected]> > To: Michael Coffey <[email protected]> > Cc: "[email protected]" <[email protected]> > Sent: Thursday, December 22, 2016 10:29 AM > Subject: Re: nutch 1.12 and Solr 5.4.1 > > Hi Michael, > > That dependencies you sent are from ivy cache. I need to know the versions > of Solr and HTTP Client. You problem is probably a jar mismatch between > hadoop and Solr. Nutch 1.12 should work with Solr 5.4.1 as you can check > from here: > https://github.com/apache/nutch/blob/release-1.12/src/ > plugin/indexer-solr/ivy.xml > > So, there maybe a bug at Nutch. Here is a workaround at given issue by you: > https://issues.apache.org/jira/browse/NUTCH-2267 Could you apply it to > SolrUtils.java ( > https://github.com/sjwoodard/nutch/blob/master/src/plugin/ > indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrUtils.java) > and check again? If you still get that error, I can try to fix it. > > Kind Regards, > Furkan KAMACI > > On Thu, Dec 22, 2016 at 6:26 PM, Michael Coffey <[email protected]> wrote: > > > Is it possible to get around this problem by using an older version of > > Solr or Nutch or both? > > > > > > ------------------------------ > > *From:* Michael Coffey <[email protected]> > > *To:* "[email protected]" <[email protected]>; Furkan KAMACI < > > [email protected]>; Michael Coffey <[email protected]> > > *Sent:* Tuesday, December 20, 2016 8:41 PM > > *Subject:* Re: nutch 1.12 and Solr 5.4.1 > > > > This should work, shouldn't it? But it is not working. I am using Nutch > > 1.12 with the recommended version of Solr (5.4.1) and Hadoop 2.7.2. I > > haven't changed any Java code, but I get a low-level Java error when > trying > > to write to the index. Is this not a tested configuration? Based on web > > searching, I know that others have had similar problems, going back > several > > months, but I haven't seen any solutions. I did try a couple of > variations > > on the patch posted for NUTCH-2267 (a slightly different manifestation) > and > > that did not help. I notice that the 2267 patch has been reverted in the > > master branch. > > I am willing to work on some Java code, if necessary, to help resolve > > this. At this point, I don't know what to try next, other than switching > to > > ElasticSearch. > > > > From: Michael Coffey <[email protected]> > > > > To: "[email protected]" <[email protected]>; Furkan KAMACI < > > [email protected]>; Michael Coffey <[email protected]> > > Sent: Monday, December 19, 2016 7:13 PM > > Subject: Re: nutch 1.12 and Solr 5.4.1 > > > > Some additional info: I am using solr.server.type=http, not cloud. I have > > tried plugins.include with protocol-http and also with > protocol-httpclient. > > My current settings are listed below. Also, I am using hadoop 2.7.2, in > > case that matters. > > <property> > > <name>plugin.includes</name> > > <value>protocol-http|urlfilter-regex|parse-(html| > > tika)|index-(basic|anchor)|indexer-solr|scoring-opic| > > urlnormalizer-(pass|regex|basic)</value> > > </property> > > > > <property> > > <name>solr.server.type</name> > > <value>http</value> > > <description> > > Specifies the SolrServer implementation to use. This is a string value > > of one of the following 'cloud', 'concurrent', 'http' or 'lb'. > > The values represent CloudSolrServer, ConcurrentUpdateSolrServer, > > HttpSolrServer or LBHttpSolrServer respectively. > > </description> > > </property> > > > > <property> > > <name>solr.server.url</name> > > <value>http://solr5-00:8983/solr/nutch-0</value> > > <description> > > Defines the Solr URL into which data should be indexed using the > > indexer-solr plugin. > > </description> > > </property> > > > > From: Michael Coffey <[email protected]> > > To: Furkan KAMACI <[email protected]>; "[email protected]" < > > [email protected]> > > Sent: Monday, December 19, 2016 5:10 PM > > Subject: Re: nutch 1.12 and Solr 5.4.1 > > > > I'm not sure how to do that. According to a find command, I have more > than > > one solrj on the nutch machine../hadass/apache-nutch- > > 1.12/runtime/local/plugins/indexer-solr/solr-solrj-5.4.1. > > jar./hadass/apache-nutch-1.12/build/plugins/indexer-solr/ > > solr-solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr- > > solrj./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr- > > solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr-solrj-4.6.0.jar > On > > the solr machine, I have./solr-5.4.1/dist/solrj-lib > > ./solr-5.4.1/server/solr-webapp/webapp/WEB-INF/lib/solr-solrj-5.4.1.jar > > ./solr-5.4.1/docs/solr-solrj > > ./solr-5.4.1/docs/solr-solrj/org/apache/solr/client/solrj > > ./solr-5.4.1/docs/solr-core/org/apache/solr/client/solrj > > > > Should I make the change to SolrUtils.java, mentioned in > > https://issues.apache.org/jira/browse/NUTCH-2267 > > Lewis and Stephen might know about this. > > > > From: Furkan KAMACI <[email protected]> > > To: Michael Coffey <[email protected]>; [email protected] > > Sent: Monday, December 19, 2016 4:13 PM > > Subject: Re: nutch 1.12 and Solr 5.4.1 > > > > Hi Michael, > > Could you check the version of solrj at your Nutch and compare it with > > version of Solr at your server? > > Kind Regards,Furkan KAMACI > > On Dec 20, 2016 1:01 AM, "Michael Coffey" <[email protected]> > > wrote: > > > > What is the recommended fix (or workaround) for the "bad return type" > > error related to "Type 'org/apache/http/impl/client/ DefaultHttpClient' > > (current frame, stack[0]) is not assignable to > > 'org/apache/http/impl/client/ CloseableHttpClient'" > > It seems that switching to different versions of Solr has not helped > > (6.3.0, 5.5.3, 5.4.1). FWIW, I have same version of Java on both > machines. > > > > OpenJDK Runtime Environment (IcedTea 2.6.8) > (7u121-2.6.8-1ubuntu0.14.04.1) > > OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode) > > > > > > > > From: Michael Coffey <[email protected]> > > To: "[email protected]" <[email protected]>; Michael Coffey < > > [email protected]> > > Sent: Saturday, November 19, 2016 8:05 AM > > Subject: Re: nutch 1.12 and Solr 6.3.0 > > > > I think this is what Lewis and Furkan know as NUTCH-2267. I get the same > > problem with Solr 5.5.3. > > > > I really would like to know which versions of nutch/solar work together > > "out of the box". > > > > From: Michael Coffey <[email protected]> > > To: "[email protected]" <[email protected]> > > Sent: Friday, November 18, 2016 2:04 PM > > Subject: nutch 1.12 and Solr 6.3.0 > > > > I decided to plunge ahead with Solr indexing, but so far it doesn't work. > > The first error I got is listed below. Could it be that I am running JDK > 7 > > on the nutch server and JDK 8 on the Solr server. As far as I know Nutch > > 1.x won't work with JDK 8 and Solr 6.3 wont work with JDK less than 8. > Any > > suggestions or advice? > > > > 16/11/18 13:59:52 INFO mapreduce.Job: Task Id : > > attempt_1479499237600_0021_r_ 000000_0, Status : FAILED > > Error: Bad return type > > Exception Details: > > Location: > > org/apache/solr/client/solrj/ impl/HttpClientUtil. > > createClient(Lorg/apache/solr/ common/params/SolrParams;Lorg/ > > apache/http/conn/ ClientConnectionManager;)Lorg/ apache/http/impl/client/ > > CloseableHttpClient; @58: areturn > > Reason: > > Type 'org/apache/http/impl/client/ DefaultHttpClient' (current frame, > > stack[0]) is not assignable to 'org/apache/http/impl/client/ > > CloseableHttpClient' (from method signature) > > Current Frame: > > bci: @58 > > flags: { } > > locals: { 'org/apache/solr/common/ params/SolrParams', > > 'org/apache/http/conn/ ClientConnectionManager', 'org/apache/solr/common/ > > params/ModifiableSolrParams', 'org/apache/http/impl/client/ > > DefaultHttpClient' } > > stack: { 'org/apache/http/impl/client/ DefaultHttpClient' } > > Bytecode: > > 0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601 > > 0000010: 0099 001e b200 05bb 0007 59b7 0008 1209 > > 0000020: b600 0a2c b600 0bb6 000c b900 0d02 002b > > 0000030: b800 104e 2d2c b800 0f2d b0 > > Stackmap Table: > > append_frame(@47,Object[#143]) > > > > Container killed by the ApplicationMaster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

