I downloaded and ran solr with the jetty example server and the indexing works there so I guess there's something messed up with my solrconfig on the other server. Can't find anything in any of the logs though.
Thanks for your help! Jens From: [email protected] [mailto:[email protected]] Sent: den 21 juli 2010 16:10 To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr The response you are seeing from the Solr webapp is quite unhelpful. Tomcat usually has several logs you can dig through - stdout captures, stderr captures, etc - but they differ somewhat based on what platform you are using. If you can't find any exceptions there, then remember that Solr has a configurable logging setup that may also be useful, but you'll have to refer to the solr documentation for how to set that up & where to look. I usually just run Solr using the jetty example server, so I'm not going to be much help to you, I'm afraid. Karl From: ext Jens Bengtsson [mailto:[email protected]] Sent: Wednesday, July 21, 2010 10:04 AM To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr OK. Solr connections is still saying "Connection working". Looking at the tomcat log I can't find any Solr exceptions. Pardon my ignorance but is there any other log then the catalina log I should be looking at? Here's some output from the simple history: Internal Server Error 07-21-2010 15:56:50.283 document ingest (solr) http://giantbomb.com/transformers-war-for-cybertron/61-29405/...reviews/ 500 133674 327 Internal Server Error 07-21-2010 15:56:50.032 document ingest (solr) http://giantbomb.com/podcast/?podcast_id=160 500 82109 380 Internal Server Error 07-21-2010 15:56:49.787 document ingest (solr) http://giantbomb.com/modnation-racers/61-26848/reviews/ 500 131147 303 Internal Server Error 07-21-2010 15:56:49.583 document ingest (solr) http://giantbomb.com/sin-punishment-star-successor/61-23966/r... eviews/ 500 99252 297 Internal Server Error 07-21-2010 15:56:49.458 document ingest (solr) http://giantbomb.com/podcast/?podcast_id=163 500 82524 280 Internal Server Error 07-21-2010 15:56:49.216 document ingest (solr) http://giantbomb.com/podcast/?podcast_id=156 500 82493 328 Internal Server Error 07-21-2010 15:56:47.772 document ingest (solr) http://giantbomb.com/ufc-undisputed-2010/61-29376/reviews/ 500 132435 342 Internal Server Error Jens From: [email protected] [mailto:[email protected]] Sent: den 21 juli 2010 15:55 To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr What that means is that the response coming from Solr is not the expected XML either. It sounds like it is just plain old HTML, which is strange if you are actually talking to Solr. When you view your Solr connection in the LCF UI, does it still say "Connection working"? The error code of 500 you reported is also almost certainly coming from Solr, so you should be able to get a stack trace from it that would explain the problem. There may well be additional Solr arguments you need to add to the connection to make everything work. The stack trace will tell us what the problem actually is. An alternative might be to look at the Simple History report for one of the failed Solr indexing attempts - that may well list the actual response back from Solr as the error text. Karl From: ext Jens Bengtsson [mailto:[email protected]] Sent: Wednesday, July 21, 2010 9:49 AM To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr Missed this in the console: [Fatal Error] :115:120: The element type "HR" must be terminated by the matching end-tag "</HR>". org.apache.lcf.core.interfaces.LCFException: XML parsing error: The element type "HR" must be terminated by the matching end-tag "</HR>". at org.apache.lcf.core.common.XMLDoc.init(XMLDoc.java:369) at org.apache.lcf.core.common.XMLDoc.<init>(XMLDoc.java:317) at org.apache.lcf.agents.output.solr.HttpPoster.getResponse(HttpPoster.j ava:537) So there's a parsing error for the XML. From: Jens Bengtsson [mailto:[email protected]] Sent: den 21 juli 2010 15:36 To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr So I updated and this is the error I get in the log now: Service interruption reported for job 1279719088042 connection 'Giantbomb RSS': Error 500 from ingestion request; ingestion will be retried again later From: Jens Bengtsson [mailto:[email protected]] Sent: den 21 juli 2010 14:41 To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr No worries! I'm very thankful for your help. Jens From: [email protected] [mailto:[email protected]] Sent: den 21 juli 2010 14:34 To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr Yesterday should be fine. I overlooked something and have checked in a fix. My apologies. Karl From: ext Jens Bengtsson [mailto:[email protected]] Sent: Wednesday, July 21, 2010 8:26 AM To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr That's strange because I did the checkout from https://svn.apache.org/repos/asf/incubator/lcf/trunk yesterday and I did a update today and rebuilt everything so things should be in sync with trunk. Jens From: [email protected] [mailto:[email protected]] Sent: den 21 juli 2010 13:55 To: [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr Hi Jens, The trace you gave me is out of date wrt trunk by at least a month. Would you be willing to synch up to the latest LCF, and see how you do with that? If you still see a trace, I'd be happy to analyze it and perhaps check in a patch. Karl From: Wright Karl (Nokia-MS/Cambridge) Sent: Wednesday, July 21, 2010 6:51 AM To: [email protected]; [email protected] Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to Solr The 'connection working' from rss doesn't mean much. But the 'connection working' from solr means that lcf could talk to solr and do a ping. In any case, you should never see an NPE from lcf, so I am going to look into this at earliest opportunity. It is possible that the NPE is masking some other error, but maybe it is just broken. Karl --- original message --- From: "ext Jens Bengtsson" <[email protected]> Subject: java.lang.NullPointerException while trying to crawl RSS feed to Solr Date: July 21, 2010 Time: 6:38:7 AM Hi! I have setup a connector against a RSS-feed with output to a Solr server. The repository connection and output connection report that the connection is ok. When I run the job it seems to retrieve the RSS feed and process everything as it should, the data does not seem to get indexed into Solr however. If I look in the lcf log file I find the following: Error tossed: null java.lang.NullPointerException at org.apache.lcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:976) I guess there's an error when it tries to post the data to Solr, but I can't figure out what the problem is. If I look at the catalina log for the tomcat where Solr is run I can't find any errors or anything else. Does anyone have any tips?
