The response you are seeing from the Solr webapp is quite unhelpful.  Tomcat 
usually has several logs you can dig through - stdout captures, stderr 
captures, etc - but they differ somewhat based on what platform you are using.  
If you can't find any exceptions there, then remember that Solr has a 
configurable logging setup that may also be useful, but you'll have to refer to 
the solr documentation for how to set that up & where to look.  I usually just 
run Solr using the jetty example server, so I'm not going to be much help to 
you, I'm afraid.

Karl


From: ext Jens Bengtsson [mailto:[email protected]]
Sent: Wednesday, July 21, 2010 10:04 AM
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

OK.

Solr connections is still saying "Connection working".

Looking at the tomcat log I can't find any Solr exceptions. Pardon my ignorance 
but is there any other log then the catalina log I should be looking at?


Here's some output from the simple history:

Internal Server Error 07-21-2010 15:56:50.283 document ingest (solr) 
http://giantbomb.com/transformers-war-for-cybertron/61-29405/...reviews/ 500 
133674 327
Internal Server Error 07-21-2010 15:56:50.032 document ingest (solr) 
http://giantbomb.com/podcast/?podcast_id=160 500 82109 380
Internal Server Error 07-21-2010 15:56:49.787 document ingest (solr) 
http://giantbomb.com/modnation-racers/61-26848/reviews/ 500 131147 303
Internal Server Error 07-21-2010 15:56:49.583 document ingest (solr) 
http://giantbomb.com/sin-punishment-star-successor/61-23966/r... eviews/ 500 
99252 297
Internal Server Error 07-21-2010 15:56:49.458 document ingest (solr) 
http://giantbomb.com/podcast/?podcast_id=163 500 82524 280
Internal Server Error 07-21-2010 15:56:49.216 document ingest (solr) 
http://giantbomb.com/podcast/?podcast_id=156 500 82493 328
Internal Server Error 07-21-2010 15:56:47.772 document ingest (solr) 
http://giantbomb.com/ufc-undisputed-2010/61-29376/reviews/ 500 132435 342 
Internal Server Error

Jens

From: [email protected] [mailto:[email protected]]
Sent: den 21 juli 2010 15:55
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

What that means is that the response coming from Solr is not the expected XML 
either.  It sounds like it is just plain old HTML, which is strange if you are 
actually talking to Solr.

When you view your Solr connection in the LCF UI, does it still say "Connection 
working"?

The error code of 500 you reported is also almost certainly coming from Solr, 
so you should be able to get a stack trace from it that would explain the 
problem.  There may well be additional Solr arguments you need to add to the 
connection to make everything work.  The stack trace will tell us what the 
problem actually is.

An alternative might be to look at the Simple History report for one of the 
failed Solr indexing attempts - that may well list the actual response back 
from Solr as the error text.

Karl


From: ext Jens Bengtsson [mailto:[email protected]]
Sent: Wednesday, July 21, 2010 9:49 AM
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

Missed this in the console:

[Fatal Error] :115:120: The element type "HR" must be terminated by the matching
end-tag "</HR>".
org.apache.lcf.core.interfaces.LCFException: XML parsing error: The element type
"HR" must be terminated by the matching end-tag "</HR>".
        at org.apache.lcf.core.common.XMLDoc.init(XMLDoc.java:369)
        at org.apache.lcf.core.common.XMLDoc.<init>(XMLDoc.java:317)
        at org.apache.lcf.agents.output.solr.HttpPoster.getResponse(HttpPoster.j
ava:537)

So there's a parsing error for the XML.

From: Jens Bengtsson [mailto:[email protected]]
Sent: den 21 juli 2010 15:36
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

So I updated and this is the error I get in the log now:

Service interruption reported for job 1279719088042 connection 'Giantbomb RSS': 
Error 500 from ingestion request; ingestion will be retried again later

From: Jens Bengtsson [mailto:[email protected]]
Sent: den 21 juli 2010 14:41
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

No worries!

I'm very thankful for your help.

Jens

From: [email protected] [mailto:[email protected]]
Sent: den 21 juli 2010 14:34
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

Yesterday should be fine.  I overlooked something and have checked in a fix.  
My apologies.

Karl



From: ext Jens Bengtsson [mailto:[email protected]]
Sent: Wednesday, July 21, 2010 8:26 AM
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

That's strange because I did the checkout from 
https://svn.apache.org/repos/asf/incubator/lcf/trunk yesterday and I did a 
update today and rebuilt everything so things should be in sync with trunk.

Jens

From: [email protected] [mailto:[email protected]]
Sent: den 21 juli 2010 13:55
To: [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr

Hi Jens,

The trace you gave me is out of date wrt trunk by at least a month.  Would you 
be willing to synch up to the latest LCF, and see how you do with that?  If you 
still see a trace, I'd be happy to analyze it and perhaps check in a patch.

Karl


From: Wright Karl (Nokia-MS/Cambridge)
Sent: Wednesday, July 21, 2010 6:51 AM
To: [email protected]; [email protected]
Subject: RE: java.lang.NullPointerException while trying to crawl RSS feed to 
Solr


The 'connection working' from rss doesn't mean much.  But the 'connection 
working' from solr means that lcf could talk to solr and do a ping.



In any case, you should never see an NPE from lcf, so I am going to look into 
this at earliest opportunity.  It is possible that the NPE is masking some 
other error, but maybe it is just broken.



Karl



--- original message ---

From: "ext Jens Bengtsson" <[email protected]>

Subject: java.lang.NullPointerException while trying to crawl RSS feed to Solr

Date: July 21, 2010

Time: 6:38:7  AM


Hi!

I have setup a connector against a RSS-feed with output to a Solr server. The 
repository connection and output connection report that the connection is ok.

When I run the job it seems to retrieve the RSS feed and process everything as 
it should, the data does not seem to get indexed into Solr however.

If I look in the lcf log file I find the following:
Error tossed: null
java.lang.NullPointerException
                             at 
org.apache.lcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:976)

I guess there's an error when it tries to post the data to Solr, but I can't 
figure out what the problem is. If I look at the catalina log for the tomcat 
where Solr is run I can't find any errors or anything else.

Does anyone have any tips?

Reply via email to