Weird problems with document size

2008-05-09 Thread Andrew Savory
Hi,

I'm trying to debug a misbehaving solr search setup. Here's the scenario:

- custom index client that posts insert/delete events to solr via http;
- custom content handlers in solr;
- tcpmon in the middle to see what's going on

When I post an add event to solr of less than about 5k, everything works:
lock/
add/
commit/
unlock/

When I post a larger event, it goes wrong. The response from solr is a
500 server error (text of which is below).

The content should be good - it's lorem ipsum.
The tomcat server has maxPostSize disabled
The solr config has field size set to a large number (and we've tested
with several big fields less than the limit, as well as one big field
- anything over 5k trips it regardless of how the data is stored in
the fields)

I've also tried pushing the same content using the command line and
curl - with the same result.

At this point I'm baffled - any suggestions?


Those pesky errors:

java.io.EOFException: no more data available - expected end tags
lt;/fieldgt;lt;/docgt;lt;/addgt; to close start tag
lt;fieldgt; from line 1 and start tag lt;docgt; from line 1 and
start tag lt;addgt; from line 1, parser stopped on START_TAG seen
...ipsum\tDolor sit amet\tlorem ipsum\tfoo\tbar... @1:8192
at org.xmlpull.mxp1.MXParser.fillBuf(MXParser.java:3015)
at org.xmlpull.mxp1.MXParser.more(MXParser.java:3026)
at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1384)
at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)
at org.xmlpull.mxp1.MXParser.nextText(MXParser.java:1058)
at 
org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:332)
at 
org.apache.solr.handler.XmlUpdateRequestHandler.update(XmlUpdateRequestHandler.java:162)
at 
org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpdateRequestHandler.java:355)
at 
org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:58)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:185)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)


HTTP/1.1 500 Internal Server Error
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Content-Length: 2509
Date: Fri, 09 May 2008 10:03:46 GMT
Connection: close



htmlheadtitleApache Tomcat/6.0.16 - Error
report/titlestyle!--H1
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
H2 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
H3 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
P 
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
{color : black;}A.name {color : black;}HR {color :
#525D76;}--/style /headbodyh1HTTP Status 500 - /h1HR
size=1 noshade=noshadepbtype/b Exception
report/ppbmessage/b u/u/ppbdescription/b uThe
server encountered an internal error () that prevented it from
fulfilling this request./u/ppbexception/b
prejava.lang.RuntimeException:
org.xmlpull.v1.XmlPullParserException: only whitespace content allowed
before start tag and not t (position: START_DOCUMENT seen t... @1:1)

com.wiley.wps.search.common.impl.servlet.SolrServletConfig.process(SolrServletConfig.java:106)

com.wiley.wps.search.common.impl.servlet.SolrServletConfig.doPost(SolrServletConfig.java:63)

Re: Weird problems with document size

2008-05-09 Thread Andrew Savory
Hi,

On 09/05/2008, Otis Gospodnetic [EMAIL PROTECTED] wrote:

  I don't understand what that lock and unlock is for...
  Just do this:
  add
  add
  add
  add
  ...
  ...
  optionally commit or optimize

Yeah, I didn't understand what the lock/unlock was for either - but on
further reviewing the code, we have a wrapper around the solr servlet
which does a crude type of locking to ensure only one index updater
process can run at a time. Not sure it's needed, as I'd guess that
solr would handle things gracefully anyway, but it at least stops
multiple index clients firing up.

Meanwhile it seems that these documents can successfully be added to
solr when it is running in jetty, so I'm now trying to find out what
Tomcat is doing to break things.

Thanks for the reply,


Andrew.


Warning: latest Tomcat 6 release is broken (was Re: Weird problems with document size)

2008-05-13 Thread Andrew Savory
Hi,

Here's a warning for anyone trying to use solr in the latest release
of tomcat, 6.0.16.

Previously I was having problems successfully posting updates to a
solr instance running in tomcat:

2008/5/9 Andrew Savory [EMAIL PROTECTED]:

  Meanwhile it seems that these documents can successfully be added to
  solr when it is running in jetty, so I'm now trying to find out what
  Tomcat is doing to break things.

A colleague (thanks, Alexis!) has just unearthed a regression bug in
tomcat dating back to February that causes posts of more than 8k to be
truncated: https://issues.apache.org/bugzilla/show_bug.cgi?id=44494

So if you're using Tomcat, aim for 6.0.14 instead.


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


Re: Release date of SOLR 1.3

2008-05-19 Thread Andrew Savory
Hi,

2008/5/16 Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED]:
 If you are looking for an immediate need waiting for a release I must
 advice you against waiting for the solr1.3 release. The best strategy
 would be to take a nightly and start using it. Test is thoroughly and
 if bugs are found report them back . If everything is fine go into
 production with that

Since most production environments are reluctant to use nightly builds
(regardless of how stable the trunk is), and since there's not been a
solr release in some time, would it be worth looking at what
outstanding issues are critical for 1.3 and perhaps pushing some over
to 1.4, and trying to do a release soon?

I think trunk has already sufficiently diverged to make it worth doing
a release, and I'd be happy to help wherever I can (since I could
really do with a more recent release to run).


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


Re: Release date of SOLR 1.3

2008-05-21 Thread Andrew Savory
Hi,

2008/5/21 Dan Thomas [EMAIL PROTECTED]:

 One year between releases is a very long time for such a useful and
 dynamic system.  Are project leaders willing to (re)consider the
 development process to prioritize improvements/features scope into
 chunks that can be accomplished in shorter time frames - say 90 days?
 In my experience, short dev iteration cycles that fix time and vary
 scope produce better results from all perspectives.

Well, as an active meritocracy, the us and them divide doesn't
really exist. If we (the users and developers) feel the need to have a
release, it's up to us to make it happen -- by contributing wherever
and whatever we can.

Fixed release cycles are difficult to achieve on any voluntary
project, it's usually just when enough people need to scratch that
particular itch it will happen.


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


*-enable and *-disable scripts

2008-05-23 Thread Andrew Savory
Hi,

I'm trying to understand what the enable and disable scripts do, for
example rsyncd-enable and rsyncd-disable.

As far as I can tell, all they do is touch or remove a file
logs/rsyncd-enabled, they don't do anything more than that. The daemon
start script checks for the presence of this file before starting the
daemon.

This seems like an extra level of work that's not needed, since if you
can enable the daemon you can also start it, so why not just start it
- what am I missing?


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


Re: solr on ubuntu 8.04

2008-05-28 Thread Andrew Savory
Hi Jack,

2008/5/28 Jack Bates [EMAIL PROTECTED]:
 Thanks for your suggestions. I have now tried installing Solr on two
 different machines. On one machine I installed the Ubuntu solr-tomcat5.5
 package, and on the other I simply dropped solr.war
 into /var/lib/tomcat5.5/webapps

 Both machines are running Tomcat 5.5

 I get the same error message on both machines:

 SEVERE: Exception starting filter SolrRequestFilter
 java.lang.NoClassDefFoundError: Could not initialize class
 org.apache.solr.core.SolrConfig

 The full error message is attached.

Can you check that solr.home is set correctly?
I don't have an ubuntu box handy at the moment, will try to look into
it tonight.


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/