RE: anyone have any clues about this exception

Petersen, Robert Fri, 12 Oct 2012 14:36:16 -0700

Hi Erick,

After reading the discussion you guys were having about renaming optimize to 
forceMerge I realized I was guilty of over-optimizing like you guys were 
worried about!  We have about 15 million docs indexed now and we spin about 
50-300 adds per second 24/7, most of them being updates to existing documents 
whose data has changed since the last time it was indexed (which we keep track 
of in a DB table).  There are some new documents being added in the mix and 
some deletes as well too.


I understand now how the merge policy caps the number of segments.  I used to 
think they would grow unbounded and thus optimize was required.  How does the 
large number of updates of existing documents affect the need to optimize, by 
causing a large number of deletes with a 're-add'?  And so I suppose that means 
the index size tends to grow with the deleted docs hanging around in the 
background, as it were.

So in our situation, what frequency of optimize would you recommend?  We're on 
3.6.1 btw...

Thanks,
Robi

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Thursday, October 11, 2012 5:29 AM
To: solr-user@lucene.apache.org
Subject: Re: anyone have any clues about this exception

Well, you'll actually be able to optimize, it's just called forceMerge.

But the point is that optimize seems like something that _of course_ you want 
to do, when in reality it's not something you usually should do at all. 
Optimize does two things:
1> merges all the segments into one (usually)
2> removes all of the info associated with deleted documents.

Of the two, point <2> is the one that really counts and that's done whenever 
segment merging is done anyway. So unless you have a very large number of 
deletes (or updates of the same document), optimize buys you very little. You 
can tell this by the difference between numDocs and maxDoc in the admin page.

So what happens if you just don't bother to optimize? Take a look at merge 
policy to help control how merging happens perhaps as an alternative.

Best
Erick

On Wed, Oct 10, 2012 at 3:04 PM, Petersen, Robert <rober...@buy.com> wrote:
> You could be right.  Going back in the logs, I noticed it used to happen less 
> frequently and always towards the end of an optimize operation.  It is 
> probably my indexer timing out waiting for updates to occur during optimizes. 
>  The errors grew recently due to my upping the indexer threadcount to 22 
> threads, so there's a lot more timeouts occurring now.  Also our index has 
> grown to double the old size so the optimize operation has started taking a 
> lot longer, also contributing to what I'm seeing.   I have just changed my 
> optimize frequency from three times a day to one time a day after reading the 
> following:
>
> Here they are talking about completely deprecating the optimize 
> command in the next version of solr... 
> https://issues.apache.org/jira/browse/SOLR-3141c
>
>
> -----Original Message-----
> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
> Sent: Wednesday, October 10, 2012 11:10 AM
> To: solr-user@lucene.apache.org
> Subject: Re: anyone have any clues about this exception
>
> Something timed out, the other end closed the connection. This end tried to 
> write to closed pipe and died, something tried to catch that exception and 
> write its own and died even worse? Just making it up really, but sounds good 
> (plus a 3-year Java tech-support hunch).
>
> If it happens often enough, see if you can run WireShark on that machine's 
> network interface and catch the whole network conversation in action. Often, 
> there is enough clues there by looking at tcp packets and/or stuff 
> transmitted. WireShark is a power-tool, so takes a little while the first 
> time, but the learning will pay for itself over and over again.
>
> Regards,
>    Alex.
>
> Personal blog: http://blog.outerthoughts.com/
> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> - Time is the quality of nature that keeps events from happening all 
> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> book)
>
>
> On Wed, Oct 10, 2012 at 11:31 PM, Petersen, Robert <rober...@buy.com> wrote:
>> Tomcat localhost log (not the catalina log) for my  solr 3.6.1 (master) 
>> instance contains lots of these exceptions but solr itself seems to be doing 
>> fine... any ideas?  I'm not seeing these exceptions being logged on my slave 
>> servers btw, just the master where we do our indexing only.
>>
>>
>>
>> Oct 9, 2012 5:34:11 PM org.apache.catalina.core.StandardWrapperValve
>> invoke
>> SEVERE: Servlet.service() for servlet default threw exception 
>> java.lang.IllegalStateException
>>                 at 
>> org.apache.catalina.connector.ResponseFacade.sendError(ResponseFacade.java:407)
>>                 at 
>> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:389)
>>                 at 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:291)
>>                 at 
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>                 at 
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>>                 at 
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>                 at 
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>>                 at 
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>>                 at 
>> com.googlecode.psiprobe.Tomcat60AgentValve.invoke(Tomcat60AgentValve.java:30)
>>                 at 
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>>                 at 
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>                 at 
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
>>                 at 
>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
>>                 at 
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>>                 at 
>> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
>>                 at java.lang.Thread.run(Unknown Source)
>

RE: anyone have any clues about this exception

Reply via email to