On 8/16/2016 8:53 PM, Jaspal Sawhney wrote:
> We are running solr 4.6 in master-slave configuration where in our master is 
> used entirely for indexing. No search traffic comes to master ever.
> Off late we have started to get the early EOF error on the solr Master which 
> results in a Broken Pipe error on the commerce application from where 
> Indexing was kicked off from.
<snip>
>   1.  Since we are not using solrCloud – I want to understand the impact of 
> bypassing transaction log

The transaction log is required for SolrCloud, but it is highly
recommended for ANY Solr install.

Here's how to remove issues with the tlog directory growing out of
control:  Configure autoCommit with openSearcher set to false, no
maxDocs setting, and a maxTime of 60000 (one minute).  You'll commonly
see 15 seconds as a recommended maxTime -- my opinion is that this is
too frequent, but if you choose to use that, I doubt you'll have any
issues.  I think the newest versions of Solr do set up autoCommit like
this with a 15 second maxTime.

If you disable the transaction log and have some kind of crash during
indexing, you may lose documents.  When it is present, the transaction
log will be replayed when the core starts.

>   2.  How does solr take documents which are sent to it to storage as in what 
> is the journey of a document from segment to tlog to storage.

Assuming no cloud mode, when a document arrives for indexing, it is
written to the tlog and sent to Lucene for processing.  When the Lucene
indexing buffer fills up, or a commit is issued, then the segment is
flushed.  Most of the time it will be flushed to disk, but if the
segment is very small and a soft commit is used, it may be flushed to
RAM instead -- this is a function of NRTCachingDirectoryFactory, which
is the default.

Cloud mode is slightly more complicated, but the behavior would be the
same once the document arrives at the correct core(s) that will index it.

The "early EOF" exception came from Jetty, not Solr.  Based on how EOF
is used by Jetty errors in other contexts, I think it means that the
indexing client closed the connection before all the data was sent,
which probably means that you have a low socket timeout on the client. 
The server likely paused while receiving the data, probably to handle
the data it had already received ... and the pause was longer than the
socket timeout, causing the client to close the connection.  Another
possibility is that the network is not working well, or that one of the
operating systems or software libraries involved has TCP or HTTP bugs.

I could be wrong about what the exception means, but the information I
was able to quickly locate supports the idea.

If I am right, then you will need to either reduce the amount of data
that you send in a single update request, or increase the socket timeout
that the indexing client is using on its connections.

Erick's idea of your update request exceeding the maximum POST body size
is something I hadn't thought of.  The default for this limit is 2MB,
and can be increased in solrconfig.xml.  I suspect that this isn't the
problem, but it's something to investigate.

Thanks,
Shawn

Reply via email to