On 8/16/2016 8:53 PM, Jaspal Sawhney wrote: > We are running solr 4.6 in master-slave configuration where in our master is > used entirely for indexing. No search traffic comes to master ever. > Off late we have started to get the early EOF error on the solr Master which > results in a Broken Pipe error on the commerce application from where > Indexing was kicked off from. <snip> > 1. Since we are not using solrCloud – I want to understand the impact of > bypassing transaction log
The transaction log is required for SolrCloud, but it is highly recommended for ANY Solr install. Here's how to remove issues with the tlog directory growing out of control: Configure autoCommit with openSearcher set to false, no maxDocs setting, and a maxTime of 60000 (one minute). You'll commonly see 15 seconds as a recommended maxTime -- my opinion is that this is too frequent, but if you choose to use that, I doubt you'll have any issues. I think the newest versions of Solr do set up autoCommit like this with a 15 second maxTime. If you disable the transaction log and have some kind of crash during indexing, you may lose documents. When it is present, the transaction log will be replayed when the core starts. > 2. How does solr take documents which are sent to it to storage as in what > is the journey of a document from segment to tlog to storage. Assuming no cloud mode, when a document arrives for indexing, it is written to the tlog and sent to Lucene for processing. When the Lucene indexing buffer fills up, or a commit is issued, then the segment is flushed. Most of the time it will be flushed to disk, but if the segment is very small and a soft commit is used, it may be flushed to RAM instead -- this is a function of NRTCachingDirectoryFactory, which is the default. Cloud mode is slightly more complicated, but the behavior would be the same once the document arrives at the correct core(s) that will index it. The "early EOF" exception came from Jetty, not Solr. Based on how EOF is used by Jetty errors in other contexts, I think it means that the indexing client closed the connection before all the data was sent, which probably means that you have a low socket timeout on the client. The server likely paused while receiving the data, probably to handle the data it had already received ... and the pause was longer than the socket timeout, causing the client to close the connection. Another possibility is that the network is not working well, or that one of the operating systems or software libraries involved has TCP or HTTP bugs. I could be wrong about what the exception means, but the information I was able to quickly locate supports the idea. If I am right, then you will need to either reduce the amount of data that you send in a single update request, or increase the socket timeout that the indexing client is using on its connections. Erick's idea of your update request exceeding the maximum POST body size is something I hadn't thought of. The default for this limit is 2MB, and can be increased in solrconfig.xml. I suspect that this isn't the problem, but it's something to investigate. Thanks, Shawn