[ https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655298#action_12655298 ]
Ryan McKinley commented on SOLR-906: ------------------------------------ One basic problem with calling add( SolrInputDocument) with the CommonsHttpSolrServer is that it logs a request for each document. This can be a substantial impact. For example while indexing 40K docs on my machine, it takes ~3 1/2 mins. If I turn logging off the time drops to ! 2 1/2 mins. With the streaming approach, the time drops to 20sec! Some of that is obviously because it limits the logging: {code} INFO: {add=[id1,id2,id3,id4, ...(38293 more)]} 0 20714 {code} > Buffered / Streaming SolrServer implementaion > --------------------------------------------- > > Key: SOLR-906 > URL: https://issues.apache.org/jira/browse/SOLR-906 > Project: Solr > Issue Type: New Feature > Components: clients - java > Reporter: Ryan McKinley > Fix For: 1.4 > > > While indexing lots of documents, the CommonsHttpSolrServer add( > SolrInputDocument ) is less then optimal. This makes a new request for each > document. > With a "StreamingHttpSolrServer", documents are buffered and then written to > a single open Http connection. > For related discussion see: > http://www.nabble.com/solr-performance-tt9055437.html#a20833680 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.