[ https://issues.apache.org/jira/browse/NUTCH-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034511#comment-15034511 ]
Sebastian Nagel commented on NUTCH-2179: ---------------------------------------- +1: SolrIndexWriter should queue the deletions the same way as done for additions/updates. Looks like the bulk commit by an UpdateRequest is already assumed because numDeletes is taken into account when checking whether the batchSize is reached (SolrIndexWriter, line 125: {{if (inputDocs.size() + numDeletes >= batchSize) {}} > Cleanup job for SOLR Performance Boost > -------------------------------------- > > Key: NUTCH-2179 > URL: https://issues.apache.org/jira/browse/NUTCH-2179 > Project: Nutch > Issue Type: Improvement > Components: indexer > Affects Versions: 1.9, 1.10, 1.11 > Reporter: David Johnson > Priority: Minor > Labels: patch > Original Estimate: 2h > Remaining Estimate: 2h > > During a cleanup job, index deletes are scheduled one by one, which can make > a large job take days -- This message was sent by Atlassian JIRA (v6.3.4#6332)