Chris M. Hostetter created SOLR-17430:
-----------------------------------------

             Summary: Redesign ExportWriter / ExportBuffers to work better with 
large batchSizes and slow consumption
                 Key: SOLR-17430
                 URL: https://issues.apache.org/jira/browse/SOLR-17430
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Chris M. Hostetter


As mentioned in SOLR-17416, the design of the {{ExportBuffers}} class used by 
the {{ExportHandler}} is brittle and the absolutely time limit on how long the 
buffer swapping threads will wait for eachother isn't suitable for very long 
running streaming expressions...
{quote}The problem however is that this 600 second timeout may not be enough to 
account for really slow downstream consumption of the data.  With really large 
collections, and really complicated streaming expressions, this can happen even 
when well behaved clients that are actively trying to consume data.
{quote}
...but another sub-optimal aspect of this buffer swapping design is that the 
"writer" thread is initially completely blocked, and can't write out a single 
document, until the "filler" thread has read the full {{batchSize}} of 
documents into it's buffer and opted to swap.  Likewise, after buffer swapping 
has occured at least once, any document in the {{outputBuffer}} that the writer 
has already processed hangs around, taking up ram, until the next swap, while 
one of the threads is idle.  If {{{}batchSize=30000{}}}, and the "filler" 
thread is ready to go with a full {{fillBuffer}} while the "writer" has only 
been able to emit 29999 of the documents in it's {{outputBuffer}} documents 
before being blocked and forced to wait (due to the downstream consumer of the 
output bytes) before it can emit the last document in it's batch – that means 
both the "writer" thread and the "filler" thread are stalled, taking up 2x the 
batchSize of ram, even though half of that is data that is no longer needed.

The bigger the {{batchSize}} the worse the initial delay (and steady state 
wasted RAM) is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to