[
https://issues.apache.org/jira/browse/DRILL-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Artem Trush updated DRILL-7801:
-------------------------------
Fix Version/s: 1.19.0
> Changing the scope of output_batch_size
> ---------------------------------------
>
> Key: DRILL-7801
> URL: https://issues.apache.org/jira/browse/DRILL-7801
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.14.0, 1.15.0, 1.16.0, 1.17.0
> Reporter: Artem Trush
> Assignee: Artem Trush
> Priority: Major
> Fix For: 1.19.0, 1.18.0
>
>
> {{*Drill.exec.memory.operator.output_batch_size*}} parameter caused problems
> with the execution speed of certain queries, in particular, it led to
> situations where the number of batch was equal to the number of records, such
> as 99890 batch and 99890 records.
> After comparing drill 1.13, where the query is executed in a few minutes, and
> 1.16, where the query is executed in a few hours, I came to the following
> conclusion.
> The problem is in the formation of the size of the record batch transmitted
> between operators.
> For example, lets take a look at *{{ProjectRecordBatch}}* .
> We have incoming batch that comes from another operator with 2000 records
> inside.
> *Drill 1.13*
> We have function *_doWork_*. There is simple logic inside. This function is
> calling every time when we have incoming batch in Project operator.
> In a few words outgoing batch size depends on just incoming batch size. And
> in most cases value of outgoing batch size equal to incoming batch size. So
> 2000 will come, 2000 will go.
> {code:java}
> final int outputRecords = projector.projectRecords(0, incomingRecordCount,
> 0);{code}
> As we can see outputRecords depends just on incomingRecords.
> *Drill 1.16*
> Now we have a memoryManager which takes as parameter our option
> outgoing_batch_size.
> Lets look at this function doWork again.
> Firstly, we got this
> {code:java}
> //calculate the output row count
> memoryManager.update();{code}
> Inside we have
> ( getOutputBatchSize() is our config and batchSizer.rowCount() is incoming
> batch size)
> {code:java}
> //if rowWidth is not zero, set the output row count in the sizer
> setOutputRowCount(getOutputBatchSize(), rowWidth);
> // if more rows can be allowed than the incoming row count, then set the
> // output row count to the incoming row count.
> outPutRowCount = Math.min(getOutputRowCount(), batchSizer.rowCount()); {code}
> Back to the function _*doWork*_.
> memoryManager.update() fills variable called maxOuputRecordCount. Here
> {code:java}
> int maxOuputRecordCount = memoryManager.getOutputRowCount();{code}
> And the main difference between 13 and 16 with using a new parameter
> {code:java}
> final int outputRecords = projector.projectRecords(this.incoming,0,
> maxOuputRecordCount, 0);
> {code}
> If maxOutputRecordCount smaller than incomingBatch size, then number of
> outputRecords will decrease and the number of batches will increase. So will
> come 2000, will go 600 600 600 .. or another value depends of
> output_batch_size.
> As you could see in both cases the number of output batches is always not
> bigger than number of incoming batches. And the same rule is following in
> every operator with memoryManager.
> This leads to a situation where the number of batches grows excessively. A
> simple solution to this problem is to increase the value for
> *{{drill.exec.memory.operator.output_batch_size}}* . However, because this
> option is set at the system level, changing it results in
> *{color:#FF0000}{{RESOURCE ERROR: One or more nodes ran out of
> memory}}{color}* in other queries.
> My suggestion is to change the scope of
> *{{drill.exec.memory.operator.output_batch_size}}* from system to system and
> session. Which will allow you to increase this option only for problematic
> requests, without affecting the work of all others. As for me I don't see any
> reason to prevent this change. If you have any information about the negative
> effects of changing the scope of this parameter, please share it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)