[
https://issues.apache.org/jira/browse/DRILL-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559285#comment-16559285
]
ASF GitHub Bot commented on DRILL-6616:
---------------------------------------
sohami commented on a change in pull request #1401: DRILL-6616: Batch
Processing for Lateral/Unnest
URL: https://github.com/apache/drill/pull/1401#discussion_r205671776
##########
File path:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestRecordBatch.java
##########
@@ -103,16 +109,19 @@ public void update() {
final MaterializedField field =
incoming.getSchema().getColumn(typedFieldId.getFieldIds()[0]);
// Get column size of unnest column.
+
RecordBatchSizer.ColumnSize columnSize =
getRecordBatchSizer().getColumn(field.getName());
+ RecordBatchSizer.ColumnSize rowIdColumnSize = getRecordBatchSizer().new
ColumnSize(rowIdVector, null);
+
// Average rowWidth of single element in the unnest list.
// subtract the offset vector size from column data size.
final int avgRowWidthSingleUnnestEntry = RecordBatchSizer
.safeDivide(columnSize.getTotalNetSize() - (getOffsetVectorWidth() *
columnSize.getValueCount()), columnSize
.getElementCount());
// Average rowWidth of outgoing batch.
- final int avgOutgoingRowWidth = avgRowWidthSingleUnnestEntry;
+ final int avgOutgoingRowWidth = avgRowWidthSingleUnnestEntry +
rowIdColumnSize.getDataSizePerEntry();
Review comment:
If still using RecordBatchSize then `getDataSizePerEntry() --->
getStdDataSizePerEntry() . `First one will return zero since the vector is
empty to start with.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Batch Processing for Lateral/Unnest
> -----------------------------------
>
> Key: DRILL-6616
> URL: https://issues.apache.org/jira/browse/DRILL-6616
> Project: Apache Drill
> Issue Type: Improvement
> Components: Execution - Relational Operators
> Affects Versions: 1.14.0
> Reporter: Sorabh Hamirwasia
> Assignee: Sorabh Hamirwasia
> Priority: Major
> Fix For: 1.15.0
>
>
> Implement the execution and planner side changes for the batch processing
> done by lateral and unnest. Based on the prototype we found performance to be
> much better as compared to initial row-by-row execution.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)