Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/20850#discussion_r178498692
--- Diff:
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java
---
@@ -20,49 +20,78 @@
import org.apache.spark.sql.catalyst.expressions.UnsafeRow;
import org.apache.spark.sql.types.Decimal;
import org.apache.spark.unsafe.Platform;
-import org.apache.spark.unsafe.array.ByteArrayMethods;
import org.apache.spark.unsafe.bitset.BitSetMethods;
-import org.apache.spark.unsafe.types.CalendarInterval;
-import org.apache.spark.unsafe.types.UTF8String;
/**
* A helper class to write data into global row buffer using `UnsafeRow`
format.
*
* It will remember the offset of row buffer which it starts to write, and
move the cursor of row
* buffer while writing. If new data(can be the input record if this is
the outermost writer, or
* nested struct if this is an inner writer) comes, the starting cursor of
row buffer may be
- * changed, so we need to call `UnsafeRowWriter.reset` before writing, to
update the
+ * changed, so we need to call `UnsafeRowWriter.resetRowWriter` before
writing, to update the
* `startingOffset` and clear out null bits.
*
* Note that if this is the outermost writer, which means we will always
write from the very
* beginning of the global row buffer, we don't need to update
`startingOffset` and can just call
* `zeroOutNullBytes` before writing new data.
+ *
+ * Generally we should call `UnsafeRowWriter.setTotalSize` to update the
size of the result row,
+ * after writing a record to the buffer. However, we can skip this step if
the fields of row are
+ * all fixed-length, as the size of result row is also fixed.
--- End diff --
Got it. We will merge `setTotalSize` and `getRow` into `getRow`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]