Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/13909
Yes, #15780 did not directly change the size of generated code. However,
since it changes `nullable` in Dataframe schema, the code size is changed.
Let me show you the following example. While you can see if-statements for
nullcheck without #15780, you see no if-statements for null check with #15780.
Precise schema information can improve quality of generated code and reduce
code size.
without #15780
```java
/* 039 */ protected void processNext() throws java.io.IOException {
/* 040 */ while (inputadapter_input.hasNext()) {
/* 041 */ InternalRow inputadapter_row = (InternalRow)
inputadapter_input.next();
/* 042 */ int inputadapter_value = inputadapter_row.getInt(0);
/* 043 */
/* 044 */ boolean project_isNull1 = false;
/* 045 */
/* 046 */ int project_value1 = -1;
/* 047 */ project_value1 = inputadapter_value + 1;
/* 048 */ project_values[0] = project_value1;
/* 049 */
/* 050 */ boolean project_isNull4 = false;
/* 051 */
/* 052 */ int project_value4 = -1;
/* 053 */ project_value4 = inputadapter_value + 2;
/* 054 */ project_values[1] = project_value4;
/* 055 */ final ArrayData project_value =
org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.fromPrimitiveArray(project_values);
/* 056 */ project_holder.reset();
/* 057 */
/* 058 */ // Remember the current cursor so that we can calculate how
many bytes are
/* 059 */ // written later.
/* 060 */ final int project_tmpCursor = project_holder.cursor;
...
```
With #15780
```java
/* 037 */ protected void processNext() throws java.io.IOException {
/* 038 */ while (inputadapter_input.hasNext()) {
/* 039 */ InternalRow inputadapter_row = (InternalRow)
inputadapter_input.next();
/* 040 */ int inputadapter_value = inputadapter_row.getInt(0);
/* 041 */
/* 042 */ final boolean project_isNull = false;
/* 043 */ this.project_values = new Object[2];
/* 044 */ boolean project_isNull1 = true;
/* 045 */ int project_value1 = -1;
/* 046 */
/* 047 */ if (!false) {
/* 048 */ project_isNull1 = false; // resultCode could change
nullability.
/* 049 */ project_value1 = inputadapter_value + 1;
/* 050 */
/* 051 */ }
/* 052 */ if (project_isNull1) {
/* 053 */ project_values[0] = null;
/* 054 */ } else {
/* 055 */ project_values[0] = project_value1;
/* 056 */ }
/* 057 */
/* 058 */ boolean project_isNull4 = true;
/* 059 */ int project_value4 = -1;
/* 060 */
/* 061 */ if (!false) {
/* 062 */ project_isNull4 = false; // resultCode could change
nullability.
/* 063 */ project_value4 = inputadapter_value + 2;
/* 064 */
/* 065 */ }
/* 066 */ if (project_isNull4) {
/* 067 */ project_values[1] = null;
/* 068 */ } else {
/* 069 */ project_values[1] = project_value4;
/* 070 */ }
/* 071 */
/* 072 */ final ArrayData project_value = new
org.apache.spark.sql.catalyst.util.GenericArrayData(project_values);
/* 073 */ this.project_values = null;
/* 074 */ project_holder.reset();
/* 075 */
/* 076 */ project_rowWriter.zeroOutNullBytes();
/* 077 */
/* 078 */ if (project_isNull) {
/* 079 */ project_rowWriter.setNullAt(0);
/* 080 */ } else {
/* 081 */ // Remember the current cursor so that we can calculate
how many bytes are
/* 082 */ // written later.
/* 083 */ final int project_tmpCursor = project_holder.cursor;
...
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]