[GitHub] [spark] bersprockets opened a new pull request, #36442: [SPARK-39093][SQL] Avoid codegen compilation error when dividing year-month intervals or day-time intervals by an integral

GitBox Tue, 03 May 2022 13:37:57 -0700


bersprockets opened a new pull request, #36442:
URL: https://github.com/apache/spark/pull/36442


   ### What changes were proposed in this pull request?
   
   In `DivideYMInterval#doGenCode` and `DivideDTInterval#doGenCode`, rely on 
the operand variable names provided by `nullSafeCodeGen` rather than calling 
`genCode` on the operands twice.
   
   
   ### Why are the changes needed?
   
   `DivideYMInterval#doGenCode` and `DivideDTInterval#doGenCode` call `genCode` 
on the operands twice (once directly, and once indirectly via 
`nullSafeCodeGen`). However, if you call `genCode` on an operand twice, you 
might not get back the same variable name for both calls (e.g., when the 
operand is not a `BoundReference` or if whole-stage codegen is turned off). 
When that happens, `nullSafeCodeGen` generates initialization code for one set 
of variables, but the divide expression generates usage code for another set of 
variables, resulting in compilation errors like this:
   ```
   spark-sql> create or replace temp view v1 as
            > select * FROM VALUES
            > (interval '10' months, interval '10' day, 2)
            > as v1(period, duration, num);
   Time taken: 2.81 seconds
   spark-sql> cache table v1;
   Time taken: 2.184 seconds
   spark-sql> select period/(num + 3) from v1;
   22/05/03 08:56:37 ERROR CodeGenerator: failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, 
Column 44: Expression "project_value_2" is not an rvalue
   ...
   22/05/03 08:56:37 WARN UnsafeProjection: Expr codegen error and falling back 
to interpreter mode
   ...
   0-2
   Time taken: 0.149 seconds, Fetched 1 row(s)
   spark-sql> select duration/(num + 3) from v1;
   22/05/03 08:57:29 ERROR CodeGenerator: failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, 
Column 54: Expression "project_value_2" is not an rvalue
   ...
   22/05/03 08:57:29 WARN UnsafeProjection: Expr codegen error and falling back 
to interpreter mode
   ...
   2 00:00:00.000000000
   Time taken: 0.089 seconds, Fetched 1 row(s)
   ```
   The error is not fatal (unless you have `spark.sql.codegen.fallback` set to 
`false`), but it muddies the log and can slow the query (since the expression 
is interpreted).
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New unit tests (unit tests run with `spark.sql.codegen.fallback` set to 
`false`, so the new tests fail without the fix).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] bersprockets opened a new pull request, #36442: [SPARK-39093][SQL] Avoid codegen compilation error when dividing year-month intervals or day-time intervals by an integral

Reply via email to