[jira] [Updated] (SPARK-41395) InterpretedMutableProjection can corrupt unsafe buffer when used with decimal data

Bruce Robbins (Jira) Mon, 05 Dec 2022 14:20:12 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-41395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Bruce Robbins updated SPARK-41395:
----------------------------------
    Description: 
The following returns the wrong answer:

{noformat}
set spark.sql.codegen.wholeStage=false;
set spark.sql.codegen.factoryMode=NO_CODEGEN;

select max(col1), max(col2) from values
(cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
(cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
as data(col1, col2);

+---------+---------+
|max(col1)|max(col2)|
+---------+---------+
|null     |239.88   |
+---------+---------+
{noformat}
This is because {{InterpretedMutableProjection}} inappropriately uses 
{{InternalRow#setNullAt}} to set null for decimal types with precision > 
{{Decimal.MAX_LONG_DIGITS}}.

The path to corruption goes like this:

Unsafe buffer at start:

{noformat}
                                          offset/len for   offset/len for
                                          1st decimal      2nd decimal

offset: 0                8                16 (0x10)        24 (0x18)        32 
(0x20)
data:   0300000000000000 0000000018000000 0000000028000000 0000000000000000 
0000000000000000 0000000000000000 0000000000000000
{noformat}

When processing the first incoming row ([null, null]), 
{{InterpretedMutableProjection}} calls {{setNullAt}} for the decimal types. As 
a result, the pointers to the storage areas for the two decimals in the 
variable length region get zeroed out.

Buffer after projecting first row (null, null):
{noformat}
                                          offset/len for   offset/len for
                                          1st decimal      2nd decimal

offset: 0                8                16 (0x10)        24 (0x18)        32 
(0x20)
data:   0300000000000000 0000000000000000 0000000000000000 0000000000000000 
0000000000000000 0000000000000000 0000000000000000
{noformat}

When it's time to project the second row into the buffer, UnsafeRow#setDecimal 
uses the zero offsets, which causes {{UnsafeRow#setDecimal}} to overwrite the 
null-tracking bit set with decimal data:

{noformat}
        null-tracking
        bit area
offset: 0                8                16 (0x10)        24 (0x18)        32 
(0x20)
data:   5db4000000000000 0000000000000000 0200000000000000 0000000000000000 
0000000000000000 0000000000000000 0000000000000000 
{noformat}
The null-tracking bit set is overwritten with 239.88 (0x5db4) rather than 
245.00 (0x5fb4) because setDecimal indirectly calls setNotNullAt(1), which 
turns off the null-tracking bit associated with the field at index 1.


In addition, the decimal at field index 0 is now null because of the corruption 
of the null-tracking bit set.


When a decimal type with precision > {{Decimal.MAX_LONG_DIGITS}} is null, 
{{InterpretedMutableProjection}} should write a null {{Decimal}} value rather 
than call {{setNullAt}} (see.)

This bug could get exercised during codegen fallback. Take for example this 
case where I forced codegen to fail for the {{Greatest}} expression:

{noformat}
spark-sql> select max(col1), max(col2) from values
(cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
(cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
as data(col1, col2);

22/12/05 08:18:54 ERROR CodeGenerator: failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 58, 
Column 1: ';' expected instead of 'if'
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 58, 
Column 1: ';' expected instead of 'if'
        at 
org.codehaus.janino.TokenStreamImpl.compileException(TokenStreamImpl.java:362)
        at org.codehaus.janino.TokenStreamImpl.read(TokenStreamImpl.java:149)
        at org.codehaus.janino.Parser.read(Parser.java:3787)
...
22/12/05 08:18:56 WARN MutableProjection: Expr codegen error and falling back 
to interpreter mode
java.util.concurrent.ExecutionException: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 43, 
Column 1: failed to compile: org.codehaus.commons.compiler.CompileException: 
File 'generated.java', Line 43, Column 1: ';' expected instead of 'boolean'
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
        at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1583)
        at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1580)
        at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
        at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
        ... 36 more
...

NULL    239.88   <== incorrect result, should be (77.77, 245.00)
Time taken: 6.132 seconds, Fetched 1 row(s)
spark-sql>
{noformat}


  was:
The following returns the wrong answer:

{noformat}
set spark.sql.codegen.wholeStage=false;
set spark.sql.codegen.factoryMode=NO_CODEGEN;

select max(col1), max(col2) from values
(cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
(cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
as data(col1, col2);

+---------+---------+
|max(col1)|max(col2)|
+---------+---------+
|null     |239.88   |
+---------+---------+
{noformat}
This is because {{InterpretedMutableProjection}} inappropriately uses 
{{InternalRow#setNullAt}} to set null for decimal types with precision > 
{{Decimal.MAX_LONG_DIGITS}}.

The path to corruption goes like this:

Unsafe buffer at start:

{noformat}
                                          offset/len for   offset/len for
                                          1st decimal      2nd decimal

offset: 0                8                16 (0x10)        24 (0x18)        32 
(0x20)
data:   0300000000000000 0000000018000000 0000000028000000 0000000000000000 
0000000000000000 0000000000000000 0000000000000000
{noformat}

When processing the first incoming row ([null, null]), 
{{InterpretedMutableProjection}} calls {{setNullAt}} for the decimal types. As 
a result, the pointers to the storage areas for the two decimals in the 
variable length region get zeroed out.

Buffer after projecting first row (null, null):
{noformat}
                                          offset/len for   offset/len for
                                          1st decimal      2nd decimal

offset: 0                8                16 (0x10)        24 (0x18)        32 
(0x20)
data:   0300000000000000 0000000000000000 0000000000000000 0000000000000000 
0000000000000000 0000000000000000 0000000000000000
{noformat}

When it's time to project the second row into the buffer, UnsafeRow#setDecimal 
uses the zero offsets, which causes {{UnsafeRow#setDecimal}} to overwrite the 
null-tracking bit set with decimal data:

{noformat}
        null-tracking
        bit area
offset: 0                8                16 (0x10)        24 (0x18)        32 
(0x20)
data:   5db4000000000000 0000000000000000 0200000000000000 0000000000000000 
0000000000000000 0000000000000000 0000000000000000 
{noformat}
The null-tracking bit set is overwritten with 239.88 (0x5db4) rather than 
245.00 (0x5fb4) because setDecimal indirectly calls setNotNullAt(1), which 
turns off the null-tracking bit associated with the field at index 1.


In addition, the decimal at field index 0 is now null because of the corruption 
of the null-tracking bit set.


When a decimal type with precision > {{Decimal.MAX_LONG_DIGITS}} is null, 
{{InterpretedMutableProjection}} should write a null {{Decimal}} value rather 
than call {{setNullAt}} (see.)

This bug could get exercised during codegen fallback. Take for example this 
case where I forcibly made codegen fail for the {{Greatest}} expression:

{noformat}
spark-sql> select max(col1), max(col2) from values
(cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
(cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
as data(col1, col2);

22/12/05 08:18:54 ERROR CodeGenerator: failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 58, 
Column 1: ';' expected instead of 'if'
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 58, 
Column 1: ';' expected instead of 'if'
        at 
org.codehaus.janino.TokenStreamImpl.compileException(TokenStreamImpl.java:362)
        at org.codehaus.janino.TokenStreamImpl.read(TokenStreamImpl.java:149)
        at org.codehaus.janino.Parser.read(Parser.java:3787)
...
22/12/05 08:18:56 WARN MutableProjection: Expr codegen error and falling back 
to interpreter mode
java.util.concurrent.ExecutionException: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 43, 
Column 1: failed to compile: org.codehaus.commons.compiler.CompileException: 
File 'generated.java', Line 43, Column 1: ';' expected instead of 'boolean'
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
        at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1583)
        at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1580)
        at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
        at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
        ... 36 more
...

NULL    239.88   <== incorrect result, should be (77.77, 245.00)
Time taken: 6.132 seconds, Fetched 1 row(s)
spark-sql>
{noformat}



> InterpretedMutableProjection can corrupt unsafe buffer when used with decimal 
> data
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-41395
>                 URL: https://issues.apache.org/jira/browse/SPARK-41395
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.3.1, 3.2.3, 3.4.0
>            Reporter: Bruce Robbins
>            Priority: Major
>
> The following returns the wrong answer:
> {noformat}
> set spark.sql.codegen.wholeStage=false;
> set spark.sql.codegen.factoryMode=NO_CODEGEN;
> select max(col1), max(col2) from values
> (cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
> (cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
> as data(col1, col2);
> +---------+---------+
> |max(col1)|max(col2)|
> +---------+---------+
> |null     |239.88   |
> +---------+---------+
> {noformat}
> This is because {{InterpretedMutableProjection}} inappropriately uses 
> {{InternalRow#setNullAt}} to set null for decimal types with precision > 
> {{Decimal.MAX_LONG_DIGITS}}.
> The path to corruption goes like this:
> Unsafe buffer at start:
> {noformat}
>                                           offset/len for   offset/len for
>                                           1st decimal      2nd decimal
> offset: 0                8                16 (0x10)        24 (0x18)        
> 32 (0x20)
> data:   0300000000000000 0000000018000000 0000000028000000 0000000000000000 
> 0000000000000000 0000000000000000 0000000000000000
> {noformat}
> When processing the first incoming row ([null, null]), 
> {{InterpretedMutableProjection}} calls {{setNullAt}} for the decimal types. 
> As a result, the pointers to the storage areas for the two decimals in the 
> variable length region get zeroed out.
> Buffer after projecting first row (null, null):
> {noformat}
>                                           offset/len for   offset/len for
>                                           1st decimal      2nd decimal
> offset: 0                8                16 (0x10)        24 (0x18)        
> 32 (0x20)
> data:   0300000000000000 0000000000000000 0000000000000000 0000000000000000 
> 0000000000000000 0000000000000000 0000000000000000
> {noformat}
> When it's time to project the second row into the buffer, 
> UnsafeRow#setDecimal uses the zero offsets, which causes 
> {{UnsafeRow#setDecimal}} to overwrite the null-tracking bit set with decimal 
> data:
> {noformat}
>         null-tracking
>         bit area
> offset: 0                8                16 (0x10)        24 (0x18)        
> 32 (0x20)
> data:   5db4000000000000 0000000000000000 0200000000000000 0000000000000000 
> 0000000000000000 0000000000000000 0000000000000000 
> {noformat}
> The null-tracking bit set is overwritten with 239.88 (0x5db4) rather than 
> 245.00 (0x5fb4) because setDecimal indirectly calls setNotNullAt(1), which 
> turns off the null-tracking bit associated with the field at index 1.
> In addition, the decimal at field index 0 is now null because of the 
> corruption of the null-tracking bit set.
> When a decimal type with precision > {{Decimal.MAX_LONG_DIGITS}} is null, 
> {{InterpretedMutableProjection}} should write a null {{Decimal}} value rather 
> than call {{setNullAt}} (see.)
> This bug could get exercised during codegen fallback. Take for example this 
> case where I forced codegen to fail for the {{Greatest}} expression:
> {noformat}
> spark-sql> select max(col1), max(col2) from values
> (cast(null  as decimal(27,2)), cast(null   as decimal(27,2))),
> (cast(77.77 as decimal(27,2)), cast(245.00 as decimal(27,2)))
> as data(col1, col2);
> 22/12/05 08:18:54 ERROR CodeGenerator: failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 58, Column 1: ';' expected instead of 'if'
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 58, Column 1: ';' expected instead of 'if'
>       at 
> org.codehaus.janino.TokenStreamImpl.compileException(TokenStreamImpl.java:362)
>       at org.codehaus.janino.TokenStreamImpl.read(TokenStreamImpl.java:149)
>       at org.codehaus.janino.Parser.read(Parser.java:3787)
> ...
> 22/12/05 08:18:56 WARN MutableProjection: Expr codegen error and falling back 
> to interpreter mode
> java.util.concurrent.ExecutionException: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 43, Column 1: failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 43, Column 1: ';' expected instead of 'boolean'
>       at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
>       at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1583)
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1580)
>       at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
>       at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
>       ... 36 more
> ...
> NULL  239.88   <== incorrect result, should be (77.77, 245.00)
> Time taken: 6.132 seconds, Fetched 1 row(s)
> spark-sql>
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-41395) InterpretedMutableProjection can corrupt unsafe buffer when used with decimal data

Reply via email to