[ 
https://issues.apache.org/jira/browse/SPARK-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-15165:
-----------------------------------
    Description: 
toCommentSafeString method replaces "\u" with "\ \u" to avoid codegen breaking.
But if the even number of "\" is put before "u", like "\ \u", in the string 
literal in the query, codegen can break.

Following code occurs compilation error.

{code}
val df = Seq(...).toDF
df.select("'\\\\\\\\u002A/'").show
{code}

The reason of the compilation error is because "\\\\\\\\u002A/" is translated 
into "*/" (the end of comment). 

Due to this unsafety, arbitrary code can be injected like as follows.

{code}
val df = Seq(...).toDF
// Inject "System.exit(1)"
df.select("'\\\\\\\\u002A/{System.exit(1);}/*'").show
{code}


  was:
toCommentSafeString method replaces "\u" with "\\u" to avoid codegen breaking.
But if the even number of "\" is put before "u", like "\\u", in the string 
literal in the query, codegen can break.

Following code occurs compilation error.

{code}
val df = Seq(...).toDF
df.select("'\\\\\\\\u002A/'").show
{code}

The reason of the compilation error is because "\\\\\\\\u002A/" is translated 
into "*/" (the end of comment). 

Due to this unsafety, arbitrary code can be injected like as follows.

{code}
val df = Seq(...).toDF
// Inject "System.exit(1)"
df.select("'\\\\\\\\u002A/{System.exit(1);}/*'").show
{code}



> Codegen can break because toCommentSafeString is not actually safe
> ------------------------------------------------------------------
>
>                 Key: SPARK-15165
>                 URL: https://issues.apache.org/jira/browse/SPARK-15165
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Kousuke Saruta
>
> toCommentSafeString method replaces "\u" with "\ \u" to avoid codegen 
> breaking.
> But if the even number of "\" is put before "u", like "\ \u", in the string 
> literal in the query, codegen can break.
> Following code occurs compilation error.
> {code}
> val df = Seq(...).toDF
> df.select("'\\\\\\\\u002A/'").show
> {code}
> The reason of the compilation error is because "\\\\\\\\u002A/" is translated 
> into "*/" (the end of comment). 
> Due to this unsafety, arbitrary code can be injected like as follows.
> {code}
> val df = Seq(...).toDF
> // Inject "System.exit(1)"
> df.select("'\\\\\\\\u002A/{System.exit(1);}/*'").show
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to