Kousuke Saruta created SPARK-15165:
--------------------------------------

             Summary: Codegen can break because toCommentSafeString is not 
actually safe
                 Key: SPARK-15165
                 URL: https://issues.apache.org/jira/browse/SPARK-15165
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Kousuke Saruta


toCommentSafeString method replaces "\u" with "\\u" to avoid codegen breaking.
But if the even number of "\" is put before "u", like "\\u", in the string 
literal in the query, codegen can break.

Following code occurs compilation error.

{code}
val df = Seq(...).toDF
df.select("'\\\\\\\\u002A/'").show
{code}

The reason of the compilation error is because "\\\\\\\\u002A/" is translated 
into "*/" (the end of comment). 

Due to this unsafety, arbitrary code can be injected like as follows.

{code}
val df = Seq(...).toDF
// Inject "System.exit(1)"
df.select("'\\\\\\\\u002A/{System.exit(1);}/*'").show
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to