Rick Hillegas created SPARK-10977:
-------------------------------------

             Summary: SQL injection bugs in JdbcUtils and DataFrameWriter
                 Key: SPARK-10977
                 URL: https://issues.apache.org/jira/browse/SPARK-10977
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.5.1
            Reporter: Rick Hillegas
            Priority: Minor


SPARK-10857 identifies a SQL injection bug in the JDBC dialect code. A similar 
SQL injection bug can be found in 2 places in JdbcUtils and another place in 
DataFrameWriter:

{noformat}
The DROP TABLE logic in JdbcUtils concatenates boilerplate with a user-supplied 
string:

def dropTable(conn: Connection, table: String): Unit = {
    conn.prepareStatement(s"DROP TABLE $table").executeUpdate()
  }


Same for the INSERT logic in JdbcUtils:

def insertStatement(conn: Connection, table: String, rddSchema: StructType): 
PreparedStatement = {
    val sql = new StringBuilder(s"INSERT INTO $table VALUES (")
    var fieldsLeft = rddSchema.fields.length
    while (fieldsLeft > 0) {
      sql.append("?")
      if (fieldsLeft > 1) sql.append(", ") else sql.append(")")
      fieldsLeft = fieldsLeft - 1
    }
    conn.prepareStatement(sql.toString())
  }


Same for the CREATE TABLE logic in DataFrameWriter:

  def jdbc(url: String, table: String, connectionProperties: Properties): Unit 
= {

   ...
   
    if (!tableExists) {
        val schema = JdbcUtils.schemaString(df, url)
        val sql = s"CREATE TABLE $table ($schema)"
        conn.prepareStatement(sql).executeUpdate()
      }

   ...

  }
{noformat}

Maybe we can find a common solution to all of these SQL injection bugs. 
Something like this:

1) Parse the user-supplied table name into a table identifier and an optional 
schema identifier. We can borrow logic from org.apache.derby.iapi.util.IdUtil 
in order to do this.

2) Double-quote (and escape as necessary) the schema and table identifiers so 
that the database interprets them as delimited ids.

That should prevent the SQL injection attacks.

With this solution, if the user specifies table names like cityTable and 
trafficSchema.congestionTable, then the generated DROP TABLE statements would be

{noformat}
DROP TABLE "CITYTABLE"

DROP TABLE "TRAFFICSCHEMA"."CONGESTIONTABLE"
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to