[
https://issues.apache.org/jira/browse/SPARK-11953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siva Gudavalli updated SPARK-11953:
-----------------------------------
Description:
DataFramewriter is not acknowledging Append Mode.
In case of Append Mode => It is verfiying if the table exists and creating a
new table if it is not there and then Inserting data.
If the table exists check is failed for whatsover reason, it still assumes
table doesnt exists and creates a new table.
As per my understanding, Append should perform "no DDL". It is okay to throw
error saying "table doesnt exist for Inserts'. But creating a Table should not
be allowed in Append Mode.
*I believe we should do two things here*
perform DDL => if and only if table exists check results "success"
do not perform DDL => if SaveMode is Append
was:
In Spark 1.3.1 we have 2 methods i.e.. CreateJdbcTable and InsertIntoJdbc.
They are replaced with write.jdbc() in Spark 1.4.1
When we specify SaveMode.Append we are letting application know that there is a
table in the database which means "tableExists = true". And we do not need to
perform "JdbcUtils.tableExists(conn, table)".
Please let me know if you think differently.
Regards
Shiv
def jdbc(url: String, table: String, connectionProperties: Properties): Unit = {
val conn = JdbcUtils.createConnection(url, connectionProperties)
try {
var tableExists = JdbcUtils.tableExists(conn, table)
if (mode == SaveMode.Ignore && tableExists)
{ return }
if (mode == SaveMode.ErrorIfExists && tableExists)
{ sys.error(s"Table $table already exists.") }
if (mode == SaveMode.Overwrite && tableExists)
{ JdbcUtils.dropTable(conn, table) tableExists = false }
// Create the table if the table didn't exist.
if (!tableExists)
{ val schema = JDBCWriteDetails.schemaString(df, url) val sql = s"CREATE TABLE
$table ($schema)" conn.prepareStatement(sql).executeUpdate() }
} finally
{ conn.close() }
JDBCWriteDetails.saveTable(df, url, table, connectionProperties)
}
> CLONE - Sparksql-1.4.1 DataFrameWrite.jdbc() SaveMode.Append Bug
> ----------------------------------------------------------------
>
> Key: SPARK-11953
> URL: https://issues.apache.org/jira/browse/SPARK-11953
> Project: Spark
> Issue Type: Bug
> Components: Java API, Spark Submit, SQL
> Affects Versions: 1.4.1, 1.5.1
> Environment: Spark stand alone cluster
> Reporter: Siva Gudavalli
>
> DataFramewriter is not acknowledging Append Mode.
> In case of Append Mode => It is verfiying if the table exists and creating a
> new table if it is not there and then Inserting data.
> If the table exists check is failed for whatsover reason, it still assumes
> table doesnt exists and creates a new table.
> As per my understanding, Append should perform "no DDL". It is okay to throw
> error saying "table doesnt exist for Inserts'. But creating a Table should
> not be allowed in Append Mode.
> *I believe we should do two things here*
> perform DDL => if and only if table exists check results "success"
> do not perform DDL => if SaveMode is Append
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]