[jira] [Created] (SPARK-10849) Allow user to specify database column type for data frame fields when writing data to jdbc data sources.

Suresh Thalamati (JIRA) Sun, 27 Sep 2015 18:51:50 -0700

Suresh Thalamati created SPARK-10849:
----------------------------------------


             Summary: Allow user to specify database column type for data frame 
fields when writing data to jdbc data sources. 
                 Key: SPARK-10849
                 URL: https://issues.apache.org/jira/browse/SPARK-10849
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 1.5.0
            Reporter: Suresh Thalamati
            Priority: Minor


Mapping data frame field type to database column type is addressed to large  
extent by  adding dialects, and Adding  maxlength option in SPARK-10101 to set 
the  VARCHAR length size. 

In some cases it is hard to determine max supported VARCHAR size , For example 
DB2 Z/OS VARCHAR size depends on the page size.  And some databases also has 
ROW SIZE limits for VARCHAR.  Specifying default CLOB for all String columns  
will likely make read/write slow. 

Allowing users to specify database type corresponding to the data frame field 
will be useful in cases where users wants to fine tune mapping for one or two 
fields, and is fine with default for all other fields .  

I propose to make the following two properties available for users to set in 
the data frame metadata when writing to JDBC data sources.
database.column.type  --  column type to use for create table.
jdbc.column.type"     --  jdbc type to  use for setting null values. 

Example :
  val secdf = sc.parallelize( Array(("Apple","Revenue ..."), 
("Google","Income:123213"))).toDF("name", "report")

  val  metadataBuilder = new MetadataBuilder()
  metadataBuilder.putString("database.column.type", "CLOB(100K)")
  metadataBuilder.putLong("jdbc.type", java.sql.Types.CLOB)
  val metadta =  metadataBuilder.build()
  val secReportDF = secdf.withColumn("report", col("report").as("report", 
metadata))
  secReporrDF.write.jdbc("jdbc:mysql://<URL>/secdata", "reports", mysqlProps)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-10849) Allow user to specify database column type for data frame fields when writing data to jdbc data sources.

Reply via email to