[ 
https://issues.apache.org/jira/browse/SPARK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271984#comment-14271984
 ] 

Yin Huai commented on SPARK-4574:
---------------------------------

Let me try to summarize [~scwf]'s PR (with my updates). 
With this PR, users can define a table with a schema through ...
{code:sql}
CREATE TEMPORARY TABLE tableName [(columnName dataType, ...)]
USING ...
OPTIONS (...)
{code}
For example,
{code:sql}
CREATE TEMPORARY TABLE avroTable(a int, b string)
USING org.apache.spark.sql.avro
OPTIONS (path "../hive/src/test/resources/data/files/episodes.avro")
{code}
This PR introduces a new relation provider:
{code}
trait SchemaRelationProvider {
  def createRelation(
      sqlContext: SQLContext,
      parameters: Map[String, String],
      schema:StructType): BaseRelation
}
{code}

Through createRelation in the SchemaRelationProvider, users can specify the 
schema of the created table. A relation provider can inherit both 
RelationProvider and SchemaRelationProvider to support both schema inference 
and user-specified schemas.

The string representations of data types used in DDLs are summarized in the 
table below.

||String representation||Spark SQL data type||
|string|StringType|
|varchar|StringType|
|binary|BinaryType|
|boolean|BooleanType|
|tinyint|ByteType|
|smallint|ShortType|
|int|IntegerType|
|bigint|LongType|
|float|FloatType|
|double|DoubleType|
|decimal|DecimalType.Unlimited|
|decimal(precision, scale)|DecimalType with specified precision and scale|
|date|DateType|
|timestamp|TimestampType|
|array<elementType>|ArrayType|
|map<keyType, valueType>|MapType|
|struct<fieldName: fieldType, ...>|StructType|

> Adding support for defining schema in foreign DDL commands.
> -----------------------------------------------------------
>
>                 Key: SPARK-4574
>                 URL: https://issues.apache.org/jira/browse/SPARK-4574
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.1.0
>            Reporter: wangfei
>            Priority: Blocker
>
> Adding support for defining schema in foreign DDL commands. Now foreign DDL 
> support commands like:
>    CREATE TEMPORARY TABLE avroTable
>    USING org.apache.spark.sql.avro
>    OPTIONS (path "../hive/src/test/resources/data/files/episodes.avro")
> Let user can define schema instead of infer from file, so we can support ddl 
> command as follows:
>    CREATE TEMPORARY TABLE avroTable(a int, b string)
>    USING org.apache.spark.sql.avro
>    OPTIONS (path "../hive/src/test/resources/data/files/episodes.avro")



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to