[ https://issues.apache.org/jira/browse/SPARK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271984#comment-14271984 ]
Yin Huai commented on SPARK-4574: --------------------------------- Let me try to summarize [~scwf]'s PR (with my updates). With this PR, users can define a table with a schema through ... {code:sql} CREATE TEMPORARY TABLE tableName [(columnName dataType, ...)] USING ... OPTIONS (...) {code} For example, {code:sql} CREATE TEMPORARY TABLE avroTable(a int, b string) USING org.apache.spark.sql.avro OPTIONS (path "../hive/src/test/resources/data/files/episodes.avro") {code} This PR introduces a new relation provider: {code} trait SchemaRelationProvider { def createRelation( sqlContext: SQLContext, parameters: Map[String, String], schema:StructType): BaseRelation } {code} Through createRelation in the SchemaRelationProvider, users can specify the schema of the created table. A relation provider can inherit both RelationProvider and SchemaRelationProvider to support both schema inference and user-specified schemas. The string representations of data types used in DDLs are summarized in the table below. ||String representation||Spark SQL data type|| |string|StringType| |varchar|StringType| |binary|BinaryType| |boolean|BooleanType| |tinyint|ByteType| |smallint|ShortType| |int|IntegerType| |bigint|LongType| |float|FloatType| |double|DoubleType| |decimal|DecimalType.Unlimited| |decimal(precision, scale)|DecimalType with specified precision and scale| |date|DateType| |timestamp|TimestampType| |array<elementType>|ArrayType| |map<keyType, valueType>|MapType| |struct<fieldName: fieldType, ...>|StructType| > Adding support for defining schema in foreign DDL commands. > ----------------------------------------------------------- > > Key: SPARK-4574 > URL: https://issues.apache.org/jira/browse/SPARK-4574 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 1.1.0 > Reporter: wangfei > Priority: Blocker > > Adding support for defining schema in foreign DDL commands. Now foreign DDL > support commands like: > CREATE TEMPORARY TABLE avroTable > USING org.apache.spark.sql.avro > OPTIONS (path "../hive/src/test/resources/data/files/episodes.avro") > Let user can define schema instead of infer from file, so we can support ddl > command as follows: > CREATE TEMPORARY TABLE avroTable(a int, b string) > USING org.apache.spark.sql.avro > OPTIONS (path "../hive/src/test/resources/data/files/episodes.avro") -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org