GitHub user yhuai opened a pull request:
https://github.com/apache/spark/pull/12363
[SQL] Implement CREATE TABLE
Just want to try
https://github.com/apache/spark/commit/ab70cb751cce8ca2e0757b9fa523534207864328
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/yhuai/spark createTable
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/12363.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12363
----
commit 014c38e28e8f4545f926ef60ccb2ee4acae07b59
Author: Andrew Or <[email protected]>
Date: 2016-04-08T21:32:39Z
Parse various parts of the CREATE TABLE command
We need to reconcile the differences between what's added here in
SparkSqlParser and HiveSqlParser. That will come in the next
commit.
This currently still fails tests, obviously because create table
is not implemented yet!
commit 15bb3b6c76e61d708538bee5d797981689ab6a8f
Author: Andrew Or <[email protected]>
Date: 2016-04-01T21:20:37Z
Refactor CatalogTable column semantics
Before: CatalogTable has schema, partitionColumns and sortColumns.
There are no constraints between the 3. However, Hive will
complain if schema and partitionColumns overlap.
After: CatalogTable has schema, partitionColumnNames,
sortColumnNames, bucketColumnNames and skewColumnNames. All the
columns must be a subset of schema. This means splitting up
schema into (schema, partitionCols) before passing it to Hive.
This allows us to store the columns more uniformly. Otherwise
partition columns would be the odd one out. This commit also
fixes "alter table bucketing", which was incorrectly using
partition columns as bucket columns.
commit b6b4d293c2efeb537110ef56fa9ffdcad90c9bb0
Author: Andrew Or <[email protected]>
Date: 2016-04-09T00:53:18Z
Implement CREATE TABLE in Hive parser
This involves reverting part of the changes in an earlier commit,
where we tried to implement the parsing logic in the general SQL
parser and introduced a bunch of case classes that we won't end
up using.
As of this commit the actual CREATE TABLE logic is not there yet.
It will come in a future commit.
commit 5e0fe03bfa655c6de854bc8adaa73186a17a0b0c
Author: Andrew Or <[email protected]>
Date: 2016-04-09T06:52:20Z
Implement it
commit f7501d9ebc5c4f08374788a937de6a56689258b8
Author: Andrew Or <[email protected]>
Date: 2016-04-09T07:00:30Z
Revert unnecessary changes (small)
commit 66970a89e6a5478773e76e7822a5945fa228b930
Author: Andrew Or <[email protected]>
Date: 2016-04-11T20:37:23Z
Merge branch 'master' of github.com:apache/spark into create-table-ddl
commit 3af954d355c3dc3c5fb982d3bcdf2a0a3e3c4580
Author: Andrew Or <[email protected]>
Date: 2016-04-11T22:37:00Z
Address comment
commit 2e95ecf790dc5d5b12b6ec72c0bd2b4bca99b17d
Author: Andrew Or <[email protected]>
Date: 2016-04-11T23:41:57Z
Add all the tests
commit c8edb75a2d9216c2bac5682a1733d678cfef4f62
Author: Andrew Or <[email protected]>
Date: 2016-04-12T17:48:26Z
Merge branch 'master' of github.com:apache/spark into create-table-ddl
commit 250f402372e9826865749f3b81cd96a7cdaff657
Author: Andrew Or <[email protected]>
Date: 2016-04-12T17:58:42Z
Not OK
commit efecac9b01b3ff8be296234392f4a6c922fa2d25
Author: Andrew Or <[email protected]>
Date: 2016-04-12T22:22:54Z
Fix part of InsertIntoHiveTableSuite
We weren't using the right default serde in Hive. Note that this
still fails a test with "Reference 'ds' is ambiguous ...", but
this error is common across many tests so it will be addressed
in a future commit.
commit 50a2054ec7a7276d45c1ab5adabd4550e00c7811
Author: Andrew Or <[email protected]>
Date: 2016-04-12T22:39:15Z
Fix ambiguous reference bug
In HiveMetastoreCatalog we already combined the schema and the
partition keys to compensate for the fact that Hive separates it.
Now this logic is pushed to the edges where Spark talks to Hive.
commit 8dc554a38c9989fc43b119645bfe5c8ceb7b6cdb
Author: Andrew Or <[email protected]>
Date: 2016-04-12T23:42:29Z
Fix ParquetMetastoreSuite
Previously we always converted the data type string to lower case.
However, for struct fields this also converts the struct field
names to lower case. This is not what tests (or perhaps user code)
expects.
commit a4f67f2a53ecb63decd348ce57b22519e3cd78c0
Author: Andrew Or <[email protected]>
Date: 2016-04-12T23:53:01Z
Fix SQLQuerySuite
commit 045820cf8a5aaf74304aea763d804ddfe98d2806
Author: Andrew Or <[email protected]>
Date: 2016-04-13T01:01:48Z
Fix HiveCompatibilitySuite (ignored some tests)
commit 8e273fdc4f95d08cb6d09f4641472861587a3a01
Author: Andrew Or <[email protected]>
Date: 2016-04-13T01:05:02Z
Fix HiveDDLCommandSuite
commit 59edce332f87b07bdfb07e2e385431b2b123e1b0
Author: Andrew Or <[email protected]>
Date: 2016-04-13T06:12:10Z
Fix SQLQuerySuite CTAS
commit 7b1a1e381c97cbcb59fa2c36e15523273e6f7c28
Author: Andrew Or <[email protected]>
Date: 2016-04-13T06:26:54Z
Fix all but 1 ignored test in HiveCompatibilitySuite
There were a few differences in DESCRIBE TABLE:
- output format should be HiveIgnoreKeyTextOutputFormat
- num buckets should be -1
- last access time should be -1
- EXTERNAL should not be set to false for managed table
After making these changes out result now matches Hive's.
commit a60e66a71dec96973043ec62e2b6d4213c5add2c
Author: Andrew Or <[email protected]>
Date: 2016-04-13T06:31:31Z
Fix last ignored test in HiveCompatibilitySuite
CatalystSqlParser knows how to parse decimal(5)!
commit 02738fed54ba0eebc2c8d887430f0bef34213c68
Author: Andrew Or <[email protected]>
Date: 2016-04-13T06:32:02Z
Merge branch 'master' of github.com:apache/spark into create-table-ddl
commit ab70cb751cce8ca2e0757b9fa523534207864328
Author: Yin Huai <[email protected]>
Date: 2016-04-13T16:08:36Z
Preserve an existing behavior.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]