GitHub user dongjoon-hyun opened a pull request:
https://github.com/apache/spark/pull/19124
[SPARK-21912][SQL] Creating ORC datasource table should check invalid
column names
## What changes were proposed in this pull request?
Currently, users meet job abortions while creating ORC data source tables
with invalid column names. We had better prevent this by raising
**AnalysisException** with a guide to use aliases instead like Paquet data
source tables.
**BEFORE**
```scala
scala> sql("CREATE TABLE orc1 USING ORC AS SELECT 1 `a b`")
17/09/04 13:28:21 ERROR Utils: Aborting task
java.lang.IllegalArgumentException: Error: : expected at the position 8 of
'struct<a b:int>' but ' ' is found.
17/09/04 13:28:21 ERROR FileFormatWriter: Job job_20170904132821_0001
aborted.
17/09/04 13:28:21 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
org.apache.spark.SparkException: Task failed while writing rows.
```
**AFTER**
```scala
scala> sql("CREATE TABLE orc1 USING ORC AS SELECT 1 `a b`")
17/09/04 13:27:40 ERROR CreateDataSourceTableAsSelectCommand: Failed to
write to table orc1
org.apache.spark.sql.AnalysisException: Attribute name "a b" contains
invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it.;
```
## How was this patch tested?
Pass the Jenkins with a new test case.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dongjoon-hyun/spark SPARK-21912
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19124.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19124
----
commit 808dfe0fcd9de2f43b33f0d1d084172b5624f2a8
Author: Dongjoon Hyun <[email protected]>
Date: 2017-09-04T20:46:15Z
[SPARK-21912][SQL] Creating ORC datasource table should check invalid
column names
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]