GitHub user wangyum opened a pull request:
https://github.com/apache/spark/pull/20248
[SPARK-23058][SQL] Fix non printable field delim issue
## What changes were proposed in this pull request?
Create a table with non printable delim like below:
```sql
CREATE EXTERNAL TABLE `t1`(`col1` bigint)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'field.delim' = '\177',
'serialization.format' = '\003'
)
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION 'file:/tmp/t1';
```
When `show create table t1` :
```sql
CREATE EXTERNAL TABLE `t1`(`col1` bigint)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'field.delim' = '',
'serialization.format' = ''
)
STORED AS
INPUTFORMAT 'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat'
LOCATION 'file:/tmp/t1'
TBLPROPERTIES (
'transient_lastDdlTime' = '1515766958'
)
```
`'\177'` and `'\003'` didn't correct show. This PR fix this issue.
## How was this patch tested?
manual tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wangyum/spark non-printable-field-delim
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20248.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20248
----
commit d44f242955503cf6195c5a47bbf631500406027d
Author: Yuming Wang <yumwang@...>
Date: 2018-01-12T14:28:22Z
Fix non printable field delim issue
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]