[
https://issues.apache.org/jira/browse/IMPALA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Rodoni closed IMPALA-7056.
-------------------------------
Resolution: Fixed
Fix Version/s: Impala 3.1.0
Impala 2.13.0
> Changing Text Delimiter Does Not Work
> -------------------------------------
>
> Key: IMPALA-7056
> URL: https://issues.apache.org/jira/browse/IMPALA-7056
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog, Docs
> Affects Versions: Impala 2.12.0
> Reporter: Alan Jackoway
> Assignee: Alex Rodoni
> Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The wording on
> https://impala.apache.org/docs/build/html/topics/impala_alter_table.html
> makes it seem like you can change the delimiter of text tables after they are
> created.
> I did the following to simulate a table that needed to switch between comma
> and pipe delimited:
> {code}
> hadoop fs -mkdir /user/alanj
> hadoop fs -mkdir /user/alanj/test_delim
> echo "A,B|C" > delim.txt
> hadoop fs -put delim.txt /user/alanj/test_delim
> {code}
> Then created in impala and tried to change delimiters:
> {code:sql}
> > create external table default.alanj_test_delim(A string, B string) ROW
> > FORMAT DELIMITED FIELDS TERMINATED BY "," LOCATION '/user/alanj/test_delim';
> > select * from default.alanj_test_delim;
> Query: select * from default.alanj_test_delim
> +---+-----+
> | a | b |
> +---+-----+
> | A | B|C |
> +---+-----+
> > alter table default.alanj_test_delim set SERDEPROPERTIES
> > ('serialization.format'='|', 'field.delim'='|');
> > select * from default.alanj_test_delim;
> +---+-----+
> | a | b |
> +---+-----+
> | A | B|C |
> +---+-----+
> > show create table default.alanj_test_delim;
> +----------------------------------------------------------------------------------------------------------------------+
> | result
> |
> +----------------------------------------------------------------------------------------------------------------------+
> | CREATE EXTERNAL TABLE default.alanj_test_delim (
> |
> | a STRING,
> |
> | b STRING
> |
> | )
> |
> | ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
> |
> | WITH SERDEPROPERTIES ('field.delim'='|', 'serialization.format'='|')
> |
> | STORED AS TEXTFILE
> |
> | LOCATION 'hdfs://namenode:8020/user/alanj/test_delim'
> |
> | TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='false', 'numFiles'='0',
> 'numRows'='-1', 'rawDataSize'='-1', 'totalSize'='0') |
> +----------------------------------------------------------------------------------------------------------------------+
> {code}
> So it shows the right serdeproperties, but impala doesn't actually use them
> to read the data.
> If you then insert data (as the docs suggest), it writes that data with the
> new delimiter:
> {code:sql}
> > insert into default.alanj_test_delim values('D', 'E,F');
> > select * from alanj_test_delim;
> +-----+-----+
> | a | b |
> +-----+-----+
> | A,B | C |
> | D | E,F |
> +-----+-----+
> # hadoop fs -cat
> /user/alanj/test_delim/a54bb0ec14646492-a738811400000000_1498283208_data.0.
> D|E,F
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)