[ https://issues.apache.org/jira/browse/IMPALA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Rodoni closed IMPALA-7056. ------------------------------- Resolution: Fixed Fix Version/s: Impala 3.1.0 Impala 2.13.0 > Changing Text Delimiter Does Not Work > ------------------------------------- > > Key: IMPALA-7056 > URL: https://issues.apache.org/jira/browse/IMPALA-7056 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Docs > Affects Versions: Impala 2.12.0 > Reporter: Alan Jackoway > Assignee: Alex Rodoni > Priority: Major > Fix For: Impala 2.13.0, Impala 3.1.0 > > > The wording on > https://impala.apache.org/docs/build/html/topics/impala_alter_table.html > makes it seem like you can change the delimiter of text tables after they are > created. > I did the following to simulate a table that needed to switch between comma > and pipe delimited: > {code} > hadoop fs -mkdir /user/alanj > hadoop fs -mkdir /user/alanj/test_delim > echo "A,B|C" > delim.txt > hadoop fs -put delim.txt /user/alanj/test_delim > {code} > Then created in impala and tried to change delimiters: > {code:sql} > > create external table default.alanj_test_delim(A string, B string) ROW > > FORMAT DELIMITED FIELDS TERMINATED BY "," LOCATION '/user/alanj/test_delim'; > > select * from default.alanj_test_delim; > Query: select * from default.alanj_test_delim > +---+-----+ > | a | b | > +---+-----+ > | A | B|C | > +---+-----+ > > alter table default.alanj_test_delim set SERDEPROPERTIES > > ('serialization.format'='|', 'field.delim'='|'); > > select * from default.alanj_test_delim; > +---+-----+ > | a | b | > +---+-----+ > | A | B|C | > +---+-----+ > > show create table default.alanj_test_delim; > +----------------------------------------------------------------------------------------------------------------------+ > | result > | > +----------------------------------------------------------------------------------------------------------------------+ > | CREATE EXTERNAL TABLE default.alanj_test_delim ( > | > | a STRING, > | > | b STRING > | > | ) > | > | ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' > | > | WITH SERDEPROPERTIES ('field.delim'='|', 'serialization.format'='|') > | > | STORED AS TEXTFILE > | > | LOCATION 'hdfs://namenode:8020/user/alanj/test_delim' > | > | TBLPROPERTIES ('COLUMN_STATS_ACCURATE'='false', 'numFiles'='0', > 'numRows'='-1', 'rawDataSize'='-1', 'totalSize'='0') | > +----------------------------------------------------------------------------------------------------------------------+ > {code} > So it shows the right serdeproperties, but impala doesn't actually use them > to read the data. > If you then insert data (as the docs suggest), it writes that data with the > new delimiter: > {code:sql} > > insert into default.alanj_test_delim values('D', 'E,F'); > > select * from alanj_test_delim; > +-----+-----+ > | a | b | > +-----+-----+ > | A,B | C | > | D | E,F | > +-----+-----+ > # hadoop fs -cat > /user/alanj/test_delim/a54bb0ec14646492-a738811400000000_1498283208_data.0. > D|E,F > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)