Paul Rogers created IMPALA-8051:
-----------------------------------
Summary: Compute stats fails on a column with special character in
name
Key: IMPALA-8051
URL: https://issues.apache.org/jira/browse/IMPALA-8051
Project: IMPALA
Issue Type: Bug
Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
Assignee: Paul Rogers
Problem - "compute stats" query executed on a table containing a special
character "#" in one of its columns is failing with below error:
WARNINGS: AnalysisException: Syntax error in line 1:
...length(cola)), NDV(colb#) AS colb#, CAST(-1 as BIG...
^
Encountered: Unexpected character
Expected: ADD, ALTER, AND, ARRAY, AS, ASC, BETWEEN, BIGINT, BINARY, BLOCK_SIZE,
BOOLEAN, CACHED, CASCADE, CHANGE, CHAR, COMMENT, COMPRESSION, CROSS, DATE,
DATETIME, DECIMAL, DEFAULT, DESC, DIV, REAL, DROP, ELSE, ENCODING, END, FLOAT,
FOLLOWING, FROM, FULL, GROUP, IGNORE, HAVING, ILIKE, IN, INNER, INTEGER,
IREGEXP, IS, JOIN, LEFT, LIKE, LIMIT, LOCATION, MAP, NOT, NULL, NULLS, OFFSET,
ON, OR, ORDER, PARTITION, PARTITIONED, PRECEDING, PRIMARY, PURGE, RANGE,
RECOVER, REGEXP, RENAME, REPLACE, RESTRICT, RIGHT, RLIKE, ROW, ROWS, SELECT,
SET, SMALLINT, SORT, STORED, STRAIGHT_JOIN, STRING, STRUCT, TABLESAMPLE,
TBLPROPERTIES, THEN, TIMESTAMP, TINYINT, TO, UNCACHED, UNION, USING, VALUES,
VARCHAR, WHEN, WHERE, WITH, COMMA, IDENTIFIER
Steps to reproduce the issue -
# Create a table containing special character in one of it columns from Hive.
For example:
{code:sql}
CREATE TABLE test_special_character (`id#` int);
{code}
# Execute "INVALIDATE METADATA test_special_character" from Impala.
# Execute "COMPUTE STATS test_special_character" from Impala and it'll lead to
above mentioned error.
Impala does not allow to create tables with columns containing special
characters but Hive allows it by using back ticks (``) to escape it. However,
Impala still can load the metadata of table and can read from column containing
special character as well by escaping the special character using back ticks
(``). For example, below query can be executed from Impala -
{code:sql}
select `id#` from test_special_character;
{code}
However, when "compute stats" command is executed, the query triggered by this
command to compute column-level stats does not use back ticks to escape the
special character present in one of the columns as it does not know that column
contains a special character and this is the cause of this issue.
(Reported by a user.)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)