[
https://issues.apache.org/jira/browse/IMPALA-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947976#comment-17947976
]
Fang-Yu Rao edited comment on IMPALA-14007 at 4/28/25 11:44 PM:
----------------------------------------------------------------
The reported behavior of Apache Calcite's validator seems to be expected.
Specifically, the exception was thrown from {{validateSelect()}} in
[SqlValidatorImpl.java|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java].
For the given query "{{{}select test_db_01.test_tbl_01.id,
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01,
test_db_02.test_tbl_01{}}}", the aliases of the two tables involved in the
query are both "{{{}test_tbl_01{}}}" and thus 'names' in the code snippet as
follows has 2 entries of "{{{}TEST_TBL_01{}}}", resulting in an validation
error as reported.
{code:java}
protected void validateSelect(
SqlSelect select,
RelDataType targetRowType) {
requireNonNull(targetRowType, "targetRowType");
...
// Make sure that items in FROM clause have distinct aliases.
final SelectScope fromScope = (SelectScope) getFromScope(select);
List<@Nullable String> names = fromScope.getChildNames();
if (!catalogReader.nameMatcher().isCaseSensitive()) {
//noinspection RedundantTypeArguments
names = names.stream()
.<@Nullable String>map(s -> s == null ? null :
s.toUpperCase(Locale.ROOT))
.collect(Collectors.toList());
}
final int duplicateAliasOrdinal = Util.firstDuplicate(names);
if (duplicateAliasOrdinal >= 0) {
final ScopeChild child =
fromScope.children.get(duplicateAliasOrdinal);
throw newValidationError(
requireNonNull(
child.namespace.getEnclosingNode(),
() -> "enclosingNode of namespace of " + child.name),
RESOURCE.fromAliasDuplicate(child.name));
}
}
{code}
This behavior seems to be expected according to the code comment.
{code:java}
// Make sure that items in FROM clause have distinct aliases.
...
{code}
If we manually rewrite the original query to the following, then the query
could be validated and executed correctly.
{code:java}
select t1.id, t2.id from test_db_01.test_tbl_01 t1, test_db_02.test_tbl_01 t2;
{code}
On a related note, we also verified the *Hive Server2*
({*}3.1.3000.7.3.1.0-160{*}) currently shipped with Apache Impala master does
not support the query where the table names are not re-written as above to
avoid ambiguity.
{code:java}
0: jdbc:hive2://localhost:11050/default> select test_db_01.test_tbl_01.id,
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, test_db_02.test_tbl_01;
Error: Error while compiling statement: FAILED: SemanticException [Error
10008]: Line 1:89 Ambiguous table alias 'test_tbl_01' (state=42000,code=10008)
{code}
was (Author: fangyurao):
The reported behavior of Apache Calcite's validator seems to be expected.
Specifically, the exception was thrown from {{validateSelect()}} in
[SqlValidatorImpl.java|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java].
For the given query "{{{}select test_db_01.test_tbl_01.id,
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01,
test_db_02.test_tbl_01{}}}", the aliases of the two tables involved in the
query are both "{{{}test_tbl_01{}}}" and thus 'names' has 2 entries of
"{{{}TEST_TBL_01{}}}", resulting in an validation error as reported.
{code:java}
protected void validateSelect(
SqlSelect select,
RelDataType targetRowType) {
requireNonNull(targetRowType, "targetRowType");
...
// Make sure that items in FROM clause have distinct aliases.
final SelectScope fromScope = (SelectScope) getFromScope(select);
List<@Nullable String> names = fromScope.getChildNames();
if (!catalogReader.nameMatcher().isCaseSensitive()) {
//noinspection RedundantTypeArguments
names = names.stream()
.<@Nullable String>map(s -> s == null ? null :
s.toUpperCase(Locale.ROOT))
.collect(Collectors.toList());
}
final int duplicateAliasOrdinal = Util.firstDuplicate(names);
if (duplicateAliasOrdinal >= 0) {
final ScopeChild child =
fromScope.children.get(duplicateAliasOrdinal);
throw newValidationError(
requireNonNull(
child.namespace.getEnclosingNode(),
() -> "enclosingNode of namespace of " + child.name),
RESOURCE.fromAliasDuplicate(child.name));
}
}
{code}
This behavior seems to be expected according to the code comment.
{code:java}
// Make sure that items in FROM clause have distinct aliases.
...
{code}
If we manually rewrite the original query to the following, then the query
could be validated and executed correctly.
{code:java}
select t1.id, t2.id from test_db_01.test_tbl_01 t1, test_db_02.test_tbl_01 t2;
{code}
On a related note, we also verified the *Hive Server2*
({*}3.1.3000.7.3.1.0-160{*}) currently shipped with Apache Impala master does
not support the query where the table names are not re-written as above to
avoid ambiguity.
{code:java}
0: jdbc:hive2://localhost:11050/default> select test_db_01.test_tbl_01.id,
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, test_db_02.test_tbl_01;
Error: Error while compiling statement: FAILED: SemanticException [Error
10008]: Line 1:89 Ambiguous table alias 'test_tbl_01' (state=42000,code=10008)
{code}
> Calcite planner could not correctly validate queries involving the same table
> name
> ----------------------------------------------------------------------------------
>
> Key: IMPALA-14007
> URL: https://issues.apache.org/jira/browse/IMPALA-14007
> Project: IMPALA
> Issue Type: Bug
> Reporter: Fang-Yu Rao
> Assignee: Fang-Yu Rao
> Priority: Major
>
> We found that the Calcite planner could not correctly validate queries
> involving the same table name for 2 tables in different databases, whereas
> this case could be supported by the classic Impala frontend. The issue could
> be reproduced as follows in impala-shell if we start the Impala cluster
> after "{{{}export USE_CALCITE_PLANNER=true{}}}" on the command line.
> - {{create database test_db_01;}}
> - {{create database test_db_02;}}
> - {{create table test_db_01.test_tbl_01 (id int);}}
> - {{create table test_db_02.test_tbl_01 (id int);}}
> - {{set use_calcite_planner=1;}}
> - {{select test_db_01.test_tbl_01.id, test_db_02.test_tbl_01.id from
> test_db_01.test_tbl_01, test_db_02.test_tbl_01;}}
>
> We will get the following exception after the SELECT query above.
> {code:java}
> [localhost:21050] default> select test_db_01.test_tbl_01.id,
> test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, test_db_02.test_tbl_01;
> Query: select test_db_01.test_tbl_01.id, test_db_02.test_tbl_01.id from
> test_db_01.test_tbl_01, test_db_02.test_tbl_01
> Query submitted at: 2025-04-28 11:48:18 (Coordinator:
> http://fangyu-upstream-dev.gce.cloudera.com:25000)
> 2025-04-28 11:48:25 [Exception] ERROR: Query
> 2e4d548884a9a1aa:595f9b6d00000000 failed:
> AnalysisException: From line 1, column 90 to line 1, column 111: Duplicate
> relation name 'TEST_TBL_01' in FROM clause
> CAUSED BY: SqlValidatorException: Duplicate relation name 'TEST_TBL_01' in
> FROM clause
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]