[ 
https://issues.apache.org/jira/browse/IMPALA-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17947976#comment-17947976
 ] 

Fang-Yu Rao edited comment on IMPALA-14007 at 4/28/25 11:46 PM:
----------------------------------------------------------------

The reported behavior of Apache Calcite's validator seems to be expected. 
Specifically, the exception was thrown from {{validateSelect()}} in 
[SqlValidatorImpl.java|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java].
 For the given query "{{{}select test_db_01.test_tbl_01.id, 
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, 
test_db_02.test_tbl_01{}}}", the aliases of the two tables involved in the 
query are both "{{{}test_tbl_01{}}}" and thus 'names' in the code snippet as 
follows has 2 entries of "{{{}TEST_TBL_01{}}}", resulting in the validation 
error as reported.
{code:java}
  protected void validateSelect(
      SqlSelect select,
      RelDataType targetRowType) {
    requireNonNull(targetRowType, "targetRowType");
    ...
    // Make sure that items in FROM clause have distinct aliases.
    final SelectScope fromScope = (SelectScope) getFromScope(select);
    List<@Nullable String> names = fromScope.getChildNames();
    if (!catalogReader.nameMatcher().isCaseSensitive()) {
      //noinspection RedundantTypeArguments
      names = names.stream()
          .<@Nullable String>map(s -> s == null ? null : 
s.toUpperCase(Locale.ROOT))
          .collect(Collectors.toList());
    }
    final int duplicateAliasOrdinal = Util.firstDuplicate(names);
    if (duplicateAliasOrdinal >= 0) {
      final ScopeChild child =
          fromScope.children.get(duplicateAliasOrdinal);
      throw newValidationError(
          requireNonNull(
              child.namespace.getEnclosingNode(),
              () -> "enclosingNode of namespace of " + child.name),
          RESOURCE.fromAliasDuplicate(child.name));
    }
  }
{code}
 

This behavior seems to be expected according to the code comment.
{code:java}
    // Make sure that items in FROM clause have distinct aliases.
    ...
{code}
 

If we manually rewrite the original query to the following, then the query 
could be validated and executed correctly.
{code:java}
select t1.id, t2.id from test_db_01.test_tbl_01 t1, test_db_02.test_tbl_01 t2;
{code}
 

On a related note, we also verified the *Hive Server2* 
({*}3.1.3000.7.3.1.0-160{*}) currently shipped with Apache Impala master does 
NOT support the query where the table names are not re-written as above to 
avoid ambiguity.
{code:java}
0: jdbc:hive2://localhost:11050/default> select test_db_01.test_tbl_01.id, 
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, test_db_02.test_tbl_01;
Error: Error while compiling statement: FAILED: SemanticException [Error 
10008]: Line 1:89 Ambiguous table alias 'test_tbl_01' (state=42000,code=10008)
{code}


was (Author: fangyurao):
The reported behavior of Apache Calcite's validator seems to be expected. 
Specifically, the exception was thrown from {{validateSelect()}} in 
[SqlValidatorImpl.java|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java].
 For the given query "{{{}select test_db_01.test_tbl_01.id, 
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, 
test_db_02.test_tbl_01{}}}", the aliases of the two tables involved in the 
query are both "{{{}test_tbl_01{}}}" and thus 'names' in the code snippet as 
follows has 2 entries of "{{{}TEST_TBL_01{}}}", resulting in the validation 
error as reported.
{code:java}
  protected void validateSelect(
      SqlSelect select,
      RelDataType targetRowType) {
    requireNonNull(targetRowType, "targetRowType");
    ...
    // Make sure that items in FROM clause have distinct aliases.
    final SelectScope fromScope = (SelectScope) getFromScope(select);
    List<@Nullable String> names = fromScope.getChildNames();
    if (!catalogReader.nameMatcher().isCaseSensitive()) {
      //noinspection RedundantTypeArguments
      names = names.stream()
          .<@Nullable String>map(s -> s == null ? null : 
s.toUpperCase(Locale.ROOT))
          .collect(Collectors.toList());
    }
    final int duplicateAliasOrdinal = Util.firstDuplicate(names);
    if (duplicateAliasOrdinal >= 0) {
      final ScopeChild child =
          fromScope.children.get(duplicateAliasOrdinal);
      throw newValidationError(
          requireNonNull(
              child.namespace.getEnclosingNode(),
              () -> "enclosingNode of namespace of " + child.name),
          RESOURCE.fromAliasDuplicate(child.name));
    }
  }
{code}
 

This behavior seems to be expected according to the code comment.
{code:java}
    // Make sure that items in FROM clause have distinct aliases.
    ...
{code}
 

If we manually rewrite the original query to the following, then the query 
could be validated and executed correctly.
{code:java}
select t1.id, t2.id from test_db_01.test_tbl_01 t1, test_db_02.test_tbl_01 t2;
{code}
 

On a related note, we also verified the *Hive Server2* 
({*}3.1.3000.7.3.1.0-160{*}) currently shipped with Apache Impala master does 
not support the query where the table names are not re-written as above to 
avoid ambiguity.
{code:java}
0: jdbc:hive2://localhost:11050/default> select test_db_01.test_tbl_01.id, 
test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, test_db_02.test_tbl_01;
Error: Error while compiling statement: FAILED: SemanticException [Error 
10008]: Line 1:89 Ambiguous table alias 'test_tbl_01' (state=42000,code=10008)
{code}

> Calcite planner could not correctly validate queries involving the same table 
> name
> ----------------------------------------------------------------------------------
>
>                 Key: IMPALA-14007
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14007
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Fang-Yu Rao
>            Assignee: Fang-Yu Rao
>            Priority: Major
>
> We found that the Calcite planner could not correctly validate queries 
> involving the same table name for 2 tables in different databases, whereas 
> this case could be supported by the classic Impala frontend. The issue could 
> be reproduced as follows in impala-shell if  we start the Impala cluster 
> after "{{{}export USE_CALCITE_PLANNER=true{}}}" on the command line.
>  - {{create database test_db_01;}}
>  - {{create database test_db_02;}}
>  - {{create table test_db_01.test_tbl_01 (id int);}}
>  - {{create table test_db_02.test_tbl_01 (id int);}}
>  - {{set use_calcite_planner=1;}}
>  - {{select test_db_01.test_tbl_01.id, test_db_02.test_tbl_01.id from 
> test_db_01.test_tbl_01, test_db_02.test_tbl_01;}}
>  
> We will get the following exception after the SELECT query above.
> {code:java}
> [localhost:21050] default> select test_db_01.test_tbl_01.id, 
> test_db_02.test_tbl_01.id from test_db_01.test_tbl_01, test_db_02.test_tbl_01;
> Query: select test_db_01.test_tbl_01.id, test_db_02.test_tbl_01.id from 
> test_db_01.test_tbl_01, test_db_02.test_tbl_01
> Query submitted at: 2025-04-28 11:48:18 (Coordinator: 
> http://fangyu-upstream-dev.gce.cloudera.com:25000)
> 2025-04-28 11:48:25 [Exception]  ERROR: Query 
> 2e4d548884a9a1aa:595f9b6d00000000 failed:
> AnalysisException: From line 1, column 90 to line 1, column 111: Duplicate 
> relation name 'TEST_TBL_01' in FROM clause
> CAUSED BY: SqlValidatorException: Duplicate relation name 'TEST_TBL_01' in 
> FROM clause
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to