zhangqs0205 opened a new issue, #10597:
URL: https://github.com/apache/seatunnel/issues/10597

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   I found that in `JdbcSource`, the `CatalogTable` obtained from JDBC metadata 
has empty `partitionKeys`.
   
     The scenario where I noticed this problem was metadata retrieval, but the 
issue itself is more general: `JdbcSource` does not preserve partition key 
metadata in the generated `CatalogTable`.
   
     For a partitioned table, the resulting metadata is still similar to:
   
     ```log
     CatalogTable{..., partitionKeys=[], ...}
   
     I checked the JDBC catalog-building code and found that partitionKeys are 
explicitly initialized as empty in the current implementation.
   
     In AbstractJdbcCatalog#getTable, CatalogTable is created with empty 
partition keys:
   
     return CatalogTable.of(
             tableIdentifier,
             tableSchemaBuilder.build(),
             buildConnectorOptions(tablePath),
             Collections.emptyList(),
             "",
             catalogName);
   
     In CatalogUtils, CatalogTable built from query metadata also uses empty 
partition keys:
   
     return CatalogTable.of(
             tableIdentifier,
             tableSchema,
             new HashMap<>(),
             new ArrayList<>(),
             "",
             catalogName);
   
     Because of this, JdbcSource loses partition metadata in the resulting 
CatalogTable.
   
   
   The scenario where I found this issue was metadata retrieval, but the 
problem itself is in JdbcSource metadata generation:
   
     - JdbcSource gets a CatalogTable
     - the partitionKeys in that CatalogTable are empty
     - partition metadata is lost on the JDBC source side
   
     Expected behavior:
   
     - for partitioned source tables, CatalogTable.partitionKeys should contain 
the actual partition key columns
   
     Actual behavior:
   
     - CatalogTable.partitionKeys is empty in JdbcSource
   
   ### SeaTunnel Version
   
   2.3.9
   
   ### SeaTunnel Config
   
   ```conf
   env {
       parallelism = 1
       job.mode = "BATCH"
     }
   
     source {
       Jdbc {
         url = "jdbc:hive2://localhost:10000/default"
         driver = "org.apache.hive.jdbc.HiveDriver"
         user = "test"
         password = "test"
         table_path = "default.partitioned_table"
       }
     }
   
     sink {
       Console {}
     }
   ```
   
   ### Running Command
   
   ```shell
   ./bin/seatunnel.sh --config config/jdbc_hive_test.conf
   ```
   
   ### Error Exception
   
   ```log
   No explicit exception is thrown.
   
     But the CatalogTable used by JdbcSource contains:
     CatalogTable{..., partitionKeys=[], ...}
   ```
   
   ### Zeta or Flink or Spark Version
   
   Flink 1.18
   
   ### Java or Scala Version
   
   java 8
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to