BentsiLeviav commented on code in PR #37611:
URL: https://github.com/apache/beam/pull/37611#discussion_r2813148816


##########
sdks/java/io/clickhouse/src/main/java/org/apache/beam/sdk/io/clickhouse/ClickHouseIO.java:
##########
@@ -491,57 +633,106 @@ private static String tuplePreprocessing(String payload) 
{
         String.join(",", l).trim().replaceAll("Tuple\\(", 
"Tuple('").replaceAll(",", ",'");
     return content;
   }
+
   /**
-   * Returns {@link TableSchema} for a given table.
+   * Returns {@link TableSchema} for a given table using JDBC URL format.
+   *
+   * <p><b>Deprecated:</b> Use {@link #getTableSchema(String, String, String, 
Properties)} instead
+   * with separate URL, database, table, and properties parameters.
+   *
+   * <p>This method parses the JDBC URL to extract connection details and 
properties. For new code,
+   * use the explicit parameter version for better clarity and control.
+   *
+   * <p>Example migration:
    *
-   * @param jdbcUrl jdbc connection url
+   * <pre>{@code
+   * // Old way (deprecated):
+   * TableSchema schema = ClickHouseIO.getTableSchema(
+   *     "jdbc:clickhouse://localhost:8123/mydb?user=admin", "my_table");
+   *
+   * // New way:
+   * Properties props = new Properties();
+   * props.setProperty("user", "admin");
+   * TableSchema schema = ClickHouseIO.getTableSchema(
+   *     "http://localhost:8123";, "mydb", "my_table", props);
+   * }</pre>
+   *
+   * @param jdbcUrl JDBC connection URL (e.g., 
jdbc:clickhouse://host:port/database?param=value)
    * @param table table name
    * @return table schema
+   * @deprecated Use {@link #getTableSchema(String, String, String, 
Properties)} with explicit
+   *     parameters
    */
+  @Deprecated
   public static TableSchema getTableSchema(String jdbcUrl, String table) {
-    List<TableSchema.Column> columns = new ArrayList<>();
-
-    try (ClickHouseConnection connection = new 
ClickHouseDataSource(jdbcUrl).getConnection();
-        Statement statement = connection.createStatement()) {
-
-      ResultSet rs = null; // try-finally is used because findbugs doesn't 
like try-with-resource
-      try {
-        rs = statement.executeQuery("DESCRIBE TABLE " + 
quoteIdentifier(table));
-
-        while (rs.next()) {
-          String name = rs.getString("name");
-          String type = rs.getString("type");
-          String defaultTypeStr = rs.getString("default_type");
-          String defaultExpression = rs.getString("default_expression");
+    ClickHouseJdbcUrlParser.ParsedJdbcUrl parsed = 
ClickHouseJdbcUrlParser.parse(jdbcUrl);
+    return getTableSchema(
+        parsed.getClickHouseUrl(), parsed.getDatabase(), table, 
parsed.getProperties());
+  }
 
-          ColumnType columnType = null;
-          if (type.toLowerCase().trim().startsWith("tuple(")) {
-            String content = tuplePreprocessing(type);
-            columnType = ColumnType.parse(content);
-          } else {
-            columnType = ColumnType.parse(type);
-          }
-          DefaultType defaultType = 
DefaultType.parse(defaultTypeStr).orElse(null);
+  /**
+   * Returns {@link TableSchema} for a given table using ClickHouse Java 
Client.
+   *
+   * @param clickHouseUrl ClickHouse connection url
+   * @param database ClickHouse database
+   * @param table table name
+   * @param properties connection properties
+   * @return table schema
+   * @since 2.72.0
+   */
+  public static TableSchema getTableSchema(
+      String clickHouseUrl, String database, String table, Properties 
properties) {
+    List<TableSchema.Column> columns = new ArrayList<>();
 
-          Object defaultValue;
-          if (DefaultType.DEFAULT.equals(defaultType)
-              && !Strings.isNullOrEmpty(defaultExpression)) {
-            defaultValue = ColumnType.parseDefaultExpression(columnType, 
defaultExpression);
-          } else {
-            defaultValue = null;
+    try {
+      String user = properties.getProperty("user", "default");
+      String password = properties.getProperty("password", "");
+
+      // Create ClickHouse Java Client
+      Client.Builder clientBuilder =
+          new Client.Builder()
+              .addEndpoint(clickHouseUrl)
+              .setUsername(user)
+              .setPassword(password)
+              .setDefaultDatabase(database)
+              .setClientName(
+                  String.format("Apache Beam/%s", 
ReleaseInfo.getReleaseInfo().getSdkVersion()));
+
+      try (Client client = clientBuilder.build()) {

Review Comment:
   Thanks for the suggestion! I agree that using the client's built-in 
`getTableSchema()` would be more elegant. However, this would entail:
   
   1. Mapping between `com.clickhouse.client.api.metadata.TableSchema` (Java 
Client v2) and Beam's `org.apache.beam.sdk.io.clickhouse.TableSchema` 
(different class hierarchies)
   2. Testing the conversion logic for all supported ClickHouse types
   3. Verifying consistency with the current implementation
   
   Since this PR is already on the JDBC → Java Client v2 migration, I think 
this schema refactoring is out of scope and deserves a separate PR to keep 
things focused and reviewable.
   
   I've filed #37613 to track this improvement and left a TODO comment in the 
code to refer to it. We can revisit this as a follow-up PR once this main 
migration is merged.
   
   Would you like me to add this to the current PR, or are you okay with 
handling this in a separate PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to