TheNeuralBit commented on a change in pull request #12498:
URL: https://github.com/apache/beam/pull/12498#discussion_r467335905
##########
File path: sdks/python/apache_beam/io/jdbc.py
##########
@@ -168,41 +182,33 @@ def __init__(
:param expansion_service: The address (host:port) of the ExpansionService.
"""
- super(WriteToJdbc, self).__init__(
- self.URN,
- NamedTupleBasedPayloadBuilder(
- WriteToJdbcSchema(
- driver_class_name=driver_class_name,
- jdbc_url=jdbc_url,
- username=username,
- password=password,
- statement=statement,
- connection_properties=connection_properties,
- connection_init_sqls=connection_init_sqls,
+ super(WriteToJdbc, self).__init__(
+ self.URN,
+ NamedTupleBasedPayloadBuilder(
+ ReadFromWriteToJdbcSchema(
+ location=jdbc_url,
Review comment:
I think the better analog for "location" would actually be the table
name, and the JDBC URL would be part of the configuration. Unfortunately it
looks like the typical use-case (i.e. what we have in xlang_jdbcio_it_test)
just has the table name implicitly in the query:
https://github.com/apache/beam/blob/84cbd9bdfbe45221b8f3302cedad7bdb47cd2f5a/sdks/python/apache_beam/io/external/xlang_jdbcio_it_test.py#L151
and in the statement:
https://github.com/apache/beam/blob/84cbd9bdfbe45221b8f3302cedad7bdb47cd2f5a/sdks/python/apache_beam/io/external/xlang_jdbcio_it_test.py#L120
Maybe what we should do here is require a `table_name` parameter that we
will use for the location. Then, in the JdbcSchemaIOProvider, we can actually
generate the query and statement. The user would still have the ability to
override both query and statement if they prefer. WDYT?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]