Hello,
Here is a case where you need to have a statement and a preparedStatementSetter.
PCollection<Row> dataCollection = pipeline.apply(Create.of(data));
PCollection<Void> rowsWritten =
dataCollection.apply(
JdbcIO.<Row>write()
.withDataSourceConfiguration(DATA_SOURCE_CONFIGURATION)
.withBatchSize(10L)
.withTable(firstTableName)
.withResults());
.dataCollection .
.apply(Wait.on(rowsWritten))
.apply(
JdbcIO.<Row>write()
.withDataSourceConfiguration(DATA_SOURCE_CONFIGURATION)
.withBatchSize(10L)
.withTable(secondTableName));
.run();
In this case, we write data to one table and then to the other, but only after
the window of data has been fully written to the first table. It is not
possible to do this with the existing JdbcIO.Write functionality.
Another option for this specific case could be extending the existing class
instead of adding a schemaApi-specific class. We can add additional conditions
and move some functionality from Write to WriteVoid to infer beamScheama. What
do you think about these options?
Schema Providers is not very well documented in Beam, and a bit confusing us.
We using Beam row as a common abstraction in Beam pipelines, which really meets
our requirements. Looking to Beam docs/code we saw SchemaProviders for some
IOs. Those providers seem like wrappers around IOs that help work with schemas
and conversion data to Beam Rows. Сould you please clarify this a little? If we
want to improve Beam Schema API what is the architecture-right way to do that?
Thank you,
Raphael.
________________________________
От: Brian Hulette <[email protected]>
Отправлено: 9 июня 2021 г. 19:12:41
Кому: dev
Копия: Reuven Lax; [email protected]; Ilya Kozyrev
Тема: [EXTERNAL] Re:
> And also the ticket and "// TODO: BEAM-10396 use writeRows() when it's
> available" appeared later than this functionality was added to "JdbcIO.Write".
Note that this TODO has been moved around through a few refactors. It was
initially added last summer [1].
You're right that JdbcIO.Write's statement generation functionality was added
about a year before that [2]. It's possible that the author of [1] didn't
realize [2] was done. Or maybe there's some reason why it doesn't work there?
+1 for Alexey's requests:
- Identify cases where statement generation in JdbcIO.Write is insufficient, if
they exist (e.g. can we just use it where that TODO is [3]? If not what goes
wrong?).
- Update documentation to avoid this confusion in the future.
Brian
[1] https://github.com/apache/beam/pull/12145
[2] https://github.com/apache/beam/pull/8962
[3] https://github.com/apache/beam/pull/14954#discussion_r648456230
On Wed, Jun 9, 2021 at 7:49 AM Alexey Romanenko
<[email protected]<mailto:[email protected]>> wrote:
Hello Raphael,
On 9 Jun 2021, at 09:31, Raphael Sanamyan
<[email protected]<mailto:[email protected]>> wrote:
The "JdbcIO.Write" allows you to write rows without a statement or statement
preparer, but not all functionality works without them.
Could you show a use case when the current functionality is not enough?
The method "WithResults" requires a statement and statement preparer. And also
the ticket<https://issues.apache.org/jira/browse/BEAM-10396> and "// TODO:
BEAM-10396 use writeRows() when it's
available"<https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcSchemaIOProvider.java#L142>
appeared later than this functionality was added to "JdbcIO.Write". And
without reading the code, just the documentation, it's not clear that the
schema is enough.
Agree but the documentation can be updated. On the oath hand, it would be great
to have some examples that show the needs of WriteRows.
Thanks,
Alexey
Thank you,
Raphael.
________________________________
От: Pablo Estrada <[email protected]<mailto:[email protected]>>
Отправлено: 7 июня 2021 г. 22:43:24
Кому: dev; Reuven Lax
Копия: Ilya Kozyrev
Тема: Re:
******* This Message Is From an External Sender *******
+Reuven Lax<mailto:[email protected]> do you know if this is already supported
or not?
I have been able to use `JdbcIO.write()` without specifying a statement nor a
statement preparer. Is that not what's necessary? I've done this with a named
class with schemas (i.e. not Row) - is this perhaps the difference?
Best
-P.
On Fri, Jun 4, 2021 at 3:44 PM Robert Bradshaw
<[email protected]<mailto:[email protected]>> wrote:
That would be great! I don't know much about this particular issue,
but tips for getting started in general can be found at
https://beam.apache.org/contribute/
On Thu, Jun 3, 2021 at 10:55 AM Raphael Sanamyan
<[email protected]<mailto:[email protected]>> wrote:
>
> Hi, community,
>
> I would like to start work on this task beam-10396, I hope nobody minds?
> Also, if anyone has any details or developments on this task, I would be glad
> if you could share them.
>
> Thank you,
> Raphael.
>
>