PostCommitTopology question

Péter Váry Tue, 27 Aug 2024 12:41:52 -0700

Hi everyone,

I am working on the Iceberg Connector's Table Maintenance function [1], and
we plan to utilize the SinkV2 SupportsPostCommitTopology (formerly known as
WithPostCommitTopology) to start the compaction after several commits.
With Steven Zhen Wu were debating [2] the expectations about the topology
added by the addPostCommitTopology method.


We see the following possibilities:
- The topology should end in another SinkV2, or DummySink to make sure that
this part of the DAG is executed and not optimized out
- The topology should end with an Operator where the output is Void -
nobody uses the output anyway
- The topology should end with an Operator with any output - for example
some "CompactionResult" which could be used for testing. In production it
will not be used, as there is no way to get the result stream from the Sink
(am I right here?).

What are the expectations/best practices followed by the other
implementations of the SupportsPostCommitTopology?

Thanks,
Peter

[1] -
https://docs.google.com/document/d/16g3vR18mVBy8jbFaLjf2JwAANuYOmIwr15yDDxovdnA
[2] - https://github.com/apache/iceberg/pull/11010

PostCommitTopology question

Reply via email to