[GitHub] [iceberg] spancer commented on issue #2208: IcebergTableSink to write data into multiple iceberg tables

GitBox Mon, 30 Aug 2021 22:00:02 -0700


spancer commented on issue #2208:
URL: https://github.com/apache/iceberg/issues/2208#issuecomment-908899467



   > Currently, Flink Iceberg sink can only write to one table with 
exactly-once semantic.
   > 
   > If the table partition doesn't fit your requirements, maybe you can use 
filter or side output to split the stream into multiple sub streams. then 
attach a separate Iceberg sink to each sub DataStream. This probably won't work 
if the number of tables/datasets is really high (like hundreds). In that case, 
maybe split the uber stream before the Iceberg ingestion jobs.
   
   Hi steven, 
     Using a side-output stream and attach a separate sink to each sub-stream 
can't solve the problem gracefully. This requires a fixed count of sub-streams, 
or we have to bind each sub-stream to a pre-defined name, otherwise, we can't 
use them in the follow-up sink. Because we have to make sure the job graph is 
pre-defined. I think using two jobs might achieve this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] spancer commented on issue #2208: IcebergTableSink to write data into multiple iceberg tables

Reply via email to