pvary opened a new pull request, #7571:
URL: https://github.com/apache/iceberg/pull/7571

   We found that if the current implementation of the IcebergSource faces some 
downstream error then it silently retries the planning again and again until 
the error persists. The only effect on the job is that there is no new record 
emitted - but no other alarms are raised.
   
   This is similar how the Flink FileSource works but this is confusing for the 
users.
   Also other sources might implement error handling differently, for example 
Kafka which fails immediately on an error. This is also not desirable for our 
user base because we expect more resiliency for our jobs.
   
   Based on our discussion with @zhen-wu2 and @gyula-fora, I have created this 
PR which adds the possibility to retry the failed planning and to configure the 
number of retries in a few different ways:
   - `IcebergSource.Builder.planRetryNum` - if the source is created from java 
code
   - Using FlinkConfiguration key `connector.iceberg.plan-retry-num`
   - Through read option `plan-retry-num`
   
   The default value is 3, which means that if the 4th planning is failed then 
the Flink job is failed.
   If the original behaviour is needed, then the value `-1` should be used.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to