[ https://issues.apache.org/jira/browse/SPARK-22963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jose Torres updated SPARK-22963: -------------------------------- Description: Spark native task restarts don't work well for continuous processing. They will process all data from the task's original start offset - even data which has already been committed. This is not semantically incorrect under at least once semantics, but it's awkward and bad. Fortunately, they're also not necessary; the central coordinator can restart every task from the checkpointed offsets without losing much. So we should force > Clean up continuous processing failure recovery > ----------------------------------------------- > > Key: SPARK-22963 > URL: https://issues.apache.org/jira/browse/SPARK-22963 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 2.3.0 > Reporter: Jose Torres > > Spark native task restarts don't work well for continuous processing. They > will process all data from the task's original start offset - even data which > has already been committed. This is not semantically incorrect under at least > once semantics, but it's awkward and bad. > Fortunately, they're also not necessary; the central coordinator can restart > every task from the checkpointed offsets without losing much. So we should > force -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org