[jira] [Commented] (FLINK-10573) Support task revocation

2018-10-22 Thread zhijiang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660123#comment-16660123
 ] 

zhijiang commented on FLINK-10573:
--

I have not focused on this implement yet. If your Jira would be relying on the 
{{DataConsumptionException}}, you can assign my Jira to yourself and realize it 
if you like. Or you could wait me to submit the PR if not blocking you. I think 
I can do that next month.

> Support task revocation
> ---
>
> Key: FLINK-10573
> URL: https://issues.apache.org/jira/browse/FLINK-10573
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: JIN SUN
>Assignee: JIN SUN
>Priority: Major
> Fix For: 1.7.0
>
>
> In Batch Mode, When a downstream task has a partition missing failure, which 
> indicate the output of upstream task has been lost. To make the job success 
> we need to rerun the upstream task to reproduce the data, which we call task 
> revocation (revoke the success of upstream task)
> For revocation, we need to identify the partition missing issue, and it is 
> better to detect the missing partition accurately:
>  * Ideally, it makes things much easier if we get a specific exception 
> indicating that the data source is missing
>  * When a task got an IOException, it doesn’t mean the source data has 
> issues. It might also be related to target task, such as that the target task 
> has network issues.
>  * If multiple tasks cannot read the same source, it is highly likely the 
> source data is missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10573) Support task revocation

2018-10-22 Thread JIN SUN (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659782#comment-16659782
 ] 

JIN SUN commented on FLINK-10573:
-

Thanks Zhijiang, i would like use this exception, i see the Jira is still open, 
do you have any update or patch? 

> Support task revocation
> ---
>
> Key: FLINK-10573
> URL: https://issues.apache.org/jira/browse/FLINK-10573
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: JIN SUN
>Assignee: JIN SUN
>Priority: Major
> Fix For: 1.7.0
>
>
> In Batch Mode, When a downstream task has a partition missing failure, which 
> indicate the output of upstream task has been lost. To make the job success 
> we need to rerun the upstream task to reproduce the data, which we call task 
> revocation (revoke the success of upstream task)
> For revocation, we need to identify the partition missing issue, and it is 
> better to detect the missing partition accurately:
>  * Ideally, it makes things much easier if we get a specific exception 
> indicating that the data source is missing
>  * When a task got an IOException, it doesn’t mean the source data has 
> issues. It might also be related to target task, such as that the target task 
> has network issues.
>  * If multiple tasks cannot read the same source, it is highly likely the 
> source data is missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10573) Support task revocation

2018-10-18 Thread zhijiang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654833#comment-16654833
 ] 

zhijiang commented on FLINK-10573:
--

I think my previous Jira 
[FLINK-6227|https://issues.apache.org/jira/browse/FLINK-6227]  which presents 
{{DataConsumptionException}} may satisfy your requirements. :)

> Support task revocation
> ---
>
> Key: FLINK-10573
> URL: https://issues.apache.org/jira/browse/FLINK-10573
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: JIN SUN
>Assignee: JIN SUN
>Priority: Major
> Fix For: 1.7.0
>
>
> In Batch Mode, When a downstream task has a partition missing failure, which 
> indicate the output of upstream task has been lost. To make the job success 
> we need to rerun the upstream task to reproduce the data, which we call task 
> revocation (revoke the success of upstream task)
> For revocation, we need to identify the partition missing issue, and it is 
> better to detect the missing partition accurately:
>  * Ideally, it makes things much easier if we get a specific exception 
> indicating that the data source is missing
>  * When a task got an IOException, it doesn’t mean the source data has 
> issues. It might also be related to target task, such as that the target task 
> has network issues.
>  * If multiple tasks cannot read the same source, it is highly likely the 
> source data is missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)