Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-05 Thread Mich Talebzadeh
>>> speculatively execute tasks on different executors to improve >>>>>> performance. If a task fails due to the *FetchFailedException*, a >>>>>> speculative task might be launched on another executor. This is where fun >>>>>> and games start. If the unavailable node recovers before t

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-04 Thread Prem Sahoo
e recovers before the speculative >>>>>> task finishes, both the original and speculative tasks might complete >>>>>> successfully,* resulting in duplicates*. With regard to missing >>>>>> data, if the data node reboot leads to data corru

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-04 Thread Jason Xu
>>> I think when a task failed in between and retry task started >>>>> and completed it may create duplicate as failed task has some data + retry >>>>> task has full data. but my question is why spark keeps delta data or >>>>> according to you i

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-04 Thread Prem Sahoo
a task fails due to the *FetchFailedException*, a >>>>> speculative task might be launched on another executor. This is where fun >>>>> and games start. If the unavailable node recovers before the speculative >>>>> task finishes, both the original and specul

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-04 Thread Mich Talebzadeh
me features to mitigate these >>>> issues, but it might not guarantee complete elimination of duplicates or >>>> data loss:. You can adjust parameters like *spark.shuffle.retry.wa*it >>>> and *spark.speculation* to control retry attempts and speculative >>>> exe

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-03 Thread Prem Sahoo
t;>>> On Thu, Feb 29, 2024 at 9:50 PM Dongjoon Hyun >>>> wrote: >>>> >>>>> Please use the url as thr full string including '()' part. >>>>> >>>>> Or you can seach directly at ASF Jira with 'Spark' project and three

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-02 Thread Mich Talebzadeh
ception due to data node reboot then spark > should handle it gracefully isn't it ? > or how to handle it ? > > > > > > On Fri, Mar 1, 2024 at 5:35 PM Mich Talebzadeh > wrote: > >> Hi, >> >> Your point -> "When Spark job shows Fe

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-01 Thread Prem Sahoo
at 5:35 PM Mich Talebzadeh wrote: > Hi, > > Your point -> "When Spark job shows FetchFailedException it creates few > duplicate data and we see few data also missing , please explain why. We > have scenario when spark job complains *FetchFailedException as one of > the

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-01 Thread Mich Talebzadeh
Hi, Your point -> "When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why. We have scenario when spark job complains *FetchFailedException as one of the data node got ** rebooted middle of job running ."* As

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-01 Thread Prem Sahoo
Hello All, in the list of JIRAs i didn't find anything related to fetchFailedException. as mentioned above "When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why. We have a scenario when spark job comp

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-02-29 Thread Dongjoon Hyun
ld be help if you can report any correctness issues with Apache >> Spark 3.5.1. >> >> Thanks, >> Dongjoon. >> >> On 2024/02/29 15:04:41 Prem Sahoo wrote: >> > When Spark job shows FetchFailedException it creates few duplicate data >> and >> &

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-02-29 Thread Prem Sahoo
s with Apache > Spark 3.5.1. > > Thanks, > Dongjoon. > > On 2024/02/29 15:04:41 Prem Sahoo wrote: > > When Spark job shows FetchFailedException it creates few duplicate data > and > > we see few data also missing , please explain why. We have scenario when > >

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-02-29 Thread Dongjoon Hyun
if you can report any correctness issues with Apache Spark 3.5.1. Thanks, Dongjoon. On 2024/02/29 15:04:41 Prem Sahoo wrote: > When Spark job shows FetchFailedException it creates few duplicate data and > we see few data also missing , please explain why. We have scenario when >

When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-02-29 Thread Prem Sahoo
When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why. We have scenario when spark job complains FetchFailedException as one of the data node got rebooted middle of job running . Now due to this we have few duplicate data