[ 
https://issues.apache.org/jira/browse/SPARK-26261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16708153#comment-16708153
 ] 

Hyukjin Kwon commented on SPARK-26261:
--------------------------------------

Mind if I ask the initial test you ran?

> Spark does not check completeness temporary file 
> -------------------------------------------------
>
>                 Key: SPARK-26261
>                 URL: https://issues.apache.org/jira/browse/SPARK-26261
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.3.2
>            Reporter: Jialin LIu
>            Priority: Minor
>
> Spark does not check temporary files' completeness. When persisting to disk 
> is enabled on some RDDs, a bunch of temporary files will be created on 
> blockmgr folder. Block manager is able to detect missing blocks while it is 
> not able detect file content being modified during execution. 
> Our initial test shows that if we truncate the block file before being used 
> by executors, the program will finish without detecting any error, but the 
> result content is totally wrong.
> We believe there should be a file checksum on every RDD file block and these 
> files should be protected by checksum.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to