Jialin LIu created SPARK-26261: ---------------------------------- Summary: Spark does not check completeness temporary file Key: SPARK-26261 URL: https://issues.apache.org/jira/browse/SPARK-26261 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.3.2 Reporter: Jialin LIu
Spark does not check temporary files' completeness. When persisting to disk is enabled on some RDDs, a bunch of temporary files will be created on blockmgr folder. Block manager is able to detect missing blocks while it is not able detect file content being modified during execution. Our initial test shows that if we truncate the block file before being used by executors, the program will finish without detecting any error, but the result content is totally wrong. We believe there should be a file checksum on every RDD file block and these files should be protected by checksum. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org