[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2020-05-08 Thread Afroz Baig (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102837#comment-17102837 ] Afroz Baig commented on SPARK-29037: spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2020-04-29 Thread t oo (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095434#comment-17095434 ] t oo commented on SPARK-29037: -- with spark 2.3.4 and hadoop 2.8.5: i am facing this doing simple Overwrite

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-15 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929916#comment-16929916 ] feiwang commented on SPARK-29037: - [~advancedxy] Hi, I found that even with dynamicPartitionOverwrite,

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928674#comment-16928674 ] feiwang commented on SPARK-29037: - [~advancedxy] I just checked the code, as shown below.

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928656#comment-16928656 ] feiwang commented on SPARK-29037: - In detail, I think we need change the logic of

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928650#comment-16928650 ] Xianjin YE commented on SPARK-29037: > About output check, I think it is not appropriate, because

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928598#comment-16928598 ] feiwang commented on SPARK-29037: - The implementation of InsertIntoHiveTable prevent reuse same

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928596#comment-16928596 ] feiwang commented on SPARK-29037: - [~advancedxy] 1. We re-submit the same application again. We meet

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread Xianjin YE (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928561#comment-16928561 ] Xianjin YE commented on SPARK-29037: [~hzfeiwang] by rerun the application, do you mean re-submit

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928503#comment-16928503 ] Wenchen Fan commented on SPARK-29037: - [~advancedxy] can you take a look? > [Core] Spark gives

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928394#comment-16928394 ] feiwang commented on SPARK-29037: - But for the version 2, it may produce partial result when we kill an

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-12 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928375#comment-16928375 ] feiwang commented on SPARK-29037: - If we set

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-11 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928180#comment-16928180 ] feiwang commented on SPARK-29037: - [~cloud_fan] > [Core] Spark gives duplicate result when an

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-11 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928170#comment-16928170 ] feiwang commented on SPARK-29037: - If we have several applications, which insert overwrite a partition

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-11 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928168#comment-16928168 ] feiwang commented on SPARK-29037: - This committedTaskPath is hard coded in FileOutputCommitter class. >

[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun

2019-09-11 Thread feiwang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928165#comment-16928165 ] feiwang commented on SPARK-29037: - This is the unit test log. !screenshot-1.png! We can see that, the