[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-103156205 I don't think this is going to be merged. Per comments in the JIRA this was 'as intended' for Spark. Do you mind closing this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user liyezhang556520 closed the pull request at: https://github.com/apache/spark/pull/2956 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user liyezhang556520 commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-103291479 I'm closing this, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user dding3 commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67931182 We have tested the patch in below senarios and find it works : 1. Apply checkpoint. RDD has been flush to disk as expected 2. Doesn't apply checkpoint. There is no performance degradation, our app(pagerank) only spend 1 more second compared to spark without patch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67459431 [Test build #24585 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24585/consoleFull) for PR 2956 at commit [`a473241`](https://github.com/apache/spark/commit/a47324118358802fcc6821e77ead77fd37003904). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67468461 [Test build #24585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24585/consoleFull) for PR 2956 at commit [`a473241`](https://github.com/apache/spark/commit/a47324118358802fcc6821e77ead77fd37003904). * This patch **fails PySpark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67468465 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24585/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user liyezhang556520 commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67483754 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67484171 [Test build #24589 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24589/consoleFull) for PR 2956 at commit [`a473241`](https://github.com/apache/spark/commit/a47324118358802fcc6821e77ead77fd37003904). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67494552 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24589/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67494546 [Test build #24589 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24589/consoleFull) for PR 2956 at commit [`a473241`](https://github.com/apache/spark/commit/a47324118358802fcc6821e77ead77fd37003904). * This patch **fails PySpark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67603039 [Test build #24629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24629/consoleFull) for PR 2956 at commit [`be7c1fa`](https://github.com/apache/spark/commit/be7c1fae8deb2922b276fca2c46f747c2cdb05f1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67607694 [Test build #24629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24629/consoleFull) for PR 2956 at commit [`be7c1fa`](https://github.com/apache/spark/commit/be7c1fae8deb2922b276fca2c46f747c2cdb05f1). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-67607704 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24629/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-66239252 [Test build #24239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24239/consoleFull) for PR 2956 at commit [`c73ee63`](https://github.com/apache/spark/commit/c73ee632d3531f28d38cdc245739921acdcd2795). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-66246526 [Test build #24239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24239/consoleFull) for PR 2956 at commit [`c73ee63`](https://github.com/apache/spark/commit/c73ee632d3531f28d38cdc245739921acdcd2795). * This patch **passes all tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-66246533 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24239/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-61751034 [Test build #22911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22911/consoleFull) for PR 2956 at commit [`81dacc5`](https://github.com/apache/spark/commit/81dacc5881d40906a4dc63dd43243853f3020bbd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-61756810 [Test build #22911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22911/consoleFull) for PR 2956 at commit [`81dacc5`](https://github.com/apache/spark/commit/81dacc5881d40906a4dc63dd43243853f3020bbd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-61756815 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22911/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-60867321 [Test build #22419 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22419/consoleFull) for PR 2956 at commit [`b8633c7`](https://github.com/apache/spark/commit/b8633c7714aacf5d8c87037f3108564a88c555b5). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user liyezhang556520 commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-60867864 RDD checkpoint should also support like this: `rdd0 = sc.makeRDD(...)` `rdd1 = rdd0.flatmap(...)` `rdd1.collect()` `rdd0.checkpoint()` `rdd1.count()` // rdd0 should checkpoint here Which means rdd checkpoint after action should work on rdds that not call the actions directly. This will cause the traverse of the whole rdd lineage until meet the rdds that has already checkpointed. But the traverse will only check the status of the rdd, which will not cause rdd's re-computation, so it will only has trivial impact on performance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-60874724 **[Test build #22419 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22419/consoleFull)** for PR 2956 at commit [`b8633c7`](https://github.com/apache/spark/commit/b8633c7714aacf5d8c87037f3108564a88c555b5) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-60874726 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22419/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
GitHub user liyezhang556520 opened a pull request: https://github.com/apache/spark/pull/2956 [SPARK-4094][CORE] checkpoint should still be available after any rdd actions JIRA URL: [SPARK-4094](https://issues.apache.org/jira/browse/SPARK-4094) You can merge this pull request into a Git repository by running: $ git pull https://github.com/liyezhang556520/spark cpAfterAction Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2956.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2956 commit 719c29da025519c9282940ac39c398ab860f700f Author: Zhang, Liye liye.zh...@intel.com Date: 2014-10-27T06:50:50Z [SPARK-4094][CORE] checkpoint should still be available after any rdd actions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-60556120 [Test build #22281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22281/consoleFull) for PR 2956 at commit [`719c29d`](https://github.com/apache/spark/commit/719c29da025519c9282940ac39c398ab860f700f). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2956#discussion_r19391269 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1204,6 +1204,8 @@ abstract class RDD[T: ClassTag]( } else if (checkpointData.isEmpty) { checkpointData = Some(new RDDCheckpointData(this)) checkpointData.get.markForCheckpoint() + // There is supposed to be doCheckpoint in the following, reset doCheckpointCalled first + doCheckpointCalled = false --- End diff -- From the docs, it's clear that this is not intended to be called after operations have executed on the RDD. These changes kind of hack it so it doesn't directly fail, but are you certain this is valid? race conditions and so on? What's the point of `doCheckpointCalled` after this change, really? the criteria seems to collapse to allow checkpoint if no checkpoint data has been written. If it's that easy I do wonder why it wasn't this way in the first place. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-60560699 [Test build #22281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22281/consoleFull) for PR 2956 at commit [`719c29d`](https://github.com/apache/spark/commit/719c29da025519c9282940ac39c398ab860f700f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2956#issuecomment-60560702 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22281/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4094][CORE] checkpoint should still be ...
Github user liyezhang556520 commented on a diff in the pull request: https://github.com/apache/spark/pull/2956#discussion_r19397687 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1204,6 +1204,8 @@ abstract class RDD[T: ClassTag]( } else if (checkpointData.isEmpty) { checkpointData = Some(new RDDCheckpointData(this)) checkpointData.get.markForCheckpoint() + // There is supposed to be doCheckpoint in the following, reset doCheckpointCalled first + doCheckpointCalled = false --- End diff -- Hi @srowen , thanks for your comment, the change is a little kind of hack. For `doCheckpointCalled`, it still keeps the point before this change. I'm just considering it might be a little wired for user that checkpoint would never work after the first time the job has been executed on the RDD. While cache() doesn't have such issue. Maybe, this is under concern of automatic checkpoint. Anyway, it would be better if this can be solved. @pwendell , can you share you expertise on the original design and your opinion on this? Since this is not a bug, it is only a convention for users on how to do checkpoint in spark. If the situation I listed in this JIRA is not considered to support in spark, I will close the JIRA and this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org