[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60553565
  
  [Test build #22277 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22277/consoleFull)
 for   PR 2922 at commit 
[`a11abab`](https://github.com/apache/spark/commit/a11ababb21a0c2378a4d2f665f16d16112d7b469).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60553667
  
@pwendell - I made destroy blocking by default and only made that version 
public (its not clear we need the non-blocking version to also be public -- we 
can add it later if required)

Also all the Broadcast stuff in the Java API seems to come directly from 
the java classes ? Let me know if I missed something.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-27 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60554586
  
Oh right yeah. Great, LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60557108
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22277/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60557102
  
  [Test build #22277 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22277/consoleFull)
 for   PR 2922 at commit 
[`a11abab`](https://github.com/apache/spark/commit/a11ababb21a0c2378a4d2f665f16d16112d7b469).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-27 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60615291
  
Thanks. Merging this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2922


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-26 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/2922#discussion_r19382271
  
--- Diff: core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala ---
@@ -60,6 +62,8 @@ abstract class Broadcast[T: ClassTag](val id: Long) 
extends Serializable {
*/
   @volatile private var _isValid = true
 
+  private var _destroySite = 
--- End diff --

@srowen yeah I asked for tracking the callsite because we now have this 
case where someone can try to use a destroyed broadcast (destroyed by e.g. 
another thread) and it will be very hard for users to debug this. Tracking the 
callsite has almost no overhead here and it seemed like it might be useful for 
debugging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-26 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/2922#discussion_r19382277
  
--- Diff: core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala ---
@@ -87,10 +91,13 @@ abstract class Broadcast[T: ClassTag](val id: Long) 
extends Serializable {
   /**
* Destroy all data and metadata related to this broadcast variable. Use 
this with caution;
* once a broadcast variable has been destroyed, it cannot be used again.
+   * @param blocking Whether to block until destroy has completed
*/
-  private[spark] def destroy(blocking: Boolean) {
+  def destroy(blocking: Boolean) {
--- End diff --

Okay if that's the case I'd propose making `destroy()` public and keeping 
`destroy(blocking: Boolean)` as private. That way we minimize the surface area 
of public APIs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60553077
  
  [Test build #22276 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22276/consoleFull)
 for   PR 2922 at commit 
[`bed9c9d`](https://github.com/apache/spark/commit/bed9c9dcf7e46eb703ca1e721609c4293639c9af).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60553130
  
  [Test build #22276 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22276/consoleFull)
 for   PR 2922 at commit 
[`bed9c9d`](https://github.com/apache/spark/commit/bed9c9dcf7e46eb703ca1e721609c4293639c9af).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60553132
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22276/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread shivaram
GitHub user shivaram opened a pull request:

https://github.com/apache/spark/pull/2922

[SPARK-4030] Make destroy public for broadcast variables

This change makes the destroy function public for broadcast variables. 
Motivation for the change is described in 
https://issues.apache.org/jira/browse/SPARK-4030.
This patch also logs where destroy was called from if a broadcast variable 
is used after destruction.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shivaram/spark-1 broadcast-destroy

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2922.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2922


commit e80c1abca25f13e2a6e9b0f0b83a9fb7032ba4ca
Author: Shivaram Venkataraman shiva...@cs.berkeley.edu
Date:   2014-10-24T06:36:46Z

Make destroy public for broadcast variables
Also log where destroy was called from if a broadcast variable is used 
after destruction.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60350235
  
cc @pwendell @rxin for review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60350674
  
  [Test build #22125 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22125/consoleFull)
 for   PR 2922 at commit 
[`e80c1ab`](https://github.com/apache/spark/commit/e80c1abca25f13e2a6e9b0f0b83a9fb7032ba4ca).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60350745
  
  [Test build #22125 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22125/consoleFull)
 for   PR 2922 at commit 
[`e80c1ab`](https://github.com/apache/spark/commit/e80c1abca25f13e2a6e9b0f0b83a9fb7032ba4ca).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class Broadcast[T: ClassTag](val id: Long) extends 
Serializable with Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60350746
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22125/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread pwendell
Github user pwendell commented on a diff in the pull request:

https://github.com/apache/spark/pull/2922#discussion_r19325043
  
--- Diff: core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala ---
@@ -87,10 +91,13 @@ abstract class Broadcast[T: ClassTag](val id: Long) 
extends Serializable {
   /**
* Destroy all data and metadata related to this broadcast variable. Use 
this with caution;
* once a broadcast variable has been destroyed, it cannot be used again.
+   * @param blocking Whether to block until destroy has completed
*/
-  private[spark] def destroy(blocking: Boolean) {
+  def destroy(blocking: Boolean) {
--- End diff --

should we only expose a version where blocking is set to true for users? It 
seems like asynchronous destroy is a bit more complex. @shivaram does your app 
need the async version?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60350936
  
small question - this looks good overall


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/2922#issuecomment-60350983
  
oh one thing - can we add a java version of this? should be pretty simple, 
right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2922#discussion_r19340941
  
--- Diff: core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala ---
@@ -60,6 +62,8 @@ abstract class Broadcast[T: ClassTag](val id: Long) 
extends Serializable {
*/
   @volatile private var _isValid = true
 
+  private var _destroySite = 
--- End diff --

How useful is it to store this? it only helps in case of an invalid 
`Broadcast` instance. (PS you can use string interpolation instead of `format` 
in these changes if you care to)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/2922#discussion_r19373743
  
--- Diff: core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala ---
@@ -87,10 +91,13 @@ abstract class Broadcast[T: ClassTag](val id: Long) 
extends Serializable {
   /**
* Destroy all data and metadata related to this broadcast variable. Use 
this with caution;
* once a broadcast variable has been destroyed, it cannot be used again.
+   * @param blocking Whether to block until destroy has completed
*/
-  private[spark] def destroy(blocking: Boolean) {
+  def destroy(blocking: Boolean) {
--- End diff --

No - I actually prefer the synchronous version in my applications. I will 
add another destroy() which is always synchronous and make it public.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4030] Make destroy public for broadcast...

2014-10-24 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/2922#discussion_r19373771
  
--- Diff: core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala ---
@@ -60,6 +62,8 @@ abstract class Broadcast[T: ClassTag](val id: Long) 
extends Serializable {
*/
   @volatile private var _isValid = true
 
+  private var _destroySite = 
--- End diff --

@pwendell requested this in the JIRA -- The main reason is that we'd like 
to make it easy to debug if users call `destroy` by mistake. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org