[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-06-13 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-111766746
  
@JoshRosen Yes, that's fine.   I'll ping @willb about listing silex on 
spark-packages.org.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-06-13 Thread erikerlandson
Github user erikerlandson closed the pull request at:

https://github.com/apache/spark/pull/1839


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-06-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/1839#discussion_r32375677
  
--- Diff: core/src/main/scala/org/apache/spark/rdd/DropRDDFunctions.scala 
---
@@ -0,0 +1,172 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.rdd
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.{SparkContext, Logging, Partition, TaskContext}
+import org.apache.spark.{Dependency, NarrowDependency, OneToOneDependency}
+
+import org.apache.spark.SparkContext.rddToPromiseRDDFunctions
+
+
+private [spark]
+class FanInDep[T: ClassTag](rdd: RDD[T]) extends NarrowDependency[T](rdd) {
+  // Assuming parent RDD type having only one partition
+  override def getParents(pid: Int) = List(0)
+}
+
+
+/**
+ * Extra functions available on RDDs for providing the RDD analogs of 
Scala drop,
+ * dropRight and dropWhile, which return an RDD as a result
+ */
+class DropRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging with 
Serializable {
+
+  /**
+   * Return a new RDD formed by dropping the first (n) elements of the 
input RDD
+   */
+  def drop(n: Int):RDD[T] = {
+if (n = 0) return self
+
+// locate partition that includes the nth element
+val locate = (partitions: Array[Partition], input: RDD[T], ctx: 
TaskContext) = {
+  var rem = n
+  var p = 0
+  var np = 0
+  while (rem  0p  partitions.length) {
+np = input.iterator(partitions(p), ctx).length
--- End diff --

Is it really lazy? I think computation will happen here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-06-13 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-111742946
  
Hey @erikerlandson, since I don't think we're going to merge this 
functionality into core right now, do you mind closing this issue?  BTW, it 
would be cool to list Silex on http://spark-packages.org, since that would put 
the library in front of a lot more users / eyeballs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-06-05 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-109312395
  
@AlexNisnevich 
drop, dropRight and dropWhile are now available on the silex project:

http://silex.freevariable.com/latest/api/#com.redhat.et.silex.rdd.drop.DropRDDFunctions



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-04-22 Thread AlexNisnevich
Github user AlexNisnevich commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-95293840
  
Have any admins verified this patch? `drop` functionality in RDDs would be 
very useful to have.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-04-22 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-95299921
  
@erikerlandson What do you think about releasing this (and maybe #1909) as 
a library on Maven or http://spark-packages.org?  I'm not sure that this is an 
API that we necessarily want to put in core yet, but if you publish it as a 
package then folks would be able to use it with their existing Spark 
deployments without having to upgrade.  The interface for users could still be 
pretty nice: just add an implicit class / object or set of implicit 
conversions, then have users import that.

Spark Packages has a helpful command line tool for creating a project 
template, which might be a timesaver if you decide to go this route: 
http://spark-packages.org/package/databricks/spark-package-cmd-tool.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-04-22 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-95311919
  
Hi @JoshRosen, publishing some of these odds and ends in some form has been 
on my to-do list for a while.  If there's interest, I can bump it up in 
priority.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2015-04-22 Thread AlexNisnevich
Github user AlexNisnevich commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-95328807
  
@JoshRosen @erikerlandson That would be great.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-10-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-61194329
  
  [Test build #22578 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22578/consoleFull)
 for   PR 1839 at commit 
[`af73e1f`](https://github.com/apache/spark/commit/af73e1f3ffab0909acaebdca154889030f1187f7).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-10-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-61201640
  
  [Test build #22578 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22578/consoleFull)
 for   PR 1839 at commit 
[`af73e1f`](https://github.com/apache/spark/commit/af73e1f3ffab0909acaebdca154889030f1187f7).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class FanInDep[T: ClassTag](rdd: RDD[T]) extends 
NarrowDependency[T](rdd) `
  * `class DropRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging 
with Serializable `
  * `class FanOutDep[T: ClassTag](rdd: RDD[T]) extends 
NarrowDependency[T](rdd) `
  * `class PromisePartition extends Partition `
  * `class PromiseRDD[V: ClassTag](expr: = (TaskContext = V),`
  * `class PromiseArgPartition(p: Partition, argv: Seq[PromiseRDD[_]]) 
extends Partition `
  * `class PromiseRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging 
with Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-10-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-61201646
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22578/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-10-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-58765937
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21648/consoleFull)
 for   PR 1839 at commit 
[`af73e1f`](https://github.com/apache/spark/commit/af73e1f3ffab0909acaebdca154889030f1187f7).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-58767530
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21648/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-10-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-58767527
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21648/consoleFull)
 for   PR 1839 at commit 
[`af73e1f`](https://github.com/apache/spark/commit/af73e1f3ffab0909acaebdca154889030f1187f7).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class FanInDep[T: ClassTag](rdd: RDD[T]) extends 
NarrowDependency[T](rdd) `
  * `class DropRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging 
with Serializable `
  * `class FanOutDep[T: ClassTag](rdd: RDD[T]) extends 
NarrowDependency[T](rdd) `
  * `class PromisePartition extends Partition `
  * `class PromiseRDD[V: ClassTag](expr: = (TaskContext = V),`
  * `class PromiseArgPartition(p: Partition, argv: Seq[PromiseRDD[_]]) 
extends Partition `
  * `class PromiseRDDFunctions[T : ClassTag](self: RDD[T]) extends Logging 
with Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-54694497
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-11 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-51806430
  
Assuming this is correct, okay is not same as ok:

 The following regex checks that: .*ok\W+to\W+test.* 
 So I think you should be able to use it in a sentence or whatever. 


https://groups.google.com/forum/#!msg/quicksilver---development/Bn7RPYqAfTI/cQ-_u1BbMEQJ



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-11 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-51825586
  
Jenkins, this is ok to test.  Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-10 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-51720727
  
Jenkins still not getting the memo.   How strict is Jenkins with commands?  
 Is 'okay' same as 'ok'?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-07 Thread erikerlandson
Github user erikerlandson closed the pull request at:

https://github.com/apache/spark/pull/1254


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-07 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-51496145
  
This is a reboot of:
https://github.com/apache/spark/pull/1254


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-51496120
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-07 Thread concretevitamin
Github user concretevitamin commented on the pull request:

https://github.com/apache/spark/pull/1839#issuecomment-51520540
  
Jenkins, this is okay to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-04 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-51142701
  
Should I consider creating a fresh PR, or is there some better way to get 
Jenkins to test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-04 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-51142915
  
I'm not sure what's happening. Maybe Jenkins is lazy today. We can retry 
tomorrow, and if it doesn't work, create a new PR.
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-02 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-50966402
  
Starting to worry I confused it by pushing the PR branch using '+'


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-01 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-50905932
  
jenkins appears to be awol


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-01 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-50906403
  
Let me give it a try:

Jenkins, this is ok to test.

Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-07-31 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-50759649
  
O Jenkins Where Art Thou?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-07-30 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-50648091
  
should Jenkins run an automatic build on PR update?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-07-30 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-50673706
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-07-29 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-50554859
  
I updated this PR so that drop(), dropRight() and dropWhile() are now lazy 
transforms.  A description of what I did is here:

http://erikerlandson.github.io/blog/2014/07/29/deferring-spark-actions-to-lazy-transforms-with-the-promise-rdd/




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-07-21 Thread jayunit100
Github user jayunit100 commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-49602573
  
Adding the Drop function to a contrib library of functions (which requires 
manual import) , as erik suggests, seems like a really good option.  I could 
see such a contrib library also being useful for other isoteric but 
nevertheless important tasks, like dealing with binary data formats, etc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47419953
  
Thanks - I can see why this might be useful, but it is a pretty high bar 
now to add new APIs to the RDD interface, and we need to be very careful about 
APIs that might have very bad performance behaviors (dropping a large number 
can be very slow, in particular if it crosses many partitions). 

For this reason, it might make more sense for this to be an example program 
or a blog post that's easily indexable so people can find. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47420008
  
BTW it is just my personal opinion. Feel free to debate or find support :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47420288
  
My reasoning is that most use cases (or at least the ones I had in mind) 
are something like rdd.drop(n), where n is much smaller than rdd.count(), 
generally 1 or some other small number. FWIW, I implemented it via an 
implicit object, so it's not directly on the RDD class per se.   Another way to 
look at it, these functions aren't worse than rdd.take(), as they use similar 
logic.

However, it's true that if (n) is a large fraction of the size of the RDD, 
then it will invoke computation of a large fraction of the partitions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47420413
  
The thing is we must scan data twice to make sure this actually works 
(because we need to verify the number of partitions we checked is sufficient). 
Usually users' specific use case can be solved with a very simple workaround 
despite the lack of RDD.drop (e.g. for csv files with header that you want to 
drop, you can just drop it at the first partition using an drop within a 
mapPartitions). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47426628
  
It will scan one partition twice:  the one containing the boundary 
between things dropped and not-dropped.   Any partitions prior to that boundary 
are ignored by the resulting RDD (so they are scanned once), and any partitions 
after the boundary are not examined unless/until the result RDD is evaluated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47426817
  
Tangentially, one thing I noticed is that currently all the 
XxxRDDFunctions implicits are automatically defined in SparkContext, and so I 
held to that pattern in this PR.However, another option might be to not 
automatically define it, and a user would import DropRDDFunctions for 
themselves if they wanted to use drop methods.

In fact, that seems like a good pattern generally for reducing unneeded 
imports; one might say the same thing for OrderedRDDFunctions, etc:  import 
XxxRDDFunctions if you need it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47428789
  
Note, in a typical case where one is invoking something like rdd.drop(1), 
or other small number, only one partition gets evaluated by drop - the first 
one.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-28 Thread erikerlandson
Github user erikerlandson commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47429672
  
I also envision typical use cases as being either pre- or post-processing.  
 That is, not something that would often appear inside a tight loop.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-27 Thread erikerlandson
GitHub user erikerlandson opened a pull request:

https://github.com/apache/spark/pull/1254

[SPARK-2315] Implement drop, dropRight and dropWhile for RDDs

drop, dropRight and dropWhile methods for RDDs that return a new RDD as the 
result.

// example: load in some text and skip header lines
val txt = sc.textFile(data_with_header.txt)
val data = txt.drop(3)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/erikerlandson/spark rdd_drop_master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1254.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1254


commit aa3c87984907d26b626dcc1e7c356d642147e840
Author: Erik Erlandson eerla...@redhat.com
Date:   2014-06-28T01:06:35Z

[SPARK-2315] Implement drop, dropRight and dropWhile for RDDs, which
take RDD as input and return new RDD with elements dropped.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1254#issuecomment-47418701
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---