subject:"\[GitHub\] spark pull request\: \[SPARK\-5654\] Integrate SparkR"

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91121381
  
Thanks @andrewor14 @pwendell for the reviews. Now that Jenkins is happy I 
am going merge this in and I'll file follow up issues for things like YARN 
cluster mode which we didn't get to in this PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91121091
  
  [Test build #29919 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29919/consoleFull)
 for   PR 5096 at commit 
[`da64742`](https://github.com/apache/spark/commit/da64742dc1543346623acc420beac209c0c951ce).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91121106
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29919/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91104742
  
  [Test build #29919 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29919/consoleFull)
 for   PR 5096 at commit 
[`da64742`](https://github.com/apache/spark/commit/da64742dc1543346623acc420beac209c0c951ce).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91088926
  
@pwendell Its around 2 minutes on my laptop. Here is the output on my 
machine
```
time ./run-tests.sh


./run-tests.sh  1:56.96 total
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91084853
  
@shivaram - hey one thing I forgot to ask, how much time do the SparkR 
tests add to the overall Spark tests?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91080612
  
  [Test build #29908 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29908/consoleFull)
 for   PR 5096 at commit 
[`bac3a6b`](https://github.com/apache/spark/commit/bac3a6bc05fecca9d7ebb3e544b2edcfdca1c50d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91070385
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29894/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91070376
  
  [Test build #29894 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29894/consoleFull)
 for   PR 5096 at commit 
[`59266d1`](https://github.com/apache/spark/commit/59266d14416a614d900447788806f958ab1088f9).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91052419
  
> also, i can't believe how long this build is... sad panda etc.

Test parallelization is going to be a lot of work, but I think we could see 
huge speedups for the pull request builders if we didn't run all tests for 
every PR.  Most PRs touch the higher-level libraries and not core, so it should 
be safe to skip most of the tests if core hasn't been modified.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91042724
  
  [Test build #29894 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29894/consoleFull)
 for   PR 5096 at commit 
[`59266d1`](https://github.com/apache/spark/commit/59266d14416a614d900447788806f958ab1088f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread shaneknapp

Github user shaneknapp commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91041984
  
also, i can't believe how long this build is...  sad panda etc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread shaneknapp

Github user shaneknapp commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91041894
  
jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91041394
  
Thanks @shaneknapp ! Could you re-trigger this build once its upped ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread shaneknapp

Github user shaneknapp commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91040738
  
i'll up it to 180, just so we have some headroom.

On Wed, Apr 8, 2015 at 2:21 PM, Shivaram Venkataraman <
notificati...@github.com> wrote:

> @brennonyork  The overall Jenkins build
> runner has a timeout of 130 minutes right now (cc @shaneknapp
> ). So all the RAT tests, Mima checks,
> style checks, new dependencies plus all the unit tests have to run within
> 130 minutes and this PR seems to be failing that.
>
> @shaneknapp  can we increase the 130 min
> timeout to say 140 minutes ?
>
> â
> Reply to this email directly or view it on GitHub
> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91040259
  
@brennonyork The overall Jenkins build runner has a timeout of 130 minutes 
right now (cc @shaneknapp). So all the RAT tests, Mima checks, style checks, 
new dependencies plus all the unit tests have to run within 130 minutes and 
this PR seems to be failing that.

@shaneknapp can we increase the 130 min timeout to say 140 minutes ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91039248
  
SparkSubmit parts LGTM. We should merge this soon so people can start 
testing this well in advance of the release window.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5096#discussion_r28013923
  
--- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala ---
@@ -0,0 +1,341 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.api.r
+
+import java.io.{DataInputStream, DataOutputStream}
+import java.sql.{Date, Time}
+
+import scala.collection.JavaConversions._
+
+/**
+ * Utility functions to serialize, deserialize objects to / from R
+ */
+private[spark] object SerDe {
+
+  // Type mapping from R to Java
+  //
+  // NULL -> void
+  // integer -> Int
+  // character -> String
+  // logical -> Boolean
+  // double, numeric -> Double
+  // raw -> Array[Byte]
+  // Date -> Date
+  // POSIXlt/POSIXct -> Time
+  //
+  // list[T] -> Array[T], where T is one of above mentioned types
+  // environment -> Map[String, T], where T is a native type
+  // jobj -> Object, where jobj is an object created in the backend
+
+  def readObjectType(dis: DataInputStream): Char = {
+dis.readByte().toChar
+  }
+
+  def readObject(dis: DataInputStream): Object = {
+val dataType = readObjectType(dis)
+readTypedObject(dis, dataType)
+  }
+
+  def readTypedObject(
+  dis: DataInputStream,
+  dataType: Char): Object = {
+dataType match {
+  case 'n' => null
+  case 'i' => new java.lang.Integer(readInt(dis))
+  case 'd' => new java.lang.Double(readDouble(dis))
+  case 'b' => new java.lang.Boolean(readBoolean(dis))
+  case 'c' => readString(dis)
+  case 'e' => readMap(dis)
+  case 'r' => readBytes(dis)
+  case 'l' => readList(dis)
+  case 'D' => readDate(dis)
+  case 't' => readTime(dis)
+  case 'j' => JVMObjectTracker.getObject(readString(dis))
+  case _ => throw new IllegalArgumentException(s"Invalid type 
$dataType")
+}
+  }
+
+  def readBytes(in: DataInputStream): Array[Byte] = {
+val len = readInt(in)
+val out = new Array[Byte](len)
+val bytesRead = in.readFully(out)
+out
+  }
+
+  def readInt(in: DataInputStream): Int = {
+in.readInt()
+  }
+
+  def readDouble(in: DataInputStream): Double = {
+in.readDouble()
+  }
+
+  def readString(in: DataInputStream): String = {
+val len = in.readInt()
+val asciiBytes = new Array[Byte](len)
+in.readFully(asciiBytes)
+assert(asciiBytes(len - 1) == 0)
+val str = new String(asciiBytes.dropRight(1).map(_.toChar))
+str
+  }
+
+  def readBoolean(in: DataInputStream): Boolean = {
+val intVal = in.readInt()
+if (intVal == 0) false else true
--- End diff --

can be `intVal != 0`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5096#discussion_r28013485
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -469,6 +469,9 @@ private[spark] class ApplicationMaster(
   System.setProperty("spark.submit.pyFiles",
 PythonRunner.formatPaths(args.pyFiles).mkString(","))
 }
+if (args.primaryRFile != null && args.primaryRFile.endsWith(".R")) {
+  // TODO(davies): add R dependencies here
--- End diff --

that's fine. We can add full support for SparkR on YARN cluster later


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread brennonyork

Github user brennonyork commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91028275

@shivaram a few things after looking at the build code some more...

1. The timeout value comes from the line [here in
`dev/run-tests-jenkins`](https://github.com/apache/spark/blob/master/dev/run-tests-jenkins#L50).
Its currently set at 120 minutes and **doesn't** include the time it takes for
PR's to be tested against the master branch (i.e. for dependencies). We could
certainly up that value, but I'd ask that since, I'm assuming, the
`dev/run-tests` script on this PR runs all the new SparkR tests (plus any
additional for core Spark you've added), that you run `dev/run-tests` locally
and, for whatever additional time is needed, update the timeout in
`dev/run-tests-jenkins` for this PR. The impetus for running locally first is
that I'd much rather get a baseline for what it takes for all the new tests to
run and then add 15ish minutes for fluff rather than throw a number into the
wind.
2. Completely agree we should get some timing metrics for the various PR
tests (thanks for the idea!). I'll generate a JIRA for that and take a look
soon. That said, just to reiterate, those tests **are not** holding up the
actual Spark test suite from finishing unless Jenkins has some deeper timing
hooks than I know about. I assume though that it's merely a factor of the large
corpus tests that were likely added in this PR.

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-91010961
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29870/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90977866
  
  [Test build #29870 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29870/consoleFull)
 for   PR 5096 at commit 
[`59266d1`](https://github.com/apache/spark/commit/59266d14416a614d900447788806f958ab1088f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90977367
  
Jenkins, retest this please (is the fourth time lucky ?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90975665
  
@brennonyork Thanks - another thing that might be helpful is to log how 
long it took to run the test. I am trying to figure out where the 120 minutes 
we have are being spent and its tricky to get a breakdown right now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-08 Thread brennonyork

Github user brennonyork commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90827465
  
We can certainly set the timeout to be something larger. Let me take a look 
at the previous builds and see if I can find a good timeout number and if there 
might be anything else we can do. 
@pwendell any other ideas?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-07 Thread shivaram

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90806814
  
@brennonyork @pwendell The new dependency checks seem to add around 20 
minutes to a Jenkins run and this PR which has SQL and pom.xml changes has 
timed out thrice now (it didn't even get a chance to run the SparkR tests, so I 
don't think that is the problem). Is there anyway we can increase the time out 
or speed up the new dependency checks ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90806760
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29828/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90792602
  
  [Test build #29828 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29828/consoleFull)
 for   PR 5096 at commit 
[`59266d1`](https://github.com/apache/spark/commit/59266d14416a614d900447788806f958ab1088f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-07 Thread shivaram

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90792138
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-07 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90768943
  
Okay LGTM from a packaging perspective. Once @andrewor14 sign's off on the 
spark-submit stuff I think this is ready to go.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90768068
  
  [Test build #640 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/640/consoleFull)
 for   PR 5096 at commit 
[`59266d1`](https://github.com/apache/spark/commit/59266d14416a614d900447788806f958ab1088f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90767895
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29814/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user davies commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90748077
  
@andrewor14 We should had addressed all you comments, could you take 
another look?

Waiting for jenkins.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90747911
  
  [Test build #29814 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29814/consoleFull)
 for   PR 5096 at commit 
[`59266d1`](https://github.com/apache/spark/commit/59266d14416a614d900447788806f958ab1088f9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5096#discussion_r27926454
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -469,6 +469,9 @@ private[spark] class ApplicationMaster(
   System.setProperty("spark.submit.pyFiles",
 PythonRunner.formatPaths(args.pyFiles).mkString(","))
 }
+if (args.primaryRFile != null && args.primaryRFile.endsWith(".R")) {
+  // TODO(davies): add R dependencies here
--- End diff --

Right now, SparkR does not support ship other package as dependencies. It 
may be added in future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5096#discussion_r27926373
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMasterArguments.scala
 ---
@@ -92,6 +97,7 @@ class ApplicationMasterArguments(val args: Array[String]) 
{
   |  --jar JAR_PATH   Path to your application's JAR file
   |  --class CLASS_NAME   Name of your application's main class
   |  --primary-py-fileA main Python file
+  |  --primary-r-file A main R file
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5096#discussion_r27926357
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -497,12 +503,15 @@ private[spark] class Client(
 if (args.primaryPyFile != null && args.primaryPyFile.endsWith(".py")) {
   args.userArgs = ArrayBuffer(args.primaryPyFile, args.pyFiles) ++ 
args.userArgs
 }
+if (args.primaryRFile != null && args.primaryRFile.endsWith(".R")) {
+  args.userArgs = ArrayBuffer(args.primaryRFile) ++ args.userArgs
+}
 val userArgs = args.userArgs.flatMap { arg =>
   Seq("--arg", YarnSparkHadoopUtil.escapeForShell(arg))
 }
 val amArgs =
-  Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ pyFiles ++ 
userArgs ++
-Seq(
+  Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ pyFiles ++ 
primaryRFile ++
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR

2015-04-07 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5096#issuecomment-90408348
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29780/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5654] Integrate SparkR