[GitHub] spark pull request: [SPARK-12316] Wait a minutes to avoid cycle ca...

2015-12-28 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/10475#issuecomment-167655015
  
CC: @harishreedharan 

@SaintBacchus could you add test cases for this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Adjust coverage for partitionBy()

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10499#issuecomment-167655997
  
**[Test build #48373 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48373/consoleFull)**
 for PR 10499 at commit 
[`7884e87`](https://github.com/apache/spark/commit/7884e87975e8655f0e3a20cc0455e0d7cd614fe4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12489][Core][SQL][MLib]Fix minor issues...

2015-12-28 Thread jkbradley
Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/10440#issuecomment-167656781
  
ML changes look good to me.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12513] [Streaming] SocketReceiver hang ...

2015-12-28 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/10464#issuecomment-167661660
  
Looks a race condition in `restart` and `finally { ... socket.stop() ...}`. 
`restart` will start a new thread and call `receiver.onStart`. So 
`receiver.onStart` may run before `socket.stop()`.

However, it looks unlikely since it sleeps 2 seconds before calling 
`startReceiver()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12539][SQL][WIP] support writing bucket...

2015-12-28 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10498#issuecomment-167664157
  
BTW in github you can use square brackets to create a checklist, e.g.

```
- [] item a
- [] item b
```

becomes

- [] item a
- [] item b


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-11199] Improve R context manag...

2015-12-28 Thread falaki
Github user falaki commented on the pull request:

https://github.com/apache/spark/pull/9185#issuecomment-167664825
  
ping @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12222] [Core] Deserialize RoaringBitmap...

2015-12-28 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/10253#issuecomment-167665206
  
LGTM.  Merging this into `master` and `branch-1.6`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-11199] Improve R context manag...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9185#issuecomment-167668475
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-11199] Improve R context manag...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9185#issuecomment-167668477
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48375/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12525] Fix fatal compiler warnings in K...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10479#issuecomment-167668461
  
**[Test build #48376 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48376/consoleFull)**
 for PR 10479 at commit 
[`422ef49`](https://github.com/apache/spark/commit/422ef494b56f9ac4c770311743fb2a01a9d19ae1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7995][SPARK-6280][Core]Remove AkkaRpcEn...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10459#issuecomment-167654637
  
**[Test build #48372 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48372/consoleFull)**
 for PR 10459 at commit 
[`1f5a523`](https://github.com/apache/spark/commit/1f5a5237c9fe238a23d0601293da3ae33f1f9fa2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-7995][SPARK-6280][Core]Remove AkkaRpcEn...

2015-12-28 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/10459#discussion_r48505487
  
--- Diff: core/src/test/scala/org/apache/spark/util/AkkaUtilsSuite.scala ---
@@ -61,9 +55,14 @@ class AkkaUtilsSuite extends SparkFunSuite with 
LocalSparkContext with ResetSyst
 
 val slaveRpcEnv = RpcEnv.create("spark-slave", hostname, 0, conf, 
securityManagerBad)
 val slaveTracker = new MapOutputTrackerWorker(conf)
-intercept[akka.actor.ActorNotFound] {
+try {
   slaveTracker.trackerEndpoint =
-slaveRpcEnv.setupEndpointRef("spark", rpcEnv.address, 
MapOutputTracker.ENDPOINT_NAME)
+slaveRpcEnv.setupEndpointRef(rpcEnv.address, 
MapOutputTracker.ENDPOINT_NAME)
+} catch {
+  case e: RuntimeException =>
+assert(e.getMessage.contains("javax.security.sasl.SaslException"))
+  case e: SparkException =>
+assert(e.getMessage.contains("Message is dropped because Outbox is 
stopped"))
--- End diff --

Adding this catch clause because there is a race condition in `Outbox` that 
may throw `SparkException` instead. Image the following execution order:

Execution Order  | Thread1 | Thread2
- | - | -
1 | nettyEnv.createClient (Outbox.scala, will call channel.close in this 
method if authentication fails) | 
2 | catch NonFatal(e) |
3 |  | connectionTerminated (NettyRpcHandler)
4 |  | nettyEnv.removeOutbox
5 |  | outbox.stop
6 |  | message.onFailure(new SparkException("Message is dropped because 
Outbox is stopped"))
7 | Outbox.handleNetworkFailure |



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12415] Do not use closure serializer to...

2015-12-28 Thread tedyu
Github user tedyu commented on a diff in the pull request:

https://github.com/apache/spark/pull/10368#discussion_r48507734
  
--- Diff: 
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ---
@@ -109,6 +111,9 @@ class KryoSerializer(conf: SparkConf)
 kryo.register(classOf[SerializableJobConf], new KryoJavaSerializer())
 kryo.register(classOf[HttpBroadcast[_]], new KryoJavaSerializer())
 kryo.register(classOf[PythonBroadcast], new KryoJavaSerializer())
+kryo.register(classOf[TaskMetrics], new KryoJavaSerializer())
+kryo.register(classOf[DirectTaskResult[_]], new KryoJavaSerializer())
+kryo.register(classOf[IndirectTaskResult[_]], new KryoJavaSerializer())
--- End diff --

bq. people may forget to register new classes if they just add an Option 
field to TaskMetrics in future

Addition to TaskMetrics would be reviewed, right ?
A comment can be added to TaskMetrics reminding them to register 
corresponding class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12486] Worker should kill the executors...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10438#issuecomment-167664384
  
**[Test build #48374 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48374/consoleFull)**
 for PR 10438 at commit 
[`67611ac`](https://github.com/apache/spark/commit/67611acec29cb6cadadc038f27759c19578f6e21).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12222] [Core] Deserialize RoaringBitmap...

2015-12-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10253


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-11199] Improve R context manag...

2015-12-28 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9185#issuecomment-167665605
  
This seems fine to me as a first step.  Eventually we will probably want to 
make the RBackend multi-session aware.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12525] Fix fatal compiler warnings in K...

2015-12-28 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/10479#issuecomment-167665578
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12489][Core][SQL][MLib]Fix minor issues...

2015-12-28 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/10440#issuecomment-167668145
  
@andrewor14 could you take a look at this pr? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12489][Core][SQL][MLib]Fix minor issues...

2015-12-28 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10440#discussion_r48510554
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala ---
@@ -386,9 +386,9 @@ private[tree] object LearningNode {
 var levelsToGo = indexToLevel(nodeIndex)
 while (levelsToGo > 0) {
   if ((nodeIndex & (1 << levelsToGo - 1)) == 0) {
-tmpNode = tmpNode.leftChild.asInstanceOf[LearningNode]
+tmpNode = tmpNode.leftChild.get
--- End diff --

@jkbradley was this code never run before?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12489][Core][SQL][MLib]Fix minor issues...

2015-12-28 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/10440#issuecomment-167671951
  
Looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12489][Core][SQL][MLib]Fix minor issues...

2015-12-28 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/10440#discussion_r48510515
  
--- Diff: launcher/src/main/java/org/apache/spark/launcher/Main.java ---
@@ -151,7 +151,7 @@ private static String 
prepareWindowsCommand(List cmd, Map

[GitHub] spark pull request: [SPARK-12490] Don't use Javascript for web UI'...

2015-12-28 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/10441#issuecomment-167672868
  
@zsxwing, I've pushed a new commit which aims to preserve the old behavior 
when increasing the number of items displayed per page while pageNumber > 1; 
see fd2d1f2a49ad4f6bc2b5ed8bf3aecd65093abc65.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-11199] Improve R context manag...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9185#issuecomment-167689398
  
**[Test build #48386 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48386/consoleFull)**
 for PR 9185 at commit 
[`0633a73`](https://github.com/apache/spark/commit/0633a73ddbc6a328d579434f3c3ec349765d70ef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6624][WIP] Draft of another alternative...

2015-12-28 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/10444#discussion_r48516709
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
 ---
@@ -47,6 +48,34 @@ trait Predicate extends Expression {
   override def dataType: DataType = BooleanType
 }
 
+object Predicate extends PredicateHelper {
+  def toCNF(predicate: Expression, maybeThreshold: Option[Double] = None): 
Expression = {
+val cnf = new CNFExecutor(predicate).execute(predicate)
+val threshold = maybeThreshold.map(predicate.size * 
_).getOrElse(Double.MaxValue)
+if (cnf.size > threshold) predicate else cnf
--- End diff --

Maximizing the number of simple predicates sounds reasonable. We may do the 
conversion in a depth-first manner, i.e. always convert the left branch of an 
`And` and then its right branch, until either no more predicates can be 
converted or we reach the size limit. In this way the intermediate result is 
still useful.

BTW, searched for CNF conversion in Hive and found [HIVE-9166][1], which 
also tries to put an upper limit for ORC SARG CNF conversion. @nongli Any clues 
about how Impala does this?

[1]: https://issues.apache.org/jira/browse/HIVE-9166


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12522] [SQL] [MINOR] Add the missing do...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10471#issuecomment-167689959
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48382/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12522] [SQL] [MINOR] Add the missing do...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10471#issuecomment-167689958
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12522] [SQL] [MINOR] Add the missing do...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10471#issuecomment-167689909
  
**[Test build #48382 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48382/consoleFull)**
 for PR 10471 at commit 
[`91ec5de`](https://github.com/apache/spark/commit/91ec5ded23df41006554c5c3401e94e5f0a1fa5d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12522] [SQL] [MINOR] Add the missing do...

2015-12-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10471


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12522] [SQL] [MINOR] Add the missing do...

2015-12-28 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10471#issuecomment-167690183
  
Thanks - I've merged it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12536] [SQL] Added "Empty Seq" in Expla...

2015-12-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10494#discussion_r48517137
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala
 ---
@@ -62,6 +62,10 @@ case class LocalRelation(output: Seq[Attribute], data: 
Seq[InternalRow] = Nil)
 case _ => false
   }
 
+  override def simpleString: String =
+if (data == Seq.empty) super.simpleString + " [Empty Seq]"
--- End diff --

should be 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala#L401


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...

2015-12-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/10491#discussion_r48517225
  
--- Diff: docs/configuration.md ---
@@ -120,7 +120,8 @@ of the most common options to set are:
   spark.driver.cores
   1
   
-Number of cores to use for the driver process, only in cluster mode.
+Number of cores to use for the driver process, only in cluster mode. 
This can be set through
+--driver-cores command line option.
--- End diff --

I moved these to the running-on-yarn doc, would that work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10491#issuecomment-167692412
  
**[Test build #48387 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48387/consoleFull)**
 for PR 10491 at commit 
[`27c6976`](https://github.com/apache/spark/commit/27c6976cb33c8a418635a46255301b027db8615c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48517423
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -2272,3 +2260,40 @@ setMethod("with",
 newEnv <- assignNewEnv(data)
 eval(substitute(expr), envir = newEnv, enclos = newEnv)
   })
+
+#' Saves the content of the DataFrame to an external database table via 
JDBC
+#'
+#' Additional JDBC database connection properties can be set (...)
+#'
+#' Also, mode is used to specify the behavior of the save operation when
+#' data already exists in the data source. There are four modes: \cr
+#'  append: Contents of this DataFrame are expected to be appended to 
existing data. \cr
+#'  overwrite: Existing data is expected to be overwritten by the contents 
of this DataFrame. \cr
+#'  error: An exception is expected to be thrown. \cr
+#'  ignore: The save operation is expected to not save the contents of the 
DataFrame
+#' and to not change the existing data. \cr
+#'
+#' @param x A SparkSQL DataFrame
+#' @param url JDBC database url of the form `jdbc:subprotocol:subname`
+#' @param tableName The name of the table in the external database
+#' @param mode One of 'append', 'overwrite', 'error', 'ignore' save mode 
(it is 'error' by default)
+#' @family DataFrame functions
+#' @rdname write.jdbc
+#' @name write.jdbc
+#' @export
+#' @examples
+#'\dontrun{
+#' sc <- sparkR.init()
+#' sqlContext <- sparkRSQL.init(sc)
+#' jdbcUrl <- "jdbc:mysql://localhost:3306/databasename"
+#' write.jdbc(df, jdbcUrl, "table", user = "username", password = 
"password")
+#' }
+setMethod("write.jdbc",
+  signature(x = "DataFrame", url = "character", tableName = 
"character"),
+  function(x, url, tableName, mode = "error", ...){
+jmode <- convertToJSaveMode(mode)
+jprops <- envToJProperties(varargsToEnv(...))
--- End diff --

vararg -> env -> properties seems a little redundant. I would prefer vararg 
-> properties.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48517433
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -556,3 +556,61 @@ createExternalTable <- function(sqlContext, tableName, 
path = NULL, source = NUL
   sdf <- callJMethod(sqlContext, "createExternalTable", tableName, source, 
options)
   dataFrame(sdf)
 }
+
+#' Create a DataFrame representing the database table accessible via JDBC 
URL
+#'
+#' Additional JDBC database connection properties can be set (...)
+#'
+#' Only one of partitionColumn or predicates should be set. Partitions of 
the table will be
+#' retrieved in parallel based on the `numPartitions` or by the predicates.
+#'
+#' Don't create too many partitions in parallel on a large cluster; 
otherwise Spark might crash
+#' your external database systems.
+#'
+#' @param sqlContext SQLContext to use
+#' @param url JDBC database url of the form `jdbc:subprotocol:subname`
+#' @param tableName the name of the table in the external database
+#' @param partitionColumn the name of a column of integral type that will 
be used for partitioning
+#' @param lowerBound the minimum value of `partitionColumn` used to decide 
partition stride
+#' @param upperBound the maximum value of `partitionColumn` used to decide 
partition stride
+#' @param numPartitions the number of partitions, This, along with 
`lowerBound` (inclusive),
+#'  `upperBound` (exclusive), form partition strides 
for generated WHERE
+#'  clause expressions used to split the column 
`partitionColumn` evenly.
+#'  This defaults to SparkContext.defaultParallelism 
when unset.
+#' @param predicates a list of conditions in the where clause; each one 
defines one partition
--- End diff --

OK


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48517532
  
--- Diff: R/pkg/R/generics.R ---
@@ -537,6 +537,12 @@ setGeneric("write.df", function(df, path, ...) { 
standardGeneric("write.df") })
 #' @export
 setGeneric("saveDF", function(df, path, ...) { standardGeneric("saveDF") })
 
--- End diff --

yeah, correct


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48517634
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -556,3 +556,61 @@ createExternalTable <- function(sqlContext, tableName, 
path = NULL, source = NUL
   sdf <- callJMethod(sqlContext, "createExternalTable", tableName, source, 
options)
   dataFrame(sdf)
 }
+
+#' Create a DataFrame representing the database table accessible via JDBC 
URL
+#'
+#' Additional JDBC database connection properties can be set (...)
+#'
+#' Only one of partitionColumn or predicates should be set. Partitions of 
the table will be
+#' retrieved in parallel based on the `numPartitions` or by the predicates.
+#'
+#' Don't create too many partitions in parallel on a large cluster; 
otherwise Spark might crash
+#' your external database systems.
+#'
+#' @param sqlContext SQLContext to use
+#' @param url JDBC database url of the form `jdbc:subprotocol:subname`
+#' @param tableName the name of the table in the external database
+#' @param partitionColumn the name of a column of integral type that will 
be used for partitioning
+#' @param lowerBound the minimum value of `partitionColumn` used to decide 
partition stride
+#' @param upperBound the maximum value of `partitionColumn` used to decide 
partition stride
+#' @param numPartitions the number of partitions, This, along with 
`lowerBound` (inclusive),
+#'  `upperBound` (exclusive), form partition strides 
for generated WHERE
+#'  clause expressions used to split the column 
`partitionColumn` evenly.
+#'  This defaults to SparkContext.defaultParallelism 
when unset.
+#' @param predicates a list of conditions in the where clause; each one 
defines one partition
+#' @return DataFrame
+#' @rdname read.jdbc
+#' @name read.jdbc
+#' @export
+#' @examples
+#'\dontrun{
+#' sc <- sparkR.init()
+#' sqlContext <- sparkRSQL.init(sc)
+#' jdbcUrl <- "jdbc:mysql://localhost:3306/databasename"
+#' df <- read.jdbc(sqlContext, jdbcUrl, "table", predicates = 
list("field<=123"), user = "username")
+#' df2 <- read.jdbc(sqlContext, jdbcUrl, "table2", partitionColumn = 
"index", lowerBound = 0,
+#'  upperBound = 1, user = "username", password = 
"password")
+#' }
+
+read.jdbc <- function(sqlContext, url, tableName,
+  partitionColumn = NULL, lowerBound = NULL, 
upperBound = NULL,
+  numPartitions = 0L, predicates = list(), ...) {
--- End diff --

default parameter for predicates can be NULL?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10491#issuecomment-167697095
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10491#issuecomment-167696817
  
**[Test build #48387 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48387/consoleFull)**
 for PR 10491 at commit 
[`27c6976`](https://github.com/apache/spark/commit/27c6976cb33c8a418635a46255301b027db8615c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10491#issuecomment-167697099
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48387/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12495][SQL] use true as default value f...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10443#issuecomment-167697369
  
**[Test build #48389 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48389/consoleFull)**
 for PR 10443 at commit 
[`a6b826c`](https://github.com/apache/spark/commit/a6b826c4cd55545e2ca2f1478a16c030bc0a86df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Adjust coverage for partitionBy()

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10499#issuecomment-167697956
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Adjust coverage for partitionBy()

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10499#issuecomment-167697671
  
**[Test build #48381 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48381/consoleFull)**
 for PR 10499 at commit 
[`f655bbe`](https://github.com/apache/spark/commit/f655bbe37fee7903ca8446996f971f629b1c5450).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Adjust coverage for partitionBy()

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10499#issuecomment-167697959
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48381/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12480][SQL] add Hash expression that ca...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10435#issuecomment-167698232
  
**[Test build #48388 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48388/consoleFull)**
 for PR 10435 at commit 
[`6311aa7`](https://github.com/apache/spark/commit/6311aa75a7a41fee8464ee96e5949ccad3e7d7a5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12503] [SQL] Pushing Limit Through Unio...

2015-12-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10451#discussion_r48517911
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -153,6 +153,15 @@ object SetOperationPushDown extends Rule[LogicalPlan] 
with PredicateHelper {
 )
   )
 
+// Adding extra Limit below UNION ALL if both left and right childs 
are not Limit.
+// This heuristic is valid assuming there does not exist any Limit 
push-down rule.
+case Limit(exp, Union(left, right))
+  if left.maxRows.isEmpty || right.maxRows.isEmpty =>
--- End diff --

is `left.maxRows.isEmpty` equal to `check if left is a Limit`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12503] [SQL] Pushing Limit Through Unio...

2015-12-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10451#discussion_r48518043
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -153,6 +153,15 @@ object SetOperationPushDown extends Rule[LogicalPlan] 
with PredicateHelper {
 )
   )
 
+// Adding extra Limit below UNION ALL if both left and right childs 
are not Limit.
+// This heuristic is valid assuming there does not exist any Limit 
push-down rule.
+case Limit(exp, Union(left, right))
+  if left.maxRows.isEmpty || right.maxRows.isEmpty =>
--- End diff --

Actually I think this branch is safe without this check, did I miss 
something here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12503] [SQL] Pushing Limit Through Unio...

2015-12-28 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10451#discussion_r48518201
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -153,6 +153,15 @@ object SetOperationPushDown extends Rule[LogicalPlan] 
with PredicateHelper {
 )
   )
 
+// Adding extra Limit below UNION ALL if both left and right childs 
are not Limit.
+// This heuristic is valid assuming there does not exist any Limit 
push-down rule.
+case Limit(exp, Union(left, right))
+  if left.maxRows.isEmpty || right.maxRows.isEmpty =>
+  Limit(exp,
+Union(
+  CombineLimits(Limit(exp, left)),
+  CombineLimits(Limit(exp, right
--- End diff --

We can get rid of this manual call now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12503] [SQL] Pushing Limit Through Unio...

2015-12-28 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10451#discussion_r48518190
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -153,6 +153,15 @@ object SetOperationPushDown extends Rule[LogicalPlan] 
with PredicateHelper {
 )
   )
 
+// Adding extra Limit below UNION ALL if both left and right childs 
are not Limit.
+// This heuristic is valid assuming there does not exist any Limit 
push-down rule.
+case Limit(exp, Union(left, right))
+  if left.maxRows.isEmpty || right.maxRows.isEmpty =>
--- End diff --

The goal is to avoid double pushdown even if the limit has been pushed past 
another operator (i.e. a project).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12503] [SQL] Pushing Limit Through Unio...

2015-12-28 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10451#discussion_r48518228
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -91,6 +91,11 @@ abstract class LogicalPlan extends 
QueryPlan[LogicalPlan] with Logging {
   }
 
   /**
+   * Returns the limited number of rows to be returned.
--- End diff --

Specify that any operator that a `Limit` can be pushed passed should 
override this function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add more exceptions to Guava relocation

2015-12-28 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/10442#issuecomment-167700718
  
@microhello please file a JIRA and add it to the title of this PR. See how 
other patches are opened.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48518375
  
--- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala ---
@@ -355,6 +355,13 @@ private[spark] object SerDe {
   writeInt(dos, v.length)
   v.foreach(elem => writeObject(dos, elem))
 
+// Handle Properties
--- End diff --

my preference is to do more in R. if you feel strongly about having a 
helper in Scala instead of handling Properties then we could move most of the 
code into a Scala helper.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12503] [SQL] Pushing Limit Through Unio...

2015-12-28 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/10451#issuecomment-167700694
  
> add a comment and explain the current solution. In the future, if we add 
such an operator, we can change the current way and fix the issue? (Already 
added a comment in the code)

I like this option


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12503] [SQL] Pushing Limit Through Unio...

2015-12-28 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10451#discussion_r48518442
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -91,6 +91,11 @@ abstract class LogicalPlan extends 
QueryPlan[LogicalPlan] with Logging {
   }
 
   /**
+   * Returns the limited number of rows to be returned.
--- End diff --

And, thus, we should fix `Project` too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48518467
  
--- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala ---
@@ -355,6 +355,13 @@ private[spark] object SerDe {
   writeInt(dos, v.length)
   v.foreach(elem => writeObject(dos, elem))
 
+// Handle Properties
--- End diff --

I got it, java.util.Properties implements Map interface.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12512][SQL] support column name with do...

2015-12-28 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/10500#issuecomment-167700900
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Adjust coverage for partitionBy()

2015-12-28 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10499#discussion_r48518567
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -119,7 +119,7 @@ final class DataFrameWriter private[sql](df: DataFrame) 
{
* Partitions the output by the given columns on the file system. If 
specified, the output is
* laid out on the file system similar to Hive's partitioning scheme.
*
-   * This is only applicable for Parquet at the moment.
+   * This was initally applicable for Parquet but in 1.5+ covers JSON as 
well.
--- End diff --

also "text"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10418#issuecomment-167701068
  
**[Test build #48385 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48385/consoleFull)**
 for PR 10418 at commit 
[`64f32a4`](https://github.com/apache/spark/commit/64f32a43848a0d458b5d42f37688e7af17c5f336).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12526][SPARKR]`ifelse`, `when`, `otherw...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10481#discussion_r48518527
  
--- Diff: R/pkg/R/column.R ---
@@ -225,7 +225,7 @@ setMethod("%in%",
 setMethod("otherwise",
   signature(x = "Column", value = "ANY"),
   function(x, value) {
-value <- ifelse(class(value) == "Column", value@jc, value)
+value <- if(class(value) == "Column") { value@jc } else { 
value }
--- End diff --

if( :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10418#issuecomment-167701117
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48385/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10418#issuecomment-167701116
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...

2015-12-28 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/10399#issuecomment-167702564
  
@rxin Could you check this? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12512][SQL] support column name with do...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10500#issuecomment-167702840
  
**[Test build #48390 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48390/consoleFull)**
 for PR 10500 at commit 
[`6372f92`](https://github.com/apache/spark/commit/6372f92d7ce57cdf12ed98af513b11d97e613a88).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12547][SQL] Tighten scala style checker...

2015-12-28 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/10501

[SPARK-12547][SQL] Tighten scala style checker enforcement for UDF 
registration

We use scalastyle:off to turn off style checks in certain places where it 
is not possible to follow the style guide. This is usually ok. However, in udf 
registration, we disable the checker for a large amount of code simply because 
some of them exceed 100 char line limit. It is better to just disable the line 
limit check rather than everything.

In this pull request, I only disabled line length check, and fixed a 
problem (lack explicit types for public methods).


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark SPARK-12547

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10501.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10501


commit 5157f276a68eef3eebf70df66ee526f1529ac354
Author: Reynold Xin 
Date:   2015-12-29T02:40:04Z

[SPARK-12547][SQL] Tighten scala style checker enforcement for UDF 
registration.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [DOC] Adjust coverage for partitionBy()

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10499#issuecomment-167703422
  
**[Test build #48391 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48391/consoleFull)**
 for PR 10499 at commit 
[`dff3935`](https://github.com/apache/spark/commit/dff3935b571bcbf121aa017b1cf52bc5757d04ab).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12453] [Streaming] Spark Streaming Kine...

2015-12-28 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10416#issuecomment-167509793
  
Roger that, @Schadix would you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10402#issuecomment-167516759
  
**[Test build #48362 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48362/consoleFull)**
 for PR 10402 at commit 
[`767305a`](https://github.com/apache/spark/commit/767305a58c47fffb1ced4483e3c4a938e5383143).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...

2015-12-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/10418#discussion_r48469649
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -44,6 +48,11 @@ case class Size(child: Expression) extends 
UnaryExpression with ExpectsInputType
  * Sorts the input array in ascending / descending order according to the 
natural ordering of
  * the array elements and returns it.
  */
+@ExpressionDescription(
+  usage = "_FUNC_(array, ascendingOrder) - Sorts the input array for the 
given column in " +
--- End diff --

This will fail to compile in scala 2.11. Use a raw string here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48472149
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -556,3 +556,61 @@ createExternalTable <- function(sqlContext, tableName, 
path = NULL, source = NUL
   sdf <- callJMethod(sqlContext, "createExternalTable", tableName, source, 
options)
   dataFrame(sdf)
 }
+
+#' Create a DataFrame representing the database table accessible via JDBC 
URL
+#'
+#' Additional JDBC database connection properties can be set (...)
+#'
+#' Only one of partitionColumn or predicates should be set. Partitions of 
the table will be
+#' retrieved in parallel based on the `numPartitions` or by the predicates.
+#'
+#' Don't create too many partitions in parallel on a large cluster; 
otherwise Spark might crash
+#' your external database systems.
+#'
+#' @param sqlContext SQLContext to use
+#' @param url JDBC database url of the form `jdbc:subprotocol:subname`
+#' @param tableName the name of the table in the external database
+#' @param partitionColumn the name of a column of integral type that will 
be used for partitioning
+#' @param lowerBound the minimum value of `partitionColumn` used to decide 
partition stride
+#' @param upperBound the maximum value of `partitionColumn` used to decide 
partition stride
+#' @param numPartitions the number of partitions, This, along with 
`lowerBound` (inclusive),
+#'  `upperBound` (exclusive), form partition strides 
for generated WHERE
+#'  clause expressions used to split the column 
`partitionColumn` evenly.
+#'  This defaults to SparkContext.defaultParallelism 
when unset.
+#' @param predicates a list of conditions in the where clause; each one 
defines one partition
--- End diff --

State that parameter predicates is mutually exclusive from 
partitionColumn/lowerBound/upperBound/numPartitions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/10480#discussion_r48472173
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -556,3 +556,61 @@ createExternalTable <- function(sqlContext, tableName, 
path = NULL, source = NUL
   sdf <- callJMethod(sqlContext, "createExternalTable", tableName, source, 
options)
   dataFrame(sdf)
 }
+
+#' Create a DataFrame representing the database table accessible via JDBC 
URL
+#'
+#' Additional JDBC database connection properties can be set (...)
+#'
+#' Only one of partitionColumn or predicates should be set. Partitions of 
the table will be
+#' retrieved in parallel based on the `numPartitions` or by the predicates.
+#'
+#' Don't create too many partitions in parallel on a large cluster; 
otherwise Spark might crash
+#' your external database systems.
+#'
+#' @param sqlContext SQLContext to use
+#' @param url JDBC database url of the form `jdbc:subprotocol:subname`
+#' @param tableName the name of the table in the external database
+#' @param partitionColumn the name of a column of integral type that will 
be used for partitioning
+#' @param lowerBound the minimum value of `partitionColumn` used to decide 
partition stride
+#' @param upperBound the maximum value of `partitionColumn` used to decide 
partition stride
+#' @param numPartitions the number of partitions, This, along with 
`lowerBound` (inclusive),
+#'  `upperBound` (exclusive), form partition strides 
for generated WHERE
+#'  clause expressions used to split the column 
`partitionColumn` evenly.
+#'  This defaults to SparkContext.defaultParallelism 
when unset.
+#' @param predicates a list of conditions in the where clause; each one 
defines one partition
+#' @return DataFrame
+#' @rdname read.jdbc
+#' @name read.jdbc
+#' @export
+#' @examples
+#'\dontrun{
+#' sc <- sparkR.init()
+#' sqlContext <- sparkRSQL.init(sc)
+#' jdbcUrl <- "jdbc:mysql://localhost:3306/databasename"
+#' df <- read.jdbc(sqlContext, jdbcUrl, "table", predicates = 
list("field<=123"), user = "username")
+#' df2 <- read.jdbc(sqlContext, jdbcUrl, "table2", partitionColumn = 
"index", lowerBound = 0,
+#'  upperBound = 1, user = "username", password = 
"password")
+#' }
+
+read.jdbc <- function(sqlContext, url, tableName,
+  partitionColumn = NULL, lowerBound = NULL, 
upperBound = NULL,
+  numPartitions = 0L, predicates = list(), ...) {
+  jprops <- envToJProperties(varargsToEnv(...))
+
+  read <- callJMethod(sqlContext, "read")
+  if (!is.null(partitionColumn)) {
--- End diff --

add mutual exclusive check for predicates? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12453][Streaming] Remove explicit depen...

2015-12-28 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10492#issuecomment-167543175
  
@JoshRosen it's not the scope that's an issue but the version. Not 
specifying it lets the SDK version required by the Kinesis client come in at 
whatever it needs to be. I am not sure provided scope works since it does 
really need to be bundled and isn't necessarily otherwise available from the 
env.

There's a little wrinkle here in that the Kinesis code uses SDK classes 
directly, so technically the POM should declare that. However the SDK is used 
only in the context of the Kinesis client, so it seems like the lesser evil to 
rely on it as a transitive dependency but at the right version. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12530][Build] Fix build break at Spark-...

2015-12-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/10488#discussion_r48475897
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala
 ---
@@ -57,9 +57,10 @@ case class Md5(child: Expression) extends 
UnaryExpression with ImplicitCastInput
  * the hash length is not one of the permitted values, the return value is 
NULL.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(input, bitLength) - Returns a checksum of SHA-2 family 
as a hex string of the " +
-"input. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit 
length of 0 is equivalent " +
-"to 256",
+  usage =
+"""_FUNC_(input, bitLength) - Returns a checksum of SHA-2 family as a 
hex string of the input.
+  SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 
0 is equivalent to 256."""
+  ,
--- End diff --

Does that work under 2.11?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...

2015-12-28 Thread sun-rui
Github user sun-rui commented on the pull request:

https://github.com/apache/spark/pull/10480#issuecomment-167517583
  
For test JDBC, we  can add a helper function in Scala side, which reuses 
code in JDBCSuite to start a in-memory JDBC server?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10402#issuecomment-167520443
  
**[Test build #48363 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48363/consoleFull)**
 for PR 10402 at commit 
[`13f9c95`](https://github.com/apache/spark/commit/13f9c95590bbee7790e74768e7b42fb0e0161b9d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12525] Fix fatal compiler warnings in K...

2015-12-28 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10479#issuecomment-167539129
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8785#issuecomment-167539588
  
**[Test build #2258 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2258/consoleFull)**
 for PR 8785 at commit 
[`6af8fd8`](https://github.com/apache/spark/commit/6af8fd8b824f5f343a01868560b74a1f55acd02f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8785#issuecomment-167539724
  
**[Test build #2258 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2258/consoleFull)**
 for PR 8785 at commit 
[`6af8fd8`](https://github.com/apache/spark/commit/6af8fd8b824f5f343a01868560b74a1f55acd02f).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12536] [SQL] Added "Empty Seq" in Expla...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10494#issuecomment-167539570
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48361/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12340][SQL]fix Int overflow in the Spar...

2015-12-28 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10487#issuecomment-167540446
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12513] [Streaming] SocketReceiver hang ...

2015-12-28 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10464#issuecomment-167541966
  
Does this really solve the problem? the current code appears to clean up 
the socket on stopping already, so I wonder why this would fix it. Did you test 
it?
It makes more sense to open the socket in onStart if you close in onStop?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: we should have training and test sets

2015-12-28 Thread XD-DENG
Github user XD-DENG commented on the pull request:

https://github.com/apache/spark/pull/10434#issuecomment-167541958
  
Thanks for clarifying. Will have a look if I can proceed as you suggested 
with JIRA. 

Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Add more exceptions to Guava relocation

2015-12-28 Thread microhello
Github user microhello commented on a diff in the pull request:

https://github.com/apache/spark/pull/10442#discussion_r48467911
  
--- Diff: pom.xml ---
@@ -99,14 +99,14 @@
 sql/hive
 unsafe
 assembly
-external/twitter
-external/flume
-external/flume-sink
-external/flume-assembly
-external/mqtt
-external/mqtt-assembly
-external/zeromq
-examples
+
--- End diff --

OK


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12536] [SQL] Added "Empty Seq" in Expla...

2015-12-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10494#discussion_r48468287
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala
 ---
@@ -62,6 +62,10 @@ case class LocalRelation(output: Seq[Attribute], data: 
Seq[InternalRow] = Nil)
 case _ => false
   }
 
+  override def simpleString: String =
+if (data == Seq.empty) super.simpleString + " [Empty Seq]"
--- End diff --

should we do it in `TreeNode`? cc @marmbrus @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10402#issuecomment-167539426
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10402#issuecomment-167539345
  
**[Test build #48362 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48362/consoleFull)**
 for PR 10402 at commit 
[`767305a`](https://github.com/apache/spark/commit/767305a58c47fffb1ced4483e3c4a938e5383143).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12530][Build] Fix build break at Spark-...

2015-12-28 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/10488#discussion_r48472393
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala
 ---
@@ -57,9 +57,10 @@ case class Md5(child: Expression) extends 
UnaryExpression with ImplicitCastInput
  * the hash length is not one of the permitted values, the return value is 
NULL.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(input, bitLength) - Returns a checksum of SHA-2 family 
as a hex string of the " +
-"input. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit 
length of 0 is equivalent " +
-"to 256",
+  usage =
+"""_FUNC_(input, bitLength) - Returns a checksum of SHA-2 family as a 
hex string of the input.
+  SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 
0 is equivalent to 256."""
+  ,
--- End diff --

Yes, this keeps within 100 characters at a line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8641][SPARK-12455][SQL] Native Spark Wi...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10402#issuecomment-167539427
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48362/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12263][Docs]: IllegalStateException: Me...

2015-12-28 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10483#issuecomment-167539239
  
@nssalian you need to fix the line that's too long now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: we should have training and test sets

2015-12-28 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10434#issuecomment-167539553
  
@XD-DENG can you address my comments or close this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12536] [SQL] Added "Empty Seq" in Expla...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10494#issuecomment-167539491
  
**[Test build #48361 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48361/consoleFull)**
 for PR 10494 at commit 
[`21080af`](https://github.com/apache/spark/commit/21080afd995e3df141db1531c065ace6eac4fa77).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-10625] [SQL] Spark SQL JDBC read/write ...

2015-12-28 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/8785#discussion_r48472434
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/jdbc/UnserializableDriverHelper.scala
 ---
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.jdbc
+
+import java.sql.{DriverManager, Connection}
+import java.util.Properties
+import java.util.logging.Logger
+
+object UnserializableDriverHelper {
+
+  import scala.collection.JavaConverters._
--- End diff --

This is imported locally in  a few places, why? Below you don't import 
org.h2.Driver though. I'm not worried about changing it though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12536] [SQL] Added "Empty Seq" in Expla...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10494#issuecomment-167539569
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: we should have training and test sets

2015-12-28 Thread XD-DENG
Github user XD-DENG commented on the pull request:

https://github.com/apache/spark/pull/10434#issuecomment-167540341
  
@srowen Hi Owen, sure. Thanks a lot for your clarification. 

My understanding was that you find this modification unnecessary, so I 
didn't proceed further. 

Happy new year.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: we should have training and test sets

2015-12-28 Thread XD-DENG
Github user XD-DENG closed the pull request at:

https://github.com/apache/spark/pull/10434


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12457] [SQL] Add ExpressionDescription ...

2015-12-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/10418#discussion_r48469610
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -335,7 +335,7 @@ object FunctionRegistry {
 val df = clazz.getAnnotation(classOf[ExpressionDescription])
 if (df != null) {
   (name,
-(new ExpressionInfo(clazz.getCanonicalName, name, df.usage(), 
df.extended()),
+(new ExpressionInfo(clazz.getCanonicalName, name, df.usage(), 
df.extended().stripMargin),
--- End diff --

Why would you want to add a new line character in a raw string?

It would be nice to add ```stripMargin``` to ```df.usage``` as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12530][Build] Fix build break at Spark-...

2015-12-28 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/10488#discussion_r48470392
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala
 ---
@@ -57,9 +57,10 @@ case class Md5(child: Expression) extends 
UnaryExpression with ImplicitCastInput
  * the hash length is not one of the permitted values, the return value is 
NULL.
  */
 @ExpressionDescription(
-  usage = "_FUNC_(input, bitLength) - Returns a checksum of SHA-2 family 
as a hex string of the " +
-"input. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit 
length of 0 is equivalent " +
-"to 256",
+  usage =
+"""_FUNC_(input, bitLength) - Returns a checksum of SHA-2 family as a 
hex string of the input.
+  SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 
0 is equivalent to 256."""
+  ,
--- End diff --

Nit style. I guess this was needed to stay within 100 charaters?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...

2015-12-28 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10491#discussion_r48471720
  
--- Diff: docs/configuration.md ---
@@ -120,7 +120,8 @@ of the most common options to set are:
   spark.driver.cores
   1
   
-Number of cores to use for the driver process, only in cluster mode.
+Number of cores to use for the driver process, only in cluster mode. 
This can be set through
+--driver-cores command line option.
--- End diff --

I don't think the purpose of this file is to document how the CLI works. It 
should stick to documenting the underlying properties.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12526][SPARKR]`ifelse`, `when`, `otherw...

2015-12-28 Thread sun-rui
Github user sun-rui commented on the pull request:

https://github.com/apache/spark/pull/10481#issuecomment-167540565
  
The fix is good, but some style nit:
if (...) { ... } else { ... }



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12424][ML] The implementation of ParamM...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10381#issuecomment-167551863
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48364/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12424][ML] The implementation of ParamM...

2015-12-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10381#issuecomment-167551695
  
**[Test build #48364 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48364/consoleFull)**
 for PR 10381 at commit 
[`ea924e9`](https://github.com/apache/spark/commit/ea924e935ea8adb7cb9bdc8a7ac0da1fa32c0328).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12424][ML] The implementation of ParamM...

2015-12-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10381#issuecomment-167551862
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >