date:20150203

[GitHub] spark pull request: [SPARK-5526][SQL] fix issue about cast to date

2015-02-03 Thread viper-kun

Github user viper-kun closed the pull request at:

https://github.com/apache/spark/pull/4307


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5574] use given name prefix in dir

2015-02-03 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4344#issuecomment-72779281
  
LGTM too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72784543
  
ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-03 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/4350#issuecomment-72785278
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4350#issuecomment-72785441
  
  [Test build #26713 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26713/consoleFull)
 for   PR 4350 at commit 
[`4c3913a`](https://github.com/apache/spark/commit/4c3913add23b39e4c5a5120d8a56917f972e1b4b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5470][Core]use defaultClassLoader to lo...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4258#issuecomment-72786916
  
  [Test build #26707 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26707/consoleFull)
 for   PR 4258 at commit 
[`73b719f`](https://github.com/apache/spark/commit/73b719f5bc7b69fca8d51cb8b991074dc92e50ed).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5380][GraphX] Solve an ArrayIndexOutOfB...

2015-02-03 Thread maropu

Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/4176#issuecomment-72786951
  
I wonder if stopping the process is the best solution.
If there is only one illegal entry in a last line, we need to re-try 
loading a whole file,
which is time-consuming.

An other idea is that illegal entries are silently re-directed into a file 
or something.
Finally, # of the re-directed entries is only output in the last phase of 
bulk-loading.
Moreover, I think that it'd be better to add a new API to append these 
entries into existing Graph
such as GraphOps.addEdges(val edges: RDD[Edge[ED]]). 
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5426][SQL] Add SparkSQL Java API helper...

2015-02-03 Thread kul

Github user kul commented on the pull request:

https://github.com/apache/spark/pull/4243#issuecomment-72786838
  
@marmbrus Thanks for review!
Rebased against master and sqashed in a new commit renaming 
`schemaRDDOperations` to now more aptly called `dataFrameRDDOperations`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5470][Core]use defaultClassLoader to lo...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4258#issuecomment-72786922
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26707/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5583][SQL][WIP] Support unique join in ...

2015-02-03 Thread scwf

GitHub user scwf opened a pull request:

https://github.com/apache/spark/pull/4354

[SPARK-5583][SQL][WIP] Support unique join in hive context

Support unique join in hive context, the basic idea is transform unique 
join into outer join + filter in spark sql:

FROM UNIQUEJOIN [PRESERVE] T1 a (a.key), [PRESERVE] T2 b (b.key), 
[PRESERVE] T3 c (c.key) ...

If all the tables have PRESERVE keyword == T1 full out join T2 full out 
join T3 ...
else If all the tables do not have PRESERVE keyword = T1 inner join T2 
inner join T3 ...
else ==
 T = (T1 full out join T2 full out join T3 ...)
 Filter on T, filter condition = keep the rows with any preserve field 
is not null.

for examples:
1 T1 a (a.key), PRESERVE T2 b (b.key), PRESERVE T3 c (c.key) == if b.key 
is not null or c.key is not null, we'll keep the row
2 T1 a (a.key), T2 b (b.key), PRESERVE T3 c (c.key)  == if c.key is not 
null we'll keep the row

Correct me if i am wrong.  

todos: add tests for this

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/scwf/spark unique-join

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4354.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4354


commit b7e89a94cbeddcb53aac779d4b9d7de2d94e0325
Author: wangfei wangf...@huawei.com
Date:   2015-02-03T05:29:09Z

support unique join in hive context




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Use HiveContext's sessionState in HiveMe...

2015-02-03 Thread yhuai

GitHub user yhuai opened a pull request:

https://github.com/apache/spark/pull/4355

[SQL] Use HiveContext's sessionState in 
HiveMetastoreCatalog.hiveDefaultTableFilePath

`client.getDatabaseCurrent` uses SessionState's local variable which can be 
an issue. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yhuai/spark defaultTablePath

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4355.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4355


commit 84a29e51b657f7b265f166a6ec25600c944cc440
Author: Yin Huai yh...@databricks.com
Date:   2015-02-04T04:46:26Z

Use HiveContext's sessionState instead of using SessionState's thread local 
variable.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4969][STREAMING][PYTHON] Add binaryReco...

2015-02-03 Thread freeman-lab

Github user freeman-lab commented on the pull request:

https://github.com/apache/spark/pull/3803#issuecomment-72793042
  
Thanks for the detailed look @tdas! Think I addressed both nits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5498][SQL]fix bug when query the data w...

2015-02-03 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4289#issuecomment-7280
  
In general I think the change looks reasonable to me, and we'd better use 
the Hive `ObjectConverter` directly, and some of the code can be cleaner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72779779
  
Why did you choose the parameters metadata.broker.list and the 
bootstrap.servers as the required kafka params? I looked at the Kafka docs, 
and it says that for consumers, the necessary properties are 
zookeeper.connect  and group.id. And intuitively the application is 
consuming, so the consumer configs should apply (not group.id, but 
zookeeper.connect). So our interface should also require zookeeper.connect 
and not other two. Isnt it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5470][Core]use defaultClassLoader to lo...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4258#issuecomment-72780813
  
  [Test build #26707 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26707/consoleFull)
 for   PR 4258 at commit 
[`73b719f`](https://github.com/apache/spark/commit/73b719f5bc7b69fca8d51cb8b991074dc92e50ed).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72782334
  
  [Test build #26701 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26701/consoleFull)
 for   PR 3798 at commit 
[`8c31855`](https://github.com/apache/spark/commit/8c31855cf6b7327c6b6611e715457ba15bb79355).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  class DeterministicKafkaInputDStreamCheckpointData extends 
DStreamCheckpointData(this) `
  * `class KafkaCluster(val kafkaParams: Map[String, String]) extends 
Serializable `
  * `  case class LeaderOffset(host: String, port: Int, offset: Long)`
  * `class KafkaRDDPartition(`
  * `trait HasOffsetRanges `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72782343
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26701/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...

2015-02-03 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4066#issuecomment-72782219
  
After some more thought and testing, I don't know if it's safe to ignore 
task failures that are due to commits being denied, since doing so risks 
infinite rescheduling if all commits are denied.  On the other hand, treating 
these as failures could lead to spurious job failures in cases where you have 
many copies of one slow, speculated task (the old behavior would treat these as 
successful task completions).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4939] revive offers periodically in Loc...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4147#issuecomment-72782236
  
  [Test build #576 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/576/consoleFull)
 for   PR 4147 at commit 
[`33ac9bb`](https://github.com/apache/spark/commit/33ac9bb57f9e0e6a60e9ffd0eeeac7599aec8c49).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4795][Core] Redesign the primitive typ...

2015-02-03 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/3642#discussion_r24060542
  
--- Diff: 
graphx/src/test/scala/org/apache/spark/graphx/lib/ShortestPathsSuite.scala ---
@@ -40,7 +40,7 @@ class ShortestPathsSuite extends FunSuite with 
LocalSparkContext {
   val graph = Graph.fromEdgeTuples(edges, 1)
   val landmarks = Seq(1, 4).map(_.toLong)
   val results = ShortestPaths.run(graph, 
landmarks).vertices.collect.map {
-case (v, spMap) = (v, spMap.mapValues(_.get))
--- End diff --

 If they are not ambiguous, I'd add the implicits back to make sure we 
never break.

I added them back.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4216#issuecomment-72786346
  
  [Test build #26716 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26716/consoleFull)
 for   PR 4216 at commit 
[`792e112`](https://github.com/apache/spark/commit/792e1121dff43e69c84fe9cfff4fc1be61ba2af5).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class MasterStateResponse(`
  * `class LocalSparkCluster(`
  * `   *   (4) the main class for the child`
  * `  case class BoundPortsResponse(actorPort: Int, webUIPort: Int, 
restPort: Option[Int])`
  * `  throw new SubmitRestMissingFieldException(Main class must be 
set in submit request.)`
  * `class SubmitRestProtocolException(message: String, cause: Exception = 
null)`
  * `class SubmitRestMissingFieldException(message: String) extends 
SubmitRestProtocolException(message)`
  * `abstract class SubmitRestProtocolMessage `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4216#issuecomment-72786347
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26716/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-4587] [mllib] ML model import/exp...

2015-02-03 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/4233#discussion_r24061275
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/mllib/classification/LogisticRegressionSuite.scala
 ---
@@ -459,7 +461,41 @@ class LogisticRegressionSuite extends FunSuite with 
MLlibTestSparkContext with M
 // very steep curve in logistic function so that when we draw samples 
from distribution, it's
 // very easy to assign to another labels. However, this prediction 
result is consistent to R.
 
validatePrediction(model.predict(validationRDD.map(_.features)).collect(), 
validationData, 0.47)
+  }
+
+  test(model export/import) {
--- End diff --

I'm reorganizing the code some to make it easier to keep exporters for each 
version.  It should be pretty maintainable and will allow for better testing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4943][SPARK-5251][SQL] Allow table name...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4062#issuecomment-72786283
  
  [Test build #26715 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26715/consoleFull)
 for   PR 4062 at commit 
[`057d23e`](https://github.com/apache/spark/commit/057d23e2223f7dd0a2d5cde5e3b5f0d47df59059).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4943][SPARK-5251][SQL] Allow table name...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4062#issuecomment-72791305
  
  [Test build #26715 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26715/consoleFull)
 for   PR 4062 at commit 
[`057d23e`](https://github.com/apache/spark/commit/057d23e2223f7dd0a2d5cde5e3b5f0d47df59059).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4216#issuecomment-72791644
  
  [Test build #26717 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26717/consoleFull)
 for   PR 4216 at commit 
[`c643f64`](https://github.com/apache/spark/commit/c643f646ce2cab7fa76a69fa64f9c6a5320111d1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class MasterStateResponse(`
  * `class LocalSparkCluster(`
  * `   *   (4) the main class for the child`
  * `  case class BoundPortsResponse(actorPort: Int, webUIPort: Int, 
restPort: Option[Int])`
  * `  throw new SubmitRestMissingFieldException(Main class must be 
set in submit request.)`
  * `class SubmitRestProtocolException(message: String, cause: Exception = 
null)`
  * `class SubmitRestMissingFieldException(message: String) extends 
SubmitRestProtocolException(message)`
  * `abstract class SubmitRestProtocolMessage `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5498][SQL]fix bug when query the data w...

2015-02-03 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4289#discussion_r24057600
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -315,9 +335,23 @@ private[hive] object HadoopTableReader extends 
HiveInspectors {
   }
 }
 
+val partTblObjectInspectorConverter = 
ObjectInspectorConverters.getConverter(
+  deserializer.getObjectInspector, soi)
+
 // Map each tuple to a row object
 iterator.map { value =
-  val raw = deserializer.deserialize(value)
+  val raw = convertdeserializer match {
--- End diff --

In general, we'd better not to do the pattern matching within the iterator, 
and we can do that like:
`
xx match {
  case xxx = iterator.map { ... }
  case yyy = iterator.map { ... }
}
```

For this case, as I shown above, if we passed the converter directly into 
`fillObject`, I don't think we need the pattern match here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4939] revive offers periodically in Loc...

2015-02-03 Thread davies

Github user davies commented on the pull request:

https://github.com/apache/spark/pull/4147#issuecomment-72775916
  
@kayousterhout done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72775833
  
  [Test build #26701 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26701/consoleFull)
 for   PR 3798 at commit 
[`8c31855`](https://github.com/apache/spark/commit/8c31855cf6b7327c6b6611e715457ba15bb79355).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72777123
  
Ohh I meant createStream -- createDirectStream. I would have preferred 
something like createReceiverLessStream but thats a mouthful. I think direct 
is something that comes close without being a mouthful. Had not occurred to me 
until Patrick suggested it.

And the underlying assumptions, I confess are not super concrete. 
Somethings like binary compatiblity issues (ex, do not use scala traits with 
implemented methods) are fairly concrete, where as things about API elegance 
(e.g. rdd.asInstanceOf[KafkaRDD] vs rdd.asInstanceOf[HasOffsetRanges]) are a 
little fuzzy and opinions vary from person to person. Often what seems 
intuitive to me is not intuitive to someone else, even within the key 
committers like Patrick, Michael, Matei, etc. We usually argue about this in 
design docs, get as many eyeballs as possible, and try to reach a consensus. 
Its is indeed a bit fuzzy, but its all towards making the API that we *think* 
will be the best for the developers. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4795][Core] Redesign the primitive typ...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3642#issuecomment-72777210
  
  [Test build #26704 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26704/consoleFull)
 for   PR 3642 at commit 
[`914b2d6`](https://github.com/apache/spark/commit/914b2d6a65afe19b582b436ed1eb6501d5c16db3).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72778614
  
  [Test build #26706 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26706/consoleFull)
 for   PR 3798 at commit 
[`59e29f6`](https://github.com/apache/spark/commit/59e29f61cd6a730eeea4e47a5316cbbe47615618).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-5577] Python udf for DataFrame

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4351#issuecomment-72778536
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26703/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread koeninger

Github user koeninger commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72780349
  
High level consumers connect to ZK.

Simple consumers (which is what this is using) connect to brokers directly
instead.  See

https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

I chose to accept either of the two existing means in Kafka of specifying a
list of seed brokers, rather than making up yet a third way



On Tue, Feb 3, 2015 at 8:36 PM, Tathagata Das notificati...@github.com
wrote:

 Why did you choose the parameters metadata.broker.list and the
 bootstrap.servers as the required kafka params? I looked at the Kafka
 docs, and it says that for consumers, the necessary properties are
 zookeeper.connect and group.id. And intuitively the application is
 consuming, so the consumer configs should apply (not group.id, but
 zookeeper.connect). So our interface should also require
 zookeeper.connect and not other two. Isnt it?

 â
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3798#issuecomment-72779779.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5578][SQL][DataFrame] Provide a conveni...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4345#issuecomment-72781888
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26698/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2827][GraphX]Add degree distribution op...

2015-02-03 Thread maropu

Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/1767#issuecomment-72782993
  
What't the status of this patch?
If possibly merged into the master, I'll refactor the codes and add unit 
tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72784748
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26706/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72784745
  
  [Test build #26706 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26706/consoleFull)
 for   PR 3798 at commit 
[`59e29f6`](https://github.com/apache/spark/commit/59e29f61cd6a730eeea4e47a5316cbbe47615618).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  class DirectKafkaInputDStreamCheckpointData extends 
DStreamCheckpointData(this) `
  * `class KafkaCluster(val kafkaParams: Map[String, String]) extends 
Serializable `
  * `  case class LeaderOffset(host: String, port: Int, offset: Long)`
  * `class KafkaRDDPartition(`
  * `trait HasOffsetRanges `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72784728
  
@harishreedharan This begs a higher level questions of whether the write 
ahead log (which is the probably component to fail) should have its own retries 
independent of the receiver retrying. 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4216#issuecomment-72786661
  
  [Test build #26717 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26717/consoleFull)
 for   PR 4216 at commit 
[`c643f64`](https://github.com/apache/spark/commit/c643f646ce2cab7fa76a69fa64f9c6a5320111d1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72787965
  
I think the simplest solution is to assign zookeeper.connect. But you are 
assigning it in KafkaCluster lines 338 - 345. So why is this warning being 
thrown?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4795][Core] Redesign the primitive typ...

2015-02-03 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/3642#issuecomment-72788058
  
Ok I'm going to merge this. Thanks for working on it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4350#issuecomment-72789745
  
  [Test build #26713 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26713/consoleFull)
 for   PR 4350 at commit 
[`4c3913a`](https://github.com/apache/spark/commit/4c3913add23b39e4c5a5120d8a56917f972e1b4b).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4350#issuecomment-72789756
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26713/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4707][STREAMING] Reliable Kafka Receive...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3655#issuecomment-72789875
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26710/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread jerryshao

Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72789850
  
Hi @tdas , should we add a example to show users how to use this new Kafka 
API correctly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Use HiveContext's sessionState in HiveMe...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4355#issuecomment-72790535
  
  [Test build #26724 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26724/consoleFull)
 for   PR 4355 at commit 
[`84a29e5`](https://github.com/apache/spark/commit/84a29e51b657f7b265f166a6ec25600c944cc440).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72791220
  
Holy crap! Dont bother about this at all. This can wait. I hope everything
is okay. Take care and all the best!
On Feb 3, 2015 8:45 PM, Cody Koeninger notificati...@github.com wrote:

 The warning is for metadata.broker.list, since its not expected by the
 existing ConsumerConfig (its used by other config classes)

 Couldn't get subclassing to work, the verifiedproperties class it uses is
 very dependent on order of operations during construction.

 I think the simplest thing is a class that is constructed using
 kafkaparams, and uses the static defaults from the ConsumerConfig object.

 I'm currently waiting in an ER with my child with a 105 fever, so won't be
 getting to it for a few hours to say the least.
 On Feb 3, 2015 10:15 PM, Tathagata Das notificati...@github.com wrote:

  I think the simplest solution is to assign zookeeper.connect. But you 
are
  assigning it in KafkaCluster lines 338 - 345. So why is this warning
 being
  thrown?
 
  â
  Reply to this email directly or view it on GitHub
  https://github.com/apache/spark/pull/3798#issuecomment-72787965.
 

 â
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3798#issuecomment-72790044.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4943][SPARK-5251][SQL] Allow table name...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4062#issuecomment-72791312
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26715/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5583][SQL][WIP] Support unique join in ...

2015-02-03 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/4354#issuecomment-72792534
  
Do you mind adding more inline comment? My worry is just complexity. If 
nobody uses this, it's going to be a bunch of code there that for the sake of 
supporting a thing in Hive.

Do any other database systems support this unique join syntax? (Or 
something similar)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread davies

Github user davies commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72794234
  
This select() and filter() in Python do not support yet


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/4348#discussion_r24063817
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala 
---
@@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql](
 select((col +: cols).map(Column(_)) :_*)
   }
 
+  override def selectExpr(exprs: String*): DataFrame = {
--- End diff --

I think this one could be merged into select(), column is also a valid 
expression


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4969][STREAMING][PYTHON] Add binaryReco...

2015-02-03 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/3803#discussion_r24064149
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -671,7 +674,11 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
   classOf[LongWritable],
   classOf[BytesWritable],
   conf=conf)
-val data = br.map{ case (k, v) = v.getBytes}
+val data = br.map { case (k, v) =
+  val bytes = v.getBytes
+  assert(bytes.length == recordLength, Byte array does not have 
correct length)
+  bytes
--- End diff --

I meant should the user be told that the system can throw error when the 
records are not of the expected size. I dont have any strong feeling on this, 
just wondering.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4350#issuecomment-72778202
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26699/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4350#issuecomment-72778196
  
  [Test build #26699 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26699/consoleFull)
 for   PR 4350 at commit 
[`4c3913a`](https://github.com/apache/spark/commit/4c3913add23b39e4c5a5120d8a56917f972e1b4b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread koeninger

Github user koeninger commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72779615
  
Yeah, there's a weird distinction in Kafka between simple consumers and
high level consumers in that they have a lot of common configuration
parameters, but one of them talks directly to brokers and the other goes
through zk.

I'll see if I can make a private subclass of ConsumerConfig to shut that
warning up.

On Tue, Feb 3, 2015 at 8:28 PM, Tathagata Das notificati...@github.com
wrote:

 Hey Cody, I was trying it and I found a odd behavior. It was printing this
 repeatedly.

 15/02/03 18:22:08 WARN VerifiableProperties: Property 
metadata.broker.list is not valid

 I was using this code.

 val kafkaParams = Map[String, String](metadata.broker.list - 
brokerList)
 val lines = KafkaUtils.createNewStream[String, String, StringDecoder, 
StringDecoder](
   ssc, kafkaParams, topicsSet)

 I chose metadata.broker.list from the code in KafkaCluster, because
 without that I was getting exception from the KafkaCluster.

 â
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3798#issuecomment-72779120.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5470][Core]use defaultClassLoader to lo...

2015-02-03 Thread lianhuiwang

Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/4258#issuecomment-72780420
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5379][Streaming] Add awaitTerminationOr...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/4171#issuecomment-72785026
  
Also could could update the python API as well? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5379][Streaming] Add awaitTerminationOr...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/4171#issuecomment-72784927
  
Please add unit tests for this behavior! It should be in 
StreamingContextSuite.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5582] [history] Ignore empty log direct...

2015-02-03 Thread vanzin

GitHub user vanzin opened a pull request:

https://github.com/apache/spark/pull/4352

[SPARK-5582] [history] Ignore empty log directories.

Empty log directories are not useful at the moment, but if one ends
up showing in the log root, it breaks the code that checks for log
directories.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vanzin/spark SPARK-5582

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4352.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4352


commit 1a6a3d45c64276ab6cb14341e57cbc8d397a1afc
Author: Marcelo Vanzin van...@cloudera.com
Date:   2015-02-04T03:18:26Z

[SPARK-5582] Fix exception when looking at empty directories.

Empty log directories are not useful at the moment, but if one ends
up showing in the log root, it breaks the code that checks for log
directories.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5388] Provide a stable application subm...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4216#issuecomment-72786281
  
  [Test build #26716 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26716/consoleFull)
 for   PR 4216 at commit 
[`792e112`](https://github.com/apache/spark/commit/792e1121dff43e69c84fe9cfff4fc1be61ba2af5).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5426][SQL] Add SparkSQL Java API helper...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4243#issuecomment-72787085
  
  [Test build #26718 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26718/consoleFull)
 for   PR 4243 at commit 
[`2390fba`](https://github.com/apache/spark/commit/2390fba337e80eb63fa25b0c4fa6adc9945b6d2d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72788843
  
  [Test build #26723 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26723/consoleFull)
 for   PR 4348 at commit 
[`2baeef2`](https://github.com/apache/spark/commit/2baeef2f4035bad7aa829cf52fc338245f52fafd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964] [Streaming] Exactly-once semantic...

2015-02-03 Thread koeninger

Github user koeninger commented on the pull request:

https://github.com/apache/spark/pull/3798#issuecomment-72790044
  
The warning is for metadata.broker.list, since its not expected by the
existing ConsumerConfig (its used by other config classes)

Couldn't get subclassing to work, the verifiedproperties class it uses is
very dependent on order of operations during construction.

I think the simplest thing is a class that is constructed using
kafkaparams, and uses the static defaults from the ConsumerConfig object.

I'm currently waiting in an ER with my child with a 105 fever, so won't be
getting to it for a few hours to say the least.
On Feb 3, 2015 10:15 PM, Tathagata Das notificati...@github.com wrote:

 I think the simplest solution is to assign zookeeper.connect. But you are
 assigning it in KafkaCluster lines 338 - 345. So why is this warning being
 thrown?

 â
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3798#issuecomment-72787965.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4066#issuecomment-72790079
  
  [Test build #26712 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26712/consoleFull)
 for   PR 4066 at commit 
[`3969f5f`](https://github.com/apache/spark/commit/3969f5f27f85e1092c8271d575e23cc834ca9ffb).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class TaskCommitDenied(`
  * `class CommitDeniedException(`
  * `  class OutputCommitCoordinatorActor(outputCommitCoordinator: 
OutputCommitCoordinator)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4066#issuecomment-72790083
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26712/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5582] [history] Ignore empty log direct...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4352#issuecomment-72790111
  
  [Test build #26711 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26711/consoleFull)
 for   PR 4352 at commit 
[`1a6a3d4`](https://github.com/apache/spark/commit/1a6a3d45c64276ab6cb14341e57cbc8d397a1afc).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5484] Checkpoint every 25 iterations in...

2015-02-03 Thread maropu

Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/4273#issuecomment-72790567
  
How about adding a new configuration, e.g., 
spark.graphx.pregel.checkpoint.interval in SparkConf?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72796410
  
  [Test build #26723 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26723/consoleFull)
 for   PR 4348 at commit 
[`2baeef2`](https://github.com/apache/spark/commit/2baeef2f4035bad7aa829cf52fc338245f52fafd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4348#issuecomment-72796417
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26723/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-5577] Python udf for DataFrame

2015-02-03 Thread davies

GitHub user davies opened a pull request:

https://github.com/apache/spark/pull/4351

[WIP] [SPARK-5577] Python udf for DataFrame



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davies/spark python_udf

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4351.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4351


commit 3ab26614b5278edce6e8571e5c51fe0b67e3124e
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T08:08:00Z

add more tests for DataFrame

commit 6040ba73431cc22d8d777555db6b35241275bdce
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T09:09:36Z

fix docs

commit 9ab78b4262961deafe0256c8c28d2911a4c07b0a
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T09:10:54Z

Merge branch 'master' of github.com:apache/spark into fix_df

Conflicts:
sql/core/src/main/scala/org/apache/spark/sql/Column.scala

commit 78ebcfa6ba750e081f6b5c7b07c8d04f32c2d4d6
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T09:12:02Z

add sql_test.py in run_tests

commit 35ccb9f5721266a3a25df7e5f6d4b2c98f5f18d5
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T09:23:16Z

fix build

commit 8dd19a912e8595dddeec56fea964ab40b5b9f738
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T18:00:04Z

fix tests in python 2.6

commit c052f6fe0aaaf688a8f08e0fe04abdeea8933448
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T18:44:36Z

Merge branch 'master' of github.com:apache/spark into fix_df

commit 83c92fedc4f69dfff909d61899c906cea357498f
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T20:21:08Z

address comments

commit 467332cacca8754f04271a70bbaf15c8f2afd5c6
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T20:34:16Z

support string in cast()

commit dd9919f115d3b8f4b66d213c4a57bc832ed8ed57
Author: Davies Liu dav...@databricks.com
Date:   2015-02-03T22:17:09Z

fix tests

commit 1e4766485b20629a9cee12fc1c4751fc427cc569
Author: Davies Liu dav...@databricks.com
Date:   2015-02-04T01:24:15Z

Merge branch 'master' of github.com:apache/spark into python_udf




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4066#issuecomment-72780049
  
  [Test build #26700 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26700/consoleFull)
 for   PR 4066 at commit 
[`97da5fe`](https://github.com/apache/spark/commit/97da5feb6fe49255afaac1dc9d5db1edf8c1ff42).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class TaskCommitDenied(`
  * `class CommitDeniedException(`
  * `  class OutputCommitCoordinatorActor(outputCommitCoordinator: 
OutputCommitCoordinator)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4939] revive offers periodically in Loc...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4147#issuecomment-72782830
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26702/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4939] revive offers periodically in Loc...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4147#issuecomment-72782813
  
  [Test build #26702 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26702/consoleFull)
 for   PR 4147 at commit 
[`2acdf9d`](https://github.com/apache/spark/commit/2acdf9d1eb6034581eb33ef3df1c8fc652bf325a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4795][Core] Redesign the primitive typ...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3642#issuecomment-72783410
  
  [Test build #26704 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26704/consoleFull)
 for   PR 3642 at commit 
[`914b2d6`](https://github.com/apache/spark/commit/914b2d6a65afe19b582b436ed1eb6501d5c16db3).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: #SPARK-2808 update kafka to version 0.8.2

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3631#issuecomment-72784310
  
Aah cool. However 0.8.1 and 0.8.2 have pretty big changes between them, so 
lets merge this for the next release. We are already doing a lot of 
experimental Kafka stuff in this release (feature merge window has closed). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3039] [BUILD] Spark assembly for new ha...

2015-02-03 Thread medale

Github user medale commented on the pull request:

https://github.com/apache/spark/pull/4315#issuecomment-72785613
  
The problem was that the Spark project hive-exec 0.13.1a depends on

```
dependency
groupIdorg.apache.avro/groupId
artifactIdavro-mapred/artifactId
version${avro.version}/version
/dependency
```

(see 
http://central.maven.org/maven2/org/spark-project/hive/hive-exec/0.13.1a/hive-exec-0.13.1a.pom)

Its parent defines avro.version as 1.7.5

avro.version1.7.5/avro.version

(see 
http://central.maven.org/maven2/org/spark-project/hive/hive/0.13.1a/hive-0.13.1a.pom)

The only place hive-exec is being used as a dependency is in:

find . -name pom.xml | xargs grep hive-exec
pom.xml (where we define it in dependencyManagement section)
sql/hive/pom.xml (in actual dependencies)

In sql/hive/pom.xml we also explicitly have dependency on:

```
   dependency
  groupIdorg.apache.avro/groupId
  artifactIdavro-mapred/artifactId
  classifier${avro.mapred.classifier}/classifier
/dependency
```

Therefore if we choose a profile that does not define avro.mapred.classifier
this field is left empty (see main pom.xml 
avro.mapred.classifier/avro.mapred.classifier).
We pull: avro-mapred-1.7.6.jar (exact same as 
avro-mapred-1.7.6-hadoop1.jar) as it should be.

If we choose a profile like hadoop-2.4 we set it to hadoop2 and pull:
avro-mapred-1.7.6-hadoop2.jar as it should be.

```
profile
  idhadoop-2.4/id
  properties
hadoop.version2.4.0/hadoop.version
protobuf.version2.5.0/protobuf.version
jets3t.version0.9.0/jets3t.version
hbase.version0.98.7-hadoop2/hbase.version
commons.math3.version3.1.1/commons.math3.version
avro.mapred.classifierhadoop2/avro.mapred.classifier
  /properties
/profile
```

However, with changes in 1.3.0-SNAPSHOT the avro-mapred's scope is newly 
defined as:

```
 dependency
groupIdorg.apache.avro/groupId
artifactIdavro-mapred/artifactId
version${avro.version}/version
classifier${avro.mapred.classifier}/classifier
scope${hive.deps.scope}/scope
```

That scope is in main pom.xml:
hive.deps.scopecompile/hive.deps.scope 

However, with changes in 1.3.0-SNAPSHOT the avro-mapred's scope is newly 
defined as:

```
 dependency
groupIdorg.apache.avro/groupId
artifactIdavro-mapred/artifactId
version${avro.version}/version
classifier${avro.mapred.classifier}/classifier
scope${hive.deps.scope}/scope
```

That scope is in main pom.xml:
hive.deps.scopecompile/hive.deps.scope 
assembly/pom.xml:hive.deps.scopeprovided/hive.deps.scope
examples/pom.xml:hive.deps.scopeprovided/hive.deps.scope

Same for hive-exec. So competing avro-mapred classes will no longer be 
included in the spark-assembly.jar. They are not included on the Hadoop 
classpath (only Avro), so they need to be supplied by the job. That will be new 
for Avro users. But excluding the hive-exec dependency and explicitly 
specifying avro-mapred to be only 1.7.6 with the correct classifier will be 
necessary if anything like maven enforcer is ever run. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4969][STREAMING][PYTHON] Add binaryReco...

2015-02-03 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/3803#discussion_r24061014
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -671,7 +674,11 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
   classOf[LongWritable],
   classOf[BytesWritable],
   conf=conf)
-val data = br.map{ case (k, v) = v.getBytes}
+val data = br.map { case (k, v) =
+  val bytes = v.getBytes
+  assert(bytes.length == recordLength, Byte array does not have 
correct length)
+  bytes
--- End diff --

nit: Is this something that the user should be made aware of in the docs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5583][SQL][WIP] Support unique join in ...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4354#issuecomment-72788451
  
  [Test build #26721 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26721/consoleFull)
 for   PR 4354 at commit 
[`015fe2f`](https://github.com/apache/spark/commit/015fe2f7fede76ef25102f1dc928ee5c57c6d167).
 * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5583][SQL][WIP] Support unique join in ...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4354#issuecomment-72792001
  
  [Test build #26722 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26722/consoleFull)
 for   PR 4354 at commit 
[`dd34ebf`](https://github.com/apache/spark/commit/dd34ebf60295046cb1bc82862b6c0a86ce2f8837).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4969][STREAMING][PYTHON] Add binaryReco...

2015-02-03 Thread freeman-lab

Github user freeman-lab commented on a diff in the pull request:

https://github.com/apache/spark/pull/3803#discussion_r24063184
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala
 ---
@@ -210,6 +211,20 @@ class JavaStreamingContext(val ssc: StreamingContext) 
extends Closeable {
   }
 
   /**
+   * :: Experimental ::
+   *
+   * Create an input stream that monitors a Hadoop-compatible filesystem
+   * for new files and reads them as flat binary files with fixed record 
lengths,
+   * yielding byte arrays
+   * @param directory HDFS directory to monitor for new files
+   * @param recordLength The length at which to split the records
+   */
--- End diff --

Thanks, added!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5426][SQL] Add SparkSQL Java API helper...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4243#issuecomment-72792070
  
  [Test build #26718 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26718/consoleFull)
 for   PR 4243 at commit 
[`2390fba`](https://github.com/apache/spark/commit/2390fba337e80eb63fa25b0c4fa6adc9945b6d2d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5426][SQL] Add SparkSQL Java API helper...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4243#issuecomment-72792072
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26718/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5583][SQL][WIP] Support unique join in ...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4354#issuecomment-72792006
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26722/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5583][SQL][WIP] Support unique join in ...

2015-02-03 Thread scwf

Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/4354#issuecomment-72792978
  
It seems this is hive specified syntax as far as i know...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4969][STREAMING][PYTHON] Add binaryReco...

2015-02-03 Thread freeman-lab

Github user freeman-lab commented on a diff in the pull request:

https://github.com/apache/spark/pull/3803#discussion_r24063473
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -671,7 +674,11 @@ class SparkContext(config: SparkConf) extends Logging 
with ExecutorAllocationCli
   classOf[LongWritable],
   classOf[BytesWritable],
   conf=conf)
-val data = br.map{ case (k, v) = v.getBytes}
+val data = br.map { case (k, v) =
+  val bytes = v.getBytes
+  assert(bytes.length == recordLength, Byte array does not have 
correct length)
+  bytes
--- End diff --

Do you mean something more than these notes we're adding? I just clarified 
the notes a bit to make it obvious the check is on the byte array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP] [SPARK-5577] Python udf for DataFrame

2015-02-03 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/4351#discussion_r24063741
  
--- Diff: python/pyspark/sql.py ---
@@ -2263,18 +2263,6 @@ def subtract(self, other):
 
 return DataFrame(getattr(self._jdf, except)(other._jdf), 
self.sql_ctx)
 
-def sample(self, withReplacement, fraction, seed=None):
--- End diff --

there are two sample().


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5579][SQL][DataFrame] Support for proje...

2015-02-03 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/4348#discussion_r24063992
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala 
---
@@ -179,10 +179,20 @@ private[sql] class DataFrameImpl protected[sql](
 select((col +: cols).map(Column(_)) :_*)
   }
 
+  override def selectExpr(exprs: String*): DataFrame = {
--- End diff --

It should work in these cases with this implementation.
```
select('a', '`the name`', 'a + 1', 'min(b) * 3')
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5379][Streaming] Add awaitTerminationOr...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4171#issuecomment-72796748
  
  [Test build #26726 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26726/consoleFull)
 for   PR 4171 at commit 
[`c9e660b`](https://github.com/apache/spark/commit/c9e660b4c8e4547a16c00364fa7baa2a40536345).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5379][Streaming] Add awaitTerminationOr...

2015-02-03 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/4171#issuecomment-72797695
  
LGTM, will merge when tests pass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5574] use given name prefix in dir

2015-02-03 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4344#issuecomment-72798456
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5498][SQL]fix bug when query the data w...

2015-02-03 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4289#discussion_r24057968
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala ---
@@ -242,6 +242,11 @@ private[hive] object HiveShim {
 }
   }
 
+  // make getConvertedOI compatible between 0.12.0 and 0.13.1
+  def getConvertedOI(inputOI: ObjectInspector, outputOI: ObjectInspector): 
ObjectInspector = {
+ObjectInspectorConverters.getConvertedOI(inputOI, outputOI, new 
java.lang.Boolean(true))
--- End diff --

Just `true` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4066#issuecomment-72777921
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26695/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4939] revive offers periodically in Loc...

2015-02-03 Thread kayousterhout

Github user kayousterhout commented on the pull request:

https://github.com/apache/spark/pull/4147#issuecomment-72777970
  
LGTM; I'll merge this as soon as tests pass.  @tdas @pwendell this is fine 
with me to merge into 1.2 (although I realize it won't make it until 1.2.2); 
does that seem ok with you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4066#issuecomment-72777913
  
  [Test build #26695 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26695/consoleFull)
 for   PR 4066 at commit 
[`97da5fe`](https://github.com/apache/spark/commit/97da5feb6fe49255afaac1dc9d5db1edf8c1ff42).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class TaskCommitDenied(`
  * `class CommitDeniedException(`
  * `  class OutputCommitCoordinatorActor(outputCommitCoordinator: 
OutputCommitCoordinator)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72778669
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26697/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [FIX][MLLIB] fix seed handling in Python GMM

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4349#issuecomment-72778764
  
  [Test build #26696 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26696/consoleFull)
 for   PR 4349 at commit 
[`3be5926`](https://github.com/apache/spark/commit/3be592612f9e4b5b6a1fbc2bf84ac006fa223bfb).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [FIX][MLLIB] fix seed handling in Python GMM

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4349#issuecomment-72778768
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26696/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5278][SQL] complete the check of ambigu...

2015-02-03 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4068#issuecomment-72778662
  
  [Test build #26697 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26697/consoleFull)
 for   PR 4068 at commit 
[`340223d`](https://github.com/apache/spark/commit/340223d44fce76096e9952cc8aab9eb46ff9d1f8).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class Dsl(object):`
  * `class ExamplePointUDT(UserDefinedType):`
  * `class SQLTests(ReusedPySparkTestCase):`
  * `case class UnresolvedGetField(child: Expression, fieldName: String) 
extends UnaryExpression `
  * `case class GetField(child: Expression, field: StructField, ordinal: 
Int) extends UnaryExpression `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...

2015-02-03 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4066#issuecomment-72780055
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26700/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 684 matches

Mail list logo