[GitHub] spark issue #14365: [SPARK-16628][SQL] Translate file-based relation schema ...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14365
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62993/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14365: [SPARK-16628][SQL] Translate file-based relation schema ...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14365
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14365: [SPARK-16628][SQL] Translate file-based relation schema ...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14365
  
**[Test build #62993 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62993/consoleFull)**
 for PR 14365 at commit 
[`0c17748`](https://github.com/apache/spark/commit/0c177484909e75c6a90555aa84e76ccede570df5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14395
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14395
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62996/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14395
  
**[Test build #62996 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62996/consoleFull)**
 for PR 14395 at commit 
[`91625fd`](https://github.com/apache/spark/commit/91625fd747cd336ebde4f68f0ef67e1749bb9a83).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14395
  
**[Test build #62996 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62996/consoleFull)**
 for PR 14395 at commit 
[`91625fd`](https://github.com/apache/spark/commit/91625fd747cd336ebde4f68f0ef67e1749bb9a83).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #11514: [SPARK-13671] [SPARK-13311] [SQL] Use different p...

2016-07-28 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11514#discussion_r72743160
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala ---
@@ -101,17 +101,76 @@ private[sql] case class LogicalRDD(
 private[sql] case class PhysicalRDD(
 output: Seq[Attribute],
 rdd: RDD[InternalRow],
-override val nodeName: String,
-override val metadata: Map[String, String] = Map.empty,
-isUnsafeRow: Boolean = false,
-override val outputPartitioning: Partitioning = UnknownPartitioning(0))
+override val nodeName: String) extends LeafNode {
+
+  private[sql] override lazy val metrics = Map(
+"numOutputRows" -> SQLMetrics.createLongMetric(sparkContext, "number 
of output rows"))
+
+  protected override def doExecute(): RDD[InternalRow] = {
+val numOutputRows = longMetric("numOutputRows")
+rdd.mapPartitionsInternal { iter =>
+  val proj = UnsafeProjection.create(schema)
+  iter.map { r =>
+numOutputRows += 1
+proj(r)
+  }
+}
+  }
+
+  override def simpleString: String = {
+s"Scan $nodeName${output.mkString("[", ",", "]")}"
+  }
+}
+
+/** Physical plan node for scanning data from a relation. */
+private[sql] case class DataSourceScan(
+output: Seq[Attribute],
+rdd: RDD[InternalRow],
+@transient relation: BaseRelation,
+override val metadata: Map[String, String] = Map.empty)
   extends LeafNode with CodegenSupport {
 
+  override val nodeName: String = relation.toString
+
+  // Ignore rdd when checking results
+  override def sameResult(plan: SparkPlan ): Boolean = plan match {
+case other: DataSourceScan => relation == other.relation && metadata 
== other.metadata
--- End diff --

this is actually wrong because we cannot ignore the rdd, otherwise scans of 
different partitions are treated as "sameResult"!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14397
  
**[Test build #62995 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62995/consoleFull)**
 for PR 14397 at commit 
[`5bca528`](https://github.com/apache/spark/commit/5bca528d971cad317f748cb3f2d050ce0b8f99a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

2016-07-28 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/14397

[SPARK-16771][SQL] WITH clause should not fall into infinite loop.

## What changes were proposed in this pull request?

This PR changes the table resolving rule to use database tables before CTE 
tables in order to prevent infinite loops on CTE table name resolution.

**Reported Error Scenarios**
```scala
scala> spark.range(10).createOrReplaceTempView("t")
scala> sql("WITH t AS (SELECT 1 FROM t) SELECT * FROM t")
java.lang.StackOverflowError
...
```

```scala
scala> spark.range(10).createOrReplaceTempView("t1")
scala> spark.range(10).createOrReplaceTempView("t2")
scala> sql("WITH t1 AS (SELECT 1 FROM t2), t2 AS (SELECT 1 FROM t1) SELECT 
* FROM t1, t2")
java.lang.StackOverflowError
...
```


## How was this patch tested?

Pass the Jenkins tests with new two testcases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-16771-TREENODE

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14397.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14397


commit 5bca528d971cad317f748cb3f2d050ce0b8f99a7
Author: Dongjoon Hyun 
Date:   2016-07-29T04:41:20Z

[SPARK-16771][SQL] WITH clause should not fall into infinite loop.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14396: [SPARK-16787] SparkContext.addFile() should not throw if...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14396
  
**[Test build #62994 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62994/consoleFull)**
 for PR 14396 at commit 
[`0d7dd0d`](https://github.com/apache/spark/commit/0d7dd0d12ace12bf54c3d6b62b802ffa4de800a3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14396: [SPARK-16787] SparkContext.addFile() should not t...

2016-07-28 Thread JoshRosen
GitHub user JoshRosen opened a pull request:

https://github.com/apache/spark/pull/14396

[SPARK-16787] SparkContext.addFile() should not throw if called twice with 
the same file

## What changes were proposed in this pull request?

The behavior of `SparkContext.addFile()` changed slightly with the 
introduction of the Netty-RPC-based file server, which was introduced in Spark 
1.6 (where it was disabled by default) and became the default / only file 
server in Spark 2.0.0.

Prior to 2.0, calling `SparkContext.addFile()` with files that have the 
same name and identical contents would succeed. This behavior was never 
explicitly documented but Spark has behaved this way since very early 1.x 
versions.

In 2.0 (or 1.6 with the Netty file server enabled), the second `addFile()` 
call will fail with a requirement error because NettyStreamManager tries to 
guard against duplicate file registration.

This problem also affects `addJar()` in a more subtle way: the 
`fileServer.addJar()` call will also fail with an exception but that exception 
is logged and ignored; I believe that the problematic exception-catching path 
was mistakenly copied from some old code which was only relevant to very old 
versions of Spark and YARN mode.

I believe that this change of behavior was unintentional, so this patch 
weakens the `require` check so that adding the same filename at the same path 
will succeed.

At file download time, Spark tasks will fail with exceptions if an executor 
already has a local copy of a file and that file's contents do not match the 
contents of the file being downloaded / added. As a result, it's important that 
we prevent files with the same name and different contents from being served 
because allowing that can effectively brick an executor by preventing it from 
successfully launching any new tasks. Before this patch's change, this was 
prevented by forbidding `addFile()` from being called twice on files with the 
same name. Because Spark does not defensively copy local files that are passed 
to `addFile` it is vulnerable to files' contents changing, so I think it's okay 
to rely on an implicit assumption that these files are intended to be immutable 
(since if they _are_ mutable then this can lead to either explicit task 
failures or implicit incorrectness (in case new executors silently get newer 
copies of the file while old executors continue to use an older v
 ersion)). To guard against this, I have decided to only update the file 
addition timestamps on the first call to `addFile()`; duplicate calls will 
succeed but will not update the timestamp. This behavior is fine as long as we 
assume files are immutable, which seems reasonable given the behaviors 
described above.

As part of this change, I also improved the thread-safety of the 
`addedJars` and `addedFiles` maps; this is important because these maps may be 
concurrently read by a task launching thread and written by a driver thread in 
case the user's driver code is multi-threaded. 

## How was this patch tested?

I added regression tests in `SparkContextSuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JoshRosen/spark SPARK-16787

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14396.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14396


commit 99d9855c109233bbeb4a3501041e4ecf4825c278
Author: Josh Rosen 
Date:   2016-07-29T01:36:24Z

Add failing regression test.

commit 3ecdb88f52fa21dc3b8c89b27a82a33e6d2d937d
Author: Josh Rosen 
Date:   2016-07-29T01:36:34Z

Remove catch case which masked error for addJar.

commit c412f991c145cf02affd7c2d38de016a9b7f548d
Author: Josh Rosen 
Date:   2016-07-29T01:56:49Z

Fix bug.

commit b98d1492ec6a6cea4e5a8cca8e1e23fee2c120e9
Author: Josh Rosen 
Date:   2016-07-29T02:54:34Z

Add back require but weaken it to only accept identical paths.

commit 0d7dd0d12ace12bf54c3d6b62b802ffa4de800a3
Author: Josh Rosen 
Date:   2016-07-29T04:32:03Z

Improve thread-safety; do not update timestamp for file under assumption of 
immutability.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: 

[GitHub] spark pull request #14110: [SPARK-16455] Add a new hook in CoarseGrainedSche...

2016-07-28 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14110#discussion_r72740499
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -345,6 +351,8 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
 driverEndpoint = createDriverEndpointRef(properties)
   }
 
+  def isClusterAvailableForNewOffers(): Boolean = true
--- End diff --

document what this does?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14155: [SPARK-16498][SQL] move hive hack for data source table ...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14155
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62992/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14155: [SPARK-16498][SQL] move hive hack for data source table ...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14155
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14155: [SPARK-16498][SQL] move hive hack for data source table ...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14155
  
**[Test build #62992 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62992/consoleFull)**
 for PR 14155 at commit 
[`a52`](https://github.com/apache/spark/commit/a520e2c41eea73d65ace4d8c80ff13badbcd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14365: [SPARK-16628][SQL] Translate file-based relation schema ...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14365
  
**[Test build #62993 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62993/consoleFull)**
 for PR 14365 at commit 
[`0c17748`](https://github.com/apache/spark/commit/0c177484909e75c6a90555aa84e76ccede570df5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14365: [SPARK-16628][SQL] Translate file-based relation ...

2016-07-28 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14365#discussion_r72738757
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SchemaMapping.scala ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.catalyst.expressions.{AttributeReference, 
Expression}
+import org.apache.spark.sql.types.{StructField, StructType}
+
+/**
+ * An interface for mapping two different schemas. For the relations that 
have are backed by files,
+ * the inferred schema from the files might be different with the schema 
stored in the catalog. In
+ * such case, the interface helps mapping inconsistent schemas.
--- End diff --

I've added more detailed document. Please take a look. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-28 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14258
  
Not sure clean + rebuild will solve the problem here. The problem here is 
that we load the Spark 2.0.0 JARs using `install_spark` (i.e. that didn't have 
the fix in #14095) and we use R test code in the master branch which has the 
updated unit test. Or in other words need to use R code which doesn't have the 
test. (i.e. branch-2.0) and the master branch cannot be used with Spark 2.0.0 
JARs. 

This may not be a big problem if we only enable CRAN checks on branch-2.0, 
but it seems like disabling the tests as in #14357 is an easy way to avoid 
confusion for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14294: [SPARK-16646][SQL] LEAST and GREATEST doesn't acc...

2016-07-28 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at:

https://github.com/apache/spark/pull/14294


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14294: [SPARK-16646][SQL] LEAST and GREATEST doesn't accept num...

2016-07-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14294
  
Closing this. Please refer https://issues.apache.org/jira/browse/SPARK-16646


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14363: [SPARK-16731][SQL] use StructType in CatalogTable...

2016-07-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14363#discussion_r72735867
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -721,16 +723,16 @@ private[hive] class HiveClientImpl(
 Utils.classForName(name)
   .asInstanceOf[Class[_ <: 
org.apache.hadoop.hive.ql.io.HiveOutputFormat[_, _]]]
 
-  private def toHiveColumn(c: CatalogColumn): FieldSchema = {
-new FieldSchema(c.name, c.dataType, c.comment.orNull)
+  private def toHiveColumn(c: StructField): FieldSchema = {
+new FieldSchema(c.name, c.dataType.catalogString, 
c.getComment().orNull)
   }
 
-  private def fromHiveColumn(hc: FieldSchema): CatalogColumn = {
-new CatalogColumn(
+  private def fromHiveColumn(hc: FieldSchema): StructField = {
+val f = StructField(
   name = hc.getName,
-  dataType = hc.getType,
-  nullable = true,
-  comment = Option(hc.getComment))
+  dataType = CatalystSqlParser.parseDataType(hc.getType),
--- End diff --

So the behaviour change is: previously if a hive table contains type string 
that we can't parse, we are still able to describe it, but throw an exception 
if we try to read it. After this PR, we will throw an exception when we try to 
read its table meta from hive meta store.

I think it's ok to break it, but need better error message. what do you 
think? cc @yhuai @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14363: [SPARK-16731][SQL] use StructType in CatalogTable...

2016-07-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14363#discussion_r72735500
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -78,28 +78,6 @@ object CatalogStorageFormat {
 }
 
 /**
- * A column in a table.
- */
-case class CatalogColumn(
-name: String,
-// TODO: make this type-safe; this is left as a string due to issues 
in converting Hive
-// varchars to and from SparkSQL strings.
--- End diff --

I don't know either...  cc @rxin @andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14155: [SPARK-16498][SQL] move hive hack for data source table ...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14155
  
**[Test build #62992 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62992/consoleFull)**
 for PR 14155 at commit 
[`a52`](https://github.com/apache/spark/commit/a520e2c41eea73d65ace4d8c80ff13badbcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14395
  
**[Test build #62991 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62991/consoleFull)**
 for PR 14395 at commit 
[`ab5120f`](https://github.com/apache/spark/commit/ab5120fde86502dbbcb4e10ed9cd595df2160176).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14395
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62991/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14395
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14395
  
**[Test build #62991 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62991/consoleFull)**
 for PR 14395 at commit 
[`ab5120f`](https://github.com/apache/spark/commit/ab5120fde86502dbbcb4e10ed9cd595df2160176).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14395: [SPARK-16748][SQL] SparkExceptions during plannin...

2016-07-28 Thread tdas
GitHub user tdas opened a pull request:

https://github.com/apache/spark/pull/14395

[SPARK-16748][SQL] SparkExceptions during planning should not wrapped in 
TreeNodeException

## What changes were proposed in this pull request?
We do not want SparkExceptions from job failures in the planning phase to 
create TreeNodeException. Hence do not wrap SparkException in TreeNodeException.

## How was this patch tested?
New unit test




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tdas/spark SPARK-16748

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14395.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14395


commit ab5120fde86502dbbcb4e10ed9cd595df2160176
Author: Tathagata Das 
Date:   2016-07-29T02:06:21Z

Fixed bug




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14395: [SPARK-16748][SQL] SparkExceptions during planning shoul...

2016-07-28 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/14395
  
@marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14394: [SPARK-16786] [Python] [WIP] LDA topic distributions API...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14394
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14394: [SPARK-16786] [Python] [WIP] LDA topic distributi...

2016-07-28 Thread supremekai
GitHub user supremekai opened a pull request:

https://github.com/apache/spark/pull/14394

[SPARK-16786] [Python] [WIP] LDA topic distributions API Call for python

## What changes were proposed in this pull request?

Implemented python call to topicDistributions for 
pyspark.clustering.mllib.LDAModel

## How was this patch tested?
Ran ./dev/run-tests, all passing
Manually verified.
Used function parameter types, return types etc. from existing API calls so 
all behaviour is consistent with existing behaviour.

(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/supremekai/spark pyspark-topic-distributions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14394.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14394


commit 00d93ccba2cfbc298f820d3d4391c4ad11211b4f
Author: Jordan 
Date:   2016-07-28T22:58:09Z

Added pyspark API call to MLlib LDAModel topicDistributions function

commit 5f36d785a689d21cb4392f56a30afdc8188bbc2a
Author: Jordan 
Date:   2016-07-29T01:12:37Z

Fixed imports and styling




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14176
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62990/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14176
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14176
  
**[Test build #62990 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62990/consoleFull)**
 for PR 14176 at commit 
[`def94cc`](https://github.com/apache/spark/commit/def94ccf2b0630bcbc88067dc4e6f57434afb8ee).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62989/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14182
  
**[Test build #62989 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62989/consoleFull)**
 for PR 14182 at commit 
[`f90e6b0`](https://github.com/apache/spark/commit/f90e6b000cfa6f71c8f2dabcf97c3cac8e58d444).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14176: [SPARK-16525][SQL] Enable Row Based HashMap in HashAggre...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14176
  
**[Test build #62990 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62990/consoleFull)**
 for PR 14176 at commit 
[`def94cc`](https://github.com/apache/spark/commit/def94ccf2b0630bcbc88067dc4e6f57434afb8ee).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14182
  
**[Test build #62989 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62989/consoleFull)**
 for PR 14182 at commit 
[`f90e6b0`](https://github.com/apache/spark/commit/f90e6b000cfa6f71c8f2dabcf97c3cac8e58d444).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62987/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14384
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14384
  
**[Test build #62987 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62987/consoleFull)**
 for PR 14384 at commit 
[`ed8681e`](https://github.com/apache/spark/commit/ed8681e34b6e4b554cd2cef24a88f4eb13795c71).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14386: [SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14386: [SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62988/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14386: [SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14386
  
**[Test build #62988 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62988/consoleFull)**
 for PR 14386 at commit 
[`4d1b3c5`](https://github.com/apache/spark/commit/4d1b3c58a4a35339fc088652b242cc6acd3838cb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62986/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14182
  
**[Test build #62986 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62986/consoleFull)**
 for PR 14182 at commit 
[`989a0b3`](https://github.com/apache/spark/commit/989a0b38244c009f4aba6d9f34cd050a48cf0b31).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14312: [SPARK-15857]Add caller context in Spark: invoke ...

2016-07-28 Thread Sherry302
Github user Sherry302 closed the pull request at:

https://github.com/apache/spark/pull/14312


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14386: [SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14386
  
**[Test build #62988 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62988/consoleFull)**
 for PR 14386 at commit 
[`4d1b3c5`](https://github.com/apache/spark/commit/4d1b3c58a4a35339fc088652b242cc6acd3838cb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14182: [SPARK-16444][SparkR]: Isotonic Regression wrappe...

2016-07-28 Thread junyangq
Github user junyangq commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r72715742
  
--- Diff: R/pkg/R/mllib.R ---
@@ -292,6 +299,78 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 return(list(apriori = apriori, tables = tables))
   })
 
+#' Isotonic Regression Model
+#'
+#' Fits an Isotonic Regression model against a Spark DataFrame, similarly 
to R's isoreg().
+#' Users can print, make predictions on the produced model and save the 
model to the input path.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#' @param isotonic Whether the output sequence should be 
isotonic/increasing (true) or
+#' antitonic/decreasing (false)
+#' @param featureIndex The index of the feature if \code{featuresCol} is a 
vector column (default: `0`),
+#' no effect otherwise
+#' @return \code{spark.isoreg} returns a fitted Isotonic Regression model
+#' @rdname spark.isoreg
+#' @aliases spark.isoreg,SparkDataFrame,formula-method
+#' @name spark.isoreg
+#' @export
+#' @examples
+#' \dontrun{
+#' sparkR.session()
+#' data <- list(list(7.0, 0.0), list(5.0, 1.0), list(3.0, 2.0),
+#' list(5.0, 3.0), list(1.0, 4.0))
+#' df <- createDataFrame(data, c("label", "feature"))
+#' model <- spark.isoreg(df, label ~ feature, isotonic = FALSE)
+#' # return model boundaries and prediction as lists
+#' result <- summary(model, df)
+#'
+#' # save fitted model to input path
+#' path <- "path/to/model"
+#' write.ml(model, path)
+#'
+#' # can also read back the saved model and print
+#' savedModel <- read.ml(path)
+#' summary(savedModel)
+#' }
+#' @note spark.isoreg since 2.1.0
+setMethod("spark.isoreg", signature(data = "SparkDataFrame", formula = 
"formula"),
+  function(data, formula, isotonic = TRUE, featureIndex = 0) {
+formula <- paste0(deparse(formula), collapse = "")
+jobj <- 
callJStatic("org.apache.spark.ml.r.IsotonicRegressionWrapper", "fit",
+data@sdf, formula, as.logical(isotonic), 
as.integer(featureIndex))
+return(new("IsotonicRegressionModel", jobj = jobj))
+  })
+
+#  Predicted values based on an isotonicRegression model
+
+#' @param object a fitted isotonicRegressionModel
+#' @param newData A SparkDataFrame for testing
+#' @return \code{predict} returns a SparkDataFrame containing predicted 
values
+#' @rdname spark.isoreg
+#' @export
+#' @note predict(isotonicRegressionModel) since 2.1.0
+setMethod("predict", signature(object = "IsotonicRegressionModel"),
+  function(object, newData) {
+return(dataFrame(callJMethod(object@jobj, "transform", 
newData@sdf)))
+  })
+
+#  Get the summary of a isotonicRegressionModel model
+
+#' @param object A fitted isotonicRegressionModel model
--- End diff --

Documenting `object` in both `predict` and `summary` may create duplicate 
argument fields in the generated doc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14386: [SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md

2016-07-28 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14386
  
Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14273: [SPARK-9140] [ML] Replace TimeTracker by MultiStopwatch

2016-07-28 Thread MechCoder
Github user MechCoder commented on the issue:

https://github.com/apache/spark/pull/14273
  
@jkbradley Would you be able to have a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13756
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62985/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13756
  
**[Test build #62985 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62985/consoleFull)**
 for PR 13756 at commit 
[`08b5374`](https://github.com/apache/spark/commit/08b5374e827f6680b4e4a00ed700ef689dce22ff).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-28 Thread junyangq
Github user junyangq commented on the issue:

https://github.com/apache/spark/pull/14258
  
@shivaram Would rebuild after cleaning solved the problem, as in #14357? So 
you mean disabling those tests in this PR first?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14386: [SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md

2016-07-28 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/14386
  
LGTM!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14384: [Spark-16443][SparkR] Alternating Least Squares (ALS) wr...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14384
  
**[Test build #62987 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62987/consoleFull)**
 for PR 14384 at commit 
[`ed8681e`](https://github.com/apache/spark/commit/ed8681e34b6e4b554cd2cef24a88f4eb13795c71).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14393: [SPARK-16772] Correct API doc references to PySpa...

2016-07-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14393


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14393
  
Thanks - merging in master and 2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14182
  
**[Test build #62986 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62986/consoleFull)**
 for PR 14182 at commit 
[`989a0b3`](https://github.com/apache/spark/commit/989a0b38244c009f4aba6d9f34cd050a48cf0b31).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap

2016-07-28 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14390
  
... but this needs to be opened vs master first. Don't worry about the 
flaky test here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14378: [SPARK-16750] [ML] Fix GaussianMixture training failed d...

2016-07-28 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14378
  
Seems reasonable to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap

2016-07-28 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14390
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread nchammas
Github user nchammas commented on the issue:

https://github.com/apache/spark/pull/14393
  
Yes, I built the docs and reviewed several (but not all) of the changes 
locally in my browser and confirmed that the corrections I wanted took place as 
expected.

(Apologies about not using the PR template when I first opened the PR. 
GitHub Desktop seems not to support that yet. I've updated the PR description 
to include this info now.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-28 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14258
  
@junyangq I just ran the CRAN checks locally and I see the problem you ran 
into in #14357 -- The problem is that if we try to run tests which depend on a 
Java-side change in master but not in 2.0.0 then they will fail. I think this 
shouldn't be a problem in the long run in the sense that we should match the 
SparkR and Spark versions closely. However for the first cut I'm fine with 
re-opening #14357 and say disabling some of the tests temporarily ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14393
  
@nchammas did you build the docs to verify? If yes I'm going to merge it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14241
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14241
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62979/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14393
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62984/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14393
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14241: [SPARK-16596] [SQL] Refactor DataSourceScanExec to do pa...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14241
  
**[Test build #62979 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62979/consoleFull)**
 for PR 14241 at commit 
[`18f5543`](https://github.com/apache/spark/commit/18f5543e6b7e56e093e07ec599fe48f3e305dc7b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14393
  
**[Test build #62984 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62984/consoleFull)**
 for PR 14393 at commit 
[`16ef570`](https://github.com/apache/spark/commit/16ef5704f567e5372448d8d4dd2d92ee6133f02e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14308: [SPARK-16421][EXAMPLES][ML] Improve ML Example Outputs

2016-07-28 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/14308
  
ping @mengxr @jkbradley @MLnick , any of you mind taking a look at this?  
There were a few Java examples I fixed up that wouldn't run because of using 
mllib.linalg.Vectors.  If it would be easier, I could separate those in another 
PR to get that in asap.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14311
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14311
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62981/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14311
  
**[Test build #62981 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62981/consoleFull)**
 for PR 14311 at commit 
[`7543c4a`](https://github.com/apache/spark/commit/7543c4abc67de3559da92f2c290c792cb4ca78bc).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13756
  
**[Test build #62985 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62985/consoleFull)**
 for PR 13756 at commit 
[`08b5374`](https://github.com/apache/spark/commit/08b5374e827f6680b4e4a00ed700ef689dce22ff).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in partiti...

2016-07-28 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/13756
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14393
  
**[Test build #62984 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62984/consoleFull)**
 for PR 14393 at commit 
[`16ef570`](https://github.com/apache/spark/commit/16ef5704f567e5372448d8d4dd2d92ee6133f02e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14363: [SPARK-16731][SQL] use StructType in CatalogTable...

2016-07-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14363#discussion_r72695248
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -78,28 +78,6 @@ object CatalogStorageFormat {
 }
 
 /**
- * A column in a table.
- */
-case class CatalogColumn(
-name: String,
-// TODO: make this type-safe; this is left as a string due to issues 
in converting Hive
-// varchars to and from SparkSQL strings.
--- End diff --

Do you know what is the issue when we converting varchars to and from 
SparkSQL strings? Sorry, I am unable to find the answer. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14363: [SPARK-16731][SQL] use StructType in CatalogTable...

2016-07-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14363#discussion_r72694426
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
@@ -721,16 +723,16 @@ private[hive] class HiveClientImpl(
 Utils.classForName(name)
   .asInstanceOf[Class[_ <: 
org.apache.hadoop.hive.ql.io.HiveOutputFormat[_, _]]]
 
-  private def toHiveColumn(c: CatalogColumn): FieldSchema = {
-new FieldSchema(c.name, c.dataType, c.comment.orNull)
+  private def toHiveColumn(c: StructField): FieldSchema = {
+new FieldSchema(c.name, c.dataType.catalogString, 
c.getComment().orNull)
   }
 
-  private def fromHiveColumn(hc: FieldSchema): CatalogColumn = {
-new CatalogColumn(
+  private def fromHiveColumn(hc: FieldSchema): StructField = {
+val f = StructField(
   name = hc.getName,
-  dataType = hc.getType,
-  nullable = true,
-  comment = Option(hc.getComment))
+  dataType = CatalystSqlParser.parseDataType(hc.getType),
--- End diff --

This is the change we have to make if we convert `CatalogColumn` to 
`StructField`. It sounds like `hc.getType` could return null? or Hive could 
return some data types we might not recognize. We could hit the exception from 
Parser, right?

That means, the caller of `fromHiveColumn` will also get the exception. 
`getTableOption` is the caller. I am just wondering if we do not want to see 
this kind of exception when doing `getTableOption`. Or maybe issue a nicer 
error message here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14387: [SPARK-16764][SQL] Recommend disabling vectorized...

2016-07-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14387


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14387: [SPARK-16764][SQL] Recommend disabling vectorized parque...

2016-07-28 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14387
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread nchammas
Github user nchammas commented on the issue:

https://github.com/apache/spark/pull/14393
  
Apologies for making a fairly "noisy" PR, with changes in several scattered 
places. However, as a PySpark user it's important to me that the API docs be 
properly formatted and that docstring class references work.

Feel free to ping me on Python docstring changes in the future. I would be 
happy to review them.

cc @rxin @davies - Ready for review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14379: [SPARK-16751] Upgrade derby to 10.12.1.1

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14379
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14379: [SPARK-16751] Upgrade derby to 10.12.1.1

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14379
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62973/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14379: [SPARK-16751] Upgrade derby to 10.12.1.1

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14379
  
**[Test build #62973 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62973/consoleFull)**
 for PR 14379 at commit 
[`f3815cf`](https://github.com/apache/spark/commit/f3815cfd1b6b6d7ba29dc8f2e12b1334ad415847).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14132
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14132
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62975/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62980/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14132: [SPARK-16475][SQL] Broadcast Hint for SQL Queries

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14132
  
**[Test build #62975 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62975/consoleFull)**
 for PR 14132 at commit 
[`4023d97`](https://github.com/apache/spark/commit/4023d974f34052bb29e12fd93aeb187ea12b536f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14182
  
**[Test build #62980 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62980/consoleFull)**
 for PR 14182 at commit 
[`b125a45`](https://github.com/apache/spark/commit/b125a4523c2ea9b31e9c500cb0f572ccaed1162c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14393
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62983/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14393
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14393: [SPARK-16772] Correct API doc references to PySpark clas...

2016-07-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14393
  
**[Test build #62983 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62983/consoleFull)**
 for PR 14393 at commit 
[`493b61e`](https://github.com/apache/spark/commit/493b61ea5b25816b252f75241223bb29b1d4d8d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14182: [SPARK-16444][SparkR]: Isotonic Regression wrapper in Sp...

2016-07-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14182
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62978/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >