[GitHub] spark issue #14953: [Minor] [ML] [MLlib] Remove work around for breeze spars...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14953
  
**[Test build #64912 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64912/consoleFull)**
 for PR 14953 at commit 
[`c403ac6`](https://github.com/apache/spark/commit/c403ac6668bae5a3785ae4a15a9f57d0da3bac9f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14953: [Minor] [ML] [MLlib] Remove work around for breez...

2016-09-03 Thread yanboliang
GitHub user yanboliang opened a pull request:

https://github.com/apache/spark/pull/14953

[Minor] [ML] [MLlib] Remove work around for breeze sparse matrix.

## What changes were proposed in this pull request?
Since we have updated breeze version to 0.12, we should remove work around 
for bug of breeze sparse matrix in v0.11.


## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanboliang/spark matrices

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14953.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14953


commit c403ac6668bae5a3785ae4a15a9f57d0da3bac9f
Author: Yanbo Liang 
Date:   2016-09-04T05:14:29Z

Remove work around for breeze sparse matrix.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14452: [SPARK-16849][SQL] Improve subquery execution by ...

2016-09-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14452#discussion_r77446696
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/subquery/CommonSubquery.scala
 ---
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.subquery
+
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.plans.QueryPlan
+import org.apache.spark.sql.catalyst.plans.logical
+import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, 
Statistics}
+import org.apache.spark.sql.execution.SparkPlan
+import org.apache.spark.util.Utils
+
+private[sql] case class CommonSubquery(
+output: Seq[Attribute],
+@transient child: SparkPlan)(
+@transient val logicalChild: LogicalPlan,
+private[sql] val _statistics: Statistics,
+@transient private[sql] var _computedOutput: RDD[InternalRow] = null)
--- End diff --

@hvanhovell Rethinking about this, it is incorrect indeed. I change it and 
use a helper class to do rdd materialization.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate bui...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14859
  
**[Test build #64911 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64911/consoleFull)**
 for PR 14859 at commit 
[`2e04911`](https://github.com/apache/spark/commit/2e049111d7cdbc1c11fd1495a3e438d6cd2fd4c7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14859: [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate bui...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14859
  
**[Test build #64910 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64910/consoleFull)**
 for PR 14859 at commit 
[`4f2db1e`](https://github.com/apache/spark/commit/4f2db1e931753908784530b356569b83514ba5af).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14919: [SPARK-17354][SQL] Partitioning by dates/timestamps shou...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14919
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64909/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14919: [SPARK-17354][SQL] Partitioning by dates/timestamps shou...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14919
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14919: [SPARK-17354][SQL] Partitioning by dates/timestamps shou...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14919
  
**[Test build #64909 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64909/consoleFull)**
 for PR 14919 at commit 
[`acf2a3d`](https://github.com/apache/spark/commit/acf2a3d2a1c673a0cdaf19ad0be86abc217ebb88).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14928: [SPARK-17369][SQL] MetastoreRelation toJSON should not t...

2016-09-03 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14928
  
Can you update the pull request title / description to say more about the 
actual issue?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14919: [SPARK-17354][SQL] Partitioning by dates/timestamps shou...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14919
  
**[Test build #64909 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64909/consoleFull)**
 for PR 14919 at commit 
[`acf2a3d`](https://github.com/apache/spark/commit/acf2a3d2a1c673a0cdaf19ad0be86abc217ebb88).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13690: [SPARK-15767][R][ML] Decision Tree Regression wrapper in...

2016-09-03 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/13690
  
@vectorijk Is this ready for another round of review ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn...

2016-09-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14433


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn...

2016-09-03 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/14433#discussion_r77443065
  
--- Diff: core/src/main/scala/org/apache/spark/internal/Logging.scala ---
@@ -135,7 +135,8 @@ private[spark] trait Logging {
 val replLevel = Option(replLogger.getLevel()).getOrElse(Level.WARN)
 if (replLevel != rootLogger.getEffectiveLevel()) {
   System.err.printf("Setting default log level to \"%s\".\n", 
replLevel)
-  System.err.println("To adjust logging level use 
sc.setLogLevel(newLevel).")
+  System.err.println("To adjust logging level use 
sc.setLogLevel(newLevel). " +
+  "For SparkR, use setLogLevel(newLevel).")
--- End diff --

I fixed this up during the merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn't work

2016-09-03 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/14433
  
Merged into master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14945: [SPARK-17386] Set default trigger interval to 1/1...

2016-09-03 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/14945#discussion_r77442963
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamTest.scala ---
@@ -152,8 +152,8 @@ trait StreamTest extends QueryTest with 
SharedSQLContext with Timeouts {
 
   /** Starts the stream, resuming if data has already been processed. It 
must not be running. */
   case class StartStream(
-  trigger: Trigger = ProcessingTime(0),
-  triggerClock: Clock = new SystemClock)
+  trigger: Trigger = 
ProcessingTime.defaultTriggerInterval,
--- End diff --

Leading spaces should be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14952: [SPARK-17110] Fix StreamCorruptionException in Bl...

2016-09-03 Thread ericl
Github user ericl commented on a diff in the pull request:

https://github.com/apache/spark/pull/14952#discussion_r77442140
  
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala 
---
@@ -520,10 +520,11 @@ private[spark] class BlockManager(
*
* This does not acquire a lock on this block in this JVM.
*/
-  private def getRemoteValues(blockId: BlockId): Option[BlockResult] = {
+  private def getRemoteValues[T: ClassTag](blockId: BlockId): 
Option[BlockResult] = {
+val ct = implicitly[ClassTag[T]]
 getRemoteBytes(blockId).map { data =>
   val values =
-serializerManager.dataDeserializeStream(blockId, 
data.toInputStream(dispose = true))
+serializerManager.dataDeserializeStream(blockId, 
data.toInputStream(dispose = true))(ct)
--- End diff --

Is it possible for dataDeserializeStream to require a classtag to be 
explicitly passed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14952: [SPARK-17110] Fix StreamCorruptionException in BlockMana...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14952
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64908/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14952: [SPARK-17110] Fix StreamCorruptionException in BlockMana...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14952
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14952: [SPARK-17110] Fix StreamCorruptionException in BlockMana...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14952
  
**[Test build #64908 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64908/consoleFull)**
 for PR 14952 at commit 
[`9eb75f5`](https://github.com/apache/spark/commit/9eb75f57bbb7ee0c555bbdd26cf4187ee0ad3671).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14881: [SPARK-17315][SparkR] Kolmogorov-Smirnov test Spa...

2016-09-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14881


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14951: [SPARK-17391] [TEST] [2.0] Fix Two Test Failures After B...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14951
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14951: [SPARK-17391] [TEST] [2.0] Fix Two Test Failures After B...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14951
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64907/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14951: [SPARK-17391] [TEST] [2.0] Fix Two Test Failures After B...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14951
  
**[Test build #64907 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64907/consoleFull)**
 for PR 14951 at commit 
[`2f02bed`](https://github.com/apache/spark/commit/2f02bed506bd16da47db50ea9ff7e42422f5797d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14915: [SPARK-17356][SQL] Fix out of memory issue when g...

2016-09-03 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14915#discussion_r77441612
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala 
---
@@ -604,6 +604,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] 
extends Product {
 }.toList
   }
 
+  // TODO: Fix toJSON so that we can more safely handle Map and Seq with 
loop.
--- End diff --

I think it is better to make the comment self-contained. So, readers of 
this part do not need to guess or search the jira to understand what this line 
means.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14828: [SPARK-17258][SQL] Parse scientific decimal literals as ...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14828
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64906/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14828: [SPARK-17258][SQL] Parse scientific decimal literals as ...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14828
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14828: [SPARK-17258][SQL] Parse scientific decimal literals as ...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14828
  
**[Test build #64906 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64906/consoleFull)**
 for PR 14828 at commit 
[`009ab39`](https://github.com/apache/spark/commit/009ab39a79b837352b65a054050d74a2cf44dfd7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14948: [SPARK-17389] [SPARK-3261] [MLLIB] Significant KMeans sp...

2016-09-03 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14948
  
Ah yeah, it also removes runs from k-means||. The good news is I think 
these changes are actually not overlapping for the most part, and where they 
do, they're essentially the same change.

@yanboliang what do you think of this? This takes out `runs` entirely and I 
think simplifies the code even a bit further. But the real win was reducing the 
default init steps. I'm also here trying to fix the fact that duplicate 
centroids can be returned.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14948: [SPARK-17389] [SPARK-3261] [MLLIB] Significant KMeans sp...

2016-09-03 Thread sethah
Github user sethah commented on the issue:

https://github.com/apache/spark/pull/14948
  
I think https://github.com/apache/spark/pull/14937 also removes runs. cc 
@yanboliang can we coordinate these PRs? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14952: [SPARK-17110] Fix StreamCorruptionException in BlockMana...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14952
  
**[Test build #64908 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64908/consoleFull)**
 for PR 14952 at commit 
[`9eb75f5`](https://github.com/apache/spark/commit/9eb75f57bbb7ee0c555bbdd26cf4187ee0ad3671).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14952: [SPARK-17110] Fix StreamCorruptionException in BlockMana...

2016-09-03 Thread JoshRosen
Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/14952
  
/cc @ericl


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14952: [SPARK-17110] Fix StreamCorruptionException in Bl...

2016-09-03 Thread JoshRosen
GitHub user JoshRosen opened a pull request:

https://github.com/apache/spark/pull/14952

[SPARK-17110] Fix StreamCorruptionException in 
BlockManager.getRemoteValues()

## What changes were proposed in this pull request?

This patch fixes a `java.io.StreamCorruptedException` error affecting 
remote reads of cached values when certain data types are used. The problem 
stems from #11801 / SPARK-13990, a patch to have Spark automatically pick the 
"best" serializer when caching RDDs. If PySpark cached a PythonRDD, then this 
would be cached as an `RDD[Array[Byte]]` and the automatic serializer selection 
would pick KryoSerializer for replication and block transfer. However, the 
`getRemoteValues()` / `getRemoteBytes()` code path did not pass proper class 
tags in order to enable the same serializer to be used during deserialization, 
causing Java to be inappropriately used instead of Kryo, leading to the 
StreamCorruptedException.

We already fixed a similar bug in #14311, which dealt with similar issues 
in block replication. Prior to that patch, it seems that we had no tests to 
ensure that block replication actually succeeded. Similarly, prior to this bug 
fix patch it looks like we had no tests to perform remote reads of cached data, 
which is why this bug was able to remain latent for so long.

This patch addresses the bug by modifying `BlockManager`'s `get()` and  
`getRemoteValues()` methods to accept ClassTags, allowing the proper class tag 
to be threaded in the `getOrElseUpdate` code path (which is used by 
`rdd.iterator`)

## How was this patch tested?

Extended the caching tests in `DistributedSuite` to exercise the 
`getRemoteValues` path, plus manual testing to verify that the PySpark bug 
reproduction in SPARK-17110 is fixed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/JoshRosen/spark SPARK-17110

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14952.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14952


commit 470380e48a9bf574ee6cfc2700bd044b70276cd8
Author: Josh Rosen 
Date:   2016-09-03T17:26:52Z

Add regression test.

commit 9eb75f57bbb7ee0c555bbdd26cf4187ee0ad3671
Author: Josh Rosen 
Date:   2016-09-03T17:31:43Z

Fix bug by threading proper ClassTag




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-03 Thread eyalfa
Github user eyalfa commented on the issue:

https://github.com/apache/spark/pull/1
  
that was basically the idea behind the trait you requested me to remove...
the thing is I don't feel it belongs to CreateNamedStruct, so I settled for 
putting it in CreateStruct and reusing it in CreateStructUnsafe.
a better approach might be adding a toUnsafe method on CreateNamedStruct, 
this could be usefull as most of the time the unsafe version is built from the 
safe version (without even touching the children), what do you think about this 
approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14950
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14950
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64905/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14950
  
**[Test build #64905 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64905/consoleFull)**
 for PR 14950 at commit 
[`3029468`](https://github.com/apache/spark/commit/302946821db75117b4ab2346b4b445472ed50eb4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MultivariateOnlineSummarizer(mask: Int)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14951: [SPARK-17391] [TEST] Fix Two Test Failures After Backpor...

2016-09-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14951
  
cc @cloud-fan @hvanhovell @davies @sameeragarwal 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14951: [SPARK-17391] [TEST] Fix Two Test Failures After Backpor...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14951
  
**[Test build #64907 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64907/consoleFull)**
 for PR 14951 at commit 
[`2f02bed`](https://github.com/apache/spark/commit/2f02bed506bd16da47db50ea9ff7e42422f5797d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14951: [SPARK-17391] [TEST] Fix Two Test Failures After ...

2016-09-03 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/14951

[SPARK-17391] [TEST] Fix Two Test Failures After Backport

### What changes were proposed in this pull request?
In the latest branch 2.0, we have two test case failure due to backport.

- test("ALTER VIEW AS should keep the previous table properties, comment, 
create_time, etc.")
- test("SPARK-6212: The EXPLAIN output of CTAS only shows the analyzed 
plan")

### How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark fixTestFailure

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14951.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14951


commit 2f02bed506bd16da47db50ea9ff7e42422f5797d
Author: gatorsmile 
Date:   2016-09-03T17:38:24Z

fix.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14550: [SPARK-16959] [SQL] Rebuild Table Comment when Retrievin...

2016-09-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14550
  
I need to fix the test case failure in Branch 2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14864
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64901/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14864
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14864
  
**[Test build #64901 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64901/consoleFull)**
 for PR 14864 at commit 
[`568b742`](https://github.com/apache/spark/commit/568b742a7087e39c39a47caac300aad74174ec7d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14797: [SPARK-17230] [SQL] Should not pass optimized query into...

2016-09-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14797
  
This does not catch all the cases. In CTAS, we still optimize the query. By 
following the way in this PR, I can try to fix that case. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14949
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14949
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64903/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14949
  
**[Test build #64903 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64903/consoleFull)**
 for PR 14949 at commit 
[`2fa331e`](https://github.com/apache/spark/commit/2fa331ebdf3df31b08e3299c8410c29efa645bf1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14938: [SPARK-17335][SQL] Fix ArrayType and MapType Cata...

2016-09-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14938


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14938: [SPARK-17335][SQL] Fix ArrayType and MapType CatalogStri...

2016-09-03 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/14938
  
Merging to master/branch-2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14867: [SPARK-17296][SQL] Simplify parser join processin...

2016-09-03 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/14867#discussion_r77439638
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -374,15 +374,17 @@ setQuantifier
 ;
 
 relation
-: left=relation
-  ((CROSS | joinType) JOIN right=relation joinCriteria?
-  | NATURAL joinType JOIN right=relation
-  )   #joinRelation
-| relationPrimary #relationDefault
+: relationPrimary joinRelation*
+;
+
+joinRelation
+: joinType JOIN right=relationPrimary joinCriteria?
+| NATURAL joinType JOIN right=relationPrimary
 ;
--- End diff --

I had to move the code around. I have added a check in the AstBuilder 
(spark side of the parser) to catch this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14828: [SPARK-17258][SQL] Parse scientific decimal literals as ...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14828
  
**[Test build #64906 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64906/consoleFull)**
 for PR 14828 at commit 
[`009ab39`](https://github.com/apache/spark/commit/009ab39a79b837352b65a054050d74a2cf44dfd7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-03 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/1
  
Why not share the method for creating the named children, and use that in 
CreateUnsafeStruct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-03 Thread eyalfa
Github user eyalfa commented on the issue:

https://github.com/apache/spark/pull/1
  
thanks, I think I came up with something simpler, I've implemented the 
logic in CreateStruct and reused it for CreateStructUnsafe by calling 
CreateStruct(children), then extracting the new children and calling 
CreateNamedStructunsafe's c'tor. I'll push in few moments (waiting for the 
unit-tests to pass).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-03 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/1
  
Something like this:
```scala
object CreateNamedStruct {
  def nameColumns(expressions: Seq[Expression]): Seq[Expression] = {
expressions.zipWithIndex.flatMap {
  case (named: NamedExpression, _) =>
Seq(Literal(named.name), named)
  case (expression, index) =>
Seq(Literal(s"col$index"), expression)
}
  }

  def unnamed(expressions: Seq[Expression]): CreateNamedStruct = {
CreateNamedStruct(nameColumns(expressions))
  }
}

// You could even drop the CreateStruct entirely
object CreateStruct {
  def apply(expressions: Seq[Expression]): Expression = {
CreateNamedStruct.unnamed(expressions)
  }
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14950
  
**[Test build #64905 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64905/consoleFull)**
 for PR 14950 at commit 
[`3029468`](https://github.com/apache/spark/commit/302946821db75117b4ab2346b4b445472ed50eb4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14950
  
@srowen not only cpu cost, if data dimension is big, serialization cost 
will be big, such as https://github.com/apache/spark/pull/14109
and compute all target seems not proper if we may add more summary targets 
in the future ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14950
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14950
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64904/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14950
  
**[Test build #64904 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64904/consoleFull)**
 for PR 14950 at commit 
[`be286eb`](https://github.com/apache/spark/commit/be286eb630fbed17ad55f97d0b53d1f25794ed37).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MultivariateOnlineSummarizer(mask: Int)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14950
  
**[Test build #64904 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64904/consoleFull)**
 for PR 14950 at commit 
[`be286eb`](https://github.com/apache/spark/commit/be286eb630fbed17ad55f97d0b53d1f25794ed37).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14950
  
Hm how much does this really save? these are pretty cheap sufficient 
statistics. Now you have to know whether your particular object was configured 
or not to return the answer you want.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14950
  
**[Test build #64902 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64902/consoleFull)**
 for PR 14950 at commit 
[`dc44bb9`](https://github.com/apache/spark/commit/dc44bb974f3969ab70032e947dd9e06b87e317db).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class MultivariateOnlineSummarizer(mask: Int)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14950
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14950
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64902/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14949
  
**[Test build #64903 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64903/consoleFull)**
 for PR 14949 at commit 
[`2fa331e`](https://github.com/apache/spark/commit/2fa331ebdf3df31b08e3299c8410c29efa645bf1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract...

2016-09-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14864#discussion_r77438939
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -156,24 +155,57 @@ case class FileSourceScanExec(
 false
   }
 
-  override val outputPartitioning: Partitioning = {
+  @transient private lazy val selectedPartitions = 
relation.location.listFiles(partitionFilters)
+
+  override val (outputPartitioning, outputOrdering): (Partitioning, 
Seq[SortOrder]) = {
 val bucketSpec = if 
(relation.sparkSession.sessionState.conf.bucketingEnabled) {
   relation.bucketSpec
 } else {
   None
 }
-bucketSpec.map { spec =>
-  val numBuckets = spec.numBuckets
-  val bucketColumns = spec.bucketColumnNames.flatMap { n =>
-output.find(_.name == n)
-  }
-  if (bucketColumns.size == spec.bucketColumnNames.size) {
-HashPartitioning(bucketColumns, numBuckets)
-  } else {
-UnknownPartitioning(0)
-  }
-}.getOrElse {
-  UnknownPartitioning(0)
+bucketSpec match {
+  case Some(spec) =>
+val numBuckets = spec.numBuckets
+
+def toAttribute(colName: String, columnType: String): Attribute =
+  output.find(_.name == colName).getOrElse {
+throw new AnalysisException(s"Could not find $columnType 
column $colName for " +
--- End diff --

My concern is that, if a table has 3 columns:`i`, `j`, `k`, and is bucketed 
by `i` and `j`, sorted by `j` and `k`. Now we wanna read `i` and `j` from this 
table, then the generated RDD should be bucketed, i.e. the number of partitions 
of this RDD should be equal to the number of buckets. For each RDD partition, 
can we treat it as sorted by `j`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSumm...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14950
  
**[Test build #64902 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64902/consoleFull)**
 for PR 14950 at commit 
[`dc44bb9`](https://github.com/apache/spark/commit/dc44bb974f3969ab70032e947dd9e06b87e317db).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14950: [SPARK-17390][ML][MLLib] Optimize MultivariantOnl...

2016-09-03 Thread WeichenXu123
GitHub user WeichenXu123 opened a pull request:

https://github.com/apache/spark/pull/14950

[SPARK-17390][ML][MLLib] Optimize MultivariantOnlineSummerizer by making 
the summarized target configurable

## What changes were proposed in this pull request?

add a mask parameter to `MultivariantOnlineSummerizer` constructor.
it can be the following values now:
meanMask
varianceMask
minMask
maxMask
numNonZerosMask

so that we can config the summarized targets in the following way:
`new MultivariantOnlineSummerizer(meanMask|varianceMask)`
it represent this summarizer will only compute mean and variance.

## How was this patch tested?

unit test added.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WeichenXu123/spark 
optimize_MultivariantOnlineSummerizer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14950.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14950


commit 834bce992bf89e6e2e5886e6ab9e34c3752d8af4
Author: WeichenXu 
Date:   2016-09-01T01:41:03Z

update.

commit dc44bb974f3969ab70032e947dd9e06b87e317db
Author: WeichenXu 
Date:   2016-09-02T19:56:25Z

update




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-03 Thread eyalfa
Github user eyalfa commented on the issue:

https://github.com/apache/spark/pull/1
  
Can you suggest an other approach to prevent duplication of the code that
constructs the 'named' arguments?

On Sep 3, 2016 19:16, "Herman van Hovell"  wrote:

> @eyalfa  there are IMO still two more things
> to do:
>
>1. Get rid of the CreateStructLikeFactory; that is way to complicated
>for what you are trying to do.
>2. Touch up the tests in order to be more consistent with the rest of
>Spark.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14302: [SPARK-16663][SQL] desc table should be consistent betwe...

2016-09-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14302
  
backported to 2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-03 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/1
  
@eyalfa there are IMO still two more things to do:

1. Get rid of the `CreateStructLikeFactory`; that is way to complicated for 
what you are trying to do.
2. Touch up the tests in order to be more consistent with the rest of Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract...

2016-09-03 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/14864#discussion_r77438866
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -156,24 +156,57 @@ case class FileSourceScanExec(
 false
   }
 
-  override val outputPartitioning: Partitioning = {
+  @transient private lazy val selectedPartitions = 
relation.location.listFiles(partitionFilters)
+
+  override val (outputPartitioning, outputOrdering): (Partitioning, 
Seq[SortOrder]) = {
 val bucketSpec = if 
(relation.sparkSession.sessionState.conf.bucketingEnabled) {
   relation.bucketSpec
 } else {
   None
 }
-bucketSpec.map { spec =>
-  val numBuckets = spec.numBuckets
-  val bucketColumns = spec.bucketColumnNames.flatMap { n =>
-output.find(_.name == n)
-  }
-  if (bucketColumns.size == spec.bucketColumnNames.size) {
-HashPartitioning(bucketColumns, numBuckets)
-  } else {
-UnknownPartitioning(0)
-  }
-}.getOrElse {
-  UnknownPartitioning(0)
+bucketSpec match {
+  case Some(spec) =>
+val numBuckets = spec.numBuckets
+val bucketColumns = spec.bucketColumnNames.flatMap { n =>
+  output.find(_.name == n)
+}
+if (bucketColumns.size == spec.bucketColumnNames.size) {
+  val partitioning = HashPartitioning(bucketColumns, numBuckets)
+
+  val sortOrder = if (spec.sortColumnNames.nonEmpty) {
+// In case of bucketing, its possible to have multiple files 
belonging to the
+// same bucket in a given relation. Each of these files are 
locally sorted
+// but those files combined together are not globally sorted. 
Given that,
+// the RDD partition will not be sorted even if the relation 
has sort columns set
+// Current solution is to check if all the buckets have a 
single file in it
+
+val files = selectedPartitions.flatMap(partition => 
partition.files)
+val bucketToFilesGrouping =
+  files.map(_.getPath.getName).groupBy(file => 
BucketingUtils.getBucketId(file))
+val singleFilePartitions = bucketToFilesGrouping.forall(p => 
p._2.length <= 1)
+
+if (singleFilePartitions) {
+  def toAttribute(colName: String): Attribute =
+output.find(_.name == colName).getOrElse {
--- End diff --

@cloud-fan : Sure. Did this change.

I am throwing exception because end user should know that there is 
something wrong with the table metadata and they need to look into that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14864
  
**[Test build #64901 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64901/consoleFull)**
 for PR 14864 at commit 
[`568b742`](https://github.com/apache/spark/commit/568b742a7087e39c39a47caac300aad74174ec7d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14550: [SPARK-16959] [SQL] Rebuild Table Comment when Retrievin...

2016-09-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14550
  
backported to 2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14944: [SPARK-16334][BACKPORT] Reusing same dictionary column f...

2016-09-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14944
  
Yeah, I also hit it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14946: [SPARK-17353] [SPARK-16943] [SPARK-16942] [SPARK-16959] ...

2016-09-03 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14946
  
It sounds like all the build 2.0 failed the same test case. 


https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.0-test-sbt-hadoop-2.3/
 

Let me try to fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract...

2016-09-03 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14864#discussion_r77438506
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -156,24 +156,57 @@ case class FileSourceScanExec(
 false
   }
 
-  override val outputPartitioning: Partitioning = {
+  @transient private lazy val selectedPartitions = 
relation.location.listFiles(partitionFilters)
+
+  override val (outputPartitioning, outputOrdering): (Partitioning, 
Seq[SortOrder]) = {
 val bucketSpec = if 
(relation.sparkSession.sessionState.conf.bucketingEnabled) {
   relation.bucketSpec
 } else {
   None
 }
-bucketSpec.map { spec =>
-  val numBuckets = spec.numBuckets
-  val bucketColumns = spec.bucketColumnNames.flatMap { n =>
-output.find(_.name == n)
-  }
-  if (bucketColumns.size == spec.bucketColumnNames.size) {
-HashPartitioning(bucketColumns, numBuckets)
-  } else {
-UnknownPartitioning(0)
-  }
-}.getOrElse {
-  UnknownPartitioning(0)
+bucketSpec match {
+  case Some(spec) =>
+val numBuckets = spec.numBuckets
+val bucketColumns = spec.bucketColumnNames.flatMap { n =>
+  output.find(_.name == n)
+}
+if (bucketColumns.size == spec.bucketColumnNames.size) {
+  val partitioning = HashPartitioning(bucketColumns, numBuckets)
+
+  val sortOrder = if (spec.sortColumnNames.nonEmpty) {
+// In case of bucketing, its possible to have multiple files 
belonging to the
+// same bucket in a given relation. Each of these files are 
locally sorted
+// but those files combined together are not globally sorted. 
Given that,
+// the RDD partition will not be sorted even if the relation 
has sort columns set
+// Current solution is to check if all the buckets have a 
single file in it
+
+val files = selectedPartitions.flatMap(partition => 
partition.files)
+val bucketToFilesGrouping =
+  files.map(_.getPath.getName).groupBy(file => 
BucketingUtils.getBucketId(file))
+val singleFilePartitions = bucketToFilesGrouping.forall(p => 
p._2.length <= 1)
+
+if (singleFilePartitions) {
+  def toAttribute(colName: String): Attribute =
+output.find(_.name == colName).getOrElse {
--- End diff --

should we follow the same way to handle bucket columns? i.e.
```
val bucketColumns = spec.bucketColumnNames.flatMap { n =>
  output.find(_.name == n)
}
if (bucketColumns.size == spec.bucketColumnNames.size) {
```

If the required output doesn't contain sort columns, should we just ignore 
the sorting, or throw exception?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14550: [SPARK-16959] [SQL] Rebuild Table Comment when Retrievin...

2016-09-03 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14550
  
yea, go ahead :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14949
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14949
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64900/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14949
  
**[Test build #64900 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64900/consoleFull)**
 for PR 14949 at commit 
[`08dbe43`](https://github.com/apache/spark/commit/08dbe43bbf961186ee94432a13f9f6cfc221e4a6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14864
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14864
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64898/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14948: [SPARK-17389] [SPARK-3261] [MLLIB] Significant KMeans sp...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14948
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14948: [SPARK-17389] [SPARK-3261] [MLLIB] Significant KMeans sp...

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14948
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64899/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14864
  
**[Test build #64898 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64898/consoleFull)**
 for PR 14864 at commit 
[`c60afd6`](https://github.com/apache/spark/commit/c60afd6e79aa4cf07a3c364ab2a9185c813f3f43).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14948: [SPARK-17389] [SPARK-3261] [MLLIB] Significant KMeans sp...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14948
  
**[Test build #64899 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64899/consoleFull)**
 for PR 14948 at commit 
[`e7f12fa`](https://github.com/apache/spark/commit/e7f12fa3e1d3273f558f90455c6c5be8e6a9c8f6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13767: [MINOR][SQL] Not dropping all necessary tables

2016-09-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13767


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13767: [MINOR][SQL] Not dropping all necessary tables

2016-09-03 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/13767
  
Merged to master/2.0 as that was obviously the intent of the test cleanup 
logic


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels' predic...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14949
  
**[Test build #64900 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64900/consoleFull)**
 for PR 14949 at commit 
[`08dbe43`](https://github.com/apache/spark/commit/08dbe43bbf961186ee94432a13f9f6cfc221e4a6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14577: [SPARK-16986][WEB UI] Make 'Started' time, 'Completed' t...

2016-09-03 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14577
  
I think we should not make this particular change and can close this PR. 
It's possible another similar change could be OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14949: [SPARK-17057] [ML] ProbabilisticClassifierModels'...

2016-09-03 Thread srowen
GitHub user srowen opened a pull request:

https://github.com/apache/spark/pull/14949

[SPARK-17057] [ML] ProbabilisticClassifierModels' prediction more 
reasonable with multi zero thresholds

## What changes were proposed in this pull request?

See related discussion at https://github.com/apache/spark/pull/14643

This actually changes more than what the original JIRA encompassed, but 
does propose a more reasonable (?) and deterministic result in this and other 
corner cases.

Revise semantics of ProbabilisticClassifierModel thresholds so that classes 
can only be predicted if they exceed their threshold (meaning no class may be 
predicted), and otherwise ordering by highest probability, then lowest 
threshold, then by class index.

## How was this patch tested?

Existing and new unit tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srowen/spark SPARK-17057.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14949.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14949


commit 08dbe43bbf961186ee94432a13f9f6cfc221e4a6
Author: Sean Owen 
Date:   2016-09-03T14:26:26Z

Revise semantics of ProbabilisticClassifierModel thresholds so that classes 
can only be predicted if they exceed their threshold (meaning no class may be 
predicted), and otherwise ordering by highest probability, then lowest 
threshold, then by class index




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14948: [SPARK-17389] [SPARK-3261] [MLLIB] Significant KMeans sp...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14948
  
**[Test build #64899 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64899/consoleFull)**
 for PR 14948 at commit 
[`e7f12fa`](https://github.com/apache/spark/commit/e7f12fa3e1d3273f558f90455c6c5be8e6a9c8f6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14948: [SPARK-17389] [SPARK-3261] [MLLIB] Significant KMeans sp...

2016-09-03 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14948
  
Note this also now resolves SPARK-3261. This change already means that with 
k-means|| init, fewer than k cluster centers may be returned, which is probably 
correct (and faster). Now random init will also return no duplicate centers, 
and thus < k clusters when the input has size < k.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14864
  
**[Test build #64898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64898/consoleFull)**
 for PR 14864 at commit 
[`c60afd6`](https://github.com/apache/spark/commit/c60afd6e79aa4cf07a3c364ab2a9185c813f3f43).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14864: [SPARK-15453] [SQL] FileSourceScanExec to extract `outpu...

2016-09-03 Thread tejasapatil
Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/14864
  
@cloud-fan : Thanks !! Did the change.

Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13767: [MINOR][SQL] Not dropping all necessary tables

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13767
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64897/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13767: [MINOR][SQL] Not dropping all necessary tables

2016-09-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13767
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13767: [MINOR][SQL] Not dropping all necessary tables

2016-09-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13767
  
**[Test build #64897 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64897/consoleFull)**
 for PR 13767 at commit 
[`a2bab62`](https://github.com/apache/spark/commit/a2bab62abf9de24b4f09f1c3a31bcc468f1af8a4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-03 Thread eyalfa
Github user eyalfa commented on the issue:

https://github.com/apache/spark/pull/1
  
@cloud-fan , @hvanhovell , can you please review this? it's been waiting 
for some time now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >