date:20170411

[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

2017-04-11 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/16774#discussion_r110979877
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala ---
@@ -67,6 +71,39 @@ private[ml] trait ValidatorParams extends HasSeed with 
Params {
   /** @group getParam */
   def getEvaluator: Evaluator = $(evaluator)
 
+  /**
+   * param to control the number of models evaluated in parallel
+   * Default: 1
+   *
+   * @group param
+   */
+  val numParallelEval: IntParam = new IntParam(this, "numParallelEval",
+"max number of models to evaluate in parallel, 1 for serial 
evaluation",
+ParamValidators.gtEq(1))
+
+  /** @group getParam */
+  def getNumParallelEval: Int = $(numParallelEval)
+
+  /**
+   * Creates a execution service to be used for validation, defaults to a 
thread-pool with
+   * size of `numParallelEval`
+   */
+  protected var executorServiceFactory: (Int) => ExecutorService = {
+(requestedMaxThreads: Int) => ThreadUtils.newDaemonCachedThreadPool(
--- End diff --

So my thinking was that if the thread calling fit is terminated, it would 
have to be the JVM shutting down which would exit without waiting for these 
daemon threads.  We don't really at what point the daemon threads stop or if 
they stop abruptly since any unfinished work is useless.  So I'm not sure if 
adding a shutdownHook would do anything different?

On the other hand, if the SparkSession wanted to cancel the running threads 
with the JVM still running, I think it could do that if it provided it's own 
ExecutorService.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) sho...

2017-04-11 Thread dbtsai

Github user dbtsai commented on a diff in the pull request:

https://github.com/apache/spark/pull/17606#discussion_r110979361
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -571,6 +571,7 @@ object TypeCoercion {
 NaNvl(l, Cast(r, DoubleType))
   case NaNvl(l, r) if l.dataType == FloatType && r.dataType == 
DoubleType =>
 NaNvl(Cast(l, DoubleType), r)
+  case NaNvl(l, r) if r.dataType == NullType => NaNvl(l, Cast(r, 
l.dataType))
--- End diff --

Yeah, this PR prevents casting from `NaNvl(FloatType, NullType)` to 
`NaNvl(DoubleType, DoubleType)` since we want to minimize the casting as much 
as possible. Also, if we want to replace `NaN` by `null`, we want to keep the 
output type the same as input type.

Whether `NaNvl(FloatType, DoubleType)` should be cast into 
`NaNvl(DoubleType, DoubleType)` is another story. I agree with you, we should 
downcast the replacement `DoubleType` into `FloatType`. And in my opinion, 
doing this implicit casting is error-prone, and we should do explicit casting 
by users instead. 

@gatorsmile maybe you can chime in, and give the feedback whether we should 
cast `NaNvl(FloatType, DoubleType)` to `NaNvl(DoubleType, DoubleType)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17330: [SPARK-19993][SQL] Caching logical plans containing subq...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17330
  
**[Test build #75710 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75710/testReport)**
 for PR 17330 at commit 
[`362d62f`](https://github.com/apache/spark/commit/362d62ff393954d37d76ac55636d50ee0b4ffcb5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17330: [SPARK-19993][SQL] Caching logical plans containi...

2017-04-11 Thread dilipbiswal

Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/17330#discussion_r110977254
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -670,4 +677,139 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils with SharedSQLContext
   assert(spark.read.parquet(path).filter($"id" > 4).count() == 15)
 }
   }
+
+  test("SPARK-19993 simple subquery caching") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
--- End diff --

@cloud-fan sorry... actually i had some of these tests combined and when i 
split, i forgot to remove this. Will fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17330: [SPARK-19993][SQL] Caching logical plans containi...

2017-04-11 Thread dilipbiswal

Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/17330#discussion_r110977330
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -670,4 +677,139 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils with SharedSQLContext
   assert(spark.read.parquet(path).filter($"id" > 4).count() == 15)
 }
   }
+
+  test("SPARK-19993 simple subquery caching") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
+
+  sql(
+"""
+  |SELECT * FROM t1
+  |WHERE
+  |NOT EXISTS (SELECT * FROM t1)
+""".stripMargin).cache()
+
+  val cachedDs =
+sql(
+  """
+|SELECT * FROM t1
+|WHERE
+|NOT EXISTS (SELECT * FROM t1)
+  """.stripMargin)
+  assert(getNumInMemoryRelations(cachedDs) == 1)
+
+  // Additional predicate in the subquery plan should cause a cache 
miss
+  val cachedMissDs =
+  sql(
+"""
+  |SELECT * FROM t1
+  |WHERE
+  |NOT EXISTS (SELECT * FROM t1 where c1 = 0)
+""".stripMargin)
+  assert(getNumInMemoryRelations(cachedMissDs) == 0)
+}
+  }
+
+  test("SPARK-19993 subquery caching with correlated predicates") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
+
+  // Simple correlated predicate in subquery
+  sql(
+"""
+  |SELECT * FROM t1
+  |WHERE
+  |t1.c1 in (SELECT t2.c1 FROM t2 where t1.c1 = t2.c1)
+""".stripMargin).cache()
+
+  val cachedDs =
+sql(
+  """
+|SELECT * FROM t1
+|WHERE
+|t1.c1 in (SELECT t2.c1 FROM t2 where t1.c1 = t2.c1)
+  """.stripMargin)
+  assert(getNumInMemoryRelations(cachedDs) == 1)
+}
+  }
+
+  test("SPARK-19993 subquery with cached underlying relation") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
--- End diff --

@cloud-fan sorry... actually i had some of these tests combined and when i 
split, i forgot to remove this. Will fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17609: [SPARK-20296][TRIVIAL][DOCS] Count distinct error messag...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17609
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17609: [SPARK-20296][TRIVIAL][DOCS] Count distinct error...

2017-04-11 Thread jtoka

GitHub user jtoka opened a pull request:

https://github.com/apache/spark/pull/17609

[SPARK-20296][TRIVIAL][DOCS] Count distinct error message for streaming

## What changes were proposed in this pull request?
Update count distinct error message for streaming datasets/dataframes to 
match current behavior. These aggregations are not yet supported, regardless of 
whether the dataset/dataframe is aggregated.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jtoka/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17609


commit a4d34c5bcfe53ef05c56f8ce6838bbcda30c9f7e
Author: jtoka 
Date:   2017-04-11T18:13:55Z

Count distinct error message

Update count distinct error message for streaming datasets/dataframes to 
match current behavior. These aggregations are not yet supported.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17330: [SPARK-19993][SQL] Caching logical plans containi...

2017-04-11 Thread dilipbiswal

Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/17330#discussion_r110976069
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -76,6 +76,13 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils with SharedSQLContext
 sum
   }
 
+  private def getNumInMemoryTableScanExecs(plan: SparkPlan): Int = {
--- End diff --

@cloud-fan So we are operating at the physical plan level in this method 
where as the other method getNumInMemoryRelations operates at a logical plan 
level. And in here we are simply counting the the InMemoryTableScanExec nodes 
in the plan. I have changed the function name to 
getNumInMemoryTablesRecursively. Does it look ok to you ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17604: [SPARK-20289][SQL] Use StaticInvoke to box primit...

2017-04-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17604


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17604: [SPARK-20289][SQL] Use StaticInvoke to box primitive typ...

2017-04-11 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/17604
  
Merging in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17295: [SPARK-19556][core] Do not encrypt block manager data in...

2017-04-11 Thread mallman

Github user mallman commented on the issue:

https://github.com/apache/spark/pull/17295
  
> LGTM, cc @mallman to check the unmap part

LGTM, too. Sorry for the late reply... I've been away the past two weeks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17436
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75709/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17436
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17436
  
**[Test build #75709 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75709/testReport)**
 for PR 17436 at commit 
[`6443f59`](https://github.com/apache/spark/commit/6443f59754fec2330fc81e201ae28c7709da9f65).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #7652: [SPARK-9312] [ML] Added max confidence factor to OneVsRes...

2017-04-11 Thread AxenGitHub

Github user AxenGitHub commented on the issue:

https://github.com/apache/spark/pull/7652
  
Is there any news on this branch? we would benefit a lot from this feature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-11 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17527
  
The general idea is to leave any lower-casing that affects strings in the 
user program alone, to use the locale-sensitive `toLowerCase()`. This is more 
conservative. All of the changes should only affect internal strings or API 
values, where there is no reason to be locale-specific. For example: checking a 
property value against a known list of enum string values in a case-insensitive 
way. This should address the underlying problem, where lower-casing an internal 
property results int he wrong result in the Turkish locale, without changing 
the results of a user program. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17455: [Spark-20044][Web UI] Support Spark UI behind front-end ...

2017-04-11 Thread ajbozarth

Github user ajbozarth commented on the issue:

https://github.com/apache/spark/pull/17455
  
It seems it didn't take @holdenk ok, @vanzin mind okaying this to test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17608: [SPARK-20293][WEB UI][History]In the page of 'jobs' or '...

2017-04-11 Thread ajbozarth

Github user ajbozarth commented on the issue:

https://github.com/apache/spark/pull/17608
  
@guoxiaolongzte This seems familiar, are you using the latest version of 
Knox with your Spark UI?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-11 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17527
  
I am wondering what is the reason some of `toLowerCase` is changed, but the 
others remain unchanged?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17598: [SPARK-20284][CORE] Make {Des,S}erializationStrea...

2017-04-11 Thread superbobry

Github user superbobry commented on a diff in the pull request:

https://github.com/apache/spark/pull/17598#discussion_r110950456
  
--- Diff: core/src/main/scala/org/apache/spark/serializer/Serializer.scala 
---
@@ -125,7 +125,7 @@ abstract class SerializerInstance {
  * A stream for writing serialized objects.
  */
 @DeveloperApi
-abstract class SerializationStream {
+abstract class SerializationStream extends Closeable {
--- End diff --

Sure, added that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17459: [SPARK-20109][MLlib] Rewrote toBlockMatrix method on Ind...

2017-04-11 Thread johnc1231

Github user johnc1231 commented on the issue:

https://github.com/apache/spark/pull/17459
  
@viirya Do you have any more comments on this, or are you happy with it? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9571
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75708/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9571
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9571
  
**[Test build #75708 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75708/testReport)**
 for PR 9571 at commit 
[`8903dcf`](https://github.com/apache/spark/commit/8903dcfe2b927c8fc3fed9df3e9939670a016944).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17491
  
I think the current approach will have a LeftSemi join for this Exists 
subquery. Is it far from the optimal access plan you said?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "spark.m...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17436
  
**[Test build #75709 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75709/testReport)**
 for PR 17436 at commit 
[`6443f59`](https://github.com/apache/spark/commit/6443f59754fec2330fc81e201ae28c7709da9f65).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17150: [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17150
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75707/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-04-11 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17436#discussion_r110923167
  
--- Diff: core/src/main/java/org/apache/spark/memory/MemoryConsumer.java ---
@@ -41,7 +41,7 @@ protected MemoryConsumer(TaskMemoryManager 
taskMemoryManager, long pageSize, Mem
   }
 
   protected MemoryConsumer(TaskMemoryManager taskMemoryManager) {
--- End diff --

[This test 
code](https://github.com/apache/spark/blob/master/core/src/test/java/org/apache/spark/memory/TestMemoryConsumer.java#L24)
 is only the case that specifies memory mode different from 
`TaskMemoryManager.getTungstenMemoryMode()`.

To simplify the code, I have just replaced 
`MemoryConsumer(taskMemoryManager, pageSize, 
taskMemoryManager.getTungstenMemoryMode())` with 
`MemoryConsumer(taskMemoryManager, pageSize)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17150: [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17150
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17150: [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17150
  
**[Test build #75707 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75707/testReport)**
 for PR 17150 at commit 
[`cccfbdf`](https://github.com/apache/spark/commit/cccfbdf5d0c762b13c65986ea6fa06a06cb394a4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-11 Thread nsyca

Github user nsyca commented on the issue:

https://github.com/apache/spark/pull/17520
  
@cloud-fan: would you be interested in reviewing this PR since I have not 
heard from @hvanhovell for a while? Note this is a WIP and I want to hear your 
feedback on the issues I put in the comments along with the code. The code, as 
it is, is to preserve the current behaviour but not necessary a desired one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread nsyca

Github user nsyca commented on the issue:

https://github.com/apache/spark/pull/17491
  
@cloud-fan wrote: "How useful is this optimization? It only works when 
Exists has no condition, is that a common case?"

One of the common cases of this usage is an application of ACL where the 
application asks the database whether the user has a proper authority to access 
a certain set of data or not.

Ex:

select ... from controlled_table where exists (select 1 from acl_table 
where user = CURRENT_USER and role = ...)

From a runtime perspective, an optimal access plan is placing the ACL_TABLE 
as an outer of a nested-loop join with a semantic to fetch only the first 
qualified row, once the row exists, continue to process the inner table, 
CONTROLLED_TABLE, or avoiding access the inner completely if no qualified row 
from the outer.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16781: [SPARK-12297][SQL] Hive compatibility for Parquet Timest...

2017-04-11 Thread squito

Github user squito commented on the issue:

https://github.com/apache/spark/pull/16781
  
@ueshin thanks for taking a look earlier, sorry it has taken me some time 
to update this.

Things to note since last time:

1) Hive has seen been updated in 
[HIVE-16231](https://issues.apache.org/jira/browse/HIVE-16231) to use the local 
timezone, not GMT, as the default for storing data.  Really, this is the change 
that should have been in HIVE-12767 -- otherwise you lose backwards 
compatibility with old datasets.

2) This PR now uses the session time zone, rather than local timezone.  
There are tests to confirm that a mix of session timezone X storage timezone 
works correctly.

3) Predicate pushdown is handled.  I actually didn't need to change the 
behavior at all, since predicates are never pushed to int96 -- but there are 
some tests that confirm this.

I'm sure there is some minor cleanup that could be done, but overall I 
think this is ready now.  I'd appreciate if you take another look and any 
suggestions you can make.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17598: [SPARK-20284][CORE] Make {Des,S}erializationStream exten...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17598
  
**[Test build #3659 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3659/testReport)**
 for PR 17598 at commit 
[`75ba026`](https://github.com/apache/spark/commit/75ba026db26171e0ed59d48d0ab2855f2a2af757).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class SerializationStream extends Closeable `
  * `abstract class DeserializationStream extends Closeable `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) sho...

2017-04-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17606#discussion_r110901338
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -656,14 +656,20 @@ class TypeCoercionSuite extends PlanTest {
 
   test("nanvl casts") {
 ruleTest(TypeCoercion.FunctionArgumentConversion,
-  NaNvl(Literal.create(1.0, FloatType), Literal.create(1.0, 
DoubleType)),
-  NaNvl(Cast(Literal.create(1.0, FloatType), DoubleType), 
Literal.create(1.0, DoubleType)))
+  NaNvl(Literal.create(1.0f, FloatType), Literal.create(1.0, 
DoubleType)),
+  NaNvl(Cast(Literal.create(1.0f, FloatType), DoubleType), 
Literal.create(1.0, DoubleType)))
 ruleTest(TypeCoercion.FunctionArgumentConversion,
-  NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0, 
FloatType)),
-  NaNvl(Literal.create(1.0, DoubleType), Cast(Literal.create(1.0, 
FloatType), DoubleType)))
+  NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0f, 
FloatType)),
+  NaNvl(Literal.create(1.0, DoubleType), Cast(Literal.create(1.0f, 
FloatType), DoubleType)))
 ruleTest(TypeCoercion.FunctionArgumentConversion,
   NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0, 
DoubleType)),
   NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0, 
DoubleType)))
+ruleTest(TypeCoercion.FunctionArgumentConversion,
+  NaNvl(Literal.create(1.0f, FloatType), Literal.create(null, 
NullType)),
+  NaNvl(Literal.create(1.0f, FloatType), Literal.create(null, 
FloatType)))
+ruleTest(TypeCoercion.FunctionArgumentConversion,
+  NaNvl(Literal.create(1.0, DoubleType), Literal.create(null, 
NullType)),
+  NaNvl(Literal.create(1.0, DoubleType), Literal.create(null, 
DoubleType)))
--- End diff --

then this should be `Cast(Literal.create(null, NullType), DoubleType)`, I 
think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) sho...

2017-04-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17606#discussion_r110901088
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -656,14 +656,20 @@ class TypeCoercionSuite extends PlanTest {
 
   test("nanvl casts") {
 ruleTest(TypeCoercion.FunctionArgumentConversion,
-  NaNvl(Literal.create(1.0, FloatType), Literal.create(1.0, 
DoubleType)),
-  NaNvl(Cast(Literal.create(1.0, FloatType), DoubleType), 
Literal.create(1.0, DoubleType)))
+  NaNvl(Literal.create(1.0f, FloatType), Literal.create(1.0, 
DoubleType)),
+  NaNvl(Cast(Literal.create(1.0f, FloatType), DoubleType), 
Literal.create(1.0, DoubleType)))
 ruleTest(TypeCoercion.FunctionArgumentConversion,
-  NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0, 
FloatType)),
-  NaNvl(Literal.create(1.0, DoubleType), Cast(Literal.create(1.0, 
FloatType), DoubleType)))
+  NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0f, 
FloatType)),
+  NaNvl(Literal.create(1.0, DoubleType), Cast(Literal.create(1.0f, 
FloatType), DoubleType)))
 ruleTest(TypeCoercion.FunctionArgumentConversion,
   NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0, 
DoubleType)),
   NaNvl(Literal.create(1.0, DoubleType), Literal.create(1.0, 
DoubleType)))
+ruleTest(TypeCoercion.FunctionArgumentConversion,
+  NaNvl(Literal.create(1.0f, FloatType), Literal.create(null, 
NullType)),
+  NaNvl(Literal.create(1.0f, FloatType), Literal.create(null, 
FloatType)))
--- End diff --

oh. `Literal.create(null, NullType)` should be `Cast(Literal.create(null, 
NullType), FloatType)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9571
  
**[Test build #75708 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75708/testReport)**
 for PR 9571 at commit 
[`8903dcf`](https://github.com/apache/spark/commit/8903dcfe2b927c8fc3fed9df3e9939670a016944).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17436#discussion_r110891992
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala
 ---
@@ -351,11 +351,12 @@ class ParquetFileFormat
   if (pushed.isDefined) {
 
ParquetInputFormat.setFilterPredicate(hadoopAttemptContext.getConfiguration, 
pushed.get)
   }
+  val taskContext = Option(TaskContext.get())
   val parquetReader = if (enableVectorizedReader) {
 val vectorizedReader = new VectorizedParquetRecordReader()
 vectorizedReader.initialize(split, hadoopAttemptContext)
 logDebug(s"Appending $partitionSchema ${file.partitionValues}")
-vectorizedReader.initBatch(partitionSchema, file.partitionValues)
+vectorizedReader.initBatch(partitionSchema, file.partitionValues, 
taskContext.isDefined)
--- End diff --

`taskContext.isDefined` means enable off heap?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-04-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17491


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17491
  
LGTM, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17436#discussion_r110889293
  
--- Diff: 
core/src/test/scala/org/apache/spark/memory/StaticMemoryManagerSuite.scala ---
@@ -48,7 +48,10 @@ class StaticMemoryManagerSuite extends 
MemoryManagerSuite {
   conf.clone
 .set("spark.memory.fraction", "1")
 .set("spark.testing.memory", maxOnHeapExecutionMemory.toString)
-.set("spark.memory.offHeap.size", 
maxOffHeapExecutionMemory.toString),
+.set("spark.memory.offHeap.size",
+  if (maxOffHeapExecutionMemory != 0L) { 
maxOffHeapExecutionMemory.toString } else {
+conf.get("spark.memory.offHeap.size", 
maxOffHeapExecutionMemory.toString)
--- End diff --

why this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17587: [SPARK-20274][SQL] support compatible array eleme...

2017-04-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/17587


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17589: [SPARK-16544][SQL] Support for conversion from numeric c...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17589
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75705/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17589: [SPARK-16544][SQL] Support for conversion from numeric c...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17589
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17589: [SPARK-16544][SQL] Support for conversion from numeric c...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17589
  
**[Test build #75705 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75705/testReport)**
 for PR 17589 at commit 
[`cbf8a22`](https://github.com/apache/spark/commit/cbf8a224e9cb5744fd340a4f835bdf07cfdf5543).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17491
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75703/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17491
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17587: [SPARK-20274][SQL] support compatible array element type...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17587
  
thanks for the review, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17491
  
**[Test build #75703 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75703/testReport)**
 for PR 17491 at commit 
[`24ae5ce`](https://github.com/apache/spark/commit/24ae5ce866f82641470ed9598fad9fece450313c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17330: [SPARK-19993][SQL] Caching logical plans containing subq...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17330
  
LGTM except some minor comments about test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17330: [SPARK-19993][SQL] Caching logical plans containi...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17330#discussion_r110885997
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -76,6 +76,13 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils with SharedSQLContext
 sum
   }
 
+  private def getNumInMemoryTableScanExecs(plan: SparkPlan): Int = {
--- End diff --

we need a better name, this actually get in-memory table recursively, which 
is different from `getNumInMemoryRelations`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17330: [SPARK-19993][SQL] Caching logical plans containi...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17330#discussion_r110885627
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -670,4 +677,139 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils with SharedSQLContext
   assert(spark.read.parquet(path).filter($"id" > 4).count() == 15)
 }
   }
+
+  test("SPARK-19993 simple subquery caching") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
+
+  sql(
+"""
+  |SELECT * FROM t1
+  |WHERE
+  |NOT EXISTS (SELECT * FROM t1)
+""".stripMargin).cache()
+
+  val cachedDs =
+sql(
+  """
+|SELECT * FROM t1
+|WHERE
+|NOT EXISTS (SELECT * FROM t1)
+  """.stripMargin)
+  assert(getNumInMemoryRelations(cachedDs) == 1)
+
+  // Additional predicate in the subquery plan should cause a cache 
miss
+  val cachedMissDs =
+  sql(
+"""
+  |SELECT * FROM t1
+  |WHERE
+  |NOT EXISTS (SELECT * FROM t1 where c1 = 0)
+""".stripMargin)
+  assert(getNumInMemoryRelations(cachedMissDs) == 0)
+}
+  }
+
+  test("SPARK-19993 subquery caching with correlated predicates") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
+
+  // Simple correlated predicate in subquery
+  sql(
+"""
+  |SELECT * FROM t1
+  |WHERE
+  |t1.c1 in (SELECT t2.c1 FROM t2 where t1.c1 = t2.c1)
+""".stripMargin).cache()
+
+  val cachedDs =
+sql(
+  """
+|SELECT * FROM t1
+|WHERE
+|t1.c1 in (SELECT t2.c1 FROM t2 where t1.c1 = t2.c1)
+  """.stripMargin)
+  assert(getNumInMemoryRelations(cachedDs) == 1)
+}
+  }
+
+  test("SPARK-19993 subquery with cached underlying relation") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
--- End diff --

where is `t2` used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17330: [SPARK-19993][SQL] Caching logical plans containi...

2017-04-11 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17330#discussion_r110885501
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -670,4 +677,139 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils with SharedSQLContext
   assert(spark.read.parquet(path).filter($"id" > 4).count() == 15)
 }
   }
+
+  test("SPARK-19993 simple subquery caching") {
+withTempView("t1", "t2") {
+  Seq(1).toDF("c1").createOrReplaceTempView("t1")
+  Seq(1).toDF("c1").createOrReplaceTempView("t2")
--- End diff --

where is `t2` used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17150: [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17150
  
**[Test build #75707 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75707/testReport)**
 for PR 17150 at commit 
[`cccfbdf`](https://github.com/apache/spark/commit/cccfbdf5d0c762b13c65986ea6fa06a06cb394a4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistices to improve...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16677
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75699/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistices to improve...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16677
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistices to improve...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16677
  
**[Test build #75699 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75699/testReport)**
 for PR 16677 at commit 
[`b8a2275`](https://github.com/apache/spark/commit/b8a22755bfdef8f1ab78016aea6914155ada67c1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #12574: [SPARK-13857][ML][WIP] Add "recommend all" functi...

2017-04-11 Thread MLnick

Github user MLnick closed the pull request at:

https://github.com/apache/spark/pull/12574


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17598: [SPARK-20284][CORE] Make {Des,S}erializationStream exten...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17598
  
**[Test build #3659 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3659/testReport)**
 for PR 17598 at commit 
[`75ba026`](https://github.com/apache/spark/commit/75ba026db26171e0ed59d48d0ab2855f2a2af757).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17608: [SPARK-20293][WEB UI][History]In the page of 'jobs' or '...

2017-04-11 Thread guoxiaolongzte

Github user guoxiaolongzte commented on the issue:

https://github.com/apache/spark/pull/17608
  
Is this, the only way to encode, will not let the browser to escape our 
special characters.The page will not be error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17587: [SPARK-20274][SQL] support compatible array element type...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17587
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17587: [SPARK-20274][SQL] support compatible array element type...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17587
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75701/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17608: [SPARK-20293][WEB UI][History]In the page of 'jobs' or '...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17608
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17587: [SPARK-20274][SQL] support compatible array element type...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17587
  
**[Test build #75701 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75701/testReport)**
 for PR 17587 at commit 
[`17a308b`](https://github.com/apache/spark/commit/17a308b7aaee44a6c807c21dea4ebaf79d48f34f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17608: [SPARK-20293][WEB UI][History]In the page of 'jobs' or '...

2017-04-11 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17608
  
I don't quite understand, in that you say that the problem was URL-encoding 
the URL, but the solution here is to URL-encode it again. Is that right? maybe 
you can show a more concrete example of the URL as generated by the UI, and 
exactly what it is interpreted as, and the error page. This isn't very clear 
now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17608: [SPARK-20293][WEB UI][History]In the page of 'job...

2017-04-11 Thread guoxiaolongzte

GitHub user guoxiaolongzte opened a pull request:

https://github.com/apache/spark/pull/17608

[SPARK-20293][WEB UI][History]In the page of 'jobs' or 'stages' of history 
server web ui,,click the 'Go' button, query paging data, the page error

## What changes were proposed in this pull request?

In the page of 'jobs' or 'stages' of history server web ui,
Click on the 'Go' button, query paging data, the page error, function can 
not be used.
The reasons are as follows:
'#' Was escaped by the browser as% 23.
& CompletedStage.desc = true% 23completed, the parameter value desc becomes 
= true% 23, causing the page to report an error. The error is as follows:
HTTP ERROR 400
Problem Access / history / app-20170411132432-0004 / stages /. Reason:
 For input string: "true # completed"
Powered by Jetty: //
The amendments are as follows:
The URL of the accessed URL is escaped to ensure that the URL is not 
escaped by the browser.
please see attachment of 
'https://issues.apache.org/jira/browse/SPARK-20293'.

## How was this patch tested?

manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guoxiaolongzte/spark SPARK-20293

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17608


commit d383efba12c66addb17006dea107bb0421d50bc3
Author: éå°é¾ 10207633 
Date:   2017-03-31T13:57:09Z

[SPARK-20177]Document about compression way has some little detail changes.

commit 3059013e9d2aec76def14eb314b6761bea0e7ca0
Author: éå°é¾ 10207633 
Date:   2017-04-01T01:38:02Z

[SPARK-20177] event log add a space

commit 555cef88fe09134ac98fd0ad056121c7df2539aa
Author: guoxiaolongzte 
Date:   2017-04-02T00:16:08Z

'/applications/[app-id]/jobs' in rest api,status should be 
[running|succeeded|failed|unknown]

commit 46bb1ad3ddd9fb55b5607ac4f20213a90186cfe9
Author: éå°é¾ 10207633 
Date:   2017-04-05T03:16:50Z

Merge branch 'master' of https://github.com/apache/spark into SPARK-20177

commit 0efb0dd9e404229cce638fe3fb0c966276784df7
Author: éå°é¾ 10207633 
Date:   2017-04-05T03:47:53Z

[SPARK-20218]'/applications/[app-id]/stages' in REST API,add description.

commit 0e37fdeee28e31fc97436dabd001d3c85c5a7794
Author: éå°é¾ 10207633 
Date:   2017-04-05T05:22:54Z

[SPARK-20218] '/applications/[app-id]/stages/[stage-id]' in REST API,remove 
redundant description.

commit 52641bb01e55b48bd9e8579fea217439d14c7dc7
Author: éå°é¾ 10207633 
Date:   2017-04-07T06:24:58Z

Merge branch 'SPARK-20218'

commit d3977c9cab0722d279e3fae7aacbd4eb944c22f6
Author: éå°é¾ 10207633 
Date:   2017-04-08T07:13:02Z

Merge branch 'master' of https://github.com/apache/spark

commit 137b90e5a85cde7e9b904b3e5ea0bb52518c4716
Author: éå°é¾ 10207633 
Date:   2017-04-10T05:13:40Z

Merge branch 'master' of https://github.com/apache/spark

commit 0fe5865b8022aeacdb2d194699b990d8467f7a0a
Author: éå°é¾ 10207633 
Date:   2017-04-10T10:25:22Z

Merge branch 'SPARK-20190' of https://github.com/guoxiaolongzte/spark

commit cf6f42ac84466960f2232c025b8faeb5d7378fe1
Author: éå°é¾ 10207633 
Date:   2017-04-10T10:26:27Z

Merge branch 'master' of https://github.com/apache/spark

commit 9c1d634b9efe7cdd85e80d742e269aa69fd9994d
Author: éå°é¾ 10207633 
Date:   2017-04-11T06:38:01Z

Merge branch 'master' of https://github.com/apache/spark

commit 6c62262bebe5fc8d5473b7fcc2fdb2656e4f8cc0
Author: éå°é¾ 10207633 
Date:   2017-04-11T10:46:58Z

Merge branch 'master' of https://github.com/apache/spark

commit 1b22cfb8e13918d52a498e8d46b3a0c5c236d121
Author: éå°é¾ 10207633 
Date:   2017-04-11T11:03:01Z

[SPARK-20293]In the page of 'jobs' or 'stages' of history server web 
ui,,click the 'Go' button, query paging data, the page error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17607: [DOCS] Add docstrings to non-operator binary ops in pysp...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17607
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17607: [DOCS] Add docstrings to non-operator binary ops in pysp...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17607
  
**[Test build #75706 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75706/testReport)**
 for PR 17607 at commit 
[`3497d12`](https://github.com/apache/spark/commit/3497d12d75db86b6a21c1c1bc5e5b9802deb19a9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17607: [DOCS] Add docstrings to non-operator binary ops in pysp...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17607
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75706/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) should not...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17606
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) should not...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17606
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75702/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17607: [DOCS] Add docstrings to non-operator binary ops ...

2017-04-11 Thread zero323

Github user zero323 closed the pull request at:

https://github.com/apache/spark/pull/17607


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) should not...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17606
  
**[Test build #75702 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75702/testReport)**
 for PR 17606 at commit 
[`fa5e1af`](https://github.com/apache/spark/commit/fa5e1aff1319a75e89da8baf48f06b223b17eb8c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17604: [SPARK-20289][SQL] Use StaticInvoke to box primitive typ...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17604
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17604: [SPARK-20289][SQL] Use StaticInvoke to box primitive typ...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17604
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75698/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17604: [SPARK-20289][SQL] Use StaticInvoke to box primitive typ...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17604
  
**[Test build #75698 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75698/testReport)**
 for PR 17604 at commit 
[`8cbc617`](https://github.com/apache/spark/commit/8cbc617ee528ab92a755995a03b9ebefc2eb03a4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17607: [DOCS] Add docstrings to non-operator binary ops in pysp...

2017-04-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17607
  
Or I believe one of both PRs could handle all of them. Cc @map222.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17607: [DOCS] Add docstrings to non-operator binary ops in pysp...

2017-04-11 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17607
  
Actually, there is a similar PR - 
https://github.com/apache/spark/pull/17469. How about doing only non-duplicated 
ones?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17607: [DOCS] Add docstrings to non-operator binary ops in pysp...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17607
  
**[Test build #75706 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75706/testReport)**
 for PR 17607 at commit 
[`3497d12`](https://github.com/apache/spark/commit/3497d12d75db86b6a21c1c1bc5e5b9802deb19a9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17607: [DOCS] Add docstrings to non-operator binary ops in pysp...

2017-04-11 Thread zero323

Github user zero323 commented on the issue:

https://github.com/apache/spark/pull/17607
  
cc @holdenk


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17607: [DOCS] Add docstrings to non-operator binary ops ...

2017-04-11 Thread zero323

GitHub user zero323 opened a pull request:

https://github.com/apache/spark/pull/17607

[DOCS] Add docstrings to non-operator binary ops in pyspark.sql.Column

## What changes were proposed in this pull request?

Add docstrings to the following `pyspark.sql.Column` binary ops:
 
- `bitwiseOR`, `bitwiseAND`, `bitwiseXOR`.
- `contains`, `rlike`, `like`, `startswith`, `endswith`.

## How was this patch tested?

Manual tests, docs build.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zero323/spark BINARYOPS-DOCSTRINGS

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17607


commit 3497d12d75db86b6a21c1c1bc5e5b9802deb19a9
Author: zero323 
Date:   2017-04-11T10:24:14Z

Add docstrings to selected binary ops




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) should not...

2017-04-11 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17606
  
LGTM, except for a question which might not be related to this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17533: [WIP][SPARK-20219] Schedule tasks based on size of input...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17533
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17533: [WIP][SPARK-20219] Schedule tasks based on size of input...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17533
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75697/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17533: [WIP][SPARK-20219] Schedule tasks based on size of input...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17533
  
**[Test build #75697 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75697/testReport)**
 for PR 17533 at commit 
[`e3a15c3`](https://github.com/apache/spark/commit/e3a15c3fffd699770738caff2e03f066bf0e149c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17589: [SPARK-16544][SQL] Support for conversion from numeric c...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17589
  
**[Test build #75705 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75705/testReport)**
 for PR 17589 at commit 
[`cbf8a22`](https://github.com/apache/spark/commit/cbf8a224e9cb5744fd340a4f835bdf07cfdf5543).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17606: [SPARK-20291][SQL] NaNvl(FloatType, NullType) sho...

2017-04-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17606#discussion_r110864960
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -571,6 +571,7 @@ object TypeCoercion {
 NaNvl(l, Cast(r, DoubleType))
   case NaNvl(l, r) if l.dataType == FloatType && r.dataType == 
DoubleType =>
 NaNvl(Cast(l, DoubleType), r)
+  case NaNvl(l, r) if r.dataType == NullType => NaNvl(l, Cast(r, 
l.dataType))
--- End diff --

One question I have is, why `NaNvl(FloatType, DoubleType)` should be cast 
to `NaNvl(DoubleType, DoubleType)`, but `NaNvl(FloatType, NullType)` should not 
be cast to `NaNvl(DoubleType, DoubleType)`?

They all change the input type from `FloatType` to `DoubleType`. Won't the 
first cast cause mismatching?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17603: [SPARK-20288] Avoid generating the MapStatus by stageId ...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17603
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17603: [SPARK-20288] Avoid generating the MapStatus by stageId ...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75696/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17603: [SPARK-20288] Avoid generating the MapStatus by stageId ...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17603
  
**[Test build #75696 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75696/testReport)**
 for PR 17603 at commit 
[`5a93693`](https://github.com/apache/spark/commit/5a93693debb733154cb9f5916d3b8ee1d2d2b2e5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9571
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9571
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75704/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9571
  
**[Test build #75704 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75704/testReport)**
 for PR 9571 at commit 
[`ec1f2d7`](https://github.com/apache/spark/commit/ec1f2d7f8743ce6de3e83f2f9a82f1c940c8be52).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9571: [SPARK-11373] [CORE] Add metrics to the History Server an...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9571
  
**[Test build #75704 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75704/testReport)**
 for PR 9571 at commit 
[`ec1f2d7`](https://github.com/apache/spark/commit/ec1f2d7f8743ce6de3e83f2f9a82f1c940c8be52).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17491
  
@cloud-fan The optimization rule is removed now. This patch now is just 
making `Exists` subquery without correlated references work. Please take a look 
again. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17491
  
**[Test build #75703 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75703/testReport)**
 for PR 17491 at commit 
[`24ae5ce`](https://github.com/apache/spark/commit/24ae5ce866f82641470ed9598fad9fece450313c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-04-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17491#discussion_r110859934
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
@@ -498,3 +498,32 @@ object RewriteCorrelatedScalarSubquery extends 
Rule[LogicalPlan] {
   }
   }
 }
+
+/**
+ * This rule rewrites a EXISTS predicate sub-queries into an Aggregate 
with count.
+ * So it doesn't be converted to a JOIN later.
+ */
+object RewriteEmptyExists extends Rule[LogicalPlan] with PredicateHelper {
+  private def containsAgg(plan: LogicalPlan): Boolean = {
+plan.collect {
+  case a: Aggregate => a
+}.nonEmpty
+  }
+
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case Filter(condition, child) =>
+  val (withSubquery, withoutSubquery) =
+
splitConjunctivePredicates(condition).partition(SubqueryExpression.hasInOrExistsSubquery)
+  val newWithSubquery = withSubquery.map(_.transform {
+case e @ Exists(sub, conditions, exprId) if conditions.isEmpty && 
!containsAgg(sub) =>
+  val countExpr = Alias(Count(Literal(1)).toAggregateExpression(), 
"count")()
+  val expr = Alias(GreaterThan(countExpr.toAttribute, Literal(0)), 
e.toString)()
+  ScalarSubquery(
+Project(Seq(expr),
+  Aggregate(Nil, Seq(countExpr), LocalLimit(Literal(1), sub))),
--- End diff --

Btw, I am not very sure this early-out can benefit the general usage, 
except for this kind of special case.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-04-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17491#discussion_r110858379
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
@@ -498,3 +498,32 @@ object RewriteCorrelatedScalarSubquery extends 
Rule[LogicalPlan] {
   }
   }
 }
+
+/**
+ * This rule rewrites a EXISTS predicate sub-queries into an Aggregate 
with count.
+ * So it doesn't be converted to a JOIN later.
+ */
+object RewriteEmptyExists extends Rule[LogicalPlan] with PredicateHelper {
+  private def containsAgg(plan: LogicalPlan): Boolean = {
+plan.collect {
+  case a: Aggregate => a
+}.nonEmpty
+  }
+
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case Filter(condition, child) =>
+  val (withSubquery, withoutSubquery) =
+
splitConjunctivePredicates(condition).partition(SubqueryExpression.hasInOrExistsSubquery)
+  val newWithSubquery = withSubquery.map(_.transform {
+case e @ Exists(sub, conditions, exprId) if conditions.isEmpty && 
!containsAgg(sub) =>
+  val countExpr = Alias(Count(Literal(1)).toAggregateExpression(), 
"count")()
+  val expr = Alias(GreaterThan(countExpr.toAttribute, Literal(0)), 
e.toString)()
+  ScalarSubquery(
+Project(Seq(expr),
+  Aggregate(Nil, Seq(countExpr), LocalLimit(Literal(1), sub))),
--- End diff --

We can address the early-out in other work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-04-11 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17491#discussion_r110858303
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
@@ -498,3 +498,32 @@ object RewriteCorrelatedScalarSubquery extends 
Rule[LogicalPlan] {
   }
   }
 }
+
+/**
+ * This rule rewrites a EXISTS predicate sub-queries into an Aggregate 
with count.
+ * So it doesn't be converted to a JOIN later.
+ */
+object RewriteEmptyExists extends Rule[LogicalPlan] with PredicateHelper {
+  private def containsAgg(plan: LogicalPlan): Boolean = {
+plan.collect {
+  case a: Aggregate => a
+}.nonEmpty
+  }
+
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case Filter(condition, child) =>
+  val (withSubquery, withoutSubquery) =
+
splitConjunctivePredicates(condition).partition(SubqueryExpression.hasInOrExistsSubquery)
+  val newWithSubquery = withSubquery.map(_.transform {
+case e @ Exists(sub, conditions, exprId) if conditions.isEmpty && 
!containsAgg(sub) =>
+  val countExpr = Alias(Count(Literal(1)).toAggregateExpression(), 
"count")()
+  val expr = Alias(GreaterThan(countExpr.toAttribute, Literal(0)), 
e.toString)()
+  ScalarSubquery(
+Project(Seq(expr),
+  Aggregate(Nil, Seq(countExpr), LocalLimit(Literal(1), sub))),
--- End diff --

I think it is a special case. Then I will remove this optimization and 
minimize this pr's change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17455: [Spark-20044][Web UI] Support Spark UI behind front-end ...

2017-04-11 Thread okoethibm

Github user okoethibm commented on the issue:

https://github.com/apache/spark/pull/17455
  
@ajbozarth Any other comments on this PR? Why is it not testing even though 
it has an "ok to test"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 >

201 - 300 of 359 matches

Mail list logo