[GitHub] spark issue #21833: [PYSPARK] [TEST] [MINOR] Fix UDFInitializationTests

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21833
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21833: [PYSPARK] [TEST] [MINOR] Fix UDFInitializationTes...

2018-07-20 Thread PenguinToast
GitHub user PenguinToast opened a pull request:

https://github.com/apache/spark/pull/21833

[PYSPARK] [TEST] [MINOR] Fix UDFInitializationTests

## What changes were proposed in this pull request?

Fix a typo in pyspark sql tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PenguinToast/spark fix-test-typo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21833


commit c4f664bd49f701773ea52751ee135915af973014
Author: William Sheu 
Date:   2018-07-20T22:26:17Z

Fix typo in pyspark sql tests




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...

2018-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21831


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21831
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.

2018-07-20 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21802
  
Do we really need full codegen for all of these collection functions? They 
seem pretty slow and specialization with full codegen won't help perf that much 
(and might even hurt by blowing up the code size) right?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...

2018-07-20 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21826
  
cc @gatorsmile @cloud-fan @HyukjinKwon this is a good thing to do?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21822
  
**[Test build #93370 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93370/testReport)**
 for PR 21822 at commit 
[`38980ad`](https://github.com/apache/spark/commit/38980ad066d26327387673910e0dfd981102cab9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...

2018-07-20 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21826
  
Jenkins, test this please.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1190/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/21822
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21831
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1189/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21831
  
Kubernetes integration test status success
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1189/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21831
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21832
  
**[Test build #93369 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93369/testReport)**
 for PR 21832 at commit 
[`ce86fbe`](https://github.com/apache/spark/commit/ce86fbeda06eb2448ecd2c425982aacca3d66b45).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21831
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1189/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21829: [SPARK-24876][SQL] Avro: simplify schema serializ...

2018-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21829


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21831
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21832
  
add to whitelist


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21829: [SPARK-24876][SQL] Avro: simplify schema serialization

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21829
  
LGTM

Thanks! Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21508: [SPARK-24488] [SQL] Fix issue when generator is aliased ...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21508
  
cc @maropu Help review this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21831
  
**[Test build #93368 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93368/testReport)**
 for PR 21831 at commit 
[`980d30c`](https://github.com/apache/spark/commit/980d30c8964c92f3965e725063fd27b5c4e60922).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21653: [SPARK-13343] speculative tasks that didn't commi...

2018-07-20 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/21653#discussion_r20409
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -723,6 +723,21 @@ private[spark] class TaskSetManager(
   def handleSuccessfulTask(tid: Long, result: DirectTaskResult[_]): Unit = 
{
 val info = taskInfos(tid)
 val index = info.index
+// Check if any other attempt succeeded before this and this attempt 
has not been handled
+if (successful(index) && killedByOtherAttempt.contains(tid)) {
+  calculatedTasks -= 1
--- End diff --

comment here about cleaning up things from incremented earlier while 
handling it as successful


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21653: [SPARK-13343] speculative tasks that didn't commi...

2018-07-20 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/21653#discussion_r204177708
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -723,6 +723,21 @@ private[spark] class TaskSetManager(
   def handleSuccessfulTask(tid: Long, result: DirectTaskResult[_]): Unit = 
{
 val info = taskInfos(tid)
 val index = info.index
+// Check if any other attempt succeeded before this and this attempt 
has not been handled
+if (successful(index) && killedByOtherAttempt.contains(tid)) {
+  calculatedTasks -= 1
+
+  val resultSizeAcc = result.accumUpdates.find(a =>
+a.name == Some(InternalAccumulator.RESULT_SIZE))
+  if (resultSizeAcc.isDefined) {
+totalResultSize -= 
resultSizeAcc.get.asInstanceOf[LongAccumulator].value
--- End diff --

the downside here is we already incremented and other tasks could have 
checked and failed before we decrement, but unless someone else has a better 
idea this is better then it is now. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...

2018-07-20 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/21831#discussion_r204177925
  
--- Diff: resource-managers/kubernetes/integration-tests/pom.xml ---
@@ -25,7 +25,7 @@
   
 
   spark-kubernetes-integration-tests_2.11
-  spark-kubernetes-integration-tests
--- End diff --

Discussed offline. `groupId` can be ignored and it will inherit from the 
parent module. I just removed it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21635: [SPARK-24594][YARN] Introducing metrics for YARN

2018-07-20 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/21635
  
+1 @jerryshao 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20761: [SPARK-20327][CORE][YARN] Add CLI support for YAR...

2018-07-20 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20761#discussion_r204171230
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceTypeHelper.scala
 ---
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.deploy.yarn
+
+import java.lang.{Integer => JInteger, Long => JLong}
+import java.lang.reflect.InvocationTargetException
+
+import scala.collection.mutable
+import scala.util.control.NonFatal
+
+import org.apache.hadoop.yarn.api.records.Resource
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.util.Utils
+
+/**
+ * This helper class uses some of Hadoop 3 methods from the YARN API,
+ * so we need to use reflection to avoid compile error when building 
against Hadoop 2.x
+ */
+private object ResourceTypeHelper extends Logging {
+  private val AMOUNT_AND_UNIT_REGEX = "([0-9]+)([A-Za-z]*)".r
+  private val RESOURCE_TYPES_NOT_AVAILABLE_ERROR_MESSAGE =
+"Ignoring updating resource with resource types because " +
+"the version of YARN does not support it!"
+
+  def setResourceInfoFromResourceTypes(
+  resourceTypes: Map[String, String],
+  resource: Resource): Resource = {
+require(resource != null, "Resource parameter should not be null!")
+
+if (!ResourceTypeHelper.isYarnResourceTypesAvailable() && 
resourceTypes.nonEmpty) {
--- End diff --

do you mean to return whether or `resourceTypes` is empty, but only log the 
warning if its empty?

I suspect that is the cause of the test failures


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204169456
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper {
 
 // Simplify the predicates before validating any unsupported 
correlation patterns
 // in the plan.
-BooleanSimplification(sub).foreachUp {
+// TODO(rxin): Why did this need to call BooleanSimplification???
--- End diff --

@hvanhovell Yeah. I agree.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21508: [SPARK-24488] [SQL] Fix issue when generator is aliased ...

2018-07-20 Thread bkrieger
Github user bkrieger commented on the issue:

https://github.com/apache/spark/pull/21508
  
@gatorsmile @hvanhovell any chance you can take a look at this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19194
  
**[Test build #93367 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93367/testReport)**
 for PR 19194 at commit 
[`aac8a6a`](https://github.com/apache/spark/commit/aac8a6a619c8d60f66f9ddb072e0c4f9a7782621).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19194
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1188/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19194
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2018-07-20 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/19194
  
test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204167870
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper {
 
 // Simplify the predicates before validating any unsupported 
correlation patterns
 // in the plan.
-BooleanSimplification(sub).foreachUp {
+// TODO(rxin): Why did this need to call BooleanSimplification???
--- End diff --

Well tests fail without it, so we don't really have a choice here. For a 
second I thought we could also create some utils class, but that would just 
mean moving the code in BooleanSimplification in there just for esthetics.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204166360
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper {
 
 // Simplify the predicates before validating any unsupported 
correlation patterns
 // in the plan.
-BooleanSimplification(sub).foreachUp {
+// TODO(rxin): Why did this need to call BooleanSimplification???
--- End diff --

@hvanhovell Hi Herman, as you said, we do the actual pulling up of the 
predicates in the optimizer in PullupCorrelatedPredicates in subquery.scala. We 
are also doing a BooleanSimplication first before traversing the plan there. In 
here, we are doing the error reporting and i thought it would be better to keep 
the traversal the same way. Basically previously we did the error reporting and 
rewriting in Analyzer and now, we do the error reporting in checkAnalysis and 
rewriting in Optimizer. Just to refresh your memory so you can help to take the 
right call here :-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...

2018-07-20 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/21831#discussion_r204165909
  
--- Diff: resource-managers/kubernetes/integration-tests/pom.xml ---
@@ -25,7 +25,7 @@
   
 
   spark-kubernetes-integration-tests_2.11
-  spark-kubernetes-integration-tests
--- End diff --

> I am wondering if we need groupId?

Yes. Each project must have a `groupId` and `artifactId`.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...

2018-07-20 Thread chemikadze
Github user chemikadze commented on the issue:

https://github.com/apache/spark/pull/7786
  
@vanzin If those would be implemented, would it have any change to get 
merged? We use preemption quite a lot and current behavior is not the best we 
can get: logs sometimes getting overfilled with preemption side effects (RPC 
errors, etc), getting logs hard to read and confusing some users. I agree that 
depending on task size, effect might be both positive and negative (longer ones 
anyway won't be able to complete be wasting resources, but lots of shorter ones 
will not get chance to run). Does it just mean it should be configurable 
behavior (spark.yarn.releasePreemptedContainers=true/false)?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21831#discussion_r204165401
  
--- Diff: resource-managers/kubernetes/integration-tests/pom.xml ---
@@ -25,7 +25,7 @@
   
 
   spark-kubernetes-integration-tests_2.11
-  spark-kubernetes-integration-tests
--- End diff --

I am wondering if we need `groupId `?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...

2018-07-20 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20856
  
@HyukjinKwon @cloud-fan Thanks for pinging me, sorry for replying late. 
Yeah I looked at the final fixing at #21815, it looks good for a fixing at this 
particular problem.

> It seems to me it would be better to always do codegen at driver side, to 
avoid complex expression/plan operations at executor side.(not sure if it's 
possible, cc ...).

I do agree that this sounds better. A major part of executor codegen is 
unsafe codegen classes such as unsafe projection. Most of them if not all are 
not serializable for now. In order to do codegen at driver side at all, we may 
need to make them serializable. Is it worth doing this?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21831
  
cc @mccheah @ssuchter 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21830: [SPARK-24878][SQL] Fix reverse function for array type o...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21830
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93353/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21830: [SPARK-24878][SQL] Fix reverse function for array type o...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21830
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21830: [SPARK-24878][SQL] Fix reverse function for array type o...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21830
  
**[Test build #93353 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93353/testReport)**
 for PR 21830 at commit 
[`91978e7`](https://github.com/apache/spark/commit/91978e79dd64189e9dbef47d8e8e720a34982d9b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204163484
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -33,6 +49,116 @@ abstract class LogicalPlan
   with QueryPlanConstraints
   with Logging {
 
+  private var _analyzed: Boolean = false
+
+  /**
+   * Marks this plan as already analyzed. This should only be called by 
[[CheckAnalysis]].
+   */
+  private[catalyst] def setAnalyzed(): Unit = { _analyzed = true }
+
+  /**
+   * Returns true if this node and its children have already been gone 
through analysis and
+   * verification.  Note that this is only an optimization used to avoid 
analyzing trees that
+   * have already been analyzed, and can be reset by transformations.
+   */
+  def analyzed: Boolean = _analyzed
+
+  /**
+   * Returns a copy of this node where `rule` has been recursively applied 
first to all of its
+   * children and then itself (post-order, bottom-up). When `rule` does 
not apply to a given node,
+   * it is left unchanged.  This function is similar to `transformUp`, but 
skips sub-trees that
+   * have already been marked as analyzed.
+   *
+   * @param rule the function use to transform this nodes children
+   */
+  def resolveOperators(rule: PartialFunction[LogicalPlan, LogicalPlan]): 
LogicalPlan = {
--- End diff --

todo: add unit tests


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204163551
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper {
 
 // Simplify the predicates before validating any unsupported 
correlation patterns
 // in the plan.
-BooleanSimplification(sub).foreachUp {
+// TODO(rxin): Why did this need to call BooleanSimplification???
--- End diff --

Yeah, I added boolean simplification here. I didn't quite like it back 
then, and I still don't like it. I was hoping this was happening in the 
`Optimizer` now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204163424
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala
 ---
@@ -23,8 +23,24 @@ import org.apache.spark.sql.catalyst.analysis._
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.plans.QueryPlan
 import 
org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats
-import org.apache.spark.sql.catalyst.trees.CurrentOrigin
+import org.apache.spark.sql.catalyst.trees.{CurrentOrigin, TreeNode}
 import org.apache.spark.sql.types.StructType
+import org.apache.spark.util.Utils
+
+
+object LogicalPlan {
+
+  private val resolveOperatorDepth = new ThreadLocal[Int] {
--- End diff --

todo: explain what this is


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21822
  
**[Test build #93366 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93366/testReport)**
 for PR 21822 at commit 
[`38980ad`](https://github.com/apache/spark/commit/38980ad066d26327387673910e0dfd981102cab9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204163328
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -2390,16 +2375,21 @@ class Analyzer(
  * scoping information for attributes and can be removed once analysis is 
complete.
  */
 object EliminateSubqueryAliases extends Rule[LogicalPlan] {
-  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
-case SubqueryAlias(_, child) => child
+  // This is actually called in the beginning of the optimization phase, 
and as a result
+  // is using transformUp rather than resolveOperators. This is also often 
called in the
+  //
--- End diff --

note: finish comment


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1187/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21832
  
Test this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18447: [SPARK-21232][SQL][SparkR][PYSPARK] New built-in SQL fun...

2018-07-20 Thread mmolimar
Github user mmolimar commented on the issue:

https://github.com/apache/spark/pull/18447
  
Hi @HyukjinKwon 
For me it's fine:
"In some SQL db you have to query explicitly the table schema, ie: select 
data_type from all_tab_columns where table_name = 'my_table'or something like 
that.
In case of the ARQ engine from Apache Jena you can call this function in 
SPARQL (see 
[W3C-SPARQL](https://www.w3.org/TR/rdf-sparql-query/#func-datatype)).
I find it useful in order to avoid to query the schema."


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93351/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21822
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21822
  
**[Test build #93351 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93351/testReport)**
 for PR 21822 at commit 
[`7c76c83`](https://github.com/apache/spark/commit/7c76c83fe89f3e5aa28540fd76bdfc6016c35749).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21832#discussion_r204161199
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
@@ -606,7 +607,15 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
 
 object ExtractableLiterals {
   def unapply(exprs: Seq[Expression]): Option[Seq[String]] = {
-val extractables = exprs.map(ExtractableLiteral.unapply)
+// SPARK-24879: The Hive filter parser does not support "null", 
but we still want to push
+// down as many predicates as we can while still maintaining 
correctness. "x in (a, b,
+// null)" can be rewritten as "x in (a, b)" for the purposes of 
partition pruning, so we
--- End diff --

Maybe we should write down the rules here. 
`1 in (2, NULL) ` -> `NULL `
`1 in (1, NULL)` -> `true`
`1 in (2)` -> `false`

NULL is not equal to FALSE. Since all the pushed down predicates are NULL 
intolerant and connected by AND or OR, NULL can be treated as FALSE. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204160853
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper {
 
 // Simplify the predicates before validating any unsupported 
correlation patterns
 // in the plan.
-BooleanSimplification(sub).foreachUp {
+// TODO(rxin): Why did this need to call BooleanSimplification???
--- End diff --

Thanks. I'm going to add it back.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21748
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21748
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93360/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21748
  
**[Test build #93360 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93360/testReport)**
 for PR 21748 at commit 
[`72c96e0`](https://github.com/apache/spark/commit/72c96e03fe4e49ec1c9b4bfad816e20cff67d75d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21798
  
**[Test build #93364 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93364/testReport)**
 for PR 21798 at commit 
[`3206a20`](https://github.com/apache/spark/commit/3206a20fc9f3036e16eca20118e1559d722ff0b9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21798
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21798
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93364/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP

2018-07-20 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21822#discussion_r204160150
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -787,6 +782,7 @@ class Analyzer(
   right
 case Some((oldRelation, newRelation)) =>
   val attributeRewrites = 
AttributeMap(oldRelation.output.zip(newRelation.output))
+  // TODO(rxin): Why do we need transformUp here?
--- End diff --

cc @cloud-fan why do we need transformUp here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21118
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21831
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1185/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21118
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1186/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21831
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21831
  
Kubernetes integration test status success
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1185/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve location size calculation in...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21608
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve location size calculation in...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21608
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93352/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve location size calculation in...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21608
  
**[Test build #93352 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93352/testReport)**
 for PR 21608 at commit 
[`107f4c6`](https://github.com/apache/spark/commit/107f4c675978628bf0effc08924a5f7d397f3719).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21798
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93363/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21798
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21798
  
**[Test build #93363 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93363/testReport)**
 for PR 21798 at commit 
[`0657508`](https://github.com/apache/spark/commit/0657508c7599a3a0ea70027bea96723e8088cc79).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class AvroOptions(`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21832#discussion_r204157323
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
@@ -606,7 +606,15 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
 
 object ExtractableLiterals {
   def unapply(exprs: Seq[Expression]): Option[Seq[String]] = {
-val extractables = exprs.map(ExtractableLiteral.unapply)
+// SPARK-24879: The Hive filter parser does not support "null", 
but we still want to push
--- End diff --

-> `Hive metastore filter parser`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread liyinan926
Github user liyinan926 commented on the issue:

https://github.com/apache/spark/pull/21748
  
LGTM for the docs updates.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/21832#discussion_r204156800
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/client/FiltersSuite.scala ---
@@ -72,6 +72,10 @@ class FiltersSuite extends SparkFunSuite with Logging 
with PlanTest {
   (Literal("p2\" and q=\"q2") === a("stringcol", StringType)) :: Nil,
 """stringcol = 'p1" and q="q1' and 'p2" and q="q2' = stringcol""")
 
+  filterTest("SPARK-24879 null literals should be ignored for IN 
constructs",
+Seq(a("intcol", IntegerType) in (Literal(1), Literal(null))),
--- End diff --

Let us add more test cases for better test coverage


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21118
  
**[Test build #93365 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93365/testReport)**
 for PR 21118 at commit 
[`d1fa32e`](https://github.com/apache/spark/commit/d1fa32e201e73f281a87d46a3510f0e3082c1d35).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21798
  
**[Test build #93364 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93364/testReport)**
 for PR 21798 at commit 
[`3206a20`](https://github.com/apache/spark/commit/3206a20fc9f3036e16eca20118e1559d722ff0b9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21832
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21831
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1185/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21832
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21832
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...

2018-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20057


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21832
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20057
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...

2018-07-20 Thread PenguinToast
GitHub user PenguinToast opened a pull request:

https://github.com/apache/spark/pull/21832

[SPARK-24879][SQL] Fix NPE in Hive partition pruning filter pushdown

## What changes were proposed in this pull request?
We get a NPE when we have a filter on a partition column of the form `col 
in (x, null)`. This is due to the filter converter in HiveShim not handling 
`null`s correctly. This patch fixes this bug while still pushing down as much 
of the partition pruning predicates as possible, by filtering out `null`s from 
any `in` predicate. Since Hive only supports very simple partition pruning 
filters, this change should preserve correctness.

## How was this patch tested?
Unit tests, manual tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PenguinToast/spark partition-pruning-npe

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21832.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21832


commit 388caa978ead4bb8957b5e41a20c394fc90fe234
Author: William Sheu 
Date:   2018-07-20T18:41:27Z

Filter out `null` values for partition pruning predicates




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...

2018-07-20 Thread rdblue
Github user rdblue commented on the issue:

https://github.com/apache/spark/pull/21118
  
Rebased on master to fix conflicts.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21798
  
**[Test build #93363 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93363/testReport)**
 for PR 21798 at commit 
[`0657508`](https://github.com/apache/spark/commit/0657508c7599a3a0ea70027bea96723e8088cc79).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21831
  
**[Test build #93362 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93362/testReport)**
 for PR 21831 at commit 
[`4345139`](https://github.com/apache/spark/commit/4345139cd45e1506ac788dc55a4d9ed420ca6b78).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...

2018-07-20 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/21831
  
cc @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21774: [SPARK-24811][SQL]Avro: add new function from_avro and t...

2018-07-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21774
  
Need to revert this PR since it breaks the build. 
spark-master-compile-maven-hadoop-2.6 #7902 (broken since this build)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...

2018-07-20 Thread zsxwing
GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/21831

[SPARK-24880][BUILD]Fix the group id for spark-kubernetes-integration-tests

## What changes were proposed in this pull request?

The correct group id should be `org.apache.spark`. This is causing the 
nightly build failure: 
https://amplab.cs.berkeley.edu/jenkins/job/spark-master-maven-snapshots/2295/console

`
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-deploy-plugin:2.8.2:deploy (default-deploy) on 
project spark-kubernetes-integration-tests_2.11: Failed to deploy artifacts: 
Could not transfer artifact 
spark-kubernetes-integration-tests:spark-kubernetes-integration-tests_2.11:jar:2.4.0-20180720.101629-1
 from/to apache.snapshots.https 
(https://repository.apache.org/content/repositories/snapshots): Access denied 
to: 
https://repository.apache.org/content/repositories/snapshots/spark-kubernetes-integration-tests/spark-kubernetes-integration-tests_2.11/2.4.0-SNAPSHOT/spark-kubernetes-integration-tests_2.11-2.4.0-20180720.101629-1.jar,
 ReasonPhrase: Forbidden. -> [Help 1]
[ERROR] 
`

## How was this patch tested?

Jenkins.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark fix-k8s-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21831.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21831


commit 4345139cd45e1506ac788dc55a4d9ed420ca6b78
Author: zsxwing 
Date:   2018-07-20T19:50:55Z

Fix the group id for spark-kubernetes-integration-tests




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread mccheah
Github user mccheah commented on the issue:

https://github.com/apache/spark/pull/21748
  
Never mind, think it's recovering now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21748
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21748
  
**[Test build #93361 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93361/testReport)**
 for PR 21748 at commit 
[`72c96e0`](https://github.com/apache/spark/commit/72c96e03fe4e49ec1c9b4bfad816e20cff67d75d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21748
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93361/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21748
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21748
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1184/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   >