[GitHub] spark issue #21833: [PYSPARK] [TEST] [MINOR] Fix UDFInitializationTests
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21833 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21833: [PYSPARK] [TEST] [MINOR] Fix UDFInitializationTes...
GitHub user PenguinToast opened a pull request: https://github.com/apache/spark/pull/21833 [PYSPARK] [TEST] [MINOR] Fix UDFInitializationTests ## What changes were proposed in this pull request? Fix a typo in pyspark sql tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/PenguinToast/spark fix-test-typo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21833.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21833 commit c4f664bd49f701773ea52751ee135915af973014 Author: William Sheu Date: 2018-07-20T22:26:17Z Fix typo in pyspark sql tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21831 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21831 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21802: [SPARK-23928][SQL] Add shuffle collection function.
Github user rxin commented on the issue: https://github.com/apache/spark/pull/21802 Do we really need full codegen for all of these collection functions? They seem pretty slow and specialization with full codegen won't help perf that much (and might even hurt by blowing up the code size) right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/21826 cc @gatorsmile @cloud-fan @HyukjinKwon this is a good thing to do? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21822 **[Test build #93370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93370/testReport)** for PR 21822 at commit [`38980ad`](https://github.com/apache/spark/commit/38980ad066d26327387673910e0dfd981102cab9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21826: [SPARK-24872] Remove the symbol “||” of the “OR”...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/21826 Jenkins, test this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21822 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1190/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21822 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user rxin commented on the issue: https://github.com/apache/spark/pull/21822 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21831 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1189/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21831 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1189/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21831 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21832 **[Test build #93369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93369/testReport)** for PR 21832 at commit [`ce86fbe`](https://github.com/apache/spark/commit/ce86fbeda06eb2448ecd2c425982aacca3d66b45). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21831 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1189/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21829: [SPARK-24876][SQL] Avro: simplify schema serializ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21829 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21831 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21832 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21829: [SPARK-24876][SQL] Avro: simplify schema serialization
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21829 LGTM Thanks! Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21508: [SPARK-24488] [SQL] Fix issue when generator is aliased ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21508 cc @maropu Help review this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21831 **[Test build #93368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93368/testReport)** for PR 21831 at commit [`980d30c`](https://github.com/apache/spark/commit/980d30c8964c92f3965e725063fd27b5c4e60922). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21653: [SPARK-13343] speculative tasks that didn't commi...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/21653#discussion_r20409 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -723,6 +723,21 @@ private[spark] class TaskSetManager( def handleSuccessfulTask(tid: Long, result: DirectTaskResult[_]): Unit = { val info = taskInfos(tid) val index = info.index +// Check if any other attempt succeeded before this and this attempt has not been handled +if (successful(index) && killedByOtherAttempt.contains(tid)) { + calculatedTasks -= 1 --- End diff -- comment here about cleaning up things from incremented earlier while handling it as successful --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21653: [SPARK-13343] speculative tasks that didn't commi...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/21653#discussion_r204177708 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -723,6 +723,21 @@ private[spark] class TaskSetManager( def handleSuccessfulTask(tid: Long, result: DirectTaskResult[_]): Unit = { val info = taskInfos(tid) val index = info.index +// Check if any other attempt succeeded before this and this attempt has not been handled +if (successful(index) && killedByOtherAttempt.contains(tid)) { + calculatedTasks -= 1 + + val resultSizeAcc = result.accumUpdates.find(a => +a.name == Some(InternalAccumulator.RESULT_SIZE)) + if (resultSizeAcc.isDefined) { +totalResultSize -= resultSizeAcc.get.asInstanceOf[LongAccumulator].value --- End diff -- the downside here is we already incremented and other tasks could have checked and failed before we decrement, but unless someone else has a better idea this is better then it is now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21831#discussion_r204177925 --- Diff: resource-managers/kubernetes/integration-tests/pom.xml --- @@ -25,7 +25,7 @@ spark-kubernetes-integration-tests_2.11 - spark-kubernetes-integration-tests --- End diff -- Discussed offline. `groupId` can be ignored and it will inherit from the parent module. I just removed it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21635: [SPARK-24594][YARN] Introducing metrics for YARN
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/21635 +1 @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20761: [SPARK-20327][CORE][YARN] Add CLI support for YAR...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20761#discussion_r204171230 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ResourceTypeHelper.scala --- @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.yarn + +import java.lang.{Integer => JInteger, Long => JLong} +import java.lang.reflect.InvocationTargetException + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.hadoop.yarn.api.records.Resource + +import org.apache.spark.internal.Logging +import org.apache.spark.util.Utils + +/** + * This helper class uses some of Hadoop 3 methods from the YARN API, + * so we need to use reflection to avoid compile error when building against Hadoop 2.x + */ +private object ResourceTypeHelper extends Logging { + private val AMOUNT_AND_UNIT_REGEX = "([0-9]+)([A-Za-z]*)".r + private val RESOURCE_TYPES_NOT_AVAILABLE_ERROR_MESSAGE = +"Ignoring updating resource with resource types because " + +"the version of YARN does not support it!" + + def setResourceInfoFromResourceTypes( + resourceTypes: Map[String, String], + resource: Resource): Resource = { +require(resource != null, "Resource parameter should not be null!") + +if (!ResourceTypeHelper.isYarnResourceTypesAvailable() && resourceTypes.nonEmpty) { --- End diff -- do you mean to return whether or `resourceTypes` is empty, but only log the warning if its empty? I suspect that is the cause of the test failures --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204169456 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper { // Simplify the predicates before validating any unsupported correlation patterns // in the plan. -BooleanSimplification(sub).foreachUp { +// TODO(rxin): Why did this need to call BooleanSimplification??? --- End diff -- @hvanhovell Yeah. I agree. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21508: [SPARK-24488] [SQL] Fix issue when generator is aliased ...
Github user bkrieger commented on the issue: https://github.com/apache/spark/pull/21508 @gatorsmile @hvanhovell any chance you can take a look at this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19194 **[Test build #93367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93367/testReport)** for PR 19194 at commit [`aac8a6a`](https://github.com/apache/spark/commit/aac8a6a619c8d60f66f9ddb072e0c4f9a7782621). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1188/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19194 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19194 test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204167870 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper { // Simplify the predicates before validating any unsupported correlation patterns // in the plan. -BooleanSimplification(sub).foreachUp { +// TODO(rxin): Why did this need to call BooleanSimplification??? --- End diff -- Well tests fail without it, so we don't really have a choice here. For a second I thought we could also create some utils class, but that would just mean moving the code in BooleanSimplification in there just for esthetics. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204166360 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper { // Simplify the predicates before validating any unsupported correlation patterns // in the plan. -BooleanSimplification(sub).foreachUp { +// TODO(rxin): Why did this need to call BooleanSimplification??? --- End diff -- @hvanhovell Hi Herman, as you said, we do the actual pulling up of the predicates in the optimizer in PullupCorrelatedPredicates in subquery.scala. We are also doing a BooleanSimplication first before traversing the plan there. In here, we are doing the error reporting and i thought it would be better to keep the traversal the same way. Basically previously we did the error reporting and rewriting in Analyzer and now, we do the error reporting in checkAnalysis and rewriting in Optimizer. Just to refresh your memory so you can help to take the right call here :-) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/21831#discussion_r204165909 --- Diff: resource-managers/kubernetes/integration-tests/pom.xml --- @@ -25,7 +25,7 @@ spark-kubernetes-integration-tests_2.11 - spark-kubernetes-integration-tests --- End diff -- > I am wondering if we need groupId? Yes. Each project must have a `groupId` and `artifactId`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #7786: [SPARK-9468][Yarn][Core] Avoid scheduling tasks on preemp...
Github user chemikadze commented on the issue: https://github.com/apache/spark/pull/7786 @vanzin If those would be implemented, would it have any change to get merged? We use preemption quite a lot and current behavior is not the best we can get: logs sometimes getting overfilled with preemption side effects (RPC errors, etc), getting logs hard to read and confusing some users. I agree that depending on task size, effect might be both positive and negative (longer ones anyway won't be able to complete be wasting resources, but lots of shorter ones will not get chance to run). Does it just mean it should be configurable behavior (spark.yarn.releasePreemptedContainers=true/false)? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21831#discussion_r204165401 --- Diff: resource-managers/kubernetes/integration-tests/pom.xml --- @@ -25,7 +25,7 @@ spark-kubernetes-integration-tests_2.11 - spark-kubernetes-integration-tests --- End diff -- I am wondering if we need `groupId `? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20856 @HyukjinKwon @cloud-fan Thanks for pinging me, sorry for replying late. Yeah I looked at the final fixing at #21815, it looks good for a fixing at this particular problem. > It seems to me it would be better to always do codegen at driver side, to avoid complex expression/plan operations at executor side.(not sure if it's possible, cc ...). I do agree that this sounds better. A major part of executor codegen is unsafe codegen classes such as unsafe projection. Most of them if not all are not serializable for now. In order to do codegen at driver side at all, we may need to make them serializable. Is it worth doing this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21831 cc @mccheah @ssuchter --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21830: [SPARK-24878][SQL] Fix reverse function for array type o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21830 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93353/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21830: [SPARK-24878][SQL] Fix reverse function for array type o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21830 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21830: [SPARK-24878][SQL] Fix reverse function for array type o...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21830 **[Test build #93353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93353/testReport)** for PR 21830 at commit [`91978e7`](https://github.com/apache/spark/commit/91978e79dd64189e9dbef47d8e8e720a34982d9b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204163484 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -33,6 +49,116 @@ abstract class LogicalPlan with QueryPlanConstraints with Logging { + private var _analyzed: Boolean = false + + /** + * Marks this plan as already analyzed. This should only be called by [[CheckAnalysis]]. + */ + private[catalyst] def setAnalyzed(): Unit = { _analyzed = true } + + /** + * Returns true if this node and its children have already been gone through analysis and + * verification. Note that this is only an optimization used to avoid analyzing trees that + * have already been analyzed, and can be reset by transformations. + */ + def analyzed: Boolean = _analyzed + + /** + * Returns a copy of this node where `rule` has been recursively applied first to all of its + * children and then itself (post-order, bottom-up). When `rule` does not apply to a given node, + * it is left unchanged. This function is similar to `transformUp`, but skips sub-trees that + * have already been marked as analyzed. + * + * @param rule the function use to transform this nodes children + */ + def resolveOperators(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan = { --- End diff -- todo: add unit tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204163551 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper { // Simplify the predicates before validating any unsupported correlation patterns // in the plan. -BooleanSimplification(sub).foreachUp { +// TODO(rxin): Why did this need to call BooleanSimplification??? --- End diff -- Yeah, I added boolean simplification here. I didn't quite like it back then, and I still don't like it. I was hoping this was happening in the `Optimizer` now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204163424 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -23,8 +23,24 @@ import org.apache.spark.sql.catalyst.analysis._ import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.plans.QueryPlan import org.apache.spark.sql.catalyst.plans.logical.statsEstimation.LogicalPlanStats -import org.apache.spark.sql.catalyst.trees.CurrentOrigin +import org.apache.spark.sql.catalyst.trees.{CurrentOrigin, TreeNode} import org.apache.spark.sql.types.StructType +import org.apache.spark.util.Utils + + +object LogicalPlan { + + private val resolveOperatorDepth = new ThreadLocal[Int] { --- End diff -- todo: explain what this is --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21822 **[Test build #93366 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93366/testReport)** for PR 21822 at commit [`38980ad`](https://github.com/apache/spark/commit/38980ad066d26327387673910e0dfd981102cab9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204163328 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2390,16 +2375,21 @@ class Analyzer( * scoping information for attributes and can be removed once analysis is complete. */ object EliminateSubqueryAliases extends Rule[LogicalPlan] { - def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { -case SubqueryAlias(_, child) => child + // This is actually called in the beginning of the optimization phase, and as a result + // is using transformUp rather than resolveOperators. This is also often called in the + // --- End diff -- note: finish comment --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21822 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1187/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21822 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21832 Test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18447: [SPARK-21232][SQL][SparkR][PYSPARK] New built-in SQL fun...
Github user mmolimar commented on the issue: https://github.com/apache/spark/pull/18447 Hi @HyukjinKwon For me it's fine: "In some SQL db you have to query explicitly the table schema, ie: select data_type from all_tab_columns where table_name = 'my_table'or something like that. In case of the ARQ engine from Apache Jena you can call this function in SPARQL (see [W3C-SPARQL](https://www.w3.org/TR/rdf-sparql-query/#func-datatype)). I find it useful in order to avoid to query the schema." --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21822 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93351/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21822 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21822 **[Test build #93351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93351/testReport)** for PR 21822 at commit [`7c76c83`](https://github.com/apache/spark/commit/7c76c83fe89f3e5aa28540fd76bdfc6016c35749). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21832#discussion_r204161199 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -606,7 +607,15 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { object ExtractableLiterals { def unapply(exprs: Seq[Expression]): Option[Seq[String]] = { -val extractables = exprs.map(ExtractableLiteral.unapply) +// SPARK-24879: The Hive filter parser does not support "null", but we still want to push +// down as many predicates as we can while still maintaining correctness. "x in (a, b, +// null)" can be rewritten as "x in (a, b)" for the purposes of partition pruning, so we --- End diff -- Maybe we should write down the rules here. `1 in (2, NULL) ` -> `NULL ` `1 in (1, NULL)` -> `true` `1 in (2)` -> `false` NULL is not equal to FALSE. Since all the pushed down predicates are NULL intolerant and connected by AND or OR, NULL can be treated as FALSE. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204160853 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -533,7 +537,8 @@ trait CheckAnalysis extends PredicateHelper { // Simplify the predicates before validating any unsupported correlation patterns // in the plan. -BooleanSimplification(sub).foreachUp { +// TODO(rxin): Why did this need to call BooleanSimplification??? --- End diff -- Thanks. I'm going to add it back. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21748 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21748 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93360/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21748 **[Test build #93360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93360/testReport)** for PR 21748 at commit [`72c96e0`](https://github.com/apache/spark/commit/72c96e03fe4e49ec1c9b4bfad816e20cff67d75d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21798 **[Test build #93364 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93364/testReport)** for PR 21798 at commit [`3206a20`](https://github.com/apache/spark/commit/3206a20fc9f3036e16eca20118e1559d722ff0b9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21798 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21798 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93364/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21822: [SPARK-24865] Remove AnalysisBarrier - WIP
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/21822#discussion_r204160150 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -787,6 +782,7 @@ class Analyzer( right case Some((oldRelation, newRelation)) => val attributeRewrites = AttributeMap(oldRelation.output.zip(newRelation.output)) + // TODO(rxin): Why do we need transformUp here? --- End diff -- cc @cloud-fan why do we need transformUp here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21118 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21831 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1185/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21118 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1186/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21831 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21831 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1185/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve location size calculation in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21608 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve location size calculation in...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21608 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93352/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve location size calculation in...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21608 **[Test build #93352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93352/testReport)** for PR 21608 at commit [`107f4c6`](https://github.com/apache/spark/commit/107f4c675978628bf0effc08924a5f7d397f3719). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21798 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93363/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21798 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21798 **[Test build #93363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93363/testReport)** for PR 21798 at commit [`0657508`](https://github.com/apache/spark/commit/0657508c7599a3a0ea70027bea96723e8088cc79). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class AvroOptions(` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21832#discussion_r204157323 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -606,7 +606,15 @@ private[client] class Shim_v0_13 extends Shim_v0_12 { object ExtractableLiterals { def unapply(exprs: Seq[Expression]): Option[Seq[String]] = { -val extractables = exprs.map(ExtractableLiteral.unapply) +// SPARK-24879: The Hive filter parser does not support "null", but we still want to push --- End diff -- -> `Hive metastore filter parser` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user liyinan926 commented on the issue: https://github.com/apache/spark/pull/21748 LGTM for the docs updates. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21832#discussion_r204156800 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/FiltersSuite.scala --- @@ -72,6 +72,10 @@ class FiltersSuite extends SparkFunSuite with Logging with PlanTest { (Literal("p2\" and q=\"q2") === a("stringcol", StringType)) :: Nil, """stringcol = 'p1" and q="q1' and 'p2" and q="q2' = stringcol""") + filterTest("SPARK-24879 null literals should be ignored for IN constructs", +Seq(a("intcol", IntegerType) in (Literal(1), Literal(null))), --- End diff -- Let us add more test cases for better test coverage --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21118 **[Test build #93365 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93365/testReport)** for PR 21118 at commit [`d1fa32e`](https://github.com/apache/spark/commit/d1fa32e201e73f281a87d46a3510f0e3082c1d35). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21798 **[Test build #93364 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93364/testReport)** for PR 21798 at commit [`3206a20`](https://github.com/apache/spark/commit/3206a20fc9f3036e16eca20118e1559d722ff0b9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21832 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21831 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/1185/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21832 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21832 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20057: [SPARK-22880][SQL] Add cascadeTruncate option to ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20057 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21832: [SPARK-24879][SQL] Fix NPE in Hive partition pruning fil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21832 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20057: [SPARK-22880][SQL] Add cascadeTruncate option to JDBC da...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20057 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21832: [SPARK-24879][SQL] Fix NPE in Hive partition prun...
GitHub user PenguinToast opened a pull request: https://github.com/apache/spark/pull/21832 [SPARK-24879][SQL] Fix NPE in Hive partition pruning filter pushdown ## What changes were proposed in this pull request? We get a NPE when we have a filter on a partition column of the form `col in (x, null)`. This is due to the filter converter in HiveShim not handling `null`s correctly. This patch fixes this bug while still pushing down as much of the partition pruning predicates as possible, by filtering out `null`s from any `in` predicate. Since Hive only supports very simple partition pruning filters, this change should preserve correctness. ## How was this patch tested? Unit tests, manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/PenguinToast/spark partition-pruning-npe Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21832.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21832 commit 388caa978ead4bb8957b5e41a20c394fc90fe234 Author: William Sheu Date: 2018-07-20T18:41:27Z Filter out `null` values for partition pruning predicates --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21118: SPARK-23325: Use InternalRow when reading with DataSourc...
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21118 Rebased on master to fix conflicts. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21798: [SPARK-24836][SQL] New option for Avro datasource - igno...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21798 **[Test build #93363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93363/testReport)** for PR 21798 at commit [`0657508`](https://github.com/apache/spark/commit/0657508c7599a3a0ea70027bea96723e8088cc79). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21831 **[Test build #93362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93362/testReport)** for PR 21831 at commit [`4345139`](https://github.com/apache/spark/commit/4345139cd45e1506ac788dc55a4d9ed420ca6b78). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21831: [SPARK-24880][BUILD]Fix the group id for spark-kubernete...
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/21831 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21774: [SPARK-24811][SQL]Avro: add new function from_avro and t...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21774 Need to revert this PR since it breaks the build. spark-master-compile-maven-hadoop-2.6 #7902 (broken since this build) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21831: [SPARK-24880][BUILD]Fix the group id for spark-ku...
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/21831 [SPARK-24880][BUILD]Fix the group id for spark-kubernetes-integration-tests ## What changes were proposed in this pull request? The correct group id should be `org.apache.spark`. This is causing the nightly build failure: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-maven-snapshots/2295/console ` [ERROR] Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.8.2:deploy (default-deploy) on project spark-kubernetes-integration-tests_2.11: Failed to deploy artifacts: Could not transfer artifact spark-kubernetes-integration-tests:spark-kubernetes-integration-tests_2.11:jar:2.4.0-20180720.101629-1 from/to apache.snapshots.https (https://repository.apache.org/content/repositories/snapshots): Access denied to: https://repository.apache.org/content/repositories/snapshots/spark-kubernetes-integration-tests/spark-kubernetes-integration-tests_2.11/2.4.0-SNAPSHOT/spark-kubernetes-integration-tests_2.11-2.4.0-20180720.101629-1.jar, ReasonPhrase: Forbidden. -> [Help 1] [ERROR] ` ## How was this patch tested? Jenkins. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark fix-k8s-test Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21831.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21831 commit 4345139cd45e1506ac788dc55a4d9ed420ca6b78 Author: zsxwing Date: 2018-07-20T19:50:55Z Fix the group id for spark-kubernetes-integration-tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user mccheah commented on the issue: https://github.com/apache/spark/pull/21748 Never mind, think it's recovering now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21748 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21748 **[Test build #93361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93361/testReport)** for PR 21748 at commit [`72c96e0`](https://github.com/apache/spark/commit/72c96e03fe4e49ec1c9b4bfad816e20cff67d75d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21748 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93361/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21748 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21748: [SPARK-23146][K8S] Support client mode.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21748 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1184/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org