[GitHub] spark issue #20016: SPARK-22830 Scala Coding style has been improved in Spar...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20016 @chetkhatri not sure what you mean. The whole change here is considered, and doesn't pass style checks. Look at the test results and fix the errors. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19715: [SPARK-22397][ML]add multiple columns support to Quantil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19715 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85199/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19715: [SPARK-22397][ML]add multiple columns support to Quantil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19715 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19715: [SPARK-22397][ML]add multiple columns support to Quantil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19715 **[Test build #85199 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85199/testReport)** for PR 19715 at commit [`99726a1`](https://github.com/apache/spark/commit/99726a15b3369d8119750143753d151201c9334c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20016: SPARK-22830 Scala Coding style has been improved in Spar...
Github user chetkhatri commented on the issue: https://github.com/apache/spark/pull/20016 @srowen why only recent commit is going to merge, can't we get "squash merge" ? Please re-run test build. and let me know if still seems wrong. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20035 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20035 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85194/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20035 **[Test build #85194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85194/testReport)** for PR 20035 at commit [`d31ccd7`](https://github.com/apache/spark/commit/d31ccd7b8aae1f46e7e9e5bfbb7d390c0992678c). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user sujithjay commented on the issue: https://github.com/apache/spark/pull/20002 @tgravescs Thank you for keeping me informed. I look forward to receiving your review. Happy holidays! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20037: [SPARK-22849] ivy.retrieve pattern should also co...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20037#discussion_r158097751 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -1271,7 +1271,7 @@ private[spark] object SparkSubmitUtils { // retrieve all resolved dependencies ivy.retrieve(rr.getModuleDescriptor.getModuleRevisionId, packagesDirectory.getAbsolutePath + File.separator + -"[organization]_[artifact]-[revision].[ext]", +"[organization]_[artifact]-[revision](-[classifier]).[ext]", --- End diff -- FWIW this looks fine; I needed a similar change as part of https://github.com/srowen/spark/commit/41267020701be4877be352e1678113bd9870ec12 It's possible you may need some other changes from that WIP commit; not sure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/20002 @sujithjay thanks for working on this. I will review but I'm not sure I will get to it for a bit, I'm out for the holidays and not sure I can give this the time it needs for a full review today. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85200 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85200/testReport)** for PR 20002 at commit [`ca6aa08`](https://github.com/apache/spark/commit/ca6aa08e3d2f6a053992fb31faed35baa46fb5a6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/20002 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20016: SPARK-22830 Scala Coding style has been improved in Spar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20016 **[Test build #4018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4018/testReport)** for PR 20016 at commit [`a14eb3e`](https://github.com/apache/spark/commit/a14eb3e772cafb52cabe75130f02b2d97fa49ae3). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20016: SPARK-22830 Scala Coding style has been improved in Spar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20016 **[Test build #4018 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4018/testReport)** for PR 20016 at commit [`a14eb3e`](https://github.com/apache/spark/commit/a14eb3e772cafb52cabe75130f02b2d97fa49ae3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20037: [SPARK-22849] ivy.retrieve pattern should also co...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20037#discussion_r158090484 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -1271,7 +1271,7 @@ private[spark] object SparkSubmitUtils { // retrieve all resolved dependencies ivy.retrieve(rr.getModuleDescriptor.getModuleRevisionId, packagesDirectory.getAbsolutePath + File.separator + -"[organization]_[artifact]-[revision].[ext]", +"[organization]_[artifact]-[revision](-[classifier]).[ext]", --- End diff -- The reason why I am putting `classifier` at the end. I am just following the [default artifact partern](https://github.com/apache/ant-ivy/blob/12aeeec70feae05a87a5adfe7b9c2c63744be37f/src/java/org/apache/ivy/core/cache/DefaultRepositoryCacheManager.java#L83) in the Apache IVY. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20037: [SPARK-22849] ivy.retrieve pattern should also co...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20037#discussion_r158089460 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -1271,7 +1271,7 @@ private[spark] object SparkSubmitUtils { // retrieve all resolved dependencies ivy.retrieve(rr.getModuleDescriptor.getModuleRevisionId, packagesDirectory.getAbsolutePath + File.separator + -"[organization]_[artifact]-[revision].[ext]", +"[organization]_[artifact]-[revision](-[classifier]).[ext]", --- End diff -- In my example, ``` zookeeper-jar: {artifact=zookeeper, ext=jar, module=zookeeper, classifier=tests, organisation=org.apache.zookeeper, type=test-jar, revision=3.4.6} zookeeper-jar: {artifact=zookeeper, ext=jar, module=zookeeper, organisation=org.apache.zookeeper, type=jar, revision=3.4.6} ``` Both dependencies will have the same name `org.apache.zookeeper_zookeeper-3.4.6.jar` and cause the collision. After my PR, they will be different names `org.apache.zookeeper_zookeeper-3.4.6.jar` `org.apache.zookeeper_zookeeper-3.4.6-tests.jar` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19884 Thanks for doing the update @shaneknapp and thanks for looking into the details @HyukjinKwon ! I'll look into the test issues with python2.7, it looks to be related to timestamps.. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19983: [SPARK-22788][streaming] Use correct hadoop confi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19983 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19983: [SPARK-22788][streaming] Use correct hadoop config for f...
Github user squito commented on the issue: https://github.com/apache/spark/pull/19983 merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20016: SPARK-22830 Scala Coding style has been improved in Spar...
Github user chetkhatri commented on the issue: https://github.com/apache/spark/pull/20016 @HyukjinKwon @mgaido91 @srowen All the changes are addressed and committed, please do review and needful. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20037: [SPARK-22849] ivy.retrieve pattern should also consider ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20037 @srowen I think we are resolving a different issue. Your PR https://github.com/apache/spark/pull/17416 is trying to resolve the issues raised by `spark-submit et al via --packages`, in which users explicitly specify the classifier. However, we still face another scenarios, the package has the external dependencies. In the dependencies, they have different classifiers. For example, [zookeeper 3.4.6](https://repo1.maven.org/maven2/org/apache/zookeeper/zookeeper/3.4.6/) has tests.jar and regular jar. In the current solution, we will download both to the same file and ivy will return an exception. ``` zookeeper-jar: {artifact=zookeeper, ext=jar, module=zookeeper, classifier=tests, organisation=org.apache.zookeeper, type=test-jar, revision=3.4.6} zookeeper-jar: {artifact=zookeeper, ext=jar, module=zookeeper, organisation=org.apache.zookeeper, type=jar, revision=3.4.6} ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19983: [SPARK-22788][streaming] Use correct hadoop config for f...
Github user squito commented on the issue: https://github.com/apache/spark/pull/19983 lgtm --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19715: [SPARK-22397][ML]add multiple columns support to Quantil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19715 **[Test build #85199 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85199/testReport)** for PR 19715 at commit [`99726a1`](https://github.com/apache/spark/commit/99726a15b3369d8119750143753d151201c9334c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19715: [SPARK-22397][ML]add multiple columns support to Quantil...
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/19715 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20020: [SPARK-22834][SQL] Make insertion commands have real chi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20020 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20020: [SPARK-22834][SQL] Make insertion commands have real chi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20020 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85193/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20020: [SPARK-22834][SQL] Make insertion commands have real chi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20020 **[Test build #85193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85193/testReport)** for PR 20020 at commit [`7ccfd90`](https://github.com/apache/spark/commit/7ccfd909a71a531be25ab0eccdf7b347d06c2c0a). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19929 kindly ping @cloud-fan @gatorsmile @HyukjinKwon @zero323 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20035 **[Test build #85198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85198/testReport)** for PR 20035 at commit [`f0163e7`](https://github.com/apache/spark/commit/f0163e7b68aa09fef5c1dc7f25e00170354a1ab2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20035: [SPARK-22848][SQL] Eliminate mutable state from S...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20035#discussion_r158082275 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala --- @@ -214,11 +213,12 @@ case class Stack(children: Seq[Expression]) extends Generator { // Create the collection. val wrapperClass = classOf[mutable.WrappedArray[_]].getName -ctx.addMutableState( - s"$wrapperClass", - ev.value, - v => s"$v = $wrapperClass$$.MODULE$$.make($rowData);", useFreshName = false) -ev.copy(code = code, isNull = "false") +ev.copy(code = + s""" + |InternalRow[] $rowData = new InternalRow[$numRows]; --- End diff -- I see. I did not imagine that `numRows` is large. I will revert the code for `rowData`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20037: [SPARK-22849] ivy.retrieve pattern should also consider ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20037 **[Test build #85197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85197/testReport)** for PR 20037 at commit [`331ba33`](https://github.com/apache/spark/commit/331ba338ce020a927fcfd88b3bd7e536fc8d3b66). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20010: [SPARK-22826][SQL] findWiderTypeForTwo Fails over...
Github user bdrillard commented on a diff in the pull request: https://github.com/apache/spark/pull/20010#discussion_r158081192 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -158,11 +169,6 @@ object TypeCoercion { findTightestCommonType(t1, t2) .orElse(findWiderTypeForDecimal(t1, t2)) .orElse(stringPromotion(t1, t2)) - .orElse((t1, t2) match { -case (ArrayType(et1, containsNull1), ArrayType(et2, containsNull2)) => - findWiderTypeForTwo(et1, et2).map(ArrayType(_, containsNull1 || containsNull2)) -case _ => None - }) --- End diff -- @gczsjdy I've taken a shot at implementing your suggestion with `findWiderTypeForTwoComplex`, which takes as an argument a `widerTypeFunc`, describing which widening behavior to apply to point types (should they permit promotion to string or not). Because `ArrayType` instances that would require widening the type could be nested in `StructType` and `MapType`, I think it's necessary to have more case matching than would be in `findWiderTypeForArray`, hence `findWiderTypeForTwoComplex`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20035 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20035 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85191/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20035 **[Test build #85191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85191/testReport)** for PR 20035 at commit [`81306c4`](https://github.com/apache/spark/commit/81306c436a533b83d21c30bed8d8504ff66f8c76). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20037: [SPARK-22849] ivy.retrieve pattern should also co...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/20037 [SPARK-22849] ivy.retrieve pattern should also consider `classifier` ## What changes were proposed in this pull request? In the previous PR https://github.com/apache/spark/pull/5755#discussion_r157848354, we dropped `(-[classifier])` from the retrieval pattern. We should add it back; otherwise, > If this pattern for instance doesn't has the [type] or [classifier] token, Ivy will download the source/javadoc artifacts to the same file as the regular jar. ## How was this patch tested? The existing tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark addClassifier Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20037.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20037 commit 331ba338ce020a927fcfd88b3bd7e536fc8d3b66 Author: gatorsmile Date: 2017-12-20T16:59:25Z fix --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19954: [SPARK-22757][Kubernetes] Enable use of remote de...
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/19954#discussion_r158074477 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/steps/DriverInitContainerBootstrapStep.scala --- @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.deploy.k8s.submit.steps + +import java.io.StringWriter +import java.util.Properties + +import io.fabric8.kubernetes.api.model.{ConfigMap, ConfigMapBuilder, ContainerBuilder, HasMetadata} + +import org.apache.spark.deploy.k8s.Config._ +import org.apache.spark.deploy.k8s.submit.{InitContainerUtil, KubernetesDriverSpec} +import org.apache.spark.deploy.k8s.submit.steps.initcontainer.{InitContainerConfigurationStep, InitContainerSpec} + +/** + * Configures the driver init-container that localizes remote dependencies into the driver pod. + * It applies the given InitContainerConfigurationSteps in the given order to produce a final + * InitContainerSpec that is then used to configure the driver pod with the init-container attached. + * It also builds a ConfigMap that will be mounted into the init-container. The ConfigMap carries + * configuration properties for the init-container. + */ +private[spark] class DriverInitContainerBootstrapStep( +steps: Seq[InitContainerConfigurationStep], --- End diff -- I think we can address this as part of the refactoring work. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19884 I need to go sleep now and I guess @ueshin should be sleeping too. Let me leave my signoff here - LGTM if the tests pass. I guess now other builds in other PRs would be broken without this PR. Let me cc @cloud-fan and @srowen here who I believe live in a different timezone. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20035: [SPARK-22848][SQL] Eliminate mutable state from S...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20035#discussion_r158071129 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala --- @@ -214,11 +213,12 @@ case class Stack(children: Seq[Expression]) extends Generator { // Create the collection. val wrapperClass = classOf[mutable.WrappedArray[_]].getName -ctx.addMutableState( - s"$wrapperClass", - ev.value, - v => s"$v = $wrapperClass$$.MODULE$$.make($rowData);", useFreshName = false) -ev.copy(code = code, isNull = "false") +ev.copy(code = + s""" + |InternalRow[] $rowData = new InternalRow[$numRows]; --- End diff -- this creates a large array every time, and I don't think we have data copy issues for generator expressions... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Constant ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20036 **[Test build #85196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85196/testReport)** for PR 20036 at commit [`53661eb`](https://github.com/apache/spark/commit/53661eb72bba55376bc6112b51c25489522d309c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20021: [SPARK-22668][SQL] Ensure no global variables in argumen...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20021 Well, I proposed to check it only for tests at the beginning, but I don't have a strong preference now as the new approach I took can guarantee that no place would violate it, by looking at all the caller sides of `ctx.splitExpressions`. Anyway only checking it in tests is safer, WDYT @viirya ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20035 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85189/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20035 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/20036 [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Constant Pool Limit - reduce entries for mutable state ## What changes were proposed in this pull request? This PR addresses additional review comments in #19811 ## How was this patch tested? Existing test suites You can merge this pull request into a Git repository by running: $ git pull https://github.com/kiszk/spark SPARK-18066-followup Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20036.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20036 commit 53661eb72bba55376bc6112b51c25489522d309c Author: Kazuaki Ishizaki Date: 2017-12-20T16:21:53Z initial commit --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20035 **[Test build #85189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85189/testReport)** for PR 20035 at commit [`2b65856`](https://github.com/apache/spark/commit/2b65856b78d663564a876f601d2f3bdfc7f353a6). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20030: [SPARK-10496][CORE] Efficient RDD cumulative sum
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20030 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20030: [SPARK-10496][CORE] Efficient RDD cumulative sum
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20030 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85188/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20030: [SPARK-10496][CORE] Efficient RDD cumulative sum
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20030 **[Test build #85188 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85188/testReport)** for PR 20030 at commit [`a596e58`](https://github.com/apache/spark/commit/a596e588bfc5c321d6cb88c5f630bf683cd1c110). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20019: [SPARK-22361][SQL][TEST] Add unit test for Window Frames
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20019 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20019: [SPARK-22361][SQL][TEST] Add unit test for Window Frames
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20019 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85190/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20019: [SPARK-22361][SQL][TEST] Add unit test for Window Frames
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20019 **[Test build #85190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85190/testReport)** for PR 20019 at commit [`a2fa042`](https://github.com/apache/spark/commit/a2fa042b4adaf7ccee2ce9795ffa1e219aa08645). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19884 **[Test build #85195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85195/testReport)** for PR 19884 at commit [`d92ae90`](https://github.com/apache/spark/commit/d92ae90e05f55955eaad8e7f55e6324bf333a6bc). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/19884 test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20033: [SPARK-22847] [CORE] Remove redundant code in AppStatusL...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20033 **[Test build #4017 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4017/testReport)** for PR 20033 at commit [`ace04a5`](https://github.com/apache/spark/commit/ace04a5c75a0dc46e0575677be6be77ab6b58895). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20021: [SPARK-22668][SQL] Ensure no global variables in argumen...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/20021 Honestly, I liked very much doing the test only for testing and not throwing an exception in production. IMHO it is an overkill to throw an exception in production and in the remote case that we happen to forget one place where this check can throw the exception, but it is not an issue, as it is perfectly possible, this would also cause a regression. Thus, honestly I am strongly against this solution. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20032: [SPARK-22845] [Scheduler] Modify spark.kubernetes...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20032#discussion_r158064721 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackend.scala --- @@ -217,7 +217,7 @@ private[spark] class KubernetesClusterSchedulerBackend( .watch(new ExecutorPodsWatcher())) allocatorExecutor.scheduleWithFixedDelay( - allocatorRunnable, 0L, podAllocationInterval, TimeUnit.SECONDS) + allocatorRunnable, 0L, podAllocationInterval.toLong, TimeUnit.MILLISECONDS) --- End diff -- Just checking this definitely returns ms? Looks good then. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20030: [SPARK-10496][CORE] Efficient RDD cumulative sum
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20030 This is a lot of code for what sounds like a small win; what is the performance gain? you can already do this with windowing right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20035 **[Test build #85194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85194/testReport)** for PR 20035 at commit [`d31ccd7`](https://github.com/apache/spark/commit/d31ccd7b8aae1f46e7e9e5bfbb7d390c0992678c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19811: [SPARK-18016][SQL] Code Generation: Constant Pool...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19811#discussion_r158060614 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -112,15 +112,15 @@ case class Like(left: Expression, right: Expression) extends StringRegexExpressi override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { val patternClass = classOf[Pattern].getName val escapeFunc = StringUtils.getClass.getName.stripSuffix("$") + ".escapeLikeRegex" -val pattern = ctx.freshName("pattern") if (right.foldable) { val rVal = right.eval() if (rVal != null) { val regexStr = StringEscapeUtils.escapeJava(escape(rVal.asInstanceOf[UTF8String].toString())) -ctx.addMutableState(patternClass, pattern, - s"""$pattern = ${patternClass}.compile("$regexStr");""") +// inline mutable state since not many Like operations in a task --- End diff -- Sure --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19250: [SPARK-12297] Table timezone correction for Times...
Github user squito closed the pull request at: https://github.com/apache/spark/pull/19250 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19975: [SPARK-22781][SS] Support creating streaming dataset wit...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19975 Thank you so much, @zsxwing and @brkyvz --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20035 This PR comes from [this discussion](https://github.com/apache/spark/pull/19811#discussion_r157408061). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19035: [SPARK-21822][SQL]When insert Hive Table is finis...
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19035#discussion_r158047414 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -435,6 +435,18 @@ case class InsertIntoHiveTable( logWarning(s"Unable to delete staging directory: $stagingDir.\n" + e) } +//delete the tmpLocation dir --- End diff -- lint-scala gives error for this line: Insert a space after the start of the comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19035: [SPARK-21822][SQL]When insert Hive Table is finis...
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/19035#discussion_r158050733 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -435,6 +435,18 @@ case class InsertIntoHiveTable( logWarning(s"Unable to delete staging directory: $stagingDir.\n" + e) } +//delete the tmpLocation dir +try { + val fs = tmpLocation.getFileSystem(hadoopConf) + if (fs.delete(tmpLocation, true)) { +// If we successfully delete the tmpLocation dir, remove it from FileSystem's cache. +fs.cancelDeleteOnExit(tmpLocation) + } +} catch { + case NonFatal(e) => +logWarning(s"Unable to delete tmpLocation directory:" + tmpLocation.toString + "\n" + e) --- End diff -- This can be compressed a bit: ` logWarning(s"Unable to delete tmpLocation directory: $tmpLocation\n$e") ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17425: [HOTFIX] [SQL] Fix the failed test cases in GeneratorFun...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/17425 @gatorsmile I have the same question. Which operation in Generator is bad? All of them, or only one? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20008: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20008 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20008: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20008 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85187/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20008: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20008 **[Test build #85187 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85187/testReport)** for PR 20008 at commit [`3291339`](https://github.com/apache/spark/commit/3291339bfa643f12e9d5c3d7cb68c02617f22afa). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20020: [SPARK-22834][SQL] Make insertion commands have real chi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20020 **[Test build #85193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85193/testReport)** for PR 20020 at commit [`7ccfd90`](https://github.com/apache/spark/commit/7ccfd909a71a531be25ab0eccdf7b347d06c2c0a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20008: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20008 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20008: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20008 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85186/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20008: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20008 **[Test build #85186 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85186/testReport)** for PR 20008 at commit [`19bcca1`](https://github.com/apache/spark/commit/19bcca13ab03c9a5cb5399476e1afac26a30ec49). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20002 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85192/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85192/testReport)** for PR 20002 at commit [`4b2dcac`](https://github.com/apache/spark/commit/4b2dcac9462879bb58e626dbab124321d00d4110). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20002 **[Test build #85192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85192/testReport)** for PR 20002 at commit [`4b2dcac`](https://github.com/apache/spark/commit/4b2dcac9462879bb58e626dbab124321d00d4110). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/20002 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19977 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85182/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19977 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19977 **[Test build #85182 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85182/testReport)** for PR 19977 at commit [`fc14aeb`](https://github.com/apache/spark/commit/fc14aeb4e92e67aba1750fc1bc2b0fc9afaa5fac). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: [SPARK-22848][SQL] Eliminate mutable state from Stack
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20035 **[Test build #85191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85191/testReport)** for PR 20035 at commit [`81306c4`](https://github.com/apache/spark/commit/81306c436a533b83d21c30bed8d8504ff66f8c76). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/19884 alright https://github.com/apache/spark/pull/18754 should now be unblocked. let me know if there's anything else that needs to happen. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20035: Enable to execute generated code of function in Generato...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20035 **[Test build #85189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85189/testReport)** for PR 20035 at commit [`2b65856`](https://github.com/apache/spark/commit/2b65856b78d663564a876f601d2f3bdfc7f353a6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20019: [SPARK-22361][SQL][TEST] Add unit test for Window Frames
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20019 **[Test build #85190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85190/testReport)** for PR 20019 at commit [`a2fa042`](https://github.com/apache/spark/commit/a2fa042b4adaf7ccee2ce9795ffa1e219aa08645). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20019: [SPARK-22361][SQL][TEST] Add unit test for Window Frames
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/20019 @smurakozi nice catch, added them. Additionally found a nit. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/19884 we should be good to go: ```$ pssh -h jenkins_workers.txt -t 0 "export PATH=/home/anaconda/envs/py3k/bin$PATH; pip install pyarrow==0.8.0" [1] 05:55:00 [SUCCESS] amp-jenkins-worker-01 [2] 05:55:00 [SUCCESS] amp-jenkins-worker-03 [3] 05:55:00 [SUCCESS] amp-jenkins-worker-08 [4] 05:55:00 [SUCCESS] amp-jenkins-worker-07 [5] 05:55:00 [SUCCESS] amp-jenkins-worker-05 [6] 05:55:00 [SUCCESS] amp-jenkins-worker-04 [7] 05:55:00 [SUCCESS] amp-jenkins-worker-06 [8] 05:55:00 [SUCCESS] amp-jenkins-worker-02 ``` ...and... ```$ pssh -h jenkins_workers.txt -t 0 -i "export PATH=/home/anaconda/envs/py3k/bin:$PATH; pip show pyarrow | grep ^Version" [1] 05:56:28 [SUCCESS] amp-jenkins-worker-02 Version: 0.8.0 [2] 05:56:28 [SUCCESS] amp-jenkins-worker-06 Version: 0.8.0 [3] 05:56:28 [SUCCESS] amp-jenkins-worker-03 Version: 0.8.0 [4] 05:56:28 [SUCCESS] amp-jenkins-worker-05 Version: 0.8.0 [5] 05:56:28 [SUCCESS] amp-jenkins-worker-08 Version: 0.8.0 [6] 05:56:28 [SUCCESS] amp-jenkins-worker-04 Version: 0.8.0 [7] 05:56:28 [SUCCESS] amp-jenkins-worker-07 Version: 0.8.0 [8] 05:56:28 [SUCCESS] amp-jenkins-worker-01 Version: 0.8.0 ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20035: Enable to execute generated code of function in G...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20035#discussion_r158029035 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala --- @@ -214,11 +214,15 @@ case class Stack(children: Seq[Expression]) extends Generator { // Create the collection. val wrapperClass = classOf[mutable.WrappedArray[_]].getName -ctx.addMutableState( +val wrappedArray = ctx.addMutableState( s"$wrapperClass", - ev.value, - v => s"$v = $wrapperClass$$.MODULE$$.make($rowData);", useFreshName = false) -ev.copy(code = code, isNull = "false") + "stackWrappedArray", + v => s"$v = $wrapperClass$$.MODULE$$.make($rowData);") +ev.copy(code = + s""" + |$code + |$wrapperClass ${ev.value} = $wrappedArray; + """.stripMargin, isNull = "false") --- End diff -- This change does not use `inline = true` for correct code generation. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20035: Enable to execute generated code of function in G...
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/20035 Enable to execute generated code of function in Generator ## What changes were proposed in this pull request? This PR enables to execute generated code of `Explode`, PosExplode`, `Inline`, and `Stack` in Generator while `doGenCode` has been implemented. This PR also fixes to generate incorrect code for `Stack` ## How was this patch tested? Existing test suites You can merge this pull request into a Git repository by running: $ git pull https://github.com/kiszk/spark SPARK-22848 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20035.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20035 commit 2b65856b78d663564a876f601d2f3bdfc7f353a6 Author: Kazuaki Ishizaki Date: 2017-12-20T13:51:39Z initial commit --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19992: [SPARK-22805][CORE] Use StorageLevel aliases in e...
Github user superbobry commented on a diff in the pull request: https://github.com/apache/spark/pull/19992#discussion_r158028287 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -444,12 +444,15 @@ private[spark] object JsonProtocol { ("Disk Size" -> rddInfo.diskSize) } - def storageLevelToJson(storageLevel: StorageLevel): JValue = { -("Use Disk" -> storageLevel.useDisk) ~ -("Use Memory" -> storageLevel.useMemory) ~ -("Deserialized" -> storageLevel.deserialized) ~ -("Replication" -> storageLevel.replication) - } + def storageLevelToJson(storageLevel: StorageLevel): JValue = --- End diff -- Sorry, missed it. I've decided not to add braces to `storageLevelFromJson` because it seems to look OK with the toplevel match. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19884: [SPARK-22324][SQL][PYTHON] Upgrade Arrow to 0.8.0
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/19884 @HyukjinKwon @wesm @BryanCutler alright. here's my plan for right now: * python 3.4.5 -- upgrade pyarrow --> 0.8.0 (confirmed working on my staging environment) what i'm not going to do today: * install pyarrow for python 2.7 * mess with the pypy installation i should have pyarrow updated across all workers in ~15 mins, tops. and please note that spark is only built on centos and ubuntu *nix distros @ RISELab (neé AMPLab). we do not have, nor plan on having any windows build nodes in the immediate future. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20020: [SPARK-22834][SQL] Make insertion commands have real chi...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20020 since now we let inserting commands have a child, it makes sense to wrap the child with `AnalysisBarrier`, to avoid re-analyzing them. also cc @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - Document...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19946 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85179/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - Document...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19946 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - Document...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19946 **[Test build #85179 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85179/testReport)** for PR 19946 at commit [`702162b`](https://github.com/apache/spark/commit/702162b4ca9eab83adb0b362d5b4d9479b6b3d0a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20020: [SPARK-22834][SQL] Make insertion commands have r...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20020#discussion_r158025766 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala --- @@ -20,30 +20,18 @@ package org.apache.spark.sql.execution.command import org.apache.hadoop.conf.Configuration import org.apache.spark.SparkContext -import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.{Row, SparkSession} +import org.apache.spark.sql.catalyst.plans.logical.{Command, LogicalPlan} +import org.apache.spark.sql.execution.SparkPlan import org.apache.spark.sql.execution.datasources.BasicWriteJobStatsTracker import org.apache.spark.sql.execution.metric.{SQLMetric, SQLMetrics} import org.apache.spark.util.SerializableConfiguration - /** * A special `RunnableCommand` which writes data out and updates metrics. */ -trait DataWritingCommand extends RunnableCommand { - - /** - * The input query plan that produces the data to be written. - */ - def query: LogicalPlan - - // We make the input `query` an inner child instead of a child in order to hide it from the - // optimizer. This is because optimizer may not preserve the output schema names' case, and we - // have to keep the original analyzed plan here so that we can pass the corrected schema to the - // writer. The schema of analyzed plan is what user expects(or specifies), so we should respect - // it when writing. - override protected def innerChildren: Seq[LogicalPlan] = query :: Nil --- End diff -- now shall we define query as a child here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - Document...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19946 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - Document...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19946 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85178/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19946: [SPARK-22648] [Scheduler] Spark on Kubernetes - Document...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19946 **[Test build #85178 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85178/testReport)** for PR 19946 at commit [`d235847`](https://github.com/apache/spark/commit/d2358470e86ab44522371b8fd733a97527d95ec5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20030: [SPARK-10496][CORE] Efficient RDD cumulative sum
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20030 **[Test build #85188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85188/testReport)** for PR 20030 at commit [`a596e58`](https://github.com/apache/spark/commit/a596e588bfc5c321d6cb88c5f630bf683cd1c110). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org