[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18975 **[Test build #80959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80959/testReport)** for PR 18975 at commit [`0882dd1`](https://github.com/apache/spark/commit/0882dd1f3c300f832d731b69a0d57ef461e55038). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19015 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19015 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80956/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18704: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18704 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19015 **[Test build #80956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80956/testReport)** for PR 19015 at commit [`191bde1`](https://github.com/apache/spark/commit/191bde194bbb56c40f5d33e8fbaf5c3505d792cc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18704: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18704 **[Test build #80958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80958/testReport)** for PR 18704 at commit [`a24a971`](https://github.com/apache/spark/commit/a24a971ed61f054766e3ed8212c2035f1d391d54). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18704: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18704 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80958/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18704: [SPARK-20783][SQL] Create CachedBatchColumnVector to abs...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18704 **[Test build #80958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80958/testReport)** for PR 18704 at commit [`a24a971`](https://github.com/apache/spark/commit/a24a971ed61f054766e3ed8212c2035f1d391d54). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18641: [SPARK-21413][SQL] Fix 64KB JVM bytecode limit problem i...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18641 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18492: [SPARK-19326] Speculated task attempts do not get...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18492#discussion_r134387763 --- Diff: core/src/test/scala/org/apache/spark/ExecutorAllocationManagerSuite.scala --- @@ -188,6 +188,40 @@ class ExecutorAllocationManagerSuite assert(numExecutorsTarget(manager) === 10) } + test("add executors when speculative tasks added") { +sc = createSparkContext(0, 10, 0) +val manager = sc.executorAllocationManager.get + +// Verify that we're capped at number of tasks including the speculative ones in the stage +sc.listenerBus.postToAll(SparkListenerSpeculativeTaskSubmitted(1)) +assert(numExecutorsTarget(manager) === 0) +assert(numExecutorsToAdd(manager) === 1) +assert(addExecutors(manager) === 1) +sc.listenerBus.postToAll(SparkListenerSpeculativeTaskSubmitted(1)) +sc.listenerBus.postToAll(SparkListenerSpeculativeTaskSubmitted(1)) + sc.listenerBus.postToAll(SparkListenerStageSubmitted(createStageInfo(1, 2))) +assert(numExecutorsTarget(manager) === 1) +assert(numExecutorsToAdd(manager) === 2) +assert(addExecutors(manager) === 2) +assert(numExecutorsTarget(manager) === 3) +assert(numExecutorsToAdd(manager) === 4) +assert(addExecutors(manager) === 2) +assert(numExecutorsTarget(manager) === 5) +assert(numExecutorsToAdd(manager) === 1) + +// Verify that running a task doesn't affect the target --- End diff -- can you explain more about this test? Why the first 3 `SparkListenerSpeculativeTaskSubmitted` events can trigger to allocate more executors, but here we don't? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading remote r...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18962 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading remote r...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18962 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80951/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading remote r...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18962 **[Test build #80951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80951/testReport)** for PR 18962 at commit [`16ce99f`](https://github.com/apache/spark/commit/16ce99fc1cea9260a96dae98f031bda9f8ed18f4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19015 **[Test build #80957 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80957/testReport)** for PR 19015 at commit [`391c4db`](https://github.com/apache/spark/commit/391c4db6cc338f3fcbf8e8d4fd43e3a6dcb365ba). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18973 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80952/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18973 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18973 **[Test build #80952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80952/testReport)** for PR 18973 at commit [`8857cf5`](https://github.com/apache/spark/commit/8857cf51f142865063c53e4a7089dd027db4d3c3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSui...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19015#discussion_r134385759 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandSuite.scala --- @@ -22,19 +22,26 @@ import java.util.Locale import scala.reflect.{classTag, ClassTag} +import org.apache.spark.sql.{AnalysisException, SaveMode} import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute import org.apache.spark.sql.catalyst.catalog._ +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans +import org.apache.spark.sql.catalyst.dsl.plans.DslLogicalPlan +import org.apache.spark.sql.catalyst.expressions.JsonTuple import org.apache.spark.sql.catalyst.parser.ParseException import org.apache.spark.sql.catalyst.plans.PlanTest -import org.apache.spark.sql.catalyst.plans.logical.Project +import org.apache.spark.sql.catalyst.plans.logical.{Generate, LogicalPlan, Project, ScriptTransformation} import org.apache.spark.sql.execution.SparkSqlParser import org.apache.spark.sql.execution.datasources.CreateTable import org.apache.spark.sql.internal.{HiveSerDe, SQLConf} +import org.apache.spark.sql.test.SharedSQLContext import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType} // TODO: merge this with DDLSuite (SPARK-14441) -class DDLCommandSuite extends PlanTest { +class DDLCommandSuite extends PlanTest with SharedSQLContext { --- End diff -- Sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19015 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSui...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19015#discussion_r134385680 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandSuite.scala --- @@ -22,19 +22,26 @@ import java.util.Locale import scala.reflect.{classTag, ClassTag} +import org.apache.spark.sql.{AnalysisException, SaveMode} import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute import org.apache.spark.sql.catalyst.catalog._ +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans +import org.apache.spark.sql.catalyst.dsl.plans.DslLogicalPlan +import org.apache.spark.sql.catalyst.expressions.JsonTuple import org.apache.spark.sql.catalyst.parser.ParseException import org.apache.spark.sql.catalyst.plans.PlanTest -import org.apache.spark.sql.catalyst.plans.logical.Project +import org.apache.spark.sql.catalyst.plans.logical.{Generate, LogicalPlan, Project, ScriptTransformation} import org.apache.spark.sql.execution.SparkSqlParser import org.apache.spark.sql.execution.datasources.CreateTable import org.apache.spark.sql.internal.{HiveSerDe, SQLConf} +import org.apache.spark.sql.test.SharedSQLContext import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType} // TODO: merge this with DDLSuite (SPARK-14441) -class DDLCommandSuite extends PlanTest { +class DDLCommandSuite extends PlanTest with SharedSQLContext { --- End diff -- shall we rename it to `DDLParserSuite`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18957: [SPARK-21744][CORE] Add retry logic for new broadcast in...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18957 Currently we don't retry for Broadcast, maybe it's positive to add a general-purpose retry logic for Broadcast so it would keep consistent with spark Job. We shouldn't just retry for a specific senario though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18974: [SPARK-21750][SQL] Use Arrow 0.6.0
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18974 Thanks for this @kiszk. I was thinking we would need to do an upgrade for DecimalType support. I'm going to help out with that on the Arrow side, but it still might not be ready until 1 or 2 more releases. I'm not sure what the general Spark stance is on updating dependencies like Arrow, but I can say that I did test 0.6 myself and did not see anything that might cause issues. Maybe someone else can share the policies on upgrading? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17849: [SPARK-10931][ML][PYSPARK] PySpark Models Copy Param Val...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17849 What do you think about this ? @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18968: [SPARK-21759][SQL] In.checkInputDataTypes should not wro...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18968 After we correctly define the data type of `ListQuery`, can we remove the special handling of `ListQuery` in `In.checkInputDataTypes`? we can add `ListQuery.childOutputs: Seq[Attribute]`, so that even we extend the project list of `ListQuery.plan`, we still keep the corrected data type: ``` def dataType = if (childOutputs.length > 1) childOutputs.toStructType else childOutputs.head.dataType ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...
Github user janewangfb commented on the issue: https://github.com/apache/spark/pull/18975 @gatorsmile plan-pasring unittests are already added in DDLCommandSuite. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134383117 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationChecker.scala --- @@ -142,10 +142,14 @@ object UnsupportedOperationChecker { "Distinct aggregations are not supported on streaming DataFrames/Datasets. Consider " + "using approx_count_distinct() instead.") + --- End diff -- reverted --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18945: Add option to convert nullable int columns to float colu...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18945 Thanks for clarifying @HyukjinKwon , I see what you mean now. Since pandas will iterate over `self.collect()` anyway I don't think your solution would impact performance at all right? So your way might be better, but it is slightly more complicated.. Just to sum things up - @logannc does this still meet your requirements? Instead of having the `strict = True` option we do the following: ``` for each nullable int32 column: if there are null values: change column type to float32 else: change column type to int32 ``` I'm also guessing we will have the same problem with nullable ShortType - maybe others? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134382990 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/InsertIntoDataSourceDirCommand.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.sql._ +import org.apache.spark.sql.catalyst.catalog._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ + +/** + * A command used to write the result of a query to a directory. + * + * The syntax of using this command in SQL is: + * {{{ + * INSERT OVERWRITE DIRECTORY (path=STRING)? + * USING format OPTIONS ([option1_name "option1_value", option2_name "option2_value", ...]) + * SELECT ... + * }}} + */ +case class InsertIntoDataSourceDirCommand( +storage: CatalogStorageFormat, +provider: Option[String], +query: LogicalPlan) extends RunnableCommand { + + override def innerChildren: Seq[LogicalPlan] = Seq(query) + + override def run(sparkSession: SparkSession): Seq[Row] = { +assert(innerChildren.length == 1) +assert(!storage.locationUri.isEmpty) --- End diff -- updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134383033 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1509,4 +1509,84 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder(conf) { query: LogicalPlan): LogicalPlan = { RepartitionByExpression(expressions, query, conf.numShufflePartitions) } + + /** + * Return the parameters for [[InsertIntoDir]] logical plan. + * + * Expected format: + * {{{ + * INSERT OVERWRITE DIRECTORY + * [path] + * [OPTIONS table_property_list] + * select_statement; + * }}} + */ + override def visitInsertOverwriteDir( + ctx: InsertOverwriteDirContext): InsertDirParams = withOrigin(ctx) { +val options = Option(ctx.options).map(visitPropertyKeyValues).getOrElse(Map.empty) +var storage = DataSource.buildStorageFormatFromOptions(options) + +val path = Option(ctx.path) match { + case Some(s) => string(s) + case None => "" +} + +if (!path.isEmpty && storage.locationUri.isDefined) { + throw new ParseException( +"Directory path and 'path' in OPTIONS are both used to indicate the directory path, " + + "you can only specify one of them.", ctx) +} +if (path.isEmpty && !storage.locationUri.isDefined) { + throw new ParseException( +"You need to specify directory path or 'path' in OPTIONS, but not both", ctx) +} + +if (!path.isEmpty) { + val customLocation = Some(CatalogUtils.stringToURI(path)) + storage = storage.copy(locationUri = customLocation) +} + +val provider = ctx.tableProvider.qualifiedName.getText + +(false, storage, Some(provider)) + } + + /** + * Return the parameters for [[InsertIntoDir]] logical plan. + * + * Expected format: + * {{{ + * INSERT OVERWRITE DIRECTORY + * path --- End diff -- added [LOCAL] --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134382855 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2040,4 +2040,80 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { assert(setOfPath.size() == pathSizeToDeleteOnExit) } } + + test("insert overwrite to dir from hive metastore table") { +import org.apache.spark.util.Utils + +val path = Utils.createTempDir() +path.delete() +checkAnswer( + sql(s"INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' SELECT * FROM src where key < 10"), + Seq.empty[Row]) + +checkAnswer( + sql(s"""INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' + |STORED AS orc + |SELECT * FROM src where key < 10""".stripMargin), + Seq.empty[Row]) + +// use orc data source to check the data of path is right. +sql( + s"""CREATE TEMPORARY TABLE orc_source + |USING org.apache.spark.sql.hive.orc + |OPTIONS ( + | PATH '${path.getCanonicalPath}' + |) + """.stripMargin) +checkAnswer( + sql("select * from orc_source"), + sql("select * from src where key < 10").collect() +) + +Utils.deleteRecursively(path) +dropTempTable("orc_source") + } + + test("insert overwrite to dir from temp table") { +import org.apache.spark.util.Utils + +sparkContext + .parallelize(1 to 10) + .map(i => TestData(i, i.toString)) + .toDF() + .registerTempTable("test_insert_table") --- End diff -- updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134382822 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2040,4 +2040,80 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { assert(setOfPath.size() == pathSizeToDeleteOnExit) } } + + test("insert overwrite to dir from hive metastore table") { +import org.apache.spark.util.Utils + +val path = Utils.createTempDir() +path.delete() +checkAnswer( + sql(s"INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' SELECT * FROM src where key < 10"), + Seq.empty[Row]) + +checkAnswer( + sql(s"""INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' + |STORED AS orc + |SELECT * FROM src where key < 10""".stripMargin), + Seq.empty[Row]) + +// use orc data source to check the data of path is right. +sql( + s"""CREATE TEMPORARY TABLE orc_source + |USING org.apache.spark.sql.hive.orc + |OPTIONS ( + | PATH '${path.getCanonicalPath}' + |) + """.stripMargin) +checkAnswer( + sql("select * from orc_source"), + sql("select * from src where key < 10").collect() +) + +Utils.deleteRecursively(path) +dropTempTable("orc_source") + } + + test("insert overwrite to dir from temp table") { +import org.apache.spark.util.Utils + +sparkContext + .parallelize(1 to 10) + .map(i => TestData(i, i.toString)) + .toDF() + .registerTempTable("test_insert_table") + +val path = Utils.createTempDir() --- End diff -- updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134382831 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2040,4 +2040,80 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton { assert(setOfPath.size() == pathSizeToDeleteOnExit) } } + + test("insert overwrite to dir from hive metastore table") { +import org.apache.spark.util.Utils + +val path = Utils.createTempDir() +path.delete() +checkAnswer( + sql(s"INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' SELECT * FROM src where key < 10"), + Seq.empty[Row]) + +checkAnswer( + sql(s"""INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' + |STORED AS orc + |SELECT * FROM src where key < 10""".stripMargin), + Seq.empty[Row]) + +// use orc data source to check the data of path is right. +sql( + s"""CREATE TEMPORARY TABLE orc_source + |USING org.apache.spark.sql.hive.orc + |OPTIONS ( + | PATH '${path.getCanonicalPath}' + |) + """.stripMargin) +checkAnswer( + sql("select * from orc_source"), + sql("select * from src where key < 10").collect() +) + +Utils.deleteRecursively(path) +dropTempTable("orc_source") + } + + test("insert overwrite to dir from temp table") { +import org.apache.spark.util.Utils + +sparkContext + .parallelize(1 to 10) + .map(i => TestData(i, i.toString)) + .toDF() + .registerTempTable("test_insert_table") + +val path = Utils.createTempDir() +path.delete() +checkAnswer( + sql( +s""" + |INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' + |ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' + |SELECT * FROM test_insert_table + """.stripMargin), + Seq.empty[Row]) + +checkAnswer( + sql(s""" +INSERT OVERWRITE LOCAL DIRECTORY '${path.toString}' + |STORED AS orc + |SELECT * FROM test_insert_table""".stripMargin), + Seq.empty[Row]) + +// use orc data source to check the data of path is right. +sql( + s"""CREATE TEMPORARY TABLE orc_source + |USING org.apache.spark.sql.hive.orc + |OPTIONS ( + | PATH '${path.getCanonicalPath}' + |) + """.stripMargin) +checkAnswer( + sql("select * from orc_source"), + sql("select * from test_insert_table").collect() +) +Utils.deleteRecursively(path) +dropTempTable("test_insert_table") --- End diff -- updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134382724 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive.execution + +import org.apache.hadoop.conf.Configuration + +import org.apache.spark.internal.io.FileCommitProtocol +import org.apache.spark.sql.SparkSession +import org.apache.spark.sql.catalyst.catalog.BucketSpec +import org.apache.spark.sql.catalyst.expressions.Attribute +import org.apache.spark.sql.execution.SparkPlan +import org.apache.spark.sql.execution.command.DataWritingCommand +import org.apache.spark.sql.execution.datasources.FileFormatWriter +import org.apache.spark.sql.hive.HiveShim.{ShimFileSinkDesc => FileSinkDesc} + +// Base trait from which all hive insert statement physical execution extends. +private[hive] trait SaveAsHiveFile extends DataWritingCommand { + + protected def saveAsHiveFile(sparkSession: SparkSession, + plan: SparkPlan, + hadoopConf: Configuration, + fileSinkConf: FileSinkDesc, + outputLocation: String, + partitionAttributes: Seq[Attribute] = Nil, + bucketSpec: Option[BucketSpec] = None, + options: Map[String, String] = Map.empty): Unit = { --- End diff -- updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134382590 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveDirCommand.scala --- @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive.execution + +import java.util.Properties + +import scala.language.existentials + +import org.apache.hadoop.fs.{FileSystem, Path} +import org.apache.hadoop.hive.common.FileUtils +import org.apache.hadoop.hive.ql.plan.TableDesc +import org.apache.hadoop.hive.serde.serdeConstants +import org.apache.hadoop.hive.serde2.`lazy`.LazySimpleSerDe +import org.apache.hadoop.mapred._ + +import org.apache.spark.sql.{Row, SparkSession} +import org.apache.spark.sql.catalyst.catalog.CatalogStorageFormat +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.SparkPlan +import org.apache.spark.util.Utils + + +case class InsertIntoHiveDirCommand( +isLocal: Boolean, +storage: CatalogStorageFormat, --- End diff -- added --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134382618 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/InsertIntoDataSourceDirCommand.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import org.apache.spark.sql._ +import org.apache.spark.sql.catalyst.catalog._ +import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan +import org.apache.spark.sql.execution.datasources._ + +/** + * A command used to write the result of a query to a directory. + * + * The syntax of using this command in SQL is: + * {{{ + * INSERT OVERWRITE DIRECTORY (path=STRING)? + * USING format OPTIONS ([option1_name "option1_value", option2_name "option2_value", ...]) + * SELECT ... + * }}} + */ +case class InsertIntoDataSourceDirCommand( +storage: CatalogStorageFormat, +provider: Option[String], --- End diff -- added --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19015 **[Test build #80956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80956/testReport)** for PR 19015 at commit [`191bde1`](https://github.com/apache/spark/commit/191bde194bbb56c40f5d33e8fbaf5c3505d792cc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19015 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19015: [SPARK-21803] [TEST] Remove the HiveDDLCommandSui...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/19015 [SPARK-21803] [TEST] Remove the HiveDDLCommandSuite ## What changes were proposed in this pull request? We do not have any Hive-specific parser. It does not make sense to keep a parser-specific test suite `HiveDDLCommandSuite.scala` in the Hive package. This PR is to remove it. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark combineDDL Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19015.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19015 commit b4de53dd8e99a1b43f723048360f947c2648bc0c Author: gatorsmileDate: 2017-08-22T04:14:41Z remove HiveDDLCommandSuite.scala commit 191bde194bbb56c40f5d33e8fbaf5c3505d792cc Author: gatorsmile Date: 2017-08-22T04:22:09Z style fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134381632 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -359,6 +359,17 @@ case class InsertIntoTable( override lazy val resolved: Boolean = false } +case class InsertIntoDir( --- End diff -- ok. added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...
Github user janewangfb commented on a diff in the pull request: https://github.com/apache/spark/pull/18975#discussion_r134381489 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -740,6 +750,7 @@ nonReserved | AND | CASE | CAST | DISTINCT | DIV | ELSE | END | FUNCTION | INTERVAL | MACRO | OR | STRATIFY | THEN | UNBOUNDED | WHEN | DATABASE | SELECT | FROM | WHERE | HAVING | TO | TABLE | WITH | NOT | CURRENT_DATE | CURRENT_TIMESTAMP +| DIRECTORY --- End diff -- it is already in TableIdentifierParserSuite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17849: [SPARK-10931][ML][PYSPARK] PySpark Models Copy Param Val...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/17849 @holdenk , do you think this is good to go now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18989: [SPARK-21781][SQL] Modify DataSourceScanExec to use conc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18989 **[Test build #80955 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80955/testReport)** for PR 18989 at commit [`9effea9`](https://github.com/apache/spark/commit/9effea9379313b0aac1f392ca11ce0f678bb1e0c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet state sho...
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18982 Thanks for reviewing @holdenk ! You brought up some good points, let me know if you prefer me to change them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet st...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18982#discussion_r134380704 --- Diff: python/pyspark/ml/wrapper.py --- @@ -118,11 +118,13 @@ def _transfer_params_to_java(self): """ Transforms the embedded params to the companion Java object. """ -paramMap = self.extractParamMap() for param in self.params: -if param in paramMap: -pair = self._make_java_param_pair(param, paramMap[param]) +if param in self._paramMap: +pair = self._make_java_param_pair(param, self._paramMap[param]) self._java_obj.set(pair) +if param in self._defaultParamMap: --- End diff -- We usually make the assumption that Python defines the same default values as Java, in Spark ML at least, but given the circumstances of the JIRA - they defined their own Model - then it's still possible for `hasDefault` or the default value to return something different that Python would. So I'm just being overly cautious here, but it's pretty cheap to just transfer the default values anyway right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18982: [SPARK-21685][PYTHON][ML] PySpark Params isSet st...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18982#discussion_r134380293 --- Diff: python/pyspark/ml/tests.py --- @@ -455,6 +455,14 @@ def test_logistic_regression_check_thresholds(self): LogisticRegression, threshold=0.42, thresholds=[0.5, 0.5] ) +def test_preserve_set_state(self): +model = Binarizer() +self.assertFalse(model.isSet("threshold")) +model._transfer_params_to_java() --- End diff -- yeah, it would be a little better to call the actual `transform`, but we would still need to call `_transfer_params_from_java` or check `isSet` with a direct call to Java via py4j. I was going to do this, but the `ParamTest` class doesn't already create a `SparkSession` - I'm sure it's just a small amount of overhead but that's why I thought to just use `_transfer_params_to_java`. Do you think it would be worth it to change `ParamTests` to inherit from `SparkSessionTestCase` so a session is created and I could make a `DataFrame` to transform? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18931: [SPARK-21717][SQL] Decouple consume functions of physica...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18931 @maropu Thanks for running the benchmark and getting the numbers! So looks like SPARK-21603 actually affect the performance improvement of this PR? It shows this PR can significantly improve long codegen queries like Q17 and Q66. And SPARK-21603 with default setting downgrades many queries including Q66. cc @cloud-fan @gatorsmile @kiszk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15435 **[Test build #80954 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80954/testReport)** for PR 15435 at commit [`67c57e5`](https://github.com/apache/spark/commit/67c57e547b654ec2816fe4f33e067072a05c4d5e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/15435 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19013 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80949/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19013 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19013 **[Test build #80949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80949/testReport)** for PR 19013 at commit [`3d6016e`](https://github.com/apache/spark/commit/3d6016e14eb3fab5cea1bd24452842e59f721cad). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18986: [SPARK-21774][SQL] The rule PromoteStrings should cast a...
Github user stanzhai commented on the issue: https://github.com/apache/spark/pull/18986 @gatorsmile @DonnyZone When comparing a string to a int in Hive, it will cast string type to double. ``` hive> select * from tb; 0 0 0.1 0 true0 19157170390056971 0 hive> select * from tb where a = 0; 0 0 hive> select * from tb where a = 19157170390056973L; WARNING: Comparing a bigint and a string may result in a loss of precision. 19157170390056973 0 hive> select 1 = 'true'; NULL hive> select 19157170390056973L = '19157170390056971'; WARNING: Comparing a bigint and a string may result in a loss of precision. true ``` So, I think that cast a string to double type when compare with a numeric is more reasonable. Actually, my usage scenarios are for Spark compatibility. The problem I found when I upgraded Spark to 2.2.0, and lots of SQL's results are wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18931: [SPARK-21717][SQL] Decouple consume functions of physica...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18931 sorry for late response (I rerun many-times in various settings...), see: https://docs.google.com/spreadsheets/d/1LsnRIWDoqNtGhrWJ4jKVfYizL9X8KIJc04WisKHJig8/edit#gid=4103073 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19014: [MINOR][CORE] Add missing kvstore module in Laucher and ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19014 CC @vanzin , please help to review, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19014: [MINOR][CORE] Add missing kvstore module in Laucher and ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19014 **[Test build #80953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80953/testReport)** for PR 19014 at commit [`bf03c76`](https://github.com/apache/spark/commit/bf03c76f670eba3efc721fe1482ed8f05531b5af). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19014: [MINOR][CORE] Add missing kvstore module in Lauch...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/19014 [MINOR][CORE] Add missing kvstore module in Laucher and SparkSubmit code There're two code in Launcher and SparkSubmit will will explicitly list all the Spark submodules, newly added kvstore module is missing in this two parts, so submitting a minor PR to fix this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark missing-kvstore Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19014.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19014 commit bf03c76f670eba3efc721fe1482ed8f05531b5af Author: jerryshaoDate: 2017-08-22T03:06:45Z Add missing kvstore module in Laucher and SparkSubmit code Change-Id: I35109bca61f9c0a246b7a9842a98947bc580c6dd --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18986: [SPARK-21774][SQL] The rule PromoteStrings should cast a...
Github user stanzhai commented on the issue: https://github.com/apache/spark/pull/18986 @DonnyZone @gatorsmile @cloud-fan PostgreSQL will throw an error when comparing a string to a int. ``` postgres=# select * from tb; a | b --+--- 0.1 | 1 a| 1 true | 1 (3 rows) postgres=# select * from tb where a>0; ERROR: operator does not exist: character varying > integer LINE 1: select * from tb where a>0; ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15435 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80947/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15435 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15435 **[Test build #80947 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80947/testReport)** for PR 15435 at commit [`67c57e5`](https://github.com/apache/spark/commit/67c57e547b654ec2816fe4f33e067072a05c4d5e). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18892: [SPARK-21520][SQL]Improvement a special case for ...
Github user heary-cao closed the pull request at: https://github.com/apache/spark/pull/18892 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18973 **[Test build #80952 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80952/testReport)** for PR 18973 at commit [`8857cf5`](https://github.com/apache/spark/commit/8857cf51f142865063c53e4a7089dd027db4d3c3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpic...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18734 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18734 Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading remote r...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18962 **[Test build #80951 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80951/testReport)** for PR 18962 at commit [`16ce99f`](https://github.com/apache/spark/commit/16ce99fc1cea9260a96dae98f031bda9f8ed18f4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18734 LGTM I am going to credit this to @rgbkrk per (http://spark.apache.org/contributing.html) > In case several people contributed, prefer to assign to the more âjuniorâ, non-committer contributor I just double checked if the tests passes with Python 3.6.0, and if I could run pi example with pypy manually (SPARK-21753), with the current status. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18734 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18734 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80950/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18734 **[Test build #80950 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80950/testReport)** for PR 18734 at commit [`f986c25`](https://github.com/apache/spark/commit/f986c2591a9a0b6962862c5cdfc33a7d65be7eda). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18957: [SPARK-21744][CORE] Add retry logic for new broadcast in...
Github user caneGuy commented on the issue: https://github.com/apache/spark/pull/18957 In my opinionï¼i think it is better to retry.Since yarn has health checker which can find bad disk later.For current job i think it should not failed by this bad disk. @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18931: [SPARK-21717][SQL] Decouple consume functions of physica...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18931 ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18931: [SPARK-21717][SQL] Decouple consume functions of physica...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18931 @maropu I saw you may open a follow-up to fix the default value, maybe you can do that in the follow-up too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18974: [SPARK-21750][SQL] Use Arrow 0.6.0
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/18974 ping @srowen @ueshin @BryanCutler --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18734 **[Test build #80950 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80950/testReport)** for PR 18734 at commit [`f986c25`](https://github.com/apache/spark/commit/f986c2591a9a0b6962862c5cdfc33a7d65be7eda). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions ...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18966#discussion_r134368200 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -582,6 +582,15 @@ object SQLConf { .intConf .createWithDefault(2667) + val CODEGEN_MAX_CHARS_PER_FUNCTION = buildConf("spark.sql.codegen.maxCharactersPerFunction") --- End diff -- In this PR, do we change this parameter to use `number of lines` instead of `number of characters`, too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18968: [SPARK-21759][SQL] In.checkInputDataTypes should not wro...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18968 @gatorsmile @cloud-fan Any more comments on this change? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18734 I am merging this because: cloudpickle looks initially ported from https://github.com/cloudpipe/cloudpickle/commit/7aebb7ed42258a9392c2ada9b4bb390d566630cc and https://github.com/cloudpipe/cloudpickle/commit/c4f885116126b2ac49deae7a31d4941d006f319f (-> https://github.com/apache/spark/commit/04e44b37cc04f62fbf9e08c7076349e0a4d12ea8), where I see both are identical. After https://github.com/apache/spark/commit/04e44b37cc04f62fbf9e08c7076349e0a4d12ea8, we have diff - https://github.com/apache/spark/commit/e044705b4402f86d0557ecd146f3565388c7eeb4, https://github.com/apache/spark/commit/55204181004c105c7a3e8c31a099b37e48bfd953, https://github.com/apache/spark/commit/ee913e6e2d58dfac20f3f06ff306081bd0e48066, https://github.com/apache/spark/commit/d48935400ca47275f677b527c636976af09332c8, https://github.com/apache/spark/commit/dbfc7aa4d0d5457bc92e1e66d065c6088d476843, https://github.com/apache/spark/commit/20e6280626fe243b170a2e7c5e018c67f3dac1db and https://github.com/apache/spark/commit/6297697f975960a3006c4e58b4964d9ac40eeaf5 **[SPARK-9116] [SQL] [PYSPARK] support Python only UDT in __main__**, https://github.com/apache/spark/commit/e044705b4402f86d0557ecd146f3565388c7eeb4: I think this part is only what we are worried of. It looks supporting `classmethod`, `staticmethod` and `property`. We have a test: https://github.com/apache/spark/blob/96608310501a43fa4ab9f2697f202d655dba98c5/python/pyspark/sql/tests.py#L141-L173 https://github.com/apache/spark/blob/96608310501a43fa4ab9f2697f202d655dba98c5/python/pyspark/sql/tests.py#L898-L927 **[SPARK-10542] [PYSPARK] fix serialize namedtuple**, https://github.com/apache/spark/commit/55204181004c105c7a3e8c31a099b37e48bfd953: We keep the changes: https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L1090-L1095 https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L433-L436 and the related test pass: https://github.com/apache/spark/blob/77cc0d67d5a7ea526f8efd37b2590923953cb8e0/python/pyspark/tests.py#L211-L219 **[SPARK-13697] [PYSPARK] Fix the missing module name of TransformFunctionSerializer.loads**, https://github.com/apache/spark/commit/ee913e6e2d58dfac20f3f06ff306081bd0e48066: We keep this change: https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L528 https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L1022-L1029 and the related test pass: https://github.com/apache/spark/blob/77cc0d67d5a7ea526f8efd37b2590923953cb8e0/python/pyspark/tests.py#L233-L237 We should probably port this one to `cloudpipe/cloudpickle`. **[SPARK-16077] [PYSPARK] catch the exception from pickle.whichmodule()**, https://github.com/apache/spark/commit/d48935400ca47275f677b527c636976af09332c8: We keep this change: https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L325-L330 https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L620-L625 This patch even should be safer as I and @rgbkrk verified this with some tests: https://github.com/cloudpipe/cloudpickle/pull/112 **[SPARK-17472] [PYSPARK] Better error message for serialization failures of large objects in Python**, https://github.com/apache/spark/commit/dbfc7aa4d0d5457bc92e1e66d065c6088d476843: We keep this change: https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L240-L249 Probably, we should port this change into `cloudpipe/cloudpickle`. **[SPARK-19019] [PYTHON] Fix hijacked `collections.namedtuple` and port cloudpickle changes for PySpark to work with Python 3.6.0**, https://github.com/apache/spark/commit/20e6280626fe243b170a2e7c5e018c67f3dac1db This change was ported from `cloudpipe/cloudpickle`. I tested our PySpark tests pass with Python 3.6.0 in my local manually - https://github.com/apache/spark/pull/18734#issuecomment-319558550 **[SPARK-19505][PYTHON] AttributeError on Exception.message in Python3**, https://github.com/apache/spark/commit/6297697f975960a3006c4e58b4964d9ac40eeaf5: We keep this change: https://github.com/holdenk/spark/blob/f986c2591a9a0b6962862c5cdfc33a7d65be7eda/python/pyspark/cloudpickle.py#L240-L249 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18734 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r134367553 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala --- @@ -118,8 +122,15 @@ case class MemoryStream[A : Encoder](id: Int, sqlContext: SQLContext) batches.slice(sliceStart, sliceEnd) } -logDebug( - s"MemoryBatch [$startOrdinal, $endOrdinal]: ${newBlocks.flatMap(_.collect()).mkString(", ")}") +logDebug({ --- End diff -- Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r134367543 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala --- @@ -420,8 +420,10 @@ class SQLContext private[sql](val sparkSession: SparkSession) * converted to Catalyst rows. */ private[sql] - def internalCreateDataFrame(catalystRows: RDD[InternalRow], schema: StructType) = { -sparkSession.internalCreateDataFrame(catalystRows, schema) + def internalCreateDataFrame(catalystRows: RDD[InternalRow], --- End diff -- Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18973: [SPARK-21765] Set isStreaming on leaf nodes for s...
Github user joseph-torres commented on a diff in the pull request: https://github.com/apache/spark/pull/18973#discussion_r134367548 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -728,7 +729,16 @@ class FakeDefaultSource extends FakeSource { override def getBatch(start: Option[Offset], end: Offset): DataFrame = { val startOffset = start.map(_.asInstanceOf[LongOffset].offset).getOrElse(-1L) + 1 -spark.range(startOffset, end.asInstanceOf[LongOffset].offset + 1).toDF("a") +val ds = new Dataset[java.lang.Long]( --- End diff -- I've tried addressing this a few different ways, and I can't come up with anything cleaner than the current solution. Directly creating a DF doesn't set the isStreaming bit, and a bunch of copying and casting is required to get it set; using LocalRelation requires explicitly handling the encoding of the rows, since LocalRelation requires InternalRow input. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpickle aga...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18734 Yea, it looks so. Named tuple one reminds me of the workaround we have for named tuple to make picklable - https://github.com/apache/spark/blob/d03aebbe6508ba441dc87f9546f27aeb27553d77/python/pyspark/serializers.py#L395-L446 Maybe, we could take a look and see if we could get rid of this or port this. Anyway, let me take a final look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18973 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80943/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18810: [SPARK-21603][SQL]The wholestage codegen will be much sl...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18810 ok, I'll make a pr as follow-up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18973 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18973 **[Test build #80943 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80943/testReport)** for PR 18973 at commit [`c837069`](https://github.com/apache/spark/commit/c83706921157bdf2af4f2b697244054bc1e8ffad). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17373: [SPARK-12664][ML] Expose probability in mlp model
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17373 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17373: [SPARK-12664][ML] Expose probability in mlp model
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17373 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80946/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18973 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80942/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18973 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17373: [SPARK-12664][ML] Expose probability in mlp model
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17373 **[Test build #80946 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80946/testReport)** for PR 17373 at commit [`5369b08`](https://github.com/apache/spark/commit/5369b088e7fcb0fa35b0e4c840772cf60515c882). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18968: [SPARK-21759][SQL] In.checkInputDataTypes should not wro...
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/18968 @viirya Thank you !! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18973: [SPARK-21765] Set isStreaming on leaf nodes for streamin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18973 **[Test build #80942 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80942/testReport)** for PR 18973 at commit [`e55abe6`](https://github.com/apache/spark/commit/e55abe6be316f251bc51f845fc9108f4f721c601). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18992: [SPARK-19762][ML][FOLLOWUP]Add necessary comments...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18992 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19013: [SPARK-21728][core] Allow SparkSubmit to use Logging.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19013 **[Test build #80949 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80949/testReport)** for PR 19013 at commit [`3d6016e`](https://github.com/apache/spark/commit/3d6016e14eb3fab5cea1bd24452842e59f721cad). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18992: [SPARK-19762][ML][FOLLOWUP]Add necessary comments to L2R...
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18992 Merged into master. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19012: [SPARK-17742][core] Fail launcher app handle if child pr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19012 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80941/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19012: [SPARK-17742][core] Fail launcher app handle if child pr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19012 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18984: [SPARK-21773][BUILD][DOCS] Installs mkdocs if missing in...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18984 Thanks for your effort @shaneknapp. I just checked the green. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19012: [SPARK-17742][core] Fail launcher app handle if child pr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19012 **[Test build #80941 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80941/testReport)** for PR 19012 at commit [`4d5cc53`](https://github.com/apache/spark/commit/4d5cc5313c319f900abf1a7f2da0392bc2c396a8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org