[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219540966 --- Diff: docs/sql-programming-guide.md --- @@ -1489,7 +1489,7 @@ See the [Apache Avro Data Source Guide](avro-data-source-guide.html). * The JDBC driver class must be visible to the primordial class loader on the client session and on all executors. This is because Java's DriverManager class does a security check that results in it ignoring all drivers not visible to the primordial class loader when one goes to open a connection. One convenient way to do this is to modify compute_classpath.sh on all worker nodes to include your driver JARs. * Some databases, such as H2, convert all names to upper case. You'll need to use upper case to refer to those names in Spark SQL. - + * Users can specify vendor-specific JDBC connection properties in the data source options to do special treatment. For example, `spark.read.format("jdbc").option("url", oracleJdbcUrl).option("oracle.jdbc.mapDateToTimestamp", "false")`. `oracle.jdbc.mapDateToTimestamp` defaults to true, users often need to disable this flag to avoid Oracle date being resolved as timestamp. --- End diff -- Ur.. I see. This section is troubleshooting, so I think you'd be better to illustrate a concrete trouble case, then you describe a solution for that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...
GitHub user maryannxue opened a pull request: https://github.com/apache/spark/pull/22519 [SPARK-25505][SQL] The output order of grouping columns in Pivot is different from the input order ## What changes were proposed in this pull request? The grouping columns from a Pivot query are inferred as "input columns - pivot columns - pivot aggregate columns", where input columns are the output of the child relation of Pivot. The grouping columns will be the leading columns in the pivot output and they should preserve the same order as specified by the input. For example, ``` SELECT * FROM ( SELECT course, earnings, "a" as a, "z" as z, "b" as b, "y" as y, "c" as c, "x" as x, "d" as d, "w" as w FROM courseSales ) PIVOT ( sum(earnings) FOR course IN ('dotNET', 'Java') ) ``` The output columns should be "a, z, b, y, c, x, d, w, ..." but now it is "a, b, c, d, w, x, y, z, ..." The fix is to use the child plan's `output` instead of `outputSet` so that the order can be preserved. ## How was this patch tested? Added UT. You can merge this pull request into a Git repository by running: $ git pull https://github.com/maryannxue/spark spark-25505 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22519.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22519 commit bd416bd74ee77329b2527fffecd21f7f90090334 Author: maryannxue Date: 2018-09-21T14:33:16Z [SPARK-25505][SQL] The output order of grouping columns in Pivot is different from the input order --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22490 **[Test build #96435 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96435/testReport)** for PR 22490 at commit [`02ecf3f`](https://github.com/apache/spark/commit/02ecf3f6ba737332464021a4bbf7320b4d71bd70). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219539062 --- Diff: docs/sql-programming-guide.md --- @@ -1287,8 +1287,18 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. Tables from the remote database can be loaded as a DataFrame or Spark SQL temporary view using the Data Sources API. Users can specify the JDBC connection properties in the data source options. user and password are normally provided as connection properties for -logging into the data sources. In addition to the connection properties, Spark also supports -the following case-insensitive options: +logging into the data sources. Vendor-specific connection properties can also be passed to the +underlying JDBC driver in the same way. For example: + +{% highlight scala %} +// oracle.jdbc.mapDateToTimestamp defaults to true. If this flag is not disabled, a column of Oracle +// DATE type will be resolved as Catalyst TimestampType, which is probably not the desired behavior. +spark.read.format("jdbc") + .option("url", oracleJdbcUrl) + .option("oracle.jdbc.mapDateToTimestamp", "false") +{% endhighlight %} + --- End diff -- I have moved this description to `Troubleshooting` section. I also tried to brush up the description. Writing good documentation is sometimes difficult than writing code. really need your help :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22461 **[Test build #96434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96434/testReport)** for PR 22461 at commit [`f6274a5`](https://github.com/apache/spark/commit/f6274a50177e18be7b36d87913c44103f2fa02d2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219537942 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -462,6 +468,12 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLCo .option("lowerBound", "2018-07-04 03:30:00.0") .option("upperBound", "2018-07-27 14:11:05.0") .option("numPartitions", 2) + // oracle.jdbc.mapDateToTimestamp defaults to true. If this flag is not disabled, column d + // (Oracle DATE) will be resolved as Catalyst Timestamp, which will fail test result + // comparison. E.g. Expected 2018-07-06 while Actual 2018-07-06 00:00:00.0. --- End diff -- OK, I will remove duplicate description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @gaborgsomogyi Totally makes sense. Let me address while the patch is reviewed by committers. I may get recommendations to rename the config or even more, so addressing documentation would be the last part. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22263 **[Test build #96433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96433/testReport)** for PR 22263 at commit [`c3b6dfc`](https://github.com/apache/spark/commit/c3b6dfcee106c4d0e38975b1a31c3f3e97d2abc1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22263 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3350/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22263 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22288 **[Test build #96432 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96432/testReport)** for PR 22288 at commit [`ffbc9c3`](https://github.com/apache/spark/commit/ffbc9c32d14a0c82036defb90eb18167f93bad4d). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3349/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22495 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22495 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96419/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22495 **[Test build #96419 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96419/testReport)** for PR 22495 at commit [`bbdc202`](https://github.com/apache/spark/commit/bbdc2029c992803e75d04be9e5b2507d6df1779f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22198 **[Test build #96431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96431/testReport)** for PR 22198 at commit [`9a6da27`](https://github.com/apache/spark/commit/9a6da2750554cd03ce468d6bfe0aeba8b75883b8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22198 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3348/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22198 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20433 ping --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22204: [SPARK-25196][SQL] Extends Analyze commands for cached t...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22204 ping --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22198 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22227: [SPARK-25202] [SQL] Implements split with limit sql func...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/7 LGTM except for the R/python parts (I'm not familiar with these parts and I'll leave them to @felixcheung and @HyukjinKwon). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22316 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96420/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22316 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22316 **[Test build #96420 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96420/testReport)** for PR 22316 at commit [`382640b`](https://github.com/apache/spark/commit/382640be9bb9739929daea0bceb3093836d7f78d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219522094 --- Diff: docs/sql-programming-guide.md --- @@ -1287,8 +1287,18 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. Tables from the remote database can be loaded as a DataFrame or Spark SQL temporary view using the Data Sources API. Users can specify the JDBC connection properties in the data source options. user and password are normally provided as connection properties for -logging into the data sources. In addition to the connection properties, Spark also supports -the following case-insensitive options: +logging into the data sources. Vendor-specific connection properties can also be passed to the +underlying JDBC driver in the same way. For example: + +{% highlight scala %} +// oracle.jdbc.mapDateToTimestamp defaults to true. If this flag is not disabled, a column of Oracle +// DATE type will be resolved as Catalyst TimestampType, which is probably not the desired behavior. +spark.read.format("jdbc") + .option("url", oracleJdbcUrl) + .option("oracle.jdbc.mapDateToTimestamp", "false") +{% endhighlight %} + --- End diff -- Also, you'd be better to brush up the description more to make users understood easily... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22331: [SPARK-25331][SS] Make FileStreamSink ignore part...
Github user misutoth commented on a diff in the pull request: https://github.com/apache/spark/pull/22331#discussion_r219521421 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StagingFileCommitProtocol.scala --- @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.streaming + +import org.apache.hadoop.fs.{FileAlreadyExistsException, FileContext, Path} +import org.apache.hadoop.mapreduce.{JobContext, TaskAttemptContext} + +import org.apache.spark.internal.Logging +import org.apache.spark.internal.io.FileCommitProtocol +import org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage + +class StagingFileCommitProtocol(jobId: String, path: String) + extends FileCommitProtocol with Serializable with Logging + with ManifestCommitProtocol { + private var stagingDir: Option[Path] = None --- End diff -- Yes, I will do that. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219520789 --- Diff: docs/sql-programming-guide.md --- @@ -1287,8 +1287,18 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9. Tables from the remote database can be loaded as a DataFrame or Spark SQL temporary view using the Data Sources API. Users can specify the JDBC connection properties in the data source options. user and password are normally provided as connection properties for -logging into the data sources. In addition to the connection properties, Spark also supports -the following case-insensitive options: +logging into the data sources. Vendor-specific connection properties can also be passed to the +underlying JDBC driver in the same way. For example: + +{% highlight scala %} +// oracle.jdbc.mapDateToTimestamp defaults to true. If this flag is not disabled, a column of Oracle +// DATE type will be resolved as Catalyst TimestampType, which is probably not the desired behavior. +spark.read.format("jdbc") + .option("url", oracleJdbcUrl) + .option("oracle.jdbc.mapDateToTimestamp", "false") +{% endhighlight %} + --- End diff -- Probably, I thinks its better to put this description in the `Troubleshooting` section: https://spark.apache.org/docs/2.3.1/sql-programming-guide.html#troubleshooting --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219518705 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -462,6 +468,12 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLCo .option("lowerBound", "2018-07-04 03:30:00.0") .option("upperBound", "2018-07-27 14:11:05.0") .option("numPartitions", 2) + // oracle.jdbc.mapDateToTimestamp defaults to true. If this flag is not disabled, column d + // (Oracle DATE) will be resolved as Catalyst Timestamp, which will fail test result + // comparison. E.g. Expected 2018-07-06 while Actual 2018-07-06 00:00:00.0. --- End diff -- I thinks we don't need this duplicate description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219518607 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -462,6 +468,12 @@ class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSQLCo .option("lowerBound", "2018-07-04 03:30:00.0") .option("upperBound", "2018-07-27 14:11:05.0") .option("numPartitions", 2) + // oracle.jdbc.mapDateToTimestamp defaults to true. If this flag is not disabled, column d + // (Oracle DATE) will be resolved as Catalyst Timestamp, which will fail test result + // comparison. E.g. Expected 2018-07-06 while Actual 2018-07-06 00:00:00.0. + .option("oracle.jdbc.mapDateToTimestamp", "false") + .option("sessionInitStatement", +"ALTER SESSION SET NLS_TIMESTAMP_FORMAT = '-MM-DD HH24:MI:SS.FF'") --- End diff -- I thinks we don't need this duplicate description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22498: [SPARK-18364] : Expose metrics for YarnShuffleService
Github user pgandhi999 commented on the issue: https://github.com/apache/spark/pull/22498 @tgravescs Fine with me, thank you. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 @jinxing64 Let's not talk too much about the problem of `spark.sql.hive.verifyPartitionPath`. We should just introduce `spark.sql.files.ignoreMissingFiles` and say this can replace `spark.sql.hive.verifyPartitionPath`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Ben...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22513 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22511 **[Test build #4346 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4346/testReport)** for PR 22511 at commit [`aee82ab`](https://github.com/apache/spark/commit/aee82abe4cd9fbefa14fb280644276fe491bcf9a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22513 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...
Github user squito commented on the issue: https://github.com/apache/spark/pull/22511 > this seems like a big change, will we hit perf regression? Not vs. 2.3. It only effects things when stream-to-disk is enabled, and when it is enabled, for reading remote cached blocks, this is actually going back to the old behavior. See https://github.com/apache/spark/pull/19476/files#diff-2b643ea78c1add0381754b1f47eec132R692 -- `FileSegmentManagedBuffer.nioByteBuffer()` reads the entire file into a regular byte buffer. I changed it to memory map the file as an attempted optimization. This change will make things slower, but that's vs. other changes that are only in 2.4. There are TODOs in the code for ways to improve this further but that should not go into 2.4. > is this a long-standing bug? the change here is not fixing a long-standing bug, its just updating new changes for 2.4. however, I'm really wondering why TorrentBroadcast calls dispose on the blocks. For regular buffers, its a no-op, so it hasn't mattered, but I can't come up with a reason that you *do* want to dispose those blocks. Secondly, it seems there is an implicit assumption that you never add memory-mapped byte buffers to the MemoryStore. Maybe that makes sense ... its kind of messing with the Memory / Disk management spark has. But the MemoryStore never checks that you don't add a mapped buffer, you'll just get weird behavior like this later on. Seems there should be a check at the very least, to avoid this kind of issue. As neither of those things are new to 2.4, I don't think we should deal w/ them here. The major motivation for memory mapping the file was not for broadcast blocks, it was for reading large cached blocks. But it actually makes more sense to change the interfaces in BlockManager to allow us to just get the managedBuffer, instead of a ChunkedByteBuffer (thats this TODO: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L728) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22173: [SPARK-24355] Spark external shuffle server improvement ...
Github user redsanket commented on the issue: https://github.com/apache/spark/pull/22173 closes #21402 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22173: [SPARK-24355] Spark external shuffle server impro...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22173 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22173: [SPARK-24355] Spark external shuffle server improvement ...
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/22173 merged into master (2.5.0) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 @cloud-fan Thanks for ping~ I updated the description. Let me know if I should refine it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22485: [SPARK-18364][YARN] Expose metrics for YarnShuffl...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/22485#discussion_r219508537 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceMetricsSuite.scala --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.network.yarn + +import scala.collection.JavaConverters._ + +import org.apache.hadoop.metrics2.MetricsRecordBuilder +import org.mockito.Matchers._ +import org.mockito.Mockito.{mock, times, verify, when} +import org.scalatest.Matchers + +import org.apache.spark.SparkFunSuite +import org.apache.spark.network.server.OneForOneStreamManager +import org.apache.spark.network.shuffle.{ExternalShuffleBlockHandler, ExternalShuffleBlockResolver} + +class YarnShuffleServiceMetricsSuite extends SparkFunSuite with Matchers { + + val streamManager = mock(classOf[OneForOneStreamManager]) + val blockResolver = mock(classOf[ExternalShuffleBlockResolver]) + when(blockResolver.getRegisteredExecutorsSize).thenReturn(42) + + val metrics = new ExternalShuffleBlockHandler(streamManager, blockResolver).getAllMetrics + + test("metrics named as expected") { +val allMetrics = Set( + "openBlockRequestLatencyMillis", "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes", "registeredExecutorsSize") + +metrics.getMetrics.keySet().asScala should be (allMetrics) + } + + // these three metrics have the same effect on the collector + for (testname <- Seq("openBlockRequestLatencyMillis", + "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes")) { +test(s"$testname - collector receives correct types") { + val builder = mock(classOf[MetricsRecordBuilder]) + when(builder.addCounter(any(), anyLong())).thenReturn(builder) + when(builder.addGauge(any(), anyDouble())).thenReturn(builder) + + YarnShuffleServiceMetrics.collectMetric(builder, testname, +metrics.getMetrics.get(testname)) + + verify(builder).addCounter(anyObject(), anyLong()) + verify(builder, times(4)).addGauge(anyObject(), anyDouble()) +} + } + + // this metric writes only one gauge to the collector + test("registeredExecutorsSize - collector receives correct types") { +val builder = mock(classOf[MetricsRecordBuilder]) +when(builder.addCounter(any(), anyLong())).thenReturn(builder) +when(builder.addGauge(any(), anyDouble())).thenReturn(builder) + +YarnShuffleServiceMetrics.collectMetric(builder, "registeredExecutorsSize", --- End diff -- you are calling a private function here --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22485: [SPARK-18364][YARN] Expose metrics for YarnShuffl...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/22485#discussion_r219508486 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceMetricsSuite.scala --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.network.yarn + +import scala.collection.JavaConverters._ + +import org.apache.hadoop.metrics2.MetricsRecordBuilder +import org.mockito.Matchers._ +import org.mockito.Mockito.{mock, times, verify, when} +import org.scalatest.Matchers + +import org.apache.spark.SparkFunSuite +import org.apache.spark.network.server.OneForOneStreamManager +import org.apache.spark.network.shuffle.{ExternalShuffleBlockHandler, ExternalShuffleBlockResolver} + +class YarnShuffleServiceMetricsSuite extends SparkFunSuite with Matchers { + + val streamManager = mock(classOf[OneForOneStreamManager]) + val blockResolver = mock(classOf[ExternalShuffleBlockResolver]) + when(blockResolver.getRegisteredExecutorsSize).thenReturn(42) + + val metrics = new ExternalShuffleBlockHandler(streamManager, blockResolver).getAllMetrics + + test("metrics named as expected") { +val allMetrics = Set( + "openBlockRequestLatencyMillis", "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes", "registeredExecutorsSize") + +metrics.getMetrics.keySet().asScala should be (allMetrics) + } + + // these three metrics have the same effect on the collector + for (testname <- Seq("openBlockRequestLatencyMillis", + "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes")) { +test(s"$testname - collector receives correct types") { + val builder = mock(classOf[MetricsRecordBuilder]) + when(builder.addCounter(any(), anyLong())).thenReturn(builder) + when(builder.addGauge(any(), anyDouble())).thenReturn(builder) + + YarnShuffleServiceMetrics.collectMetric(builder, testname, --- End diff -- you are calling a private function here --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22498: [SPARK-18364] : Expose metrics for YarnShuffleService
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/22498 @pgandhi999 thanks for the pr, since you have additional changes and there are additional changes that were made by the original author here: palantir#236 Perhaps for now we should review the lowest common denominator one and get that in and we can put additional metrics on top of it in separate prs. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22485: [SPARK-18364][YARN] Expose metrics for YarnShuffl...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/22485#discussion_r219506086 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceMetricsSuite.scala --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.network.yarn + +import scala.collection.JavaConverters._ + +import org.apache.hadoop.metrics2.MetricsRecordBuilder +import org.mockito.Matchers._ +import org.mockito.Mockito.{mock, times, verify, when} +import org.scalatest.Matchers + +import org.apache.spark.SparkFunSuite +import org.apache.spark.network.server.OneForOneStreamManager +import org.apache.spark.network.shuffle.{ExternalShuffleBlockHandler, ExternalShuffleBlockResolver} + +class YarnShuffleServiceMetricsSuite extends SparkFunSuite with Matchers { + + val streamManager = mock(classOf[OneForOneStreamManager]) + val blockResolver = mock(classOf[ExternalShuffleBlockResolver]) + when(blockResolver.getRegisteredExecutorsSize).thenReturn(42) + + val metrics = new ExternalShuffleBlockHandler(streamManager, blockResolver).getAllMetrics + + test("metrics named as expected") { +val allMetrics = Set( + "openBlockRequestLatencyMillis", "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes", "registeredExecutorsSize") + +metrics.getMetrics.keySet().asScala should be (allMetrics) + } + + // these three metrics have the same effect on the collector + for (testname <- Seq("openBlockRequestLatencyMillis", + "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes")) { +test(s"$testname - collector receives correct types") { + val builder = mock(classOf[MetricsRecordBuilder]) + when(builder.addCounter(any(), anyLong())).thenReturn(builder) + when(builder.addGauge(any(), anyDouble())).thenReturn(builder) + + YarnShuffleServiceMetrics.collectMetric(builder, testname, +metrics.getMetrics.get(testname)) + + verify(builder).addCounter(anyObject(), anyLong()) + verify(builder, times(4)).addGauge(anyObject(), anyDouble()) +} + } + + // this metric writes only one gauge to the collector + test("registeredExecutorsSize - collector receives correct types") { +val builder = mock(classOf[MetricsRecordBuilder]) +when(builder.addCounter(any(), anyLong())).thenReturn(builder) --- End diff -- Don't seem to use these? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22494: [SPARK-22036][SQL][followup] add a new config for...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22494#discussion_r219505527 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1345,6 +1345,15 @@ object SQLConf { .booleanConf .createWithDefault(true) + val LITERAL_PICK_MINIMUM_PRECISION = +buildConf("spark.sql.literal.pickMinimumPrecision") --- End diff -- nit: I am thinking whether we can consider this as a `legacy` config. We can also remove it once the PR for SPARK-25454 will be merged and/or we remove the support to negative scale decimals. What do you think? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22485: [SPARK-18364][YARN] Expose metrics for YarnShuffl...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/22485#discussion_r219506036 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceMetricsSuite.scala --- @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.network.yarn + +import scala.collection.JavaConverters._ + +import org.apache.hadoop.metrics2.MetricsRecordBuilder +import org.mockito.Matchers._ +import org.mockito.Mockito.{mock, times, verify, when} +import org.scalatest.Matchers + +import org.apache.spark.SparkFunSuite +import org.apache.spark.network.server.OneForOneStreamManager +import org.apache.spark.network.shuffle.{ExternalShuffleBlockHandler, ExternalShuffleBlockResolver} + +class YarnShuffleServiceMetricsSuite extends SparkFunSuite with Matchers { + + val streamManager = mock(classOf[OneForOneStreamManager]) + val blockResolver = mock(classOf[ExternalShuffleBlockResolver]) + when(blockResolver.getRegisteredExecutorsSize).thenReturn(42) + + val metrics = new ExternalShuffleBlockHandler(streamManager, blockResolver).getAllMetrics + + test("metrics named as expected") { +val allMetrics = Set( + "openBlockRequestLatencyMillis", "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes", "registeredExecutorsSize") + +metrics.getMetrics.keySet().asScala should be (allMetrics) + } + + // these three metrics have the same effect on the collector + for (testname <- Seq("openBlockRequestLatencyMillis", + "registerExecutorRequestLatencyMillis", + "blockTransferRateBytes")) { +test(s"$testname - collector receives correct types") { + val builder = mock(classOf[MetricsRecordBuilder]) + when(builder.addCounter(any(), anyLong())).thenReturn(builder) --- End diff -- I don't see where these when's are used? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22485: [SPARK-18364][YARN] Expose metrics for YarnShuffl...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/22485#discussion_r219311876 --- Diff: common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java --- @@ -168,6 +170,15 @@ protected void serviceInit(Configuration conf) throws Exception { TransportConf transportConf = new TransportConf("shuffle", new HadoopConfigProvider(conf)); blockHandler = new ExternalShuffleBlockHandler(transportConf, registeredExecutorFile); + // register metrics on the block handler into the Node Manager's metrics system. + YarnShuffleServiceMetrics serviceMetrics = + new YarnShuffleServiceMetrics(blockHandler.getAllMetrics()); --- End diff -- indentation should be 2 spaces, fix here and below --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22513 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96415/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22513 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22513 **[Test build #96415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96415/testReport)** for PR 22513 at commit [`1a5e1e9`](https://github.com/apache/spark/commit/1a5e1e927072e4438ea1ce7dc579a6d5b0986835). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18544: [SPARK-21318][SQL]Improve exception message thrown by `l...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18544 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18544: [SPARK-21318][SQL]Improve exception message throw...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18544#discussion_r219505379 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -131,14 +131,14 @@ private[sql] class HiveSessionCatalog( Try(super.lookupFunction(funcName, children)) match { case Success(expr) => expr case Failure(error) => -if (functionRegistry.functionExists(funcName)) { - // If the function actually exists in functionRegistry, it means that there is an - // error when we create the Expression using the given children. +if (super.functionExists(name)) { + // If the function actually exists in functionRegistry or externalCatalog, + // it means that there is an error when we create the Expression using the given children. // We need to throw the original exception. throw error } else { - // This function is not in functionRegistry, let's try to load it as a Hive's - // built-in function. + // This function is not in functionRegistry or externalCatalog, --- End diff -- `This function does not exist(neither in functionRegistry or externalCatalog) ...` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL][WIP] Fix SQLQueryTestSuite failures w...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22512 **[Test build #96430 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96430/testReport)** for PR 22512 at commit [`bff88ee`](https://github.com/apache/spark/commit/bff88ee81f57900cca38df8455c4a2eb78b4b758). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18544: [SPARK-21318][SQL]Improve exception message throw...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18544#discussion_r219505121 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -131,14 +131,14 @@ private[sql] class HiveSessionCatalog( Try(super.lookupFunction(funcName, children)) match { case Success(expr) => expr case Failure(error) => -if (functionRegistry.functionExists(funcName)) { - // If the function actually exists in functionRegistry, it means that there is an - // error when we create the Expression using the given children. +if (super.functionExists(name)) { + // If the function actually exists in functionRegistry or externalCatalog, --- End diff -- `If the function exists (either in functionRegistry or externalCatalog), ...` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL][WIP] Fix SQLQueryTestSuite failures w...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22512 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22512: [SPARK-25498][SQL][WIP] Fix SQLQueryTestSuite failures w...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22512 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3347/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22466: [SPARK-25464][SQL]On dropping the Database it will drop ...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/22466 See JIRA, I don't think this should be merged. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22494 **[Test build #96429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96429/testReport)** for PR 22494 at commit [`b4fdd13`](https://github.com/apache/spark/commit/b4fdd1307059c7df7c386a96aad6bc17b593d9c5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3346/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22490 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22490 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96418/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22490 **[Test build #96418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96418/testReport)** for PR 22490 at commit [`cbfac03`](https://github.com/apache/spark/commit/cbfac03f4e34c32b797f9e26ec9146136a0c14be). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #10130: SPARK-12105 - SPARK-SQL add convenient show functions
Github user davis-x commented on the issue: https://github.com/apache/spark/pull/10130 +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22518 **[Test build #96428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96428/testReport)** for PR 22518 at commit [`7c75067`](https://github.com/apache/spark/commit/7c75067767ed6935960d09c7915da86fea3553fa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3345/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22450: [SPARK-25454][SQL] Avoid precision loss in division with...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22450 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22450: [SPARK-25454][SQL] Avoid precision loss in division with...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22450 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96417/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22450: [SPARK-25454][SQL] Avoid precision loss in division with...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22450 **[Test build #96417 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96417/testReport)** for PR 22450 at commit [`4e240d9`](https://github.com/apache/spark/commit/4e240d9abea9ea67312f31e3af129416b8c3381a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22518 **[Test build #96427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96427/testReport)** for PR 22518 at commit [`36fa664`](https://github.com/apache/spark/commit/36fa664c6d251901270984115ff2ebfd1b665fca). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3344/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22503: [SPARK-25493] [SQL] Fix multiline crlf
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/22503#discussion_r219495737 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala --- @@ -212,6 +212,7 @@ class CSVOptions( settings.setEmptyValue(emptyValueInRead) settings.setMaxCharsPerColumn(maxCharsPerColumn) settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_DELIMITER) +settings.setLineSeparatorDetectionEnabled(true) --- End diff -- The auto-detection mechanism is enabled for both - multi-line and per-line mode. I guess it has some overhead on detection of new lines which is not needed in per-line mode. I would benchmark it in both modes (see `CSVBenchmarks`), and if the overhead in per-line mode is significant, I would not enable the option when `multiLine` is set to `false`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22517: Branch 2.3 how can i fix error use Pyspark
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22517 Do you mind close this PR. questions and help should be sent to `u...@spark.apache.org` ``` u...@spark.apache.org is for usage questions, help, and announcements. (subscribe) (unsubscribe) (archives) d...@spark.apache.org is for people who want to contribute code to Spark. (subscribe) (unsubscribe) (archives) ``` http://spark.apache.org/community.html --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/22518 [SPARK-25482][SQL] ReuseSubquery can be useless when the subqueries have the same id ## What changes were proposed in this pull request? When a `ExecSubqueryExpression` is copied, it may happen that there are two instances with the same `exprId`. This can happen for instance when a filter containing a scalar subquery is pushed to a DataSource. In this scenario, `ReuseSubquery` becomes useless, as replacing the `SubqueryExec` is ignored since the new plan is equal to the previous one. The PR avoids this problem by creating a new `ExprId` for the subquery expression when it is changed in `ReuseSubquery`, so that the changes are not ignored. ## How was this patch tested? added UT You can merge this pull request into a Git repository by running: $ git pull https://github.com/mgaido91/spark SPARK-25482 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22518 commit 36fa664c6d251901270984115ff2ebfd1b665fca Author: Marco Gaido Date: 2018-09-21T12:50:11Z [SPARK-25482][SQL] ReuseSubquery can be effectless when the subqueries have the same id --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22517: Branch 2.3 how can i fix error use Pyspark
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22517 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22517: Branch 2.3 how can i fix error use Pyspark
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22517 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22517: Branch 2.3 how can i fix error use Pyspark
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22517 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22494: [SPARK-22036][SQL][followup] add a new config for...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22494#discussion_r219493420 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2858,6 +2858,13 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { val result = ds.flatMap(_.bar).distinct result.rdd.isEmpty } + + test("SPARK-25454: decimal division with negative scale") { +// TODO: completely fix this issue even LITERAL_PRECISE_PRECISION is true. --- End diff -- nit: `even when` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22494: [SPARK-22036][SQL][followup] add a new config for...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22494#discussion_r219492191 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1345,6 +1345,16 @@ object SQLConf { .booleanConf .createWithDefault(true) + val LITERAL_PRECISE_PRECISION = +buildConf("spark.sql.literal.precisePrecision") + .internal() + .doc("When integral literals are used with decimals in binary operators, Spark will " + +"pick a precise precision for the literals to calculate the precision and scale " + +"of the result decimal, when this config is true. By picking a precise precision, we " + +"can avoid wasting precision, to reduce the possibility of overflow.") --- End diff -- `to reduce the possibility of overflow` actually this is not true and depends on the value of `DECIMAL_OPERATIONS_ALLOW_PREC_LOSS`, If `DECIMAL_OPERATIONS_ALLOW_PREC_LOSS` is true, the risk is to have a precision loss, but we don't overflow. If that is false, then this statement is right. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22494: [SPARK-22036][SQL][followup] add a new config for...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22494#discussion_r219492885 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1345,6 +1345,16 @@ object SQLConf { .booleanConf .createWithDefault(true) + val LITERAL_PRECISE_PRECISION = --- End diff -- mmh... PRECISE_PRECISION sounds weird... can we look for a better one? I don't have a suggestion right now, but I'll think about it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22494: [SPARK-22036][SQL][followup] add a new config for...
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22494#discussion_r219493091 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1345,6 +1345,16 @@ object SQLConf { .booleanConf .createWithDefault(true) + val LITERAL_PRECISE_PRECISION = +buildConf("spark.sql.literal.precisePrecision") + .internal() + .doc("When integral literals are used with decimals in binary operators, Spark will " + +"pick a precise precision for the literals to calculate the precision and scale " + --- End diff -- `a precise precision` -> `the minimal precision required to represent the given value`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22517: Branch 2.3 how can i fix error use Pyspark
GitHub user lovezeropython opened a pull request: https://github.com/apache/spark/pull/22517 Branch 2.3 how can i fix error use Pyspark how can i fix error use Pyspark ![image](https://user-images.githubusercontent.com/35518020/45883415-9cc23800-bde3-11e8-90b1-2bed95608c54.png) use spark 2.3 python 3.6 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/spark branch-2.3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22517.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22517 commit a9d0784e6733666a0608e8236322f1dc380e96b7 Author: smallory Date: 2018-03-15T02:58:54Z [SPARK-23642][DOCS] AccumulatorV2 subclass isZero scaladoc fix Added/corrected scaladoc for isZero on the DoubleAccumulator, CollectionAccumulator, and LongAccumulator subclasses of AccumulatorV2, particularly noting where there are requirements in addition to having a value of zero in order to return true. ## What changes were proposed in this pull request? Three scaladoc comments are updated in AccumulatorV2.scala No changes outside of comment blocks were made. ## How was this patch tested? Running "sbt unidoc", fixing style errors found, and reviewing the resulting local scaladoc in firefox. Author: smallory Closes #20790 from smallory/patch-1. (cherry picked from commit 4f5bad615b47d743b8932aea1071652293981604) Signed-off-by: hyukjinkwon commit 72c13ed844d6be6510ce2c5e3526c234d1d5e10f Author: hyukjinkwon Date: 2018-03-15T17:55:33Z [SPARK-23695][PYTHON] Fix the error message for Kinesis streaming tests ## What changes were proposed in this pull request? This PR proposes to fix the error message for Kinesis in PySpark when its jar is missing but explicitly enabled. ```bash ENABLE_KINESIS_TESTS=1 SPARK_TESTING=1 bin/pyspark pyspark.streaming.tests ``` Before: ``` Skipped test_flume_stream (enable by setting environment variable ENABLE_FLUME_TESTS=1Skipped test_kafka_stream (enable by setting environment variable ENABLE_KAFKA_0_8_TESTS=1Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/.../spark/python/pyspark/streaming/tests.py", line 1572, in % kinesis_asl_assembly_dir) + NameError: name 'kinesis_asl_assembly_dir' is not defined ``` After: ``` Skipped test_flume_stream (enable by setting environment variable ENABLE_FLUME_TESTS=1Skipped test_kafka_stream (enable by setting environment variable ENABLE_KAFKA_0_8_TESTS=1Traceback (most recent call last): File "/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/local/Cellar/python/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/.../spark/python/pyspark/streaming/tests.py", line 1576, in "You need to build Spark with 'build/sbt -Pkinesis-asl " Exception: Failed to find Spark Streaming Kinesis assembly jar in /.../spark/external/kinesis-asl-assembly. You need to build Spark with 'build/sbt -Pkinesis-asl assembly/package streaming-kinesis-asl-assembly/assembly'or 'build/mvn -Pkinesis-asl package' before running this test. ``` ## How was this patch tested? Manually tested. Author: hyukjinkwon Closes #20834 from HyukjinKwon/minor-variable. (cherry picked from commit 56e8f48a43eb51e8582db2461a585b13a771a00a) Signed-off-by: Takuya UESHIN commit 2e1e274ed9d7a30656555e71c68e7de34a336a8a Author: Sahil Takiar Date: 2018-03-16T00:04:39Z [SPARK-23658][LAUNCHER] InProcessAppHandle uses the wrong class in getLogger ## What changes were proposed in this pull request? Changed `Logger` in `InProcessAppHandle` to use `InProcessAppHandle` instead of `ChildProcAppHandle` Author: Sahil Takiar Closes #20815 from sahilTakiar/master. (cherry picked from commit 7618896e855579f111dd92cd76794a5672a087e5) Signed-off-by: Marcelo Vanzin commit 52a52d5d26fc1650e788eec62ce478c76f627470 Author: Marcelo Vanzin Date: 2018-03-16T00:12:01Z [SPARK-23671][CORE] Fix condition to enable the SHS thread pool. Author: Marce
[GitHub] spark issue #22516: [SPARK-25468]Highlight current page index in the history...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22516 Yes. I hit this issue also. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22263 **[Test build #96426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96426/testReport)** for PR 22263 at commit [`865e7d1`](https://github.com/apache/spark/commit/865e7d10be99c5f0ccfc89b2fc208de91e810ded). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22263: [SPARK-25269][SQL] SQL interface support specify ...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22263#discussion_r219490742 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -162,7 +162,8 @@ statement tableIdentifier partitionSpec? describeColName? #describeTable | REFRESH TABLE tableIdentifier #refreshTable | REFRESH (STRING | .*?) #refreshResource -| CACHE LAZY? TABLE tableIdentifier (AS? query)? #cacheTable +| CACHE LAZY? storageLevel=identifier? TABLE +tableIdentifier (AS? query)? #cacheTable --- End diff -- Thanks @maropu, I changed to option. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22263 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3343/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22263 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22461 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22461 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96416/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22461 **[Test build #96416 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96416/testReport)** for PR 22461 at commit [`9a948e5`](https://github.com/apache/spark/commit/9a948e5e59ccb6a39e9fa844273188ae331d5abd). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22494 **[Test build #96425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96425/testReport)** for PR 22494 at commit [`ad79c56`](https://github.com/apache/spark/commit/ad79c56ca038fac6797814410d665110ef43e826). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3342/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22494 sorry, I screwed up my local branch and made a wrong conclusion. Turning off the precise precision for integral literals can fix the regression. So it's better to add a config for it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22514 cc @cloud-fan @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3341/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] add a new config for pickin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org