[GitHub] spark issue #18268: [SPARK-21054] [SQL] Reset Command support reset specific...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18268 ping @ericsahit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18309: [SPARK-21079] [SQL] Calculate total size of a partition ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18309 Could you submit a backport PR to 2.1? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18309: [SPARK-21079] [SQL] Calculate total size of a par...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18309 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18309: [SPARK-21079] [SQL] Calculate total size of a partition ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18309 Thanks! Merging to master/2.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18309: [SPARK-21079] [SQL] Calculate total size of a par...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18309#discussion_r123890894 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -128,6 +129,77 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto TableIdentifier("tempTable"), ignoreIfNotExists = true, purge = false) } + test("SPARK-21079 - analyze table with location different than that of individual partitions") { +def queryTotalSize(tableName: String): BigInt = + spark.table(tableName).queryExecution.analyzed.stats(conf).sizeInBytes + +val tableName = "analyzeTable_part" +withTable(tableName) { + withTempPath { path => +sql(s"CREATE TABLE $tableName (key STRING, value STRING) PARTITIONED BY (ds STRING)") + +val partitionDates = List("2010-01-01", "2010-01-02", "2010-01-03") +partitionDates.foreach { ds => + sql(s"INSERT INTO TABLE $tableName PARTITION (ds='$ds') SELECT * FROM src") +} + +sql(s"ALTER TABLE $tableName SET LOCATION '$path'") + +sql(s"ANALYZE TABLE $tableName COMPUTE STATISTICS noscan") + +assert(queryTotalSize(tableName) === BigInt(17436)) + } +} + } + + test("SPARK-21079 - analyze partitioned table with only a subset of partitions visible") { +def queryTotalSize(tableName: String): BigInt = + spark.table(tableName).queryExecution.analyzed.stats(conf).sizeInBytes + +val sourceTableName = "analyzeTable_part" +val tableName = "analyzeTable_part_vis" +withTable(sourceTableName, tableName) { + withTempPath { path => + // Create a table with 3 partitions all located under a single top-level directory 'path' + sql( +s""" + |CREATE TABLE $sourceTableName (key STRING, value STRING) + |PARTITIONED BY (ds STRING) + |LOCATION '$path' + """.stripMargin) + + val partitionDates = List("2010-01-01", "2010-01-02", "2010-01-03") + partitionDates.foreach { ds => + sql( +s""" + |INSERT INTO TABLE $sourceTableName PARTITION (ds='$ds') + |SELECT * FROM src + """.stripMargin) + } + + // Create another table referring to the same location + sql( +s""" + |CREATE TABLE $tableName (key STRING, value STRING) + |PARTITIONED BY (ds STRING) + |LOCATION '$path' + """.stripMargin) + + // Register only one of the partitions found on disk + val ds = partitionDates.head + sql(s"ALTER TABLE $tableName ADD PARTITION (ds='$ds')").collect() + + // Analyze original table - expect 3 partitions + sql(s"ANALYZE TABLE $sourceTableName COMPUTE STATISTICS noscan") + assert(queryTotalSize(sourceTableName) === BigInt(3 * 5812)) + + // Analyze partial-copy table - expect only 1 partition + sql(s"ANALYZE TABLE $tableName COMPUTE STATISTICS noscan") + assert(queryTotalSize(tableName) === BigInt(5812)) --- End diff -- I am afraid this hard-coded values might not succeed in some other platforms. We might need to adjust the way. However, let me first merge this. If needed, we might need to submit a follow-up PR. cc @cloud-fan @wzhfy --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18309: [SPARK-21079] [SQL] Calculate total size of a partition ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18309 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18413 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18413 **[Test build #78572 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78572/testReport)** for PR 18413 at commit [`ffd3bcb`](https://github.com/apache/spark/commit/ffd3bcbf09c239516f44c206ce8ce74fbcd4d712). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18413 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78572/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18414: Update status of application to RUNNING if execut...
Github user brad-kaiser commented on a diff in the pull request: https://github.com/apache/spark/pull/18414#discussion_r123889666 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -566,10 +566,15 @@ private[deploy] class Master( // Update application state if executors are accepted and RUNNING apps.foreach(appInfo => { val app = idToApp(appInfo.id) - if(app.executors.filter(_._2.state != ExecutorState.RUNNING).isEmpty) { -app.state = ApplicationState.RUNNING -logInfo(s"Application :: ${app.id} status updated to RUNNING state") - }}) + apps.foreach(f = appInfo => { +val app = idToApp(appInfo.id) +if (app.executors.size > 0 && --- End diff -- It might be more clear to write this condition like this ```scala if (app.executors.nonEmpty && app.executors.forall(_._2.state == ExecutorState.RUNNING)) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18414: Update status of application to RUNNING if execut...
Github user brad-kaiser commented on a diff in the pull request: https://github.com/apache/spark/pull/18414#discussion_r123889361 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -566,10 +566,15 @@ private[deploy] class Master( // Update application state if executors are accepted and RUNNING apps.foreach(appInfo => { val app = idToApp(appInfo.id) - if(app.executors.filter(_._2.state != ExecutorState.RUNNING).isEmpty) { -app.state = ApplicationState.RUNNING -logInfo(s"Application :: ${app.id} status updated to RUNNING state") - }}) + apps.foreach(f = appInfo => { --- End diff -- It would probably be more clear to write foreach like this ```scala apps.foreach { appInfo => ... } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18414: Update status of application to RUNNING if executors are...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18414 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18414: Update status of application to RUNNING if execut...
GitHub user srini-daruna opened a pull request: https://github.com/apache/spark/pull/18414 Update status of application to RUNNING if executors are accepted RUNNING SPARK-21169 ## What changes were proposed in this pull request? In Spark-HA, after active master failure, stand-by mater is choosen and workers,applications will be re-registered with new master. However, application state is not moving from WAITING state to RUNNING state. This code change checks applications after recovery and if all executors are RUNNING (Please fill in changes proposed in this fix) In the method completeRecovery in org.apache.spark.deploy.master.Master class, where cleanup of the workers and applictions is being done, i have added code change to move the application to RUNNING state, if application has more than 1 executors and all of them are in RUNNING status. In some cases, executors will be in LOADING status, but we cannot consider those to change application state to RUNNING, as executors in LOADING status might also happen due to resource unavailability. ## How was this patch tested? To check existing bug. 1) Created a zookeeper cluster 2) I have configured spark with recovery mode zookeeper and updated spark-env.sh with recovery mode settings. 3) Updated spark-defaults in both worker and master with both the masters. spark://:7077,:7077 4) Started spark master1 and spark master 2 and and workers in the order. 5) master1 is ACTIVE and master2 showed as STANDBY. 6) Started a sample streaming application. 7) Killed the spark-master1, and waited for the workers and applications to appear in master2. They appeared and job showed up in WAITING state. To check implemented fix: 1) I have built spark-core 2) removed spark-core jar in SPARK_HOME/jars folder and replaced with the newly built one. 3) Performed the same steps as above, and checked job status 4) checked spark-master logs to ensure the log message got printed and (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apac ![master2_ui_after_recovery_with_fix](https://user-images.githubusercontent.com/5573733/27513117-c73cc10c-5928-11e7-89d6-039c1410b43c.png) ![master1_ui_with_fix](https://user-images.githubusercontent.com/5573733/27513121-c73e23f8-5928-11e7-995b-2ebda7aad86c.png) ![master2_ui_after_master1_is_killed_before_fix](https://user-images.githubusercontent.com/5573733/27513118-c73d7624-5928-11e7-8854-2f6a4aaceeb9.png) ![master2_ui_before_fix](https://user-images.githubusercontent.com/5573733/27513120-c73db698-5928-11e7-8684-9a6d8adeabac.png) ![master1_ui_before_fix](https://user-images.githubusercontent.com/5573733/27513119-c73d7e76-5928-11e7-90c1-4a0151e6094b.png) he.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/srini-daruna/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18414.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18414 commit e48a0b002d128128c2b351b492de7f36dfcc67a9 Author: Daruna, SrinivasaraoDate: 2017-06-25T00:18:42Z Update status of application to RUNNING if executors are accepted and RUNNING SPARK-21169 commit 703742c2d937bca4459edab1b3aac3b01c788a39 Author: Daruna, Srinivasarao Date: 2017-06-25T01:59:11Z Adding changes necessary to move application state to RUNNING,if executors are accepted and running after recovery --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18413: [SPARK-21205][SQL] pmod(number, 0) should be null...
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/18413 [SPARK-21205][SQL] pmod(number, 0) should be null. ## What changes were proposed in this pull request? Hive `pmod(3.13, 0)`: ```:sql hive> select pmod(3.13, 0); OK NULL Time taken: 2.514 seconds, Fetched: 1 row(s) hive> ``` Spark `mod(3.13, 0)`: ```:sql spark-sql> select mod(3.13, 0); NULL spark-sql> ``` But the Spark `pmod(3.13, 0)`: ```:sql spark-sql> select pmod(3.13, 0); 17/06/25 09:35:58 ERROR SparkSQLDriver: Failed in [select pmod(3.13, 0)] java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.Pmod.pmod(arithmetic.scala:504) at org.apache.spark.sql.catalyst.expressions.Pmod.nullSafeEval(arithmetic.scala:432) at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:419) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:323) ... ``` This PR make `pmod(number, 0)` to null. ## How was this patch tested? unit tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-21205 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18413.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18413 commit ffd3bcbf09c239516f44c206ce8ce74fbcd4d712 Author: Yuming WangDate: 2017-06-25T01:45:28Z pmod(number, 0) should be null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18413: [SPARK-21205][SQL] pmod(number, 0) should be null.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18413 **[Test build #78572 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78572/testReport)** for PR 18413 at commit [`ffd3bcb`](https://github.com/apache/spark/commit/ffd3bcbf09c239516f44c206ce8ce74fbcd4d712). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18400 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78571/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18400 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18400 **[Test build #78571 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78571/testReport)** for PR 18400 at commit [`5cdd328`](https://github.com/apache/spark/commit/5cdd328ee9a32969377cbdbfea229cc364dbee17). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user CodingCat commented on the issue: https://github.com/apache/spark/pull/18410 @zsxwing would you mind taking a look at this PR...what does this pip packaging tests mean? it's flaky test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78570/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18410 **[Test build #78570 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78570/testReport)** for PR 18410 at commit [`b37ec11`](https://github.com/apache/spark/commit/b37ec112e01880e3d67d81972bae33487763c742). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18384: [SPARK-21170] [CORE] Utils.tryWithSafeFinallyAndFailureC...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18384 **[Test build #3813 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3813/testReport)** for PR 18384 at commit [`15b77df`](https://github.com/apache/spark/commit/15b77dfdaa9fbffe6c87e5615d2b298c7ea54c7b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18400 **[Test build #78571 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78571/testReport)** for PR 18400 at commit [`5cdd328`](https://github.com/apache/spark/commit/5cdd328ee9a32969377cbdbfea229cc364dbee17). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/18400 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18400: [SPARK-21188][CORE] releaseAllLocksForTask should synchr...
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/18400 I actually asked @liufengdb to submit this PR after he spotted this problem while debugging a test. I did a bit of digging and spotted where this bug was introduced: in https://github.com/JoshRosen/spark/commit/b9d6e181db0c7293b0b00b3a679716f652df936f, an intermediate commit in an old PR, this was introduced when refactoring one map from a thread-safe LoadingCache (where entires were implicitly created) to an explicit map with explicit registration (which used synchronization). In making this change I incorrectly preserved a bad synchronization pattern which was unnecessarily fine-grained. Therefore this looks like the right change to me so I'm going to retest and merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18410 **[Test build #78570 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78570/testReport)** for PR 18410 at commit [`b37ec11`](https://github.com/apache/spark/commit/b37ec112e01880e3d67d81972bae33487763c742). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user CodingCat commented on the issue: https://github.com/apache/spark/pull/18410 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18384: [SPARK-21170] [CORE] Utils.tryWithSafeFinallyAndFailureC...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18384 **[Test build #3813 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3813/testReport)** for PR 18384 at commit [`15b77df`](https://github.com/apache/spark/commit/15b77dfdaa9fbffe6c87e5615d2b298c7ea54c7b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18106: [SPARK-20754][SQL] Support TRUNC (number)
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/18106#discussion_r123884847 --- Diff: python/pyspark/sql/functions.py --- @@ -1028,20 +1028,28 @@ def to_timestamp(col, format=None): @since(1.5) -def trunc(date, format): +def trunc(data, format): """ -Returns date truncated to the unit specified by the format. +Returns date truncated to the unit specified by the format or +number truncated by specified decimal places. :param format: 'year', '', 'yy' or 'month', 'mon', 'mm' >>> df = spark.createDataFrame([('1997-02-28',)], ['d']) ->>> df.select(trunc(df.d, 'year').alias('year')).collect() +>>> df.select(trunc(to_date(df.d), 'year').alias('year')).collect() [Row(year=datetime.date(1997, 1, 1))] ->>> df.select(trunc(df.d, 'mon').alias('month')).collect() +>>> df.select(trunc(to_date(df.d), 'mon').alias('month')).collect() --- End diff -- this, could be a bigger problem :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18106: [SPARK-20754][SQL] Support TRUNC (number)
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/18106#discussion_r123884830 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -1382,8 +1382,8 @@ test_that("column functions", { c20 <- to_timestamp(c) + to_timestamp(c, "") + to_date(c, "") c21 <- posexplode_outer(c) + explode_outer(c) c22 <- not(c) - c23 <- trunc(c, "year") + trunc(c, "") + trunc(c, "yy") + -trunc(c, "month") + trunc(c, "mon") + trunc(c, "mm") + c23 <- trunc(to_date(c), "year") + trunc(to_date(c), "") + trunc(to_date(c), "yy") + +trunc(to_date(c), "month") + trunc(to_date(c), "mon") + trunc(to_date(c), "mm") --- End diff -- that's a good point. fortunately (?) trunc was only added to R in 2.3.0, so I think we need to make sure (manually, add unit test) that trunc works on date columns and numeric columns --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18253: [SPARK-18838][CORE] Introduce multiple queues in LiveLis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18253 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18253: [SPARK-18838][CORE] Introduce multiple queues in LiveLis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18253 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78569/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18253: [SPARK-18838][CORE] Introduce multiple queues in LiveLis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18253 **[Test build #78569 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78569/testReport)** for PR 18253 at commit [`f423363`](https://github.com/apache/spark/commit/f423363d2712cfec7fd93f5ff2ef1a078408ce9f). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9518: [SPARK-11574][Core] Add metrics StatsD sink
Github user xflin commented on the issue: https://github.com/apache/spark/pull/9518 Well, the unit test indeed passed. Some other (not related) test failed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/17583 The error looks irrelevant. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18253: [SPARK-18838][CORE] Introduce multiple queues in LiveLis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18253 **[Test build #78569 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78569/testReport)** for PR 18253 at commit [`f423363`](https://github.com/apache/spark/commit/f423363d2712cfec7fd93f5ff2ef1a078408ce9f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18403: [SPARK-21193][PYTHON] Specify Pandas version in s...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18403#discussion_r123881043 --- Diff: python/setup.py --- @@ -199,7 +199,7 @@ def _supports_symlinks(): extras_require={ 'ml': ['numpy>=1.7'], 'mllib': ['numpy>=1.7'], -'sql': ['pandas'] +'sql': ['pandas>=0.13.0'] --- End diff -- Thanks for researching this @hyukjinkwon! I opened a follow-up to add more type support. I can do related docs there and we could also discuss whether or not to add pyarrow to the setup.py file once that's complete. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14285: [SPARK-16649][SQL] Push partition predicates down into m...
Github user lianhuiwang commented on the issue: https://github.com/apache/spark/pull/14285 @cloud-fan OK, I will close it. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14285: [SPARK-16649][SQL] Push partition predicates down...
Github user lianhuiwang closed the pull request at: https://github.com/apache/spark/pull/14285 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123878866 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -20,16 +20,19 @@ package org.apache.spark.sql.hive.execution import scala.collection.JavaConverters._ import scala.util.Random +import test.org.apache.spark.sql.MyDoubleAvg +import test.org.apache.spark.sql.MyDoubleSum + import org.apache.spark.sql._ import org.apache.spark.sql.catalyst.expressions.UnsafeRow import org.apache.spark.sql.expressions.{MutableAggregationBuffer, UserDefinedAggregateFunction} import org.apache.spark.sql.functions._ -import org.apache.spark.sql.hive.aggregate.{MyDoubleAvg, MyDoubleSum} import org.apache.spark.sql.hive.test.TestHiveSingleton import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.test.SQLTestUtils import org.apache.spark.sql.types._ + --- End diff -- when move it to sql/core, we can make it extend `SharedSQLContext`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18384: [SPARK-21170] [CORE] Utils.tryWithSafeFinallyAndFailureC...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18384 **[Test build #3812 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3812/testReport)** for PR 18384 at commit [`15b77df`](https://github.com/apache/spark/commit/15b77dfdaa9fbffe6c87e5615d2b298c7ea54c7b). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18412: [SPARK-21203] [SQL] Fix wrong results of insertio...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18412 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18412 the test failure is unrelated, merging to master/2.2/2.1, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18412: [SPARK-21203] [SQL] Fix wrong results of insertio...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18412#discussion_r123878391 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -482,15 +482,15 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String case (fromField, toField) => cast(fromField.dataType, toField.dataType) } // TODO: Could be faster? -val newRow = new GenericInternalRow(from.fields.length) buildCast[InternalRow](_, row => { + val newRow = new GenericInternalRow(from.fields.length) var i = 0 while (i < row.numFields) { newRow.update(i, if (row.isNullAt(i)) null else castFuncs(i)(row.get(i, from.apply(i).dataType))) i += 1 } - newRow.copy() --- End diff -- instead of having a same row instance and copy it every time, I think it makes more sense to create a different row everytime. Besides, I also had a PR to fix this: https://github.com/apache/spark/pull/15082 . Maybe I should reopen it... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78568/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18412 **[Test build #78568 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78568/testReport)** for PR 18412 at commit [`6ad657c`](https://github.com/apache/spark/commit/6ad657c211a1f06fad6f6a33cdcb77cc67141e27). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark History servi...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16924 I'm confused, how can your customer see a still running application in history server? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123876794 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -20,16 +20,19 @@ package org.apache.spark.sql.hive.execution import scala.collection.JavaConverters._ import scala.util.Random +import test.org.apache.spark.sql.MyDoubleAvg +import test.org.apache.spark.sql.MyDoubleSum + import org.apache.spark.sql._ import org.apache.spark.sql.catalyst.expressions.UnsafeRow import org.apache.spark.sql.expressions.{MutableAggregationBuffer, UserDefinedAggregateFunction} import org.apache.spark.sql.functions._ -import org.apache.spark.sql.hive.aggregate.{MyDoubleAvg, MyDoubleSum} import org.apache.spark.sql.hive.test.TestHiveSingleton import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.test.SQLTestUtils import org.apache.spark.sql.types._ + --- End diff -- It depends on some hive stuff (`TestHiveSingleton`), so I guess it is intended to be put in sql/hive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18412: [SPARK-21203] [SQL] Fix wrong results of insertio...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/18412#discussion_r123876215 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -482,15 +482,15 @@ case class Cast(child: Expression, dataType: DataType, timeZoneId: Option[String case (fromField, toField) => cast(fromField.dataType, toField.dataType) } // TODO: Could be faster? -val newRow = new GenericInternalRow(from.fields.length) buildCast[InternalRow](_, row => { + val newRow = new GenericInternalRow(from.fields.length) var i = 0 while (i < row.numFields) { newRow.update(i, if (row.isNullAt(i)) null else castFuncs(i)(row.get(i, from.apply(i).dataType))) i += 1 } - newRow.copy() --- End diff -- Isn't it better to just fix `GenericInternalRow.copy`? I think I broke it when I removed `MutableRow`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18412 **[Test build #78568 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78568/testReport)** for PR 18412 at commit [`6ad657c`](https://github.com/apache/spark/commit/6ad657c211a1f06fad6f6a33cdcb77cc67141e27). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123876192 --- Diff: sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java --- @@ -31,7 +31,7 @@ import org.apache.spark.sql.expressions.UserDefinedAggregateFunction; import static org.apache.spark.sql.functions.*; import org.apache.spark.sql.hive.test.TestHive$; -import org.apache.spark.sql.hive.aggregate.MyDoubleSum; +import test.org.apache.spark.sql.MyDoubleSum; public class JavaDataFrameSuite { --- End diff -- yea --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r123876191 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -20,16 +20,19 @@ package org.apache.spark.sql.hive.execution import scala.collection.JavaConverters._ import scala.util.Random +import test.org.apache.spark.sql.MyDoubleAvg +import test.org.apache.spark.sql.MyDoubleSum + import org.apache.spark.sql._ import org.apache.spark.sql.catalyst.expressions.UnsafeRow import org.apache.spark.sql.expressions.{MutableAggregationBuffer, UserDefinedAggregateFunction} import org.apache.spark.sql.functions._ -import org.apache.spark.sql.hive.aggregate.{MyDoubleAvg, MyDoubleSum} import org.apache.spark.sql.hive.test.TestHiveSingleton import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.test.SQLTestUtils import org.apache.spark.sql.types._ + --- End diff -- yea, seems there is no reason to leave this suite in sql/hive --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18412 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18384: [SPARK-21170] [CORE] Utils.tryWithSafeFinallyAndFailureC...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18384 **[Test build #3812 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3812/testReport)** for PR 18384 at commit [`15b77df`](https://github.com/apache/spark/commit/15b77dfdaa9fbffe6c87e5615d2b298c7ea54c7b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18405: [SPARK-21194][SQL][WIP] Fail the putNullmethod when cont...
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18405 cc @kiszk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78567/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78567 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78567/testReport)** for PR 17222 at commit [`ad5d2c9`](https://github.com/apache/spark/commit/ad5d2c99be23746c557264d51fcfcd480f2c848c). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `throw new AnalysisException(s\"UDF class $` * `throw new AnalysisException(s\"It is invalid to implement multiple UDF interfaces, UDF class $` * ` throw new AnalysisException(s\"UDF class with $` * `throw new AnalysisException(s\"Can not instantiate class $` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"class $className doesn't implement interface UserDefinedAggregateFunction\")` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"Can not instantiate class $` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17227: [SPARK-19507][PySpark][SQL] Show field name in _v...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17227#discussion_r123875271 --- Diff: python/pyspark/sql/types.py --- @@ -1249,7 +1249,7 @@ def _infer_schema_type(obj, dataType): } -def _verify_type(obj, dataType, nullable=True): +def _verify_type(obj, dataType, nullable=True, name="obj"): --- End diff -- I guess this is only place that we print "obj" maybe? If so, let's set `name=None`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17227: [SPARK-19507][PySpark][SQL] Show field name in _v...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17227#discussion_r123875195 --- Diff: python/pyspark/sql/types.py --- @@ -1249,7 +1249,7 @@ def _infer_schema_type(obj, dataType): } -def _verify_type(obj, dataType, nullable=True): +def _verify_type(obj, dataType, nullable=True, name="obj"): --- End diff -- Let's fix this case. ```python >>> from pyspark.sql.types import * >>> spark.createDataFrame(["a"], StringType()).printSchema() ``` ``` root |-- value: string (nullable = true) ``` ```python >>> from pyspark.sql.types import * >>> spark.createDataFrame(["a"], IntegerType()).printSchema() ``` ``` Traceback (most recent call last): File "", line 1, in File ".../spark/python/pyspark/sql/session.py", line 526, in createDataFrame rdd, schema = self._createFromLocal(map(prepare, data), schema) File ".../spark/python/pyspark/sql/session.py", line 387, in _createFromLocal data = list(data) File ".../spark/python/pyspark/sql/session.py", line 516, in prepare verify_func(obj, dataType) File ".../spark/python/pyspark/sql/types.py", line 1326, in _verify_type % (name, dataType, obj, type(obj))) TypeError: obj: IntegerType can not accept object 'a' in type ``` It sounds "obj" should be "value". It looks we should specify the name around https://github.com/dgingrich/spark/blob/topic-spark-19507-verify-types/python/pyspark/sql/session.py#L516. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17227: [SPARK-19507][PySpark][SQL] Show field name in _v...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17227#discussion_r123875088 --- Diff: python/pyspark/sql/types.py --- @@ -1249,7 +1249,7 @@ def _infer_schema_type(obj, dataType): } -def _verify_type(obj, dataType, nullable=True): +def _verify_type(obj, dataType, nullable=True, name="obj"): --- End diff -- I meant this case: ```python >>> from pyspark.sql.types import * >>> spark.createDataFrame(["a"], StringType()).printSchema() ``` ``` root |-- value: string (nullable = true) ``` ```python >>> from pyspark.sql.types import * >>> spark.createDataFrame(["a"], IntegerType()).printSchema() ``` ``` Traceback (most recent call last): File "", line 1, in File ".../spark/python/pyspark/sql/session.py", line 526, in createDataFrame rdd, schema = self._createFromLocal(map(prepare, data), schema) File ".../spark/python/pyspark/sql/session.py", line 387, in _createFromLocal data = list(data) File ".../spark/python/pyspark/sql/session.py", line 516, in prepare verify_func(obj, dataType) File ".../spark/python/pyspark/sql/types.py", line 1326, in _verify_type % (name, dataType, obj, type(obj))) TypeError: obj: IntegerType can not accept object 'a' in type ``` It sounds "obj" should be "value". It looks we should specify the name around https://github.com/dgingrich/spark/blob/topic-spark-19507-verify-types/python/pyspark/sql/session.py#L516. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17227: [SPARK-19507][PySpark][SQL] Show field name in _v...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17227#discussion_r123874870 --- Diff: python/pyspark/sql/tests.py --- @@ -2367,6 +2380,157 @@ def range_frame_match(): importlib.reload(window) + +class TypesTest(unittest.TestCase): + +def test_verify_type_ok_nullable(self): +for obj, data_type in [ +(None, IntegerType()), +(None, FloatType()), +(None, StringType()), +(None, StructType([]))]: --- End diff -- Let's do this like ... ```python types = [IntegerType()), FloatType()), StringType()), StructType([])] for ... ``` if you don't mind. I think taking out the same value in a loop is slightly better. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17227: [SPARK-19507][PySpark][SQL] Show field name in _v...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17227#discussion_r123874817 --- Diff: python/pyspark/sql/tests.py --- @@ -30,6 +30,19 @@ import functools import time import datetime +import traceback + +if sys.version_info[:2] <= (2, 6): --- End diff -- Yea, let's leave it then. Not a big deal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16422: [SPARK-17642] [SQL] support DESC EXTENDED/FORMATTED tabl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16422 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16422: [SPARK-17642] [SQL] support DESC EXTENDED/FORMATTED tabl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16422 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78563/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16422: [SPARK-17642] [SQL] support DESC EXTENDED/FORMATTED tabl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16422 **[Test build #78563 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78563/testReport)** for PR 16422 at commit [`7c901ce`](https://github.com/apache/spark/commit/7c901ceecc74eb8b3bf1fc10e60a79401b30367b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18399: [SPARK-21189][INFRA] Handle unknown error codes i...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18399 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18399: [SPARK-21189][INFRA] Handle unknown error codes in Jenki...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18399 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78562/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18412 **[Test build #78562 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78562/testReport)** for PR 18412 at commit [`6ad657c`](https://github.com/apache/spark/commit/6ad657c211a1f06fad6f6a33cdcb77cc67141e27). * This patch **fails PySpark pip packaging tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78567 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78567/testReport)** for PR 17222 at commit [`ad5d2c9`](https://github.com/apache/spark/commit/ad5d2c99be23746c557264d51fcfcd480f2c848c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16924: [SPARK-19531] Send UPDATE_LENGTH for Spark History servi...
Github user dosoft commented on the issue: https://github.com/apache/spark/pull/16924 This issue has been reported by our customer as following: "Subsequently connecting to spark thriftserver via beeline and running any MR job, it doesnt get reflected in spark history server UI even after the job completion. If we stop and start spark history server, then this job info gets displayed in UI. Shouldnt this UI get auto refreshed with info on completed jobs?" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78566/testReport)** for PR 17222 at commit [`813c501`](https://github.com/apache/spark/commit/813c5014e5688aa1ede17042654d2f3163548c46). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `throw new AnalysisException(s\"UDF class $` * `throw new AnalysisException(s\"It is invalid to implement multiple UDF interfaces, UDF class $` * ` throw new AnalysisException(s\"UDF class with $` * `throw new AnalysisException(s\"Can not instantiate class $` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"class $className doesn't implement interface UserDefinedAggregateFunction\")` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"Can not instantiate class $` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78566/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78566/testReport)** for PR 17222 at commit [`813c501`](https://github.com/apache/spark/commit/813c5014e5688aa1ede17042654d2f3163548c46). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17422: [SPARK-20087] Attach accumulators / metrics to 'TaskKill...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/17422 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78565/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78565 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78565/testReport)** for PR 17222 at commit [`92e74cd`](https://github.com/apache/spark/commit/92e74cde16bbd68a22a37b27b1567f0fefa8fe4d). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `throw new AnalysisException(s\"UDF class $` * `throw new AnalysisException(s\"It is invalid to implement multiple UDF interfaces, UDF class $` * ` throw new AnalysisException(s\"UDF class with $` * `throw new AnalysisException(s\"Can not instantiate class $` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"class $className doesn't implement interface UserDefinedAggregateFunction\")` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"Can not instantiate class $` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78565 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78565/testReport)** for PR 17222 at commit [`92e74cd`](https://github.com/apache/spark/commit/92e74cde16bbd68a22a37b27b1567f0fefa8fe4d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78564/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78564/testReport)** for PR 17222 at commit [`a23a7c3`](https://github.com/apache/spark/commit/a23a7c38c41c82e4c141bbe323b917d494859e5a). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `throw new AnalysisException(s\"UDF class $` * `throw new AnalysisException(s\"It is invalid to implement multiple UDF interfaces, UDF class $` * ` throw new AnalysisException(s\"UDF class with $` * `throw new AnalysisException(s\"Can not instantiate class $` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"class $className doesn't implement interface UserDefinedAggregateFunction\")` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"Can not instantiate class $` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78564/testReport)** for PR 17222 at commit [`a23a7c3`](https://github.com/apache/spark/commit/a23a7c38c41c82e4c141bbe323b917d494859e5a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18412 **[Test build #78562 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78562/testReport)** for PR 18412 at commit [`6ad657c`](https://github.com/apache/spark/commit/6ad657c211a1f06fad6f6a33cdcb77cc67141e27). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16422: [SPARK-17642] [SQL] support DESC EXTENDED/FORMATTED tabl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16422 **[Test build #78563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78563/testReport)** for PR 16422 at commit [`7c901ce`](https://github.com/apache/spark/commit/7c901ceecc74eb8b3bf1fc10e60a79401b30367b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18412 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18412: [SPARK-21203] [SQL] Fix wrong results of insertion of Ar...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18412 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78559/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18410: [SPARK-20971][SS] purge metadata log in FileStreamSource
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18410 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78560/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16422: [SPARK-17642] [SQL] support DESC EXTENDED/FORMATT...
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16422#discussion_r123872260 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -619,6 +620,104 @@ case class DescribeTableCommand( } } +/** + * A command to list the info for a column, including name, data type, column stats and comment. + * This function creates a [[DescribeColumnCommand]] logical plan. + * + * The syntax of using this command in SQL is: + * {{{ + * DESCRIBE [EXTENDED|FORMATTED] table_name column_name; + * }}} + */ +case class DescribeColumnCommand( +table: TableIdentifier, +column: String, +isFormatted: Boolean) + extends RunnableCommand { + + override val output: Seq[Attribute] = { +// The displayed names are based on Hive. +// (Link for the corresponding Hive Jira: https://issues.apache.org/jira/browse/HIVE-7050) +if (isFormatted) { + Seq( +AttributeReference("col_name", StringType, nullable = false, + new MetadataBuilder().putString("comment", "name of the column").build())(), +AttributeReference("data_type", StringType, nullable = false, + new MetadataBuilder().putString("comment", "data type of the column").build())(), +AttributeReference("min", StringType, nullable = true, + new MetadataBuilder().putString("comment", "min value of the column").build())(), +AttributeReference("max", StringType, nullable = true, --- End diff -- @cloud-fan If we want to get the data type for the given column name, we need to get the CatalogTable like we do in `def run(sparkSession)`, but seems we can't get that here in `output`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78561/testReport)** for PR 17222 at commit [`da71c93`](https://github.com/apache/spark/commit/da71c938a401a2e11ba61a9afe05ba8c689b98b1). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `throw new AnalysisException(s\"UDF class $` * `throw new AnalysisException(s\"It is invalid to implement multiple UDF interfaces, UDF class $` * ` throw new AnalysisException(s\"UDF class with $` * `throw new AnalysisException(s\"Can not instantiate class $` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"class $className doesn't implement interface UserDefinedAggregateFunction\")` * ` case e: ClassNotFoundException => throw new AnalysisException(s\"Can not load class $` * `throw new AnalysisException(s\"Can not instantiate class $` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78561/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #78561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78561/testReport)** for PR 17222 at commit [`da71c93`](https://github.com/apache/spark/commit/da71c938a401a2e11ba61a9afe05ba8c689b98b1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org