[GitHub] [spark] yaooqinn commented on pull request #29202: [SPARK-32406][SQL] Make RESET syntax support single configuration reset

2020-07-29 Thread GitBox
yaooqinn commented on pull request #29202: URL: https://github.com/apache/spark/pull/29202#issuecomment-665645902 thanks, @gatorsmile for your suggestion. I will raise a followup to address your comments. I will also pay more attention to the test coverage in my future PRs, thanks again.

[GitHub] [spark] SparkQA removed a comment on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665634620 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665674253 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665635287 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29290: URL: https://github.com/apache/spark/pull/29290#issuecomment-665634915 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665635287 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
SparkQA commented on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665634641 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] srowen commented on pull request #29250: [SPARK-32449][ML][PySpark] Add summary to MultilayerPerceptronClassificationModel

2020-07-29 Thread GitBox
srowen commented on pull request #29250: URL: https://github.com/apache/spark/pull/29250#issuecomment-665715951 Merged to master This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] SparkQA removed a comment on pull request #29279: [SPARK-31418][FOLLOW-UP][MINOR] Fix log messages to print stage id in…

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29279: URL: https://github.com/apache/spark/pull/29279#issuecomment-665681945 **[Test build #126776 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126776/testReport)** for PR 29279 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-665793376 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-29 Thread GitBox
tgravescs commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-665674505 finally, tests pass, merged to master and branch-3.0. thanks @sarutak This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29286: [WIP}[SPARK-21708][Build] Migrate build to sbt 1.x

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29286: URL: https://github.com/apache/spark/pull/29286#issuecomment-665621406 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-665691334 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665673398 **[Test build #126775 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126775/testReport)** for PR 29262 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665634641 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29284: [SPARK-32479] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665582034 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] srowen commented on pull request #29259: [SPARK-29918][SQL][FOLLOWUP][TEST] Fix endianness issues in tests in RecordBinaryComparatorSuite

2020-07-29 Thread GitBox
srowen commented on pull request #29259: URL: https://github.com/apache/spark/pull/29259#issuecomment-665700938 Maybe just leave it as is and write a comment about what the resulting byte order is, for readers This is an

[GitHub] [spark] dongjoon-hyun commented on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
dongjoon-hyun commented on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665704971 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29234: [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources

2020-07-29 Thread GitBox
SparkQA commented on pull request #29234: URL: https://github.com/apache/spark/pull/29234#issuecomment-665520179 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] revans2 commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-29 Thread GitBox
revans2 commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-665781333 The test that failed appears to be unrelated to any of my changes This is an automated message from the Apache

[GitHub] [spark] MaxGekk commented on pull request #29145: [SPARK-32346][SQL] Support filters pushdown in Avro datasource

2020-07-29 Thread GitBox
MaxGekk commented on pull request #29145: URL: https://github.com/apache/spark/pull/29145#issuecomment-665625674 @gengliangwang Are you ok with this PR? This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665554923 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29292: URL: https://github.com/apache/spark/pull/29292#issuecomment-665722763 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] prgitpr opened a new pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr opened a new pull request #29285: URL: https://github.com/apache/spark/pull/29285 “Cannot broadcast the table that is larger than 8GB: ${dataSize >> 30} GB ” shoule be modified to be more accurate .

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29234: [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29234: URL: https://github.com/apache/spark/pull/29234#issuecomment-665520954 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] gaborgsomogyi commented on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
gaborgsomogyi commented on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665636580 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] prgitpr opened a new pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr opened a new pull request #29290: URL: https://github.com/apache/spark/pull/29290 s"Cannot broadcast the table that is larger than 8GB: ${dataSize >> 30} GB") is not accurate info , because 8GB is not accurate. ### What changes were proposed in this pull request?

[GitHub] [spark] SparkQA commented on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
SparkQA commented on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665553817 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] dongjoon-hyun closed pull request #29287: [SPARK-27830][CORE][UI][2.4] Show Spark version at app lists of Spark History UI

2020-07-29 Thread GitBox
dongjoon-hyun closed pull request #29287: URL: https://github.com/apache/spark/pull/29287 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665635179 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
dongjoon-hyun edited a comment on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665711025 @tgravescs , `assignedAddrs` is updated, too. Could you review this once more? This is an

[GitHub] [spark] danzafar commented on pull request #28379: [SPARK-28040][SPARK-28070][R] Write type object s3

2020-07-29 Thread GitBox
danzafar commented on pull request #28379: URL: https://github.com/apache/spark/pull/28379#issuecomment-665751086 why not just coerce inputs to class `character`? ``` SparkR:::sql.default <- function (sqlQuery) { sparkSession <- getSparkSession() sdf <-

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
HyukjinKwon commented on a change in pull request #29292: URL: https://github.com/apache/spark/pull/29292#discussion_r462390312 ## File path: docs/configuration.md ## @@ -3028,3 +3028,10 @@ There are configurations available to request resources for the driver: sp Spark will

[GitHub] [spark] huaxingao commented on pull request #29250: [SPARK-32449][ML][PySpark] Add summary to MultilayerPerceptronClassificationModel

2020-07-29 Thread GitBox
huaxingao commented on pull request #29250: URL: https://github.com/apache/spark/pull/29250#issuecomment-665761465 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs commented on a change in pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
tgravescs commented on a change in pull request #29281: URL: https://github.com/apache/spark/pull/29281#discussion_r462305035 ## File path: core/src/main/scala/org/apache/spark/resource/ResourceAllocator.scala ## @@ -56,7 +56,7 @@ trait ResourceAllocator { def

[GitHub] [spark] HyukjinKwon commented on pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
HyukjinKwon commented on pull request #29292: URL: https://github.com/apache/spark/pull/29292#issuecomment-665744147 From my reading, looks good. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins commented on pull request #29284: [SPARK-32479] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665582034 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29291: URL: https://github.com/apache/spark/pull/29291#issuecomment-665704776 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] liucht-inspur opened a new pull request #29287: [SPARK-27830][CORE][UI][2.4] Show Spark version at app lists of Spark History UI

2020-07-29 Thread GitBox
liucht-inspur opened a new pull request #29287: URL: https://github.com/apache/spark/pull/29287 ## What changes were proposed in this pull request? This PR aims to show Spark version at application lists of Spark History UI for branch-2.4. From the following, the first Version

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29285: URL: https://github.com/apache/spark/pull/29285#issuecomment-665612517 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] tgravescs commented on a change in pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
tgravescs commented on a change in pull request #29292: URL: https://github.com/apache/spark/pull/29292#discussion_r462399890 ## File path: docs/configuration.md ## @@ -3028,3 +3028,10 @@ There are configurations available to request resources for the driver: sp Spark will

[GitHub] [spark] SparkQA commented on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
SparkQA commented on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665634620 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA removed a comment on pull request #29234: [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29234: URL: https://github.com/apache/spark/pull/29234#issuecomment-665520179 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun closed pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
dongjoon-hyun closed pull request #29282: URL: https://github.com/apache/spark/pull/29282 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] SparkQA removed a comment on pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29282: URL: https://github.com/apache/spark/pull/29282#issuecomment-665481384 **[Test build #126762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126762/testReport)** for PR 29282 at commit

[GitHub] [spark] SparkQA commented on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
SparkQA commented on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665673398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29287: [SPARK-27830][CORE][UI][2.4] Show Spark version at app lists of Spark History UI

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29287: URL: https://github.com/apache/spark/pull/29287#issuecomment-665624526 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] SparkQA removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-665558362 **[Test build #126768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126768/testReport)** for PR 28841 at commit

[GitHub] [spark] prgitpr closed pull request #29288: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr closed pull request #29288: URL: https://github.com/apache/spark/pull/29288 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] SparkQA removed a comment on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665477914 **[Test build #126761 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126761/testReport)** for PR 28761 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29291: URL: https://github.com/apache/spark/pull/29291#issuecomment-665704776 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29286: [WIP}[SPARK-21708][Build] Migrate build to sbt 1.x

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29286: URL: https://github.com/apache/spark/pull/29286#issuecomment-665621406 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29290: URL: https://github.com/apache/spark/pull/29290#issuecomment-665634915 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] prgitpr closed pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr closed pull request #29285: URL: https://github.com/apache/spark/pull/29285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] AmplabJenkins commented on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665635179 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs commented on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
tgravescs commented on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665671647 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins commented on pull request #29288: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29288: URL: https://github.com/apache/spark/pull/29288#issuecomment-665630982 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29285: URL: https://github.com/apache/spark/pull/29285#issuecomment-665612517 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29282: URL: https://github.com/apache/spark/pull/29282#issuecomment-665557833 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] gemelen opened a new pull request #29286: [WIP}[SPARK-21708][Build] Migrate build to sbt 1.x

2020-07-29 Thread GitBox
gemelen opened a new pull request #29286: URL: https://github.com/apache/spark/pull/29286 ### What changes were proposed in this pull request? Migrate sbt-launcher URL to download one for sbt 1.x. Update plugins versions where required by sbt update. Change sbt version to be

[GitHub] [spark] AmplabJenkins commented on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665612446 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-29 Thread GitBox
SparkQA commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-665690606 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-07-29 Thread GitBox
SparkQA commented on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-665558362 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

2020-07-29 Thread GitBox
HyukjinKwon commented on pull request #29283: URL: https://github.com/apache/spark/pull/29283#issuecomment-665527242 @viirya can you take a quick look when you're available? This is an automated message from the Apache Git

[GitHub] [spark] prgitpr closed pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr closed pull request #29290: URL: https://github.com/apache/spark/pull/29290 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] tgravescs commented on pull request #29279: [SPARK-31418][FOLLOW-UP][MINOR] Fix log messages to print stage id in…

2020-07-29 Thread GitBox
tgravescs commented on pull request #29279: URL: https://github.com/apache/spark/pull/29279#issuecomment-665677593 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] SparkQA removed a comment on pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29291: URL: https://github.com/apache/spark/pull/29291#issuecomment-665703948 **[Test build #126778 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126778/testReport)** for PR 29291 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29284: [SPARK-32479][PYSPARK] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665665154 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs commented on a change in pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-07-29 Thread GitBox
tgravescs commented on a change in pull request #29276: URL: https://github.com/apache/spark/pull/29276#discussion_r462334961 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ## @@ -695,7 +696,7 @@ private[spark] class TaskSetManager( def

[GitHub] [spark] SparkQA commented on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
SparkQA commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665611663 **[Test build #126761 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126761/testReport)** for PR 28761 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665452628 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
cloud-fan commented on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-664952148 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29146: [SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-664694827 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] ueshin opened a new pull request #29278: [SPARK-32160][CORE][PYSPARK] Add configs to switch allow/disallow to create SparkContext in executors.

2020-07-29 Thread GitBox
ueshin opened a new pull request #29278: URL: https://github.com/apache/spark/pull/29278 ### What changes were proposed in this pull request? This is a follow-up of #28986. This PR adds configs to switch allow/disallow to create SparkContext in executors. -

[GitHub] [spark] cloud-fan commented on pull request #29237: [SPARK-32382][SQL] Override table renaming in JDBC dialects

2020-07-29 Thread GitBox
cloud-fan commented on pull request #29237: URL: https://github.com/apache/spark/pull/29237#issuecomment-665012763 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] cloud-fan commented on pull request #29273: [SPARK-32469][SQL] ApplyColumnarRulesAndInsertTransitions should be idempotent

2020-07-29 Thread GitBox
cloud-fan commented on pull request #29273: URL: https://github.com/apache/spark/pull/29273#issuecomment-665164965 cc @andygrove @tgravescs @maryannxue This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29170: [SPARK-30876][SQL] Optimizer fails to infer constraints within join

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29170: URL: https://github.com/apache/spark/pull/29170#issuecomment-664699039 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon edited a comment on pull request #29274: [SPARK-32397][BUILD] Allow specifying of time for build to keep time consistent between modules

2020-07-29 Thread GitBox
HyukjinKwon edited a comment on pull request #29274: URL: https://github.com/apache/spark/pull/29274#issuecomment-665424580 I didn't test it by myself but the change sounds fine to me. cc @dongjoon-hyun and @srowen This is

[GitHub] [spark] sarutak commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-29 Thread GitBox
sarutak commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-665372255 It's strange that the reason of the R test failure implies that deprecated function is used. But it's resolved in #29252 . I'll rebase this change. ```

[GitHub] [spark] WinkerDu opened a new pull request #29000: [SPARK-27194][SPARK-29302][SQL] Fix commit collision in dynamic partition overwrite mode

2020-07-29 Thread GitBox
WinkerDu opened a new pull request #29000: URL: https://github.com/apache/spark/pull/29000 ### What changes were proposed in this pull request? When using dynamic partition overwrite, each task has its working dir under staging dir like `stagingDir/.spark-staging-{jobId}`,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29265: [SPARK-32462][WEBUI] Don't save the previous search text for datatable

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29265: URL: https://github.com/apache/spark/pull/29265#issuecomment-664692710 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan commented on a change in pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-07-29 Thread GitBox
cloud-fan commented on a change in pull request #29276: URL: https://github.com/apache/spark/pull/29276#discussion_r462013325 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ## @@ -695,7 +696,7 @@ private[spark] class TaskSetManager( def

[GitHub] [spark] MaxGekk commented on a change in pull request #29234: [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources

2020-07-29 Thread GitBox
MaxGekk commented on a change in pull request #29234: URL: https://github.com/apache/spark/pull/29234#discussion_r461693321 ## File path: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ## @@ -1800,6 +1800,44 @@ abstract class AvroSuite extends

[GitHub] [spark] WinkerDu commented on pull request #29260: [SPARK-27194][SPARK-29302][SQL] Fix commit collision in dynamic partition overwrite mode

2020-07-29 Thread GitBox
WinkerDu commented on pull request #29260: URL: https://github.com/apache/spark/pull/29260#issuecomment-664913968 close this pr and reopen the previous pr, further code review and update please check #29000 This is an

[GitHub] [spark] Ngone51 opened a new pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-07-29 Thread GitBox
Ngone51 opened a new pull request #29270: URL: https://github.com/apache/spark/pull/29270 ### What changes were proposed in this pull request? This PR proposes to detect possible regression inside `SparkPlan`. To achieve this goal, this PR added a base test suite called

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29277: [SPARK-32421][SQL] Add code-gen for shuffled hash join

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29277: URL: https://github.com/apache/spark/pull/29277#issuecomment-665248859 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] holdenk opened a new pull request #29263: [SPARK-32417][TEST][CORE] Fix test flakiness in BlockManagerDecommissionIntegrationSuite

2020-07-29 Thread GitBox
holdenk opened a new pull request #29263: URL: https://github.com/apache/spark/pull/29263 ### What changes were proposed in this pull request? This changes the test to ensure the executors are all the way up and accepting tasks before attempting to move on to decommissioning. This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29263: [SPARK-32417][TEST][CORE] Fix test flakiness in BlockManagerDecommissionIntegrationSuite

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29263: URL: https://github.com/apache/spark/pull/29263#issuecomment-664660244 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] Fokko commented on a change in pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-29 Thread GitBox
Fokko commented on a change in pull request #29121: URL: https://github.com/apache/spark/pull/29121#discussion_r461329483 ## File path: python/pyspark/ml/tests/test_stat.py ## @@ -40,7 +40,7 @@ def test_chisquaretest(self): if __name__ == "__main__": -from

[GitHub] [spark] SparkQA removed a comment on pull request #29269: [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description for some operations

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29269: URL: https://github.com/apache/spark/pull/29269#issuecomment-664908330 **[Test build #126708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126708/testReport)** for PR 29269 at commit

[GitHub] [spark] SparkQA commented on pull request #29261: [SPARK-32459][SQL] Support WrappedArray as customCollectionCls in MapObjects

2020-07-29 Thread GitBox
SparkQA commented on pull request #29261: URL: https://github.com/apache/spark/pull/29261#issuecomment-664763583 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-29 Thread GitBox
SparkQA commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-665372757 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on pull request #29274: [SPARK-32397][BUILD] Allow specifying of time for build to keep time consistent between modules

2020-07-29 Thread GitBox
HyukjinKwon commented on pull request #29274: URL: https://github.com/apache/spark/pull/29274#issuecomment-665424580 I didn't test it by myself but the change sounds fine to me. This is an automated message from the Apache

[GitHub] [spark] abhishekd0907 commented on a change in pull request #29242: [SPARK-31448] [PYTHON] Fix storage level used in cache() in dataframe.py

2020-07-29 Thread GitBox
abhishekd0907 commented on a change in pull request #29242: URL: https://github.com/apache/spark/pull/29242#discussion_r462086489 ## File path: python/pyspark/sql/dataframe.py ## @@ -674,7 +674,7 @@ def cache(self): .. note:: The default storage level has changed to

[GitHub] [spark] SparkQA commented on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
SparkQA commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665464585 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-665373210 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29274: [SPARK-32397][BUILD] Allow specifying of time for build to keep time consistent between modules

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29274: URL: https://github.com/apache/spark/pull/29274#issuecomment-665218139 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29278: [WIP][SPARK-32160][CORE][PYSPARK] Add configs to switch allow/disallow to create SparkContext in executors.

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29278: URL: https://github.com/apache/spark/pull/29278#issuecomment-665385821 **[Test build #126746 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126746/testReport)** for PR 29278 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29244: [SPARK-20680][SQL][FOLLOW-UP] Make NullType.simpleString as void to support hive

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29244: URL: https://github.com/apache/spark/pull/29244#issuecomment-664703549 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] andygrove edited a comment on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
andygrove edited a comment on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665066613 Thanks @cloud-fan. I have tested these changes both with Spark 3.1 and also back ported to the 3.0 branch and everything is working well, so LGTM. I wish I had

  1   2   3   4   5   6   >