[GitHub] [spark] SparkQA commented on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
SparkQA commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665611663 **[Test build #126761 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126761/testReport)** for PR 28761 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665634641 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29284: [SPARK-32479] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665582034 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-29 Thread GitBox
SparkQA commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-665690606 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins commented on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665612446 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29285: URL: https://github.com/apache/spark/pull/29285#issuecomment-665612517 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] tgravescs commented on a change in pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
tgravescs commented on a change in pull request #29292: URL: https://github.com/apache/spark/pull/29292#discussion_r462399890 ## File path: docs/configuration.md ## @@ -3028,3 +3028,10 @@ There are configurations available to request resources for the driver: sp Spark will

[GitHub] [spark] SparkQA commented on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
SparkQA commented on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665634620 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] liucht-inspur opened a new pull request #29287: [SPARK-27830][CORE][UI][2.4] Show Spark version at app lists of Spark History UI

2020-07-29 Thread GitBox
liucht-inspur opened a new pull request #29287: URL: https://github.com/apache/spark/pull/29287 ## What changes were proposed in this pull request? This PR aims to show Spark version at application lists of Spark History UI for branch-2.4. From the following, the first Version

[GitHub] [spark] SparkQA removed a comment on pull request #29234: [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29234: URL: https://github.com/apache/spark/pull/29234#issuecomment-665520179 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun closed pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
dongjoon-hyun closed pull request #29282: URL: https://github.com/apache/spark/pull/29282 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] SparkQA removed a comment on pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29282: URL: https://github.com/apache/spark/pull/29282#issuecomment-665481384 **[Test build #126762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126762/testReport)** for PR 29282 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29279: [SPARK-31418][FOLLOW-UP][MINOR] Fix log messages to print stage id in…

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29279: URL: https://github.com/apache/spark/pull/29279#issuecomment-665681945 **[Test build #126776 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126776/testReport)** for PR 29279 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-665793376 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] danzafar commented on pull request #28379: [SPARK-28040][SPARK-28070][R] Write type object s3

2020-07-29 Thread GitBox
danzafar commented on pull request #28379: URL: https://github.com/apache/spark/pull/28379#issuecomment-665751086 why not just coerce inputs to class `character`? ``` SparkR:::sql.default <- function (sqlQuery) { sparkSession <- getSparkSession() sdf <-

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
HyukjinKwon commented on a change in pull request #29292: URL: https://github.com/apache/spark/pull/29292#discussion_r462390312 ## File path: docs/configuration.md ## @@ -3028,3 +3028,10 @@ There are configurations available to request resources for the driver: sp Spark will

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
dongjoon-hyun edited a comment on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665711025 @tgravescs , `assignedAddrs` is updated, too. Could you review this once more? This is an

[GitHub] [spark] huaxingao commented on pull request #29250: [SPARK-32449][ML][PySpark] Add summary to MultilayerPerceptronClassificationModel

2020-07-29 Thread GitBox
huaxingao commented on pull request #29250: URL: https://github.com/apache/spark/pull/29250#issuecomment-665761465 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs commented on a change in pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
tgravescs commented on a change in pull request #29281: URL: https://github.com/apache/spark/pull/29281#discussion_r462305035 ## File path: core/src/main/scala/org/apache/spark/resource/ResourceAllocator.scala ## @@ -56,7 +56,7 @@ trait ResourceAllocator { def

[GitHub] [spark] SparkQA removed a comment on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665673398 **[Test build #126775 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126775/testReport)** for PR 29262 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29292: URL: https://github.com/apache/spark/pull/29292#issuecomment-665722763 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] prgitpr opened a new pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr opened a new pull request #29285: URL: https://github.com/apache/spark/pull/29285 “Cannot broadcast the table that is larger than 8GB: ${dataSize >> 30} GB ” shoule be modified to be more accurate .

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29234: [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29234: URL: https://github.com/apache/spark/pull/29234#issuecomment-665520954 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29282: URL: https://github.com/apache/spark/pull/29282#issuecomment-665557833 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29285: URL: https://github.com/apache/spark/pull/29285#issuecomment-665612517 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs commented on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
tgravescs commented on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665671647 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins commented on pull request #29288: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29288: URL: https://github.com/apache/spark/pull/29288#issuecomment-665630982 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] gemelen opened a new pull request #29286: [WIP}[SPARK-21708][Build] Migrate build to sbt 1.x

2020-07-29 Thread GitBox
gemelen opened a new pull request #29286: URL: https://github.com/apache/spark/pull/29286 ### What changes were proposed in this pull request? Migrate sbt-launcher URL to download one for sbt 1.x. Update plugins versions where required by sbt update. Change sbt version to be

[GitHub] [spark] SparkQA removed a comment on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665477914 **[Test build #126761 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126761/testReport)** for PR 28761 at commit

[GitHub] [spark] SparkQA commented on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
SparkQA commented on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665634641 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] srowen commented on pull request #29250: [SPARK-32449][ML][PySpark] Add summary to MultilayerPerceptronClassificationModel

2020-07-29 Thread GitBox
srowen commented on pull request #29250: URL: https://github.com/apache/spark/pull/29250#issuecomment-665715951 Merged to master This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] AmplabJenkins commented on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665635287 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29290: URL: https://github.com/apache/spark/pull/29290#issuecomment-665634915 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665635287 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-665691334 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] tgravescs commented on pull request #29279: [SPARK-31418][FOLLOW-UP][MINOR] Fix log messages to print stage id in…

2020-07-29 Thread GitBox
tgravescs commented on pull request #29279: URL: https://github.com/apache/spark/pull/29279#issuecomment-665677593 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] SparkQA removed a comment on pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29291: URL: https://github.com/apache/spark/pull/29291#issuecomment-665703948 **[Test build #126778 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126778/testReport)** for PR 29291 at commit

[GitHub] [spark] prgitpr closed pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr closed pull request #29290: URL: https://github.com/apache/spark/pull/29290 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] tgravescs commented on a change in pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-07-29 Thread GitBox
tgravescs commented on a change in pull request #29276: URL: https://github.com/apache/spark/pull/29276#discussion_r462334961 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ## @@ -695,7 +696,7 @@ private[spark] class TaskSetManager( def

[GitHub] [spark] SparkQA removed a comment on pull request #29284: [SPARK-32479][PYSPARK] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665665154 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
dongjoon-hyun commented on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665704971 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] srowen commented on pull request #29259: [SPARK-29918][SQL][FOLLOWUP][TEST] Fix endianness issues in tests in RecordBinaryComparatorSuite

2020-07-29 Thread GitBox
srowen commented on pull request #29259: URL: https://github.com/apache/spark/pull/29259#issuecomment-665700938 Maybe just leave it as is and write a comment about what the resulting byte order is, for readers This is an

[GitHub] [spark] SparkQA commented on pull request #29234: [SPARK-32431][SQL] Check duplicate nested columns in read from in-built datasources

2020-07-29 Thread GitBox
SparkQA commented on pull request #29234: URL: https://github.com/apache/spark/pull/29234#issuecomment-665520179 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] MaxGekk commented on pull request #29145: [SPARK-32346][SQL] Support filters pushdown in Avro datasource

2020-07-29 Thread GitBox
MaxGekk commented on pull request #29145: URL: https://github.com/apache/spark/pull/29145#issuecomment-665625674 @gengliangwang Are you ok with this PR? This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665554923 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] revans2 commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-29 Thread GitBox
revans2 commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-665781333 The test that failed appears to be unrelated to any of my changes This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665635179 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun closed pull request #29287: [SPARK-27830][CORE][UI][2.4] Show Spark version at app lists of Spark History UI

2020-07-29 Thread GitBox
dongjoon-hyun closed pull request #29287: URL: https://github.com/apache/spark/pull/29287 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
HyukjinKwon commented on pull request #29292: URL: https://github.com/apache/spark/pull/29292#issuecomment-665744147 From my reading, looks good. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29286: [WIP}[SPARK-21708][Build] Migrate build to sbt 1.x

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29286: URL: https://github.com/apache/spark/pull/29286#issuecomment-665621406 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] tgravescs commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-29 Thread GitBox
tgravescs commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-665674505 finally, tests pass, merged to master and branch-3.0. thanks @sarutak This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29291: URL: https://github.com/apache/spark/pull/29291#issuecomment-665704776 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29284: [SPARK-32479] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665582034 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29146: URL: https://github.com/apache/spark/pull/29146#issuecomment-665635179 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] prgitpr closed pull request #29288: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr closed pull request #29288: URL: https://github.com/apache/spark/pull/29288 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] SparkQA removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-665558362 **[Test build #126768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126768/testReport)** for PR 28841 at commit

[GitHub] [spark] SparkQA commented on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
SparkQA commented on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665673398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29287: [SPARK-27830][CORE][UI][2.4] Show Spark version at app lists of Spark History UI

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29287: URL: https://github.com/apache/spark/pull/29287#issuecomment-665624526 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] SparkQA removed a comment on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665634620 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29262: URL: https://github.com/apache/spark/pull/29262#issuecomment-665674253 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] yaooqinn commented on pull request #29202: [SPARK-32406][SQL] Make RESET syntax support single configuration reset

2020-07-29 Thread GitBox
yaooqinn commented on pull request #29202: URL: https://github.com/apache/spark/pull/29202#issuecomment-665645902 thanks, @gatorsmile for your suggestion. I will raise a followup to address your comments. I will also pay more attention to the test coverage in my future PRs, thanks again.

[GitHub] [spark] SparkQA commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-07-29 Thread GitBox
SparkQA commented on pull request #28841: URL: https://github.com/apache/spark/pull/28841#issuecomment-665558362 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

2020-07-29 Thread GitBox
HyukjinKwon commented on pull request #29283: URL: https://github.com/apache/spark/pull/29283#issuecomment-665527242 @viirya can you take a quick look when you're available? This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins commented on pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29291: URL: https://github.com/apache/spark/pull/29291#issuecomment-665704776 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29286: [WIP}[SPARK-21708][Build] Migrate build to sbt 1.x

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29286: URL: https://github.com/apache/spark/pull/29286#issuecomment-665621406 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] prgitpr closed pull request #29285: Modify log info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr closed pull request #29285: URL: https://github.com/apache/spark/pull/29285 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29290: URL: https://github.com/apache/spark/pull/29290#issuecomment-665634915 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] SparkQA commented on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
SparkQA commented on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665553817 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] gaborgsomogyi commented on pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
gaborgsomogyi commented on pull request #29289: URL: https://github.com/apache/spark/pull/29289#issuecomment-665636580 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] prgitpr opened a new pull request #29290: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr opened a new pull request #29290: URL: https://github.com/apache/spark/pull/29290 s"Cannot broadcast the table that is larger than 8GB: ${dataSize >> 30} GB") is not accurate info , because 8GB is not accurate. ### What changes were proposed in this pull request?

[GitHub] [spark] SparkQA commented on pull request #29276: [SPARK-32470][CORE] Remove task result size check for shuffle map stage

2020-07-29 Thread GitBox
SparkQA commented on pull request #29276: URL: https://github.com/apache/spark/pull/29276#issuecomment-665792999 **[Test build #126786 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126786/testReport)** for PR 29276 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #29277: [SPARK-32421][SQL] Add code-gen for shuffled hash join

2020-07-29 Thread GitBox
cloud-fan commented on a change in pull request #29277: URL: https://github.com/apache/spark/pull/29277#discussion_r462369012 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ## @@ -316,6 +318,387 @@ trait HashJoin extends BaseJoinExec

[GitHub] [spark] WeichenXu123 commented on pull request #29284: [SPARK-32479][PYSPARK] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
WeichenXu123 commented on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665662649 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29270: URL: https://github.com/apache/spark/pull/29270#issuecomment-665552184 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on pull request #29287: [SPARK-27830][CORE][UI][2.4] Show Spark version at app lists of Spark History UI

2020-07-29 Thread GitBox
dongjoon-hyun commented on pull request #29287: URL: https://github.com/apache/spark/pull/29287#issuecomment-665714221 I'll close this PR to prevent an accidental merge. We can discuss more on this PR. This is an automated

[GitHub] [spark] SparkQA removed a comment on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29270: URL: https://github.com/apache/spark/pull/29270#issuecomment-665495913 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29283: URL: https://github.com/apache/spark/pull/29283#issuecomment-665525169 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

2020-07-29 Thread GitBox
HyukjinKwon commented on a change in pull request #29283: URL: https://github.com/apache/spark/pull/29283#discussion_r462136305 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala ## @@ -567,7 +567,14 @@ case class FlatMapGroupsInRWithArrowExec(

[GitHub] [spark] c21 commented on a change in pull request #29277: [SPARK-32421][SQL] Add code-gen for shuffled hash join

2020-07-29 Thread GitBox
c21 commented on a change in pull request #29277: URL: https://github.com/apache/spark/pull/29277#discussion_r462401733 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ## @@ -316,6 +318,387 @@ trait HashJoin extends BaseJoinExec {

[GitHub] [spark] AmplabJenkins commented on pull request #29281: [SPARK-32476][CORE] ResourceAllocator.availableAddrs should be deterministic

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29281: URL: https://github.com/apache/spark/pull/29281#issuecomment-665554923 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29279: [SPARK-31418][FOLLOW-UP][MINOR] Fix log messages to print stage id in…

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29279: URL: https://github.com/apache/spark/pull/29279#issuecomment-665678371 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] tgravescs opened a new pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
tgravescs opened a new pull request #29292: URL: https://github.com/apache/spark/pull/29292 ### What changes were proposed in this pull request? Document the stage level scheduling feature. ### Why are the changes needed? Document the stage level scheduling

[GitHub] [spark] dongjoon-hyun commented on pull request #29280: [SPARK-32473][CORE][TESTS] Use === instead IndexSeqView

2020-07-29 Thread GitBox
dongjoon-hyun commented on pull request #29280: URL: https://github.com/apache/spark/pull/29280#issuecomment-665708576 Thanks, @srowen . This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] srowen closed pull request #29250: [SPARK-32449][ML][PySpark] Add summary to MultilayerPerceptronClassificationModel

2020-07-29 Thread GitBox
srowen closed pull request #29250: URL: https://github.com/apache/spark/pull/29250 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] cloud-fan commented on a change in pull request #29262: [SPARK-32332][SQL] Support columnar exchanges

2020-07-29 Thread GitBox
cloud-fan commented on a change in pull request #29262: URL: https://github.com/apache/spark/pull/29262#discussion_r462347999 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -376,10 +382,14 @@ case class

[GitHub] [spark] beliefer opened a new pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
beliefer opened a new pull request #29291: URL: https://github.com/apache/spark/pull/29291 ### What changes were proposed in this pull request? This PR is related to https://github.com/apache/spark/pull/26656. https://github.com/apache/spark/pull/26656 only support use FILTER clause

[GitHub] [spark] SparkQA commented on pull request #29291: [SPARK-30276][SQL] Support Filter expression allows simultaneous use of DISTINCT

2020-07-29 Thread GitBox
SparkQA commented on pull request #29291: URL: https://github.com/apache/spark/pull/29291#issuecomment-665703948 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins commented on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29270: URL: https://github.com/apache/spark/pull/29270#issuecomment-665552184 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-665612446 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29292: [SPARK-30322][DOCS] Add stage level scheduling docs

2020-07-29 Thread GitBox
AmplabJenkins removed a comment on pull request #29292: URL: https://github.com/apache/spark/pull/29292#issuecomment-665722763 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] liangz1 opened a new pull request #29284: [SPARK-32479] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
liangz1 opened a new pull request #29284: URL: https://github.com/apache/spark/pull/29284 ### What changes were proposed in this pull request? In `spark.createDataFrame(pdf: pd.DataFrame)`, the pdf will be sliced to `num_slices=defaultParallelism` slices before

[GitHub] [spark] AmplabJenkins commented on pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29283: URL: https://github.com/apache/spark/pull/29283#issuecomment-665525169 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29284: [SPARK-32479][PYSPARK] Fix the slicing logic in createDataFrame when converting pandas dataframe to arrow table

2020-07-29 Thread GitBox
SparkQA commented on pull request #29284: URL: https://github.com/apache/spark/pull/29284#issuecomment-665665154 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA removed a comment on pull request #29283: [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization

2020-07-29 Thread GitBox
SparkQA removed a comment on pull request #29283: URL: https://github.com/apache/spark/pull/29283#issuecomment-665524460 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on pull request #29202: [SPARK-32406][SQL] Make RESET syntax support single configuration reset

2020-07-29 Thread GitBox
dongjoon-hyun commented on pull request #29202: URL: https://github.com/apache/spark/pull/29202#issuecomment-665722956 Thank you for your advice, @gatorsmile . This is an automated message from the Apache Git Service. To

[GitHub] [spark] dongjoon-hyun commented on pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
dongjoon-hyun commented on pull request #29282: URL: https://github.com/apache/spark/pull/29282#issuecomment-665705462 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] prgitpr opened a new pull request #29288: [SPARK-32484][SQL] Not accurate Log Info in BroadcastExchangeExec.scala

2020-07-29 Thread GitBox
prgitpr opened a new pull request #29288: URL: https://github.com/apache/spark/pull/29288 s"Cannot broadcast the table that is larger than 8GB: ${dataSize >> 30} GB") is not accurate info , because 8GB is not accurate. ### What changes were proposed in this pull request?

[GitHub] [spark] cloud-fan commented on a change in pull request #29146: [WIP][SPARK-32257][SQL] Reports explicit errors for invalid usage of SET/RESET command

2020-07-29 Thread GitBox
cloud-fan commented on a change in pull request #29146: URL: https://github.com/apache/spark/pull/29146#discussion_r462359914 ## File path: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ## @@ -244,11 +258,31 @@ statement | SET TIME ZONE

[GitHub] [spark] gaborgsomogyi opened a new pull request #29289: [SPARK-32482][SS][TESTS] Eliminate deprecated poll(long) API calls to avoid infinite wait in tests

2020-07-29 Thread GitBox
gaborgsomogyi opened a new pull request #29289: URL: https://github.com/apache/spark/pull/29289 ### What changes were proposed in this pull request? Structured Streaming Kafka connector tests are now using a deprecated `poll(long)` API which could cause infinite wait. In this PR I've

[GitHub] [spark] AmplabJenkins commented on pull request #29282: [SPARK-32477][CORE] JsonProtocol.accumulablesToJson should be deterministic

2020-07-29 Thread GitBox
AmplabJenkins commented on pull request #29282: URL: https://github.com/apache/spark/pull/29282#issuecomment-665557833 This is an automated message from the Apache Git Service. To respond to the message, please log on to

  1   2   3   4   5   6   >