[GitHub] [spark] msamirkhan edited a comment on pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan edited a comment on pull request #29353: URL: https://github.com/apache/spark/pull/29353#issuecomment-669459288 Created https://github.com/apache/spark/pull/29366 for the changes referred to in https://github.com/apache/spark/pull/29353#discussion_r465955120

[GitHub] [spark] AmplabJenkins commented on pull request #28761: [SPARK-25557][SQL][test-hive1.2] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669599299 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28761: [SPARK-25557][SQL][test-hive1.2] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669599299 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29366: [SPARK-32550][SQL] Make SpecificInternalRow constructors faster by using while loops instead of maps

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29366: URL: https://github.com/apache/spark/pull/29366#issuecomment-669599127 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29366: [SPARK-32550][SQL] Make SpecificInternalRow constructors faster by using while loops instead of maps

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29366: URL: https://github.com/apache/spark/pull/29366#issuecomment-669598823 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #28761: [SPARK-25557][SQL][test-hive1.2] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
SparkQA commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669598955 **[Test build #127109 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127109/testReport)** for PR 28761 at commit

[GitHub] [spark] msamirkhan opened a new pull request #29366: [SPARK-32550][SQL] Make SpecificInternalRow constructors faster by using while loops instead of maps

2020-08-05 Thread GitBox
msamirkhan opened a new pull request #29366: URL: https://github.com/apache/spark/pull/29366 ### What changes were proposed in this pull request? Change maps in two constructors of SpecificInternalRow to while loops. ### Why are the changes needed? This was

[GitHub] [spark] viirya commented on pull request #28761: [SPARK-25557][SQL][test-hive2.3] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
viirya commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669597182 let me test hive-1.2 again. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] viirya commented on pull request #28761: [SPARK-25557][SQL][test-hive1.2] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
viirya commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669597280 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] holdenk commented on pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
holdenk commented on pull request #29211: URL: https://github.com/apache/spark/pull/29211#issuecomment-669596854 Merged to the dev branch :) This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] asfgit closed pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
asfgit closed pull request #29211: URL: https://github.com/apache/spark/pull/29211 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] holdenk commented on pull request #28817: [WIP][SPARK-31197][CORE] Exit the executor once all tasks and migrations are finished built on top of on top of spark20629

2020-08-05 Thread GitBox
holdenk commented on pull request #28817: URL: https://github.com/apache/spark/pull/28817#issuecomment-669596710 It's been merged in the non-WIP version of this PR. This is an automated message from the Apache Git Service.

[GitHub] [spark] AmplabJenkins commented on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669596575 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
SparkQA removed a comment on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669589492 **[Test build #127108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127108/testReport)** for PR 29365 at commit

[GitHub] [spark] holdenk closed pull request #28817: [WIP][SPARK-31197][CORE] Exit the executor once all tasks and migrations are finished built on top of on top of spark20629

2020-08-05 Thread GitBox
holdenk closed pull request #28817: URL: https://github.com/apache/spark/pull/28817 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669596575 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
SparkQA commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669596334 **[Test build #127108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127108/testReport)** for PR 29365 at commit

[GitHub] [spark] holdenk commented on pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
holdenk commented on pull request #29211: URL: https://github.com/apache/spark/pull/29211#issuecomment-669595846 Since it's approved and passing jenkins & GHA I'm going to merge this, but I'll add the comment about the time in the follow up I'll rebase on top of this after merge.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28761: [SPARK-25557][SQL][test-hive2.3] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669593486 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28761: [SPARK-25557][SQL][test-hive2.3] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669593486 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28761: [SPARK-25557][SQL][test-hive2.3] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
SparkQA commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669592883 **[Test build #127102 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127102/testReport)** for PR 28761 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #28761: [SPARK-25557][SQL][test-hive2.3] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
SparkQA removed a comment on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669398172 **[Test build #127102 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127102/testReport)** for PR 28761 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669589886 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669589886 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
SparkQA commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669589492 **[Test build #127108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127108/testReport)** for PR 29365 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669587791 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669587783 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669587783 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA removed a comment on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
SparkQA removed a comment on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669580170 **[Test build #127107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127107/testReport)** for PR 29365 at commit

[GitHub] [spark] SparkQA commented on pull request #29365: [WIP][SPARK-32549][PYSPARK] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
SparkQA commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669587690 **[Test build #127107 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127107/testReport)** for PR 29365 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29365: [WIP] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669582185 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29365: [WIP] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669582185 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29365: [WIP] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
SparkQA commented on pull request #29365: URL: https://github.com/apache/spark/pull/29365#issuecomment-669580170 **[Test build #127107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127107/testReport)** for PR 29365 at commit

[GitHub] [spark] liangz1 opened a new pull request #29365: [WIP] Add column name in _infer_schema error message

2020-08-05 Thread GitBox
liangz1 opened a new pull request #29365: URL: https://github.com/apache/spark/pull/29365 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] allisonwang-db commented on a change in pull request #29137: [SPARK-32337][SQL] Show initial plan in AQE plan tree string

2020-08-05 Thread GitBox
allisonwang-db commented on a change in pull request #29137: URL: https://github.com/apache/spark/pull/29137#discussion_r466043640 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -279,34 +280,69 @@ case class

[GitHub] [spark] tianczha commented on a change in pull request #28618: [SPARK-31801][API][SHUFFLE] Register map output metadata

2020-08-05 Thread GitBox
tianczha commented on a change in pull request #28618: URL: https://github.com/apache/spark/pull/28618#discussion_r466028728 ## File path: core/src/main/java/org/apache/spark/shuffle/sort/io/LocalDiskShuffleDataIO.java ## @@ -35,12 +37,13 @@ public

[GitHub] [spark] AmplabJenkins commented on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669544416 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669544416 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
SparkQA commented on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669543966 **[Test build #127106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127106/testReport)** for PR 29364 at commit

[GitHub] [spark] viirya commented on pull request #28761: [SPARK-25557][SQL][test-hive2.3] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
viirya commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669540195 Thanks @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29363: [SPARK-32546][SQL] Get table names directly from Hive tables

2020-08-05 Thread GitBox
dongjoon-hyun edited a comment on pull request #29363: URL: https://github.com/apache/spark/pull/29363#issuecomment-669534406 Why don't you put your comment into the PR description? "How was this patch tested?" section is designed for that. In addition to that, please note that what

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29363: [SPARK-32546][SQL] Get table names directly from Hive tables

2020-08-05 Thread GitBox
dongjoon-hyun edited a comment on pull request #29363: URL: https://github.com/apache/spark/pull/29363#issuecomment-669534406 Why don't you put your comment into the PR description? "How was this patch tested?" section is designed for that. In addition to that, please note that what

[GitHub] [spark] dongjoon-hyun commented on pull request #29363: [SPARK-32546][SQL] Get table names directly from Hive tables

2020-08-05 Thread GitBox
dongjoon-hyun commented on pull request #29363: URL: https://github.com/apache/spark/pull/29363#issuecomment-669534406 Why don't you put your comment into the PR description? "How was this patch tested?" section is designed for that.

[GitHub] [spark] dongjoon-hyun commented on pull request #28761: [SPARK-25557][SQL][test-hive2.3] Nested column predicate pushdown for ORC

2020-08-05 Thread GitBox
dongjoon-hyun commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-669530622 Oh, the Hive bug was the one which broke our Maven build until today. It's fixed here, https://github.com/apache/spark/commit/1b6f482adbd5e48e6376ed6896ff968dbe75c1d3 .

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669527685 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA removed a comment on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
SparkQA removed a comment on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669526582 **[Test build #127105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127105/testReport)** for PR 29364 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669527672 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
SparkQA commented on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669527660 **[Test build #127105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127105/testReport)** for PR 29364 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669527672 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669523576 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
SparkQA commented on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669526582 **[Test build #127105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127105/testReport)** for PR 29364 at commit

[GitHub] [spark] erenavsarogullari commented on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
erenavsarogullari commented on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669523666 cc @gengliangwang This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins commented on pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29364: URL: https://github.com/apache/spark/pull/29364#issuecomment-669523576 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] erenavsarogullari opened a new pull request #29364: [SPARK-32548][SQL] - Add Application attemptId support to SQL Rest API

2020-08-05 Thread GitBox
erenavsarogullari opened a new pull request #29364: URL: https://github.com/apache/spark/pull/29364 ### What changes were proposed in this pull request? Currently, Spark Public Rest APIs support Application attemptId except SQL API. This causes `no such app: application_X` issue when

[GitHub] [spark] msamirkhan commented on pull request #29354: [SPARK-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on pull request #29354: URL: https://github.com/apache/spark/pull/29354#issuecomment-669519533 The [pdf attached to the PR](https://github.com/apache/spark/files/5025167/AvroBenchmarks.pdf) contains the read and write time improvements with the commits. For

[GitHub] [spark] msamirkhan edited a comment on pull request #29354: [SPARK-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan edited a comment on pull request #29354: URL: https://github.com/apache/spark/pull/29354#issuecomment-669519533 The [pdf attached to the PR](https://github.com/apache/spark/files/5025167/AvroBenchmarks.pdf) contains the read and write time improvements with the commits. I have

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29211: URL: https://github.com/apache/spark/pull/29211#issuecomment-669517201 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] stczwd commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-08-05 Thread GitBox
stczwd commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r466011087 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AlterTableAddPartitionExec.scala ## @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] [spark] AmplabJenkins commented on pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29211: URL: https://github.com/apache/spark/pull/29211#issuecomment-669517201 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
SparkQA removed a comment on pull request #29211: URL: https://github.com/apache/spark/pull/29211#issuecomment-669390805 **[Test build #127101 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127101/testReport)** for PR 29211 at commit

[GitHub] [spark] SparkQA commented on pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
SparkQA commented on pull request #29211: URL: https://github.com/apache/spark/pull/29211#issuecomment-669515934 **[Test build #127101 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127101/testReport)** for PR 29211 at commit

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [SPARK-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465980330 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroOutputWriterFactory.scala ## @@ -40,6 +40,8 @@ private[sql] class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [SPARK-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r466004477 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumWriter.scala ## @@ -125,42 +125,42 @@ class

[GitHub] [spark] rdblue commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-08-05 Thread GitBox
rdblue commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r466002905 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AlterTableAddPartitionExec.scala ## @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465983317 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala ## @@ -182,19 +182,21 @@ class AvroDeserializer(

[GitHub] [spark] stczwd commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-08-05 Thread GitBox
stczwd commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r465996986 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AlterTableAddPartitionExec.scala ## @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r466001347 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -421,12 +418,10 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r466000797 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -638,90 +628,57 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r466000481 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -638,90 +628,57 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465999766 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -88,17 +87,10 @@ class SparkAvroDatumReader[T](

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29211: [SPARK-31197][CORE] Shutdown executor once we are done decommissioning

2020-08-05 Thread GitBox
agrawaldevesh commented on a change in pull request #29211: URL: https://github.com/apache/spark/pull/29211#discussion_r465999198 ## File path: core/src/test/scala/org/apache/spark/storage/BlockManagerDecommissionIntegrationSuite.scala ## @@ -266,18 +266,17 @@ class

[GitHub] [spark] stczwd commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-08-05 Thread GitBox
stczwd commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r465999241 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala ## @@ -62,4 +74,31 @@ object

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465998634 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -809,51 +735,111 @@ class

[GitHub] [spark] squito commented on a change in pull request #28885: [SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan exchange and subquery reuse

2020-08-05 Thread GitBox
squito commented on a change in pull request #28885: URL: https://github.com/apache/spark/pull/28885#discussion_r465997277 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/ExchangeSuite.scala ## @@ -156,4 +158,46 @@ class ExchangeSuite extends

[GitHub] [spark] stczwd commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-08-05 Thread GitBox
stczwd commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r465996986 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AlterTableAddPartitionExec.scala ## @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465997315 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -809,51 +735,111 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465996100 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -809,51 +735,111 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465995487 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -420,10 +426,27 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465994643 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -452,71 +452,73 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465994403 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -452,71 +452,73 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465994024 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java ## @@ -356,6 +356,17 @@ public void

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465991846 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumWriter.scala ## @@ -0,0 +1,419 @@ +/* + * Licensed to the Apache

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465989873 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SparkAvroDatumReader.scala ## @@ -0,0 +1,811 @@ +/* + * Licensed to the Apache

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465983317 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala ## @@ -182,19 +182,21 @@ class AvroDeserializer(

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465983050 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala ## @@ -367,15 +372,45 @@ class AvroDeserializer( }

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465982616 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala ## @@ -367,15 +372,45 @@ class AvroDeserializer( }

[GitHub] [spark] msamirkhan commented on a change in pull request #29354: [WIP][Spark-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29354: URL: https://github.com/apache/spark/pull/29354#discussion_r465980330 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroOutputWriterFactory.scala ## @@ -40,6 +40,8 @@ private[sql] class

[GitHub] [spark] rdblue commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-08-05 Thread GitBox
rdblue commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r465974060 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/AlterTableAddPartitionExec.scala ## @@ -0,0 +1,48 @@ +/* + * Licensed

[GitHub] [spark] SparkQA commented on pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
SparkQA commented on pull request #29353: URL: https://github.com/apache/spark/pull/29353#issuecomment-669465889 **[Test build #127104 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127104/testReport)** for PR 29353 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29353: URL: https://github.com/apache/spark/pull/29353#issuecomment-669461208 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
AmplabJenkins commented on pull request #29353: URL: https://github.com/apache/spark/pull/29353#issuecomment-669461208 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
SparkQA commented on pull request #29353: URL: https://github.com/apache/spark/pull/29353#issuecomment-669460508 **[Test build #127103 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127103/testReport)** for PR 29353 at commit

[GitHub] [spark] msamirkhan commented on a change in pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29353: URL: https://github.com/apache/spark/pull/29353#discussion_r465964580 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SpecificInternalRow.scala ## @@ -192,24 +192,41 @@ final class

[GitHub] [spark] msamirkhan commented on pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on pull request #29353: URL: https://github.com/apache/spark/pull/29353#issuecomment-669459288 These are the changes referred to in https://github.com/apache/spark/pull/29353#discussion_r465955120 I'll put up a separate PR with this.

[GitHub] [spark] msamirkhan commented on a change in pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29353: URL: https://github.com/apache/spark/pull/29353#discussion_r465963022 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcDeserializer.scala ## @@ -72,137 +74,191 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29353: URL: https://github.com/apache/spark/pull/29353#discussion_r465963114 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcDeserializer.scala ## @@ -72,137 +74,191 @@ class

[GitHub] [spark] rdblue commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-08-05 Thread GitBox
rdblue commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r465956261 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala ## @@ -62,4 +74,31 @@ object

[GitHub] [spark] msamirkhan commented on a change in pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29353: URL: https://github.com/apache/spark/pull/29353#discussion_r465860164 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcSerializer.scala ## @@ -150,69 +156,110 @@ class

[GitHub] [spark] msamirkhan commented on a change in pull request #29353: [SPARK-32532][SQL] Improve ORC read/write performance on nested structs and array of structs

2020-08-05 Thread GitBox
msamirkhan commented on a change in pull request #29353: URL: https://github.com/apache/spark/pull/29353#discussion_r465955120 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcDeserializer.scala ## @@ -73,136 +74,180 @@ class

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29357: [SPARK-32539][INFRA] Disallow `FileSystem.get(Configuration conf)` in style check by default

2020-08-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29357: URL: https://github.com/apache/spark/pull/29357#issuecomment-669414888 This is an automated message from the Apache Git Service. To respond to the message, please log on

<    1   2   3   4   5   6   >