[GitHub] [spark] AmplabJenkins removed a comment on pull request #28766: [SPARK-31939][SQL] Fix Parsing day of year when year field pattern is missing

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28766: URL: https://github.com/apache/spark/pull/28766#issuecomment-641771608 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28766: [SPARK-31939][SQL] Fix Parsing day of year when year field pattern is missing

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28766: URL: https://github.com/apache/spark/pull/28766#issuecomment-641771608 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] dilipbiswal edited a comment on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
dilipbiswal edited a comment on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766979 @maropu > Have you checked my last comment? #28750 (comment) The PR itself looks okay. Sorry i missed that. I have added the comment now. @viirya

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28733: URL: https://github.com/apache/spark/pull/28733#issuecomment-641770390 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #28766: [SPARK-31939][SQL] Fix Parsing day of year when year field pattern is missing

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28766: URL: https://github.com/apache/spark/pull/28766#issuecomment-641679788 **[Test build #123715 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123715/testReport)** for PR 28766 at commit [`a11a049`](https://gi

[GitHub] [spark] SparkQA commented on pull request #28766: [SPARK-31939][SQL] Fix Parsing day of year when year field pattern is missing

2020-06-09 Thread GitBox
SparkQA commented on pull request #28766: URL: https://github.com/apache/spark/pull/28766#issuecomment-641770461 **[Test build #123715 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123715/testReport)** for PR 28766 at commit [`a11a049`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28733: URL: https://github.com/apache/spark/pull/28733#issuecomment-641770390 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] cloud-fan commented on a change in pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
cloud-fan commented on a change in pull request #28733: URL: https://github.com/apache/spark/pull/28733#discussion_r437898264 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -198,6 +200,90 @@ trait PredicateHelper {

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28775: [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config "spark.standalone.submit.waitAppCompletion"

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28775: URL: https://github.com/apache/spark/pull/28775#issuecomment-641770136 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] xuanyuanking commented on pull request #28737: [SPARK-31913][SQL] Fix StackOverflowError in FileScanRDD

2020-06-09 Thread GitBox
xuanyuanking commented on pull request #28737: URL: https://github.com/apache/spark/pull/28737#issuecomment-641769868 Let me clarify. The issue is the recursive calls in FileScanRDD will cause StackOverflowError while we have too many empty files. Could you please quantify the number of em

[GitHub] [spark] AmplabJenkins commented on pull request #28775: [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config "spark.standalone.submit.waitAppCompletion"

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28775: URL: https://github.com/apache/spark/pull/28775#issuecomment-641770136 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
SparkQA commented on pull request #28733: URL: https://github.com/apache/spark/pull/28733#issuecomment-641769586 **[Test build #123729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123729/testReport)** for PR 28733 at commit [`15437b3`](https://github.com

[GitHub] [spark] viirya commented on a change in pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-09 Thread GitBox
viirya commented on a change in pull request #28761: URL: https://github.com/apache/spark/pull/28761#discussion_r437897971 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcV1FilterSuite.scala ## @@ -19,7 +19,7 @@ package org.apache.spark.

[GitHub] [spark] viirya commented on a change in pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-09 Thread GitBox
viirya commented on a change in pull request #28761: URL: https://github.com/apache/spark/pull/28761#discussion_r437897740 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFiltersBase.scala ## @@ -37,12 +40,44 @@ trait OrcFiltersBase {

[GitHub] [spark] SparkQA removed a comment on pull request #28775: [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config "spark.standalone.submit.waitAppCompletion"

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28775: URL: https://github.com/apache/spark/pull/28775#issuecomment-641706549 **[Test build #123721 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123721/testReport)** for PR 28775 at commit [`3d55ef8`](https://gi

[GitHub] [spark] SparkQA commented on pull request #28775: [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config "spark.standalone.submit.waitAppCompletion"

2020-06-09 Thread GitBox
SparkQA commented on pull request #28775: URL: https://github.com/apache/spark/pull/28775#issuecomment-641768855 **[Test build #123721 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123721/testReport)** for PR 28775 at commit [`3d55ef8`](https://github.co

[GitHub] [spark] viirya commented on a change in pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-09 Thread GitBox
viirya commented on a change in pull request #28761: URL: https://github.com/apache/spark/pull/28761#discussion_r437897212 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFiltersBase.scala ## @@ -37,12 +40,44 @@ trait OrcFiltersBase {

[GitHub] [spark] viirya commented on a change in pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
viirya commented on a change in pull request #28743: URL: https://github.com/apache/spark/pull/28743#discussion_r437896443 ## File path: python/pyspark/sql/pandas/serializers.py ## @@ -150,15 +151,22 @@ def _create_batch(self, series): series = ((s, None) if not isinst

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766412 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766399 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] dilipbiswal commented on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
dilipbiswal commented on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766979 @maropu > Have you checked my last comment? #28750 (comment) The PR itself looks okay. Sorry i missed that. I have added the comment now. @viirya We cou

[GitHub] [spark] SparkQA removed a comment on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641765460 **[Test build #123728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123728/testReport)** for PR 28750 at commit [`1050df3`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766107 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-641766138 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
SparkQA commented on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766373 **[Test build #123728 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123728/testReport)** for PR 28750 at commit [`1050df3`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-641766138 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766399 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641766107 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641765619 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641765619 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28761: [SPARK-25557][SQL] Nested column predicate pushdown for ORC

2020-06-09 Thread GitBox
SparkQA commented on pull request #28761: URL: https://github.com/apache/spark/pull/28761#issuecomment-641765400 **[Test build #123727 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123727/testReport)** for PR 28761 at commit [`bd691ed`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #28750: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-09 Thread GitBox
SparkQA commented on pull request #28750: URL: https://github.com/apache/spark/pull/28750#issuecomment-641765460 **[Test build #123728 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123728/testReport)** for PR 28750 at commit [`1050df3`](https://github.com

[GitHub] [spark] viirya commented on a change in pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
viirya commented on a change in pull request #28743: URL: https://github.com/apache/spark/pull/28743#discussion_r437893088 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -394,10 +394,11 @@ def _create_from_pandas_with_arrow(self, pdf, schema, timezone):

[GitHub] [spark] SparkQA commented on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
SparkQA commented on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641764410 **[Test build #123720 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123720/testReport)** for PR 28412 at commit [`d6c7d98`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641701635 **[Test build #123720 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123720/testReport)** for PR 28412 at commit [`d6c7d98`](https://gi

[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
zhli1142015 edited a comment on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768 > Then how about capture the exception and ask the user to increase the related configuration or try loading the page again? Because there is disk space limitat

[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
zhli1142015 edited a comment on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768 > Then how about capture the exception and ask the user to increase the related configuration or try loading the page again? Because there is disk space limitat

[GitHub] [spark] viirya commented on a change in pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
viirya commented on a change in pull request #28743: URL: https://github.com/apache/spark/pull/28743#discussion_r437887602 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -394,10 +394,11 @@ def _create_from_pandas_with_arrow(self, pdf, schema, timezone):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27617: [SPARK-30865][SQL] Refactor DateTimeUtils

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #27617: URL: https://github.com/apache/spark/pull/27617#issuecomment-641754416 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #27617: [SPARK-30865][SQL] Refactor DateTimeUtils

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #27617: URL: https://github.com/apache/spark/pull/27617#issuecomment-641754416 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
zhli1142015 edited a comment on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768 > Then how about capture the exception and ask the user to increase the related configuration or try loading the page again? Because there is disk space limitat

[GitHub] [spark] SparkQA commented on pull request #27617: [SPARK-30865][SQL] Refactor DateTimeUtils

2020-06-09 Thread GitBox
SparkQA commented on pull request #27617: URL: https://github.com/apache/spark/pull/27617#issuecomment-641753870 **[Test build #123726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123726/testReport)** for PR 27617 at commit [`311a47e`](https://github.com

[GitHub] [spark] zhli1142015 commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
zhli1142015 commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641752768 > Then how about capture the exception and ask the user to increase the related configuration or try loading the page again? Because there is disk space limitation, we

[GitHub] [spark] gengliangwang commented on a change in pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
gengliangwang commented on a change in pull request #28733: URL: https://github.com/apache/spark/pull/28733#discussion_r437884631 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -198,6 +200,90 @@ trait PredicateHelper

[GitHub] [spark] cloud-fan commented on a change in pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
cloud-fan commented on a change in pull request #28733: URL: https://github.com/apache/spark/pull/28733#discussion_r437882318 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -198,6 +200,90 @@ trait PredicateHelper {

[GitHub] [spark] gengliangwang commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
gengliangwang commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641747325 Then how about capture the exception and ask the user to increase the related configuration or try loading the page again? What is the benefit of this PR to users? ---

[GitHub] [spark] agrawaldevesh commented on pull request #27636: [SPARK-30873][CORE][YARN]Handling Node Decommissioning for Yarn cluster manger in Spark

2020-06-09 Thread GitBox
agrawaldevesh commented on pull request #27636: URL: https://github.com/apache/spark/pull/27636#issuecomment-641747304 @SaurabhChawla100 , can you briefly update the PR description to reflect how work relates to the recently merged in https://github.com/apache/spark/pull/27864 ? Perhaps yo

[GitHub] [spark] gengliangwang commented on a change in pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
gengliangwang commented on a change in pull request #28733: URL: https://github.com/apache/spark/pull/28733#discussion_r437880281 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -198,6 +200,90 @@ trait PredicateHelper

[GitHub] [spark] karuppayya commented on pull request #28662: [SPARK-31850][SQL]Prevent DetermineTableStats from computing stats multiple times for same table

2020-06-09 Thread GitBox
karuppayya commented on pull request #28662: URL: https://github.com/apache/spark/pull/28662#issuecomment-641746170 @viirya @maropu Can you please help review this PR This is an automated message from the Apache Git Service.

[GitHub] [spark] cloud-fan commented on a change in pull request #28733: [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion

2020-06-09 Thread GitBox
cloud-fan commented on a change in pull request #28733: URL: https://github.com/apache/spark/pull/28733#discussion_r437879694 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -198,6 +200,90 @@ trait PredicateHelper {

[GitHub] [spark] zhli1142015 commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
zhli1142015 commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641744773 > could you describe an end-to-end use case that can reproduce the error page in PR description? Does it only happen when leveldb is evicted or UI server is Sure, for

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641736976 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641736976 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641721923 **[Test build #123725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123725/testReport)** for PR 28743 at commit [`403f579`](https://gi

[GitHub] [spark] moskvax commented on a change in pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
moskvax commented on a change in pull request #28743: URL: https://github.com/apache/spark/pull/28743#discussion_r437872692 ## File path: python/pyspark/sql/pandas/serializers.py ## @@ -150,15 +151,22 @@ def _create_batch(self, series): series = ((s, None) if not isins

[GitHub] [spark] SparkQA commented on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
SparkQA commented on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641736450 **[Test build #123725 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123725/testReport)** for PR 28743 at commit [`403f579`](https://github.co

[GitHub] [spark] siknezevic commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-06-09 Thread GitBox
siknezevic commented on pull request #27246: URL: https://github.com/apache/spark/pull/27246#issuecomment-641736377 > Also, could you add some benchmark classes in https://github.com/apache/spark/tree/master/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark ? Hello @m

[GitHub] [spark] gengliangwang commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
gengliangwang commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641735784 @zhli1142015 sorry I left comments in the code before I read the discussion in the PR. So, before you update the related code, could you describe an end-to-end use

[GitHub] [spark] yaooqinn commented on a change in pull request #28766: [SPARK-31939][SQL] Fix Parsing day of year when year field pattern is missing

2020-06-09 Thread GitBox
yaooqinn commented on a change in pull request #28766: URL: https://github.com/apache/spark/pull/28766#discussion_r437871769 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/TimestampFormatterSuite.scala ## @@ -433,4 +433,35 @@ class TimestampFormat

[GitHub] [spark] yaooqinn commented on a change in pull request #28766: [SPARK-31939][SQL] Fix Parsing day of year when year field pattern is missing

2020-06-09 Thread GitBox
yaooqinn commented on a change in pull request #28766: URL: https://github.com/apache/spark/pull/28766#discussion_r437871990 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala ## @@ -39,6 +39,18 @@ trait DateTimeFormatter

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641732806 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] gengliangwang commented on a change in pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
gengliangwang commented on a change in pull request #28769: URL: https://github.com/apache/spark/pull/28769#discussion_r437870387 ## File path: common/kvstore/src/test/java/org/apache/spark/util/kvstore/LevelDBSuite.java ## @@ -276,6 +276,41 @@ public void testNegativeIndexVal

[GitHub] [spark] AmplabJenkins commented on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641732806 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] gengliangwang commented on a change in pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
gengliangwang commented on a change in pull request #28769: URL: https://github.com/apache/spark/pull/28769#discussion_r437869878 ## File path: common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java ## @@ -189,7 +198,12 @@ public void delete(Class type, Object

[GitHub] [spark] dilipbiswal commented on pull request #28773: [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

2020-06-09 Thread GitBox
dilipbiswal commented on pull request #28773: URL: https://github.com/apache/spark/pull/28773#issuecomment-641732487 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [spark] SparkQA removed a comment on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641689848 **[Test build #123719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123719/testReport)** for PR 28412 at commit [`1e514b9`](https://gi

[GitHub] [spark] gengliangwang commented on a change in pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
gengliangwang commented on a change in pull request #28769: URL: https://github.com/apache/spark/pull/28769#discussion_r437869753 ## File path: common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java ## @@ -256,6 +275,7 @@ void closeIterator(LevelDBIterator it)

[GitHub] [spark] SparkQA commented on pull request #28412: [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster

2020-06-09 Thread GitBox
SparkQA commented on pull request #28412: URL: https://github.com/apache/spark/pull/28412#issuecomment-641732044 **[Test build #123719 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123719/testReport)** for PR 28412 at commit [`1e514b9`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #28773: [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28773: URL: https://github.com/apache/spark/pull/28773#issuecomment-641725711 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28773: [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28773: URL: https://github.com/apache/spark/pull/28773#issuecomment-641725711 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #28773: [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28773: URL: https://github.com/apache/spark/pull/28773#issuecomment-641651098 **[Test build #123711 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123711/testReport)** for PR 28773 at commit [`1013ac8`](https://gi

[GitHub] [spark] SparkQA commented on pull request #28773: [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

2020-06-09 Thread GitBox
SparkQA commented on pull request #28773: URL: https://github.com/apache/spark/pull/28773#issuecomment-641724830 **[Test build #123711 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123711/testReport)** for PR 28773 at commit [`1013ac8`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641722284 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641722278 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641722278 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
SparkQA commented on pull request #28743: URL: https://github.com/apache/spark/pull/28743#issuecomment-641721923 **[Test build #123725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123725/testReport)** for PR 28743 at commit [`403f579`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27507: [SPARK-24884][SQL] Support regexp function regexp_extract_all

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #27507: URL: https://github.com/apache/spark/pull/27507#issuecomment-641719947 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641720102 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641720102 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #27507: [SPARK-24884][SQL] Support regexp function regexp_extract_all

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #27507: URL: https://github.com/apache/spark/pull/27507#issuecomment-641679815 **[Test build #123716 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123716/testReport)** for PR 27507 at commit [`ca6c1c5`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27507: [SPARK-24884][SQL] Support regexp function regexp_extract_all

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #27507: URL: https://github.com/apache/spark/pull/27507#issuecomment-641719937 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] SparkQA commented on pull request #28776: [3.0][SPARK-31935][SQL] Hadoop file system config should be effective in data source options

2020-06-09 Thread GitBox
SparkQA commented on pull request #28776: URL: https://github.com/apache/spark/pull/28776#issuecomment-641719758 **[Test build #123723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123723/testReport)** for PR 28776 at commit [`f6cca6b`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #27507: [SPARK-24884][SQL] Support regexp function regexp_extract_all

2020-06-09 Thread GitBox
SparkQA commented on pull request #27507: URL: https://github.com/apache/spark/pull/27507#issuecomment-641719783 **[Test build #123716 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123716/testReport)** for PR 27507 at commit [`ca6c1c5`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
SparkQA commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641719773 **[Test build #123724 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123724/testReport)** for PR 28769 at commit [`84e9012`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #27507: [SPARK-24884][SQL] Support regexp function regexp_extract_all

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #27507: URL: https://github.com/apache/spark/pull/27507#issuecomment-641719937 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28776: [3.0][SPARK-31935][SQL] Hadoop file system config should be effective in data source options

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28776: URL: https://github.com/apache/spark/pull/28776#issuecomment-641717889 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641186609 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] cloud-fan commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
cloud-fan commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641719161 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] moskvax commented on a change in pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
moskvax commented on a change in pull request #28743: URL: https://github.com/apache/spark/pull/28743#discussion_r437858625 ## File path: python/pyspark/sql/tests/test_arrow.py ## @@ -30,10 +30,14 @@ pandas_requirement_message, pyarrow_requirement_message from pyspark.tes

[GitHub] [spark] cloud-fan commented on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
cloud-fan commented on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641718840 cc @gengliangwang This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] moskvax commented on a change in pull request #28743: [SPARK-31920][PYTHON] Fix pandas conversion using Arrow with __arrow_array__ columns

2020-06-09 Thread GitBox
moskvax commented on a change in pull request #28743: URL: https://github.com/apache/spark/pull/28743#discussion_r437858389 ## File path: python/pyspark/sql/pandas/conversion.py ## @@ -394,10 +394,11 @@ def _create_from_pandas_with_arrow(self, pdf, schema, timezone):

[GitHub] [spark] AmplabJenkins commented on pull request #28774: [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python function.

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28774: URL: https://github.com/apache/spark/pull/28774#issuecomment-641718511 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28774: [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python function.

2020-06-09 Thread GitBox
AmplabJenkins removed a comment on pull request #28774: URL: https://github.com/apache/spark/pull/28774#issuecomment-641718511 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28776: [3.0][SPARK-31935][SQL] Hadoop file system config should be effective in data source options

2020-06-09 Thread GitBox
AmplabJenkins commented on pull request #28776: URL: https://github.com/apache/spark/pull/28776#issuecomment-641717889 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #28774: [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python function.

2020-06-09 Thread GitBox
SparkQA removed a comment on pull request #28774: URL: https://github.com/apache/spark/pull/28774#issuecomment-641673972 **[Test build #123714 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123714/testReport)** for PR 28774 at commit [`c2b6b86`](https://gi

[GitHub] [spark] gengliangwang opened a new pull request #28776: [SPARK-31935][SQL] Hadoop file system config should be effective in data source options

2020-06-09 Thread GitBox
gengliangwang opened a new pull request #28776: URL: https://github.com/apache/spark/pull/28776 ### What changes were proposed in this pull request? Mkae Hadoop file system config effective in data source options. From `org.apache.hadoop.fs.FileSystem.java`: ```

[GitHub] [spark] gengliangwang commented on pull request #28776: [SPARK-31935][SQL] Hadoop file system config should be effective in data source options

2020-06-09 Thread GitBox
gengliangwang commented on pull request #28776: URL: https://github.com/apache/spark/pull/28776#issuecomment-641717785 This PR backports https://github.com/apache/spark/pull/28760 to branch-3.0 This is an automated message fr

[GitHub] [spark] SparkQA commented on pull request #28774: [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python function.

2020-06-09 Thread GitBox
SparkQA commented on pull request #28774: URL: https://github.com/apache/spark/pull/28774#issuecomment-641717749 **[Test build #123714 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123714/testReport)** for PR 28774 at commit [`c2b6b86`](https://github.co

[GitHub] [spark] xccui commented on pull request #28768: [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore

2020-06-09 Thread GitBox
xccui commented on pull request #28768: URL: https://github.com/apache/spark/pull/28768#issuecomment-641714353 Sorry that I didn't realize the potential impact of using `SparkException` or `NoSuchElementException`. +1 to this change.

[GitHub] [spark] zhli1142015 edited a comment on pull request #28769: [SPARK-31929][WEBUI] Close leveldbiterator when leveldb.close

2020-06-09 Thread GitBox
zhli1142015 edited a comment on pull request #28769: URL: https://github.com/apache/spark/pull/28769#issuecomment-641708063 > Of course relying on finalize is wrong, but I don't think the intent was to rely on finalize. Not closing these iterators is a bug. I see one case it clearly isn't;

  1   2   3   4   5   6   7   8   >