[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-27 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query URL: https://github.com/apache/spark/pull/23531#discussion_r251307934 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ##

[GitHub] httfighter commented on issue #23649: [SPARK-26726] The amount of memory used by the broadcast variable is …

2019-01-27 Thread GitBox
httfighter commented on issue #23649: [SPARK-26726] The amount of memory used by the broadcast variable is … URL: https://github.com/apache/spark/pull/23649#issuecomment-458029276 @vanzin Thank you for your review! I have added the test case.

[GitHub] peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-01-27 Thread GitBox
peter-toth commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query URL: https://github.com/apache/spark/pull/23531#discussion_r251307934 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ##

[GitHub] httfighter commented on a change in pull request #23649: [SPARK-26726] The amount of memory used by the broadcast variable is …

2019-01-27 Thread GitBox
httfighter commented on a change in pull request #23649: [SPARK-26726] The amount of memory used by the broadcast variable is … URL: https://github.com/apache/spark/pull/23649#discussion_r251307769 ## File path: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

[GitHub] felixcheung commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-01-27 Thread GitBox
felixcheung commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-458027865 @shaneknapp ^^ This is an automated message from the Apache Git

[GitHub] felixcheung commented on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0

2019-01-27 Thread GitBox
felixcheung commented on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0 URL: https://github.com/apache/spark/pull/23657#issuecomment-458026691 probably a good idea - arrow moves quickly; 0.10 is kinda "dated"

[GitHub] AmplabJenkins removed a comment on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise URL: https://github.com/apache/spark/pull/23669#issuecomment-458023244 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] AmplabJenkins commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise URL: https://github.com/apache/spark/pull/23669#issuecomment-458023240 Merged build finished. Test PASSed. This is an

[GitHub] AmplabJenkins removed a comment on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise URL: https://github.com/apache/spark/pull/23669#issuecomment-458023240 Merged build finished. Test PASSed. This

[GitHub] AmplabJenkins commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise URL: https://github.com/apache/spark/pull/23669#issuecomment-458023244 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] srowen commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
srowen commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#discussion_r251302189 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ##

[GitHub] SparkQA commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise

2019-01-27 Thread GitBox
SparkQA commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise URL: https://github.com/apache/spark/pull/23669#issuecomment-458022899 **[Test build #101748 has

[GitHub] maropu commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise

2019-01-27 Thread GitBox
maropu commented on issue #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise URL: https://github.com/apache/spark/pull/23669#issuecomment-458022193 It seems `ElementAt` could have the same fix. This change is trivial, so I think it might be better to include

[GitHub] maropu opened a new pull request #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise

2019-01-27 Thread GitBox
maropu opened a new pull request #23669: [SPARK-26747][SQL] Makes GetMapValue nullability more precise URL: https://github.com/apache/spark/pull/23669 ## What changes were proposed in this pull request? IIn master, `GetMapValue` nullable is always true;

[GitHub] HyukjinKwon edited a comment on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0

2019-01-27 Thread GitBox
HyukjinKwon edited a comment on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0 URL: https://github.com/apache/spark/pull/23657#issuecomment-458019236 For PyArrow 0.12.0, I don't think so but it will run with Arrow 0.12.0 + PyArrow 0.8.0 combination. I

[GitHub] HyukjinKwon commented on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0

2019-01-27 Thread GitBox
HyukjinKwon commented on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0 URL: https://github.com/apache/spark/pull/23657#issuecomment-458019236 For PyArrow 0.12.0, I don't think so but it will run with Arrow 0.12.0 + PyArrow 0.8.0 combination. I needs to

[GitHub] eatoncys commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types

2019-01-27 Thread GitBox
eatoncys commented on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-458018613 @cloud-fan Would you like to review it again, thanks.

[GitHub] maropu commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
maropu commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#discussion_r251298706 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

[GitHub] eatoncys removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types

2019-01-27 Thread GitBox
eatoncys removed a comment on issue #23010: [SPARK-26012][SQL]Null and '' values should not cause dynamic partition failure of string types URL: https://github.com/apache/spark/pull/23010#issuecomment-457778864 ping @cloud-fan

[GitHub] beliefer commented on a change in pull request #23650: [SPARK-26728]Make rdd.unpersist blocking configurable

2019-01-27 Thread GitBox
beliefer commented on a change in pull request #23650: [SPARK-26728]Make rdd.unpersist blocking configurable URL: https://github.com/apache/spark/pull/23650#discussion_r251297717 ## File path: core/src/main/scala/org/apache/spark/rdd/RDD.scala ## @@ -209,13 +210,14 @@

[GitHub] viirya commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
viirya commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#discussion_r251297474 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

[GitHub] viirya commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
viirya commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#discussion_r251297474 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

[GitHub] gatorsmile commented on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0

2019-01-27 Thread GitBox
gatorsmile commented on issue #23657: [SPARK-26566][PYTHON][SQL] Upgrade Apache Arrow to version 0.12.0 URL: https://github.com/apache/spark/pull/23657#issuecomment-458015915 Will our Jenkin run Arrow 0.12? This is an

[GitHub] SparkQA commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
SparkQA commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458013714 **[Test build #101747 has

[GitHub] AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458013641 Merged build finished. Test PASSed.

[GitHub] AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458013646 Test PASSed. Refer to this link for build results (access

[GitHub] AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458013641 Merged build finished. Test PASSed.

[GitHub] caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-27 Thread GitBox
caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#discussion_r251292265 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/ExchangeSuite.scala

[GitHub] AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458013646 Test PASSed. Refer to this link for build results (access rights to CI

[GitHub] HyukjinKwon commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-27 Thread GitBox
HyukjinKwon commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#discussion_r251294071 ## File path:

[GitHub] dongjoon-hyun commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
dongjoon-hyun commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458012910 Retest this please.

[GitHub] HyukjinKwon commented on issue #23336: [SPARK-26378][SQL] Restore performance of queries against wide CSV/JSON tables

2019-01-27 Thread GitBox
HyukjinKwon commented on issue #23336: [SPARK-26378][SQL] Restore performance of queries against wide CSV/JSON tables URL: https://github.com/apache/spark/pull/23336#issuecomment-458012967 I think it's clear that it saves the tame. Let's just push the results file here.

[GitHub] SparkQA removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
SparkQA removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-457993454 **[Test build #101743 has

[GitHub] AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458012248 Merged build finished. Test FAILed.

[GitHub] AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458012250 Test FAILed. Refer to this link for build results (access

[GitHub] AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458012248 Merged build finished. Test FAILed.

[GitHub] AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458012250 Test FAILed. Refer to this link for build results (access rights to CI

[GitHub] SparkQA commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp

2019-01-27 Thread GitBox
SparkQA commented on issue #23656: [SPARK-26379][SS][BRANCH-2.3] Use dummy TimeZoneId to avoid UnresolvedException in CurrentBatchTimestamp URL: https://github.com/apache/spark/pull/23656#issuecomment-458012077 **[Test build #101743 has

[GitHub] deshanxiao edited a comment on issue #23637: [SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI

2019-01-27 Thread GitBox
deshanxiao edited a comment on issue #23637: [SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI URL: https://github.com/apache/spark/pull/23637#issuecomment-458003956 Yes, it takes no time at all and It always succeeds. Maybe using the same time in `SparkListenerJobStart` and

[GitHub] caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-27 Thread GitBox
caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#discussion_r251292265 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/ExchangeSuite.scala

[GitHub] HyukjinKwon commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
HyukjinKwon commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#discussion_r251292205 ## File path:

[GitHub] liupc commented on issue #23614: [SPARK-26689]Support blacklisting bad disk directory and retry in DiskBlockManager

2019-01-27 Thread GitBox
liupc commented on issue #23614: [SPARK-26689]Support blacklisting bad disk directory and retry in DiskBlockManager URL: https://github.com/apache/spark/pull/23614#issuecomment-458010952 cc @srowen @vanzin @HyukjinKwon @squito @dongjoon-hyun Any body have a look at this PR and give

[GitHub] liupc commented on issue #23580: [SPARK-26660]Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
liupc commented on issue #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#issuecomment-458010241 @maropu @srowen @HyukjinKwon Really sorry for that, I will be more carefully next time!

[GitHub] bersprockets edited a comment on issue #23336: [SPARK-26378][SQL] Restore performance of queries against wide CSV/JSON tables

2019-01-27 Thread GitBox
bersprockets edited a comment on issue #23336: [SPARK-26378][SQL] Restore performance of queries against wide CSV/JSON tables URL: https://github.com/apache/spark/pull/23336#issuecomment-458008758 @HyukjinKwon The last time CSVBenchmark-results.txt or JSONBenchmark-results.txt was

[GitHub] bersprockets commented on issue #23336: [SPARK-26378][SQL] Restore performance of queries against wide CSV/JSON tables

2019-01-27 Thread GitBox
bersprockets commented on issue #23336: [SPARK-26378][SQL] Restore performance of queries against wide CSV/JSON tables URL: https://github.com/apache/spark/pull/23336#issuecomment-458008758 @HyukjinKwon The last time CSVBenchmark-results.txt or JSONBenchmark-results.txt was

[GitHub] caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-27 Thread GitBox
caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#discussion_r251289944 ## File path:

[GitHub] caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-01-27 Thread GitBox
caneGuy commented on a change in pull request #23519: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23519#discussion_r251289872 ## File path:

[GitHub] liupc commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
liupc commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#discussion_r251289438 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ##

[GitHub] deshanxiao commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
deshanxiao commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#discussion_r251289378 ## File path:

[GitHub] deshanxiao commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
deshanxiao commented on a change in pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#discussion_r251289378 ## File path:

[GitHub] HyukjinKwon commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
HyukjinKwon commented on a change in pull request #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#discussion_r251287698 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala

[GitHub] deshanxiao edited a comment on issue #23637: [SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI

2019-01-27 Thread GitBox
deshanxiao edited a comment on issue #23637: [SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI URL: https://github.com/apache/spark/pull/23637#issuecomment-458003956 Yes, it takes no time at all and It always succeeds. I think in this case, we could not check it twice. Event not

[GitHub] deshanxiao commented on issue #23637: [SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI

2019-01-27 Thread GitBox
deshanxiao commented on issue #23637: [SPARK-26714][CORE][WEBUI] Show 0 partition job in WebUI URL: https://github.com/apache/spark/pull/23637#issuecomment-458003956 Yes, it takes no time at all and It always succeeds. I think in this case, we could not check it twich. Event not creating

[GitHub] AmplabJenkins commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-458003268 Test PASSed. Refer to this link for build results (access rights to CI

[GitHub] AmplabJenkins removed a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-458003268 Test PASSed. Refer to this link for build results (access rights

[GitHub] SparkQA removed a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
SparkQA removed a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-457973075 **[Test build #101739 has

[GitHub] AmplabJenkins removed a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-458003266 Merged build finished. Test PASSed.

[GitHub] SparkQA commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
SparkQA commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-458003043 **[Test build #101739 has

[GitHub] AmplabJenkins commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-458003266 Merged build finished. Test PASSed.

[GitHub] zjf2012 commented on issue #23560: [SPARK-26632][Spark Core] Separate Thread Configurations of Driver and Executor

2019-01-27 Thread GitBox
zjf2012 commented on issue #23560: [SPARK-26632][Spark Core] Separate Thread Configurations of Driver and Executor URL: https://github.com/apache/spark/pull/23560#issuecomment-458002943 @jerryshao , I checked configuration.md as well as other related md files under the docs folder. I

[GitHub] AmplabJenkins removed a comment on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#issuecomment-458001541 Merged build finished. Test PASSed.

[GitHub] AmplabJenkins commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#issuecomment-458001546 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] maropu commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
maropu commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#issuecomment-458001665 LGTM This is an automated message from the

[GitHub] viirya commented on a change in pull request #22193: [SPARK-25186][SQL] Remove v2 save mode.

2019-01-27 Thread GitBox
viirya commented on a change in pull request #22193: [SPARK-25186][SQL] Remove v2 save mode. URL: https://github.com/apache/spark/pull/22193#discussion_r251284321 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala

[GitHub] AmplabJenkins removed a comment on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#issuecomment-458001546 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] AmplabJenkins commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#issuecomment-458001541 Merged build finished. Test PASSed.

[GitHub] srowen commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-27 Thread GitBox
srowen commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-458001245 Is there a good reason to scale it by the square of the samples? if not, yeah, worth a follow-up. If there is a good

[GitHub] SparkQA commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
SparkQA commented on issue #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668#issuecomment-458001181 **[Test build #101746 has

[GitHub] HeartSaVioR edited a comment on issue #23634: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows

2019-01-27 Thread GitBox
HeartSaVioR edited a comment on issue #23634: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows URL: https://github.com/apache/spark/pull/23634#issuecomment-458000808 > The rows in one of the inputs are immediately discarded while I

[GitHub] HeartSaVioR commented on issue #23634: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows

2019-01-27 Thread GitBox
HeartSaVioR commented on issue #23634: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows URL: https://github.com/apache/spark/pull/23634#issuecomment-458000808 > The rows in one of the inputs are immediately discarded while I think

[GitHub] srowen opened a new pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
srowen opened a new pull request #23668: [SPARK-26660][FOLLOWUP] Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23668 ## What changes were proposed in this pull request? The warning introduced in

[GitHub] maropu commented on issue #23580: [SPARK-26660]Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
maropu commented on issue #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#issuecomment-458000585 oh...thanks! This is an automated message from the

[GitHub] srowen commented on issue #23580: [SPARK-26660]Add warning logs when broadcasting large task binary

2019-01-27 Thread GitBox
srowen commented on issue #23580: [SPARK-26660]Add warning logs when broadcasting large task binary URL: https://github.com/apache/spark/pull/23580#issuecomment-458000380 Oh, shoot, there's a bug here: `if (taskBinaryBytes.length * 1000 > TaskSetManager.TASK_SIZE_TO_WARN_KB)` The `*

[GitHub] sadhen commented on a change in pull request #23276: [SPARK-26321][SQL] Improve the behavior of sql text splitting for the spark-sql command line

2019-01-27 Thread GitBox
sadhen commented on a change in pull request #23276: [SPARK-26321][SQL] Improve the behavior of sql text splitting for the spark-sql command line URL: https://github.com/apache/spark/pull/23276#discussion_r251283220 ## File path:

[GitHub] srowen commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala.

2019-01-27 Thread GitBox
srowen commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala. URL: https://github.com/apache/spark/pull/23534#discussion_r251283106 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -45,6

[GitHub] AmplabJenkins removed a comment on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959 URL: https://github.com/apache/spark/pull/23667#issuecomment-457999378 Merged build finished. Test PASSed.

[GitHub] AmplabJenkins removed a comment on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959 URL: https://github.com/apache/spark/pull/23667#issuecomment-457999380 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] AmplabJenkins commented on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959 URL: https://github.com/apache/spark/pull/23667#issuecomment-457999378 Merged build finished. Test PASSed.

[GitHub] AmplabJenkins commented on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959 URL: https://github.com/apache/spark/pull/23667#issuecomment-457999380 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] SparkQA commented on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959

2019-01-27 Thread GitBox
SparkQA commented on issue #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959 URL: https://github.com/apache/spark/pull/23667#issuecomment-457999116 **[Test build #101745 has

[GitHub] HyukjinKwon commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
HyukjinKwon commented on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-457998561 @sumitsu, if we agree upon reverting it (at 23667), let's convert this PR

[GitHub] HyukjinKwon edited a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
HyukjinKwon edited a comment on issue #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#issuecomment-457998561 @sumitsu, if we agree upon reverting it (at #23667), let's convert

[GitHub] HyukjinKwon opened a new pull request #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959

2019-01-27 Thread GitBox
HyukjinKwon opened a new pull request #23667: [SPARK-26745][SQL] Revert count optimization in JSON datasource by SPARK-24959 URL: https://github.com/apache/spark/pull/23667 ## What changes were proposed in this pull request? This PR reverts JSON count optimization part of #21909.

[GitHub] maropu commented on a change in pull request #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect

2019-01-27 Thread GitBox
maropu commented on a change in pull request #23665: [SPARK-26745][SQL] Skip empty lines in JSON-derived DataFrames when skipParsing optimization in effect URL: https://github.com/apache/spark/pull/23665#discussion_r251281725 ## File path:

[GitHub] cloud-fan commented on a change in pull request #23276: [SPARK-26321][SQL] Improve the behavior of sql text splitting for the spark-sql command line

2019-01-27 Thread GitBox
cloud-fan commented on a change in pull request #23276: [SPARK-26321][SQL] Improve the behavior of sql text splitting for the spark-sql command line URL: https://github.com/apache/spark/pull/23276#discussion_r251281683 ## File path:

[GitHub] imatiach-msft commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees

2019-01-27 Thread GitBox
imatiach-msft commented on issue #21632: [SPARK-19591][ML][MLlib] Add sample weights to decision trees URL: https://github.com/apache/spark/pull/21632#issuecomment-457997552 @srowen thank you for the merge and the thorough review. I have some doubts about the tolerance we decided for

[GitHub] ueshin commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala.

2019-01-27 Thread GitBox
ueshin commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala. URL: https://github.com/apache/spark/pull/23534#discussion_r251281383 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -45,6

[GitHub] ueshin commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala.

2019-01-27 Thread GitBox
ueshin commented on a change in pull request #23534: [SPARK-26610][PYTHON] Fix inconsistency between toJSON Method in Python and Scala. URL: https://github.com/apache/spark/pull/23534#discussion_r251281383 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -45,6

[GitHub] cloud-fan commented on issue #23645: [SPARK-26725][TEST] Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-27 Thread GitBox
cloud-fan commented on issue #23645: [SPARK-26725][TEST] Fix the input values of UnifiedMemoryManager constructor in test suites URL: https://github.com/apache/spark/pull/23645#issuecomment-457997323 thanks, merging to master!

[GitHub] cloud-fan closed pull request #23645: [SPARK-26725][TEST] Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-27 Thread GitBox
cloud-fan closed pull request #23645: [SPARK-26725][TEST] Fix the input values of UnifiedMemoryManager constructor in test suites URL: https://github.com/apache/spark/pull/23645 This is an automated message from the Apache

[GitHub] HeartSaVioR commented on issue #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation

2019-01-27 Thread GitBox
HeartSaVioR commented on issue #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation URL: https://github.com/apache/spark/pull/23666#issuecomment-457997277 You might be confused regarding policy for the first time, but we only add `[BRANCH-NAME]` for

[GitHub] cloud-fan commented on a change in pull request #23644: [SPARK-26708][SQL] Incorrect result caused by inconsistency between a SQL cache's cached RDD and its physical plan

2019-01-27 Thread GitBox
cloud-fan commented on a change in pull request #23644: [SPARK-26708][SQL] Incorrect result caused by inconsistency between a SQL cache's cached RDD and its physical plan URL: https://github.com/apache/spark/pull/23644#discussion_r251281049 ## File path:

[GitHub] HeartSaVioR commented on a change in pull request #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation

2019-01-27 Thread GitBox
HeartSaVioR commented on a change in pull request #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation URL: https://github.com/apache/spark/pull/23666#discussion_r251280936 ## File path:

[GitHub] liupc edited a comment on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce

2019-01-27 Thread GitBox
liupc edited a comment on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce URL: https://github.com/apache/spark/pull/23647#issuecomment-457995661 @vanzin @HyukjinKwon we once run into a similar problem on Spark2.0.1 when

[GitHub] sadhen commented on a change in pull request #23276: [SPARK-26321][SQL] Improve the behavior of sql text splitting for the spark-sql command line

2019-01-27 Thread GitBox
sadhen commented on a change in pull request #23276: [SPARK-26321][SQL] Improve the behavior of sql text splitting for the spark-sql command line URL: https://github.com/apache/spark/pull/23276#discussion_r251280587 ## File path:

[GitHub] HyukjinKwon commented on a change in pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and CSV

2019-01-27 Thread GitBox
HyukjinKwon commented on a change in pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and CSV URL: https://github.com/apache/spark/pull/21909#discussion_r251280295 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala

[GitHub] liupc commented on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce

2019-01-27 Thread GitBox
liupc commented on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce URL: https://github.com/apache/spark/pull/23647#issuecomment-457995661 @vanzin @HyukjinKwon we once run into a similar problem on Spark2.0.1 when

[GitHub] liupc edited a comment on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce

2019-01-27 Thread GitBox
liupc edited a comment on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce URL: https://github.com/apache/spark/pull/23647#issuecomment-457995661 @vanzin @HyukjinKwon we once run into a similar problem on Spark2.0.1 when

[GitHub] AmplabJenkins commented on issue #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation

2019-01-27 Thread GitBox
AmplabJenkins commented on issue #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation URL: https://github.com/apache/spark/pull/23666#issuecomment-457994835 Merged build finished. Test PASSed.

[GitHub] AmplabJenkins removed a comment on issue #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation

2019-01-27 Thread GitBox
AmplabJenkins removed a comment on issue #23666: [SPARK-26718][SS][Master] Fixed integer overflow in SS kafka rateLimit calculation URL: https://github.com/apache/spark/pull/23666#issuecomment-457994835 Merged build finished. Test PASSed.

  1   2   3   4   >