[GitHub] [spark] gatorsmile commented on a change in pull request #24164: [SPARK-27225][SQL] Implement join strategy hints

2019-03-26 Thread GitBox
gatorsmile commented on a change in pull request #24164: [SPARK-27225][SQL] Implement join strategy hints URL: https://github.com/apache/spark/pull/24164#discussion_r269424547 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ## @@

[GitHub] [spark] SparkQA commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
SparkQA commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-477003736 **[Test build #104003 has started](https://amplab.cs.berkeley.edu/jenkins/

[GitHub] [spark] AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-477003265 Test PASSed. Refer to this link for build results (access

[GitHub] [spark] AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-477003262 Merged build finished. Test PASSed. ---

[GitHub] [spark] AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-477003265 Test PASSed. Refer to this link for build results (access rights t

[GitHub] [spark] AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-477003262 Merged build finished. Test PASSed.

[GitHub] [spark] cloud-fan closed pull request #23998: [SPARK-27083][SQL]Add a new conf to control subqueryReuse

2019-03-26 Thread GitBox
cloud-fan closed pull request #23998: [SPARK-27083][SQL]Add a new conf to control subqueryReuse URL: https://github.com/apache/spark/pull/23998 This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [spark] HeartSaVioR commented on a change in pull request #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#discussion_r269421228 ## File path: external/kafka-0-10-sql/src/m

[GitHub] [spark] HeartSaVioR commented on a change in pull request #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#discussion_r269421228 ## File path: external/kafka-0-10-sql/src/m

[GitHub] [spark] cloud-fan commented on issue #23998: [SPARK-27083][SQL]Add a new conf to control subqueryReuse

2019-03-26 Thread GitBox
cloud-fan commented on issue #23998: [SPARK-27083][SQL]Add a new conf to control subqueryReuse URL: https://github.com/apache/spark/pull/23998#issuecomment-476999892 thanks, merging to master! This is an automated message fro

[GitHub] [spark] ScrapCodes commented on a change in pull request #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking.

2019-03-26 Thread GitBox
ScrapCodes commented on a change in pull request #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#discussion_r269419077 ## File path: external/kafka-0-10-sql/src/ma

[GitHub] [spark] dbtsai commented on issue #24220: [SPARK-27288][SQL] Pruning nested field in complex map key from object serializers

2019-03-26 Thread GitBox
dbtsai commented on issue #24220: [SPARK-27288][SQL] Pruning nested field in complex map key from object serializers URL: https://github.com/apache/spark/pull/24220#issuecomment-476996160 LGTM. This is an automated message fr

[GitHub] [spark] gatorsmile commented on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor

2019-03-26 Thread GitBox
gatorsmile commented on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor URL: https://github.com/apache/spark/pull/24136#issuecomment-476993535 LGTM except the above comment. --

[GitHub] [spark] gatorsmile commented on a change in pull request #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor

2019-03-26 Thread GitBox
gatorsmile commented on a change in pull request #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor URL: https://github.com/apache/spark/pull/24136#discussion_r269416911 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/cata

[GitHub] [spark] 10110346 commented on a change in pull request #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
10110346 commented on a change in pull request #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#discussion_r269413253 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf

[GitHub] [spark] 10110346 commented on a change in pull request #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
10110346 commented on a change in pull request #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#discussion_r269413177 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysi

[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending) URL: https://github.com/apache/spark/pull/23747#discussion_r269410616 ## File path: external/kafka-0-10-sql/src/main/scala/org/ap

[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending) URL: https://github.com/apache/spark/pull/23747#discussion_r269410616 ## File path: external/kafka-0-10-sql/src/main/scala/org/ap

[GitHub] [spark] chakravarthiT edited a comment on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor

2019-03-26 Thread GitBox
chakravarthiT edited a comment on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor URL: https://github.com/apache/spark/pull/24136#issuecomment-476948267 @maropu @HyukjinKwon handled review comments. and now that UT is passed,as messag

[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending) URL: https://github.com/apache/spark/pull/23747#discussion_r269410616 ## File path: external/kafka-0-10-sql/src/main/scala/org/ap

[GitHub] [spark] felixcheung commented on issue #23721: [SPARK-26797][SQL][WIP][test-maven] Start using the new logical types API of Parquet 1.11.0 instead of the deprecated one

2019-03-26 Thread GitBox
felixcheung commented on issue #23721: [SPARK-26797][SQL][WIP][test-maven] Start using the new logical types API of Parquet 1.11.0 instead of the deprecated one URL: https://github.com/apache/spark/pull/23721#issuecomment-476985992 that's great, honestly we can't merge this until parquet 1

[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending) URL: https://github.com/apache/spark/pull/23747#discussion_r269410616 ## File path: external/kafka-0-10-sql/src/main/scala/org/ap

[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending) URL: https://github.com/apache/spark/pull/23747#discussion_r269410616 ## File path: external/kafka-0-10-sql/src/main/scala/org/ap

[GitHub] [spark] cloud-fan closed pull request #24225: [SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader

2019-03-26 Thread GitBox
cloud-fan closed pull request #24225: [SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader URL: https://github.com/apache/spark/pull/24225 This is an automated message from the Apache Git

[GitHub] [spark] gatorsmile closed pull request #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-26 Thread GitBox
gatorsmile closed pull request #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1 URL: https://github.com/apache/spark/pull/24119 This is an automated message from the Apache G

[GitHub] [spark] cloud-fan commented on issue #24225: [SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader

2019-03-26 Thread GitBox
cloud-fan commented on issue #24225: [SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader URL: https://github.com/apache/spark/pull/24225#issuecomment-476983293 thanks, merging to master!

[GitHub] [spark] gatorsmile commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-26 Thread GitBox
gatorsmile commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1 URL: https://github.com/apache/spark/pull/24119#issuecomment-476982813 I am merging this to master. @wangyum Please submit your follow-up PR for making the actua

[GitHub] [spark] gatorsmile commented on a change in pull request #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-26 Thread GitBox
gatorsmile commented on a change in pull request #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1 URL: https://github.com/apache/spark/pull/24119#discussion_r269410080 ## File path: sql/core/v2.3.4/src/main/scala/org/apache/spark/sq

[GitHub] [spark] SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476981979 **[Test build #104002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/1040

[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476981440 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476981443 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476981440 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476981443 Test PASSed. Refer to this link for build results (access rights to CI server needed): https:

[GitHub] [spark] gatorsmile commented on a change in pull request #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-26 Thread GitBox
gatorsmile commented on a change in pull request #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1 URL: https://github.com/apache/spark/pull/24119#discussion_r269408856 ## File path: sql/core/v2.3.4/src/main/scala/org/apache/spark/sq

[GitHub] [spark] SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
SparkQA commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476979983 **[Test build #104001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/1040

[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476979501 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476979495 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476979501 Test PASSed. Refer to this link for build results (access rights to CI server needed): https:

[GitHub] [spark] AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#issuecomment-476979495 Merged build finished. Test PASSed.

[GitHub] [spark] adrian-wang commented on a change in pull request #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec`

2019-03-26 Thread GitBox
adrian-wang commented on a change in pull request #24214: [SPARK-27279][SQL] Reuse subquery should compare child plan of `SubqueryExec` URL: https://github.com/apache/spark/pull/24214#discussion_r269407893 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquer

[GitHub] [spark] beliefer commented on a change in pull request #24218: [SPARK-27281][DStreams] Change the way latest kafka offsets are retrieved to consumer#endOffsets

2019-03-26 Thread GitBox
beliefer commented on a change in pull request #24218: [SPARK-27281][DStreams] Change the way latest kafka offsets are retrieved to consumer#endOffsets URL: https://github.com/apache/spark/pull/24218#discussion_r269399030 ## File path: external/kafka-0-10/src/main/scala/org/apache/

[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

2019-03-26 Thread GitBox
HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending) URL: https://github.com/apache/spark/pull/23747#discussion_r269397115 ## File path: external/kafka-0-10-sql/src/main/scala/org/ap

[GitHub] [spark] cloud-fan edited a comment on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-26 Thread GitBox
cloud-fan edited a comment on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct URL: https://github.com/apache/spark/pull/24215#issuecomment-476959548 I'm afraid this may cause big regression, as aggregate is expensive and here we do it twice. Aggregate needs

[GitHub] [spark] shivusondur commented on issue #23926: [SPARK-26872][STREAMING] Use a configurable value for final termination in the JobScheduler.stop() method

2019-03-26 Thread GitBox
shivusondur commented on issue #23926: [SPARK-26872][STREAMING] Use a configurable value for final termination in the JobScheduler.stop() method URL: https://github.com/apache/spark/pull/23926#issuecomment-476959691 @smrosenberry "only developer can configure" means it is still availabl

[GitHub] [spark] cloud-fan commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-26 Thread GitBox
cloud-fan commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct URL: https://github.com/apache/spark/pull/24215#issuecomment-476959548 I'm afraid this may cause big regression, as aggregate is expensive and here we do it twice. Aggregate needs to buil

[GitHub] [spark] AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views URL: https://github.com/apache/spark/pull/24200#issuecomment-476958489 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views URL: https://github.com/apache/spark/pull/24200#issuecomment-476958495 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views URL: https://github.com/apache/spark/pull/24200#issuecomment-476958495 Test PASSed. Refer to this link for build results (access rights to CI server

[GitHub] [spark] AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views URL: https://github.com/apache/spark/pull/24200#issuecomment-476958489 Merged build finished. Test PASSed.

[GitHub] [spark] cloud-fan commented on a change in pull request #24195: [SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_timestamp

2019-03-26 Thread GitBox
cloud-fan commented on a change in pull request #24195: [SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_timestamp URL: https://github.com/apache/spark/pull/24195#discussion_r269395838 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/dat

[GitHub] [spark] SparkQA removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-26 Thread GitBox
SparkQA removed a comment on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views URL: https://github.com/apache/spark/pull/24200#issuecomment-476882510 **[Test build #103999 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPul

[GitHub] [spark] SparkQA commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views

2019-03-26 Thread GitBox
SparkQA commented on issue #24200: [SPARK-27266][SQL] Support ANALYZE TABLE to collect tables stats for cached catalog views URL: https://github.com/apache/spark/pull/24200#issuecomment-476958017 **[Test build #103999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullReques

[GitHub] [spark] HyukjinKwon commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
HyukjinKwon commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476949248 Oops, I thought merged it to master. --

[GitHub] [spark] chakravarthiT commented on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor

2019-03-26 Thread GitBox
chakravarthiT commented on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor URL: https://github.com/apache/spark/pull/24136#issuecomment-476948267 @maropu @HyukjinKwon handled review comments. and now that UT wont fail ,as message will

[GitHub] [spark] dilipbiswal commented on a change in pull request #24209: [SPARK-27255][SQL] Aggregate functions should not be allowed in WHERE

2019-03-26 Thread GitBox
dilipbiswal commented on a change in pull request #24209: [SPARK-27255][SQL] Aggregate functions should not be allowed in WHERE URL: https://github.com/apache/spark/pull/24209#discussion_r269388977 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Che

[GitHub] [spark] beliefer edited a comment on issue #23841: [SPARK-26936][SQL] Fix bug of insert overwrite local dir can not create temporary path in local staging directory

2019-03-26 Thread GitBox
beliefer edited a comment on issue #23841: [SPARK-26936][SQL] Fix bug of insert overwrite local dir can not create temporary path in local staging directory URL: https://github.com/apache/spark/pull/23841#issuecomment-476472507 > Will we hit this bug when we deploy spark in cluster? Seems t

[GitHub] [spark] shaneknapp commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
shaneknapp commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476944643 > :D Great! Please merge this if this is finally done, @shaneknapp ! i will do so tomorrow, after i poke around and test a bit more

[GitHub] [spark] jose-torres commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

2019-03-26 Thread GitBox
jose-torres commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending) URL: https://github.com/apache/spark/pull/23747#discussion_r269384961 ## File path: external/kafka-0-10-sql/src/main/scala/org/ap

[GitHub] [spark] windpiger commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-26 Thread GitBox
windpiger commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct URL: https://github.com/apache/spark/pull/24215#issuecomment-476931507 > Yea, I think so. How about checking `#distinctRowCount / rowCount` before optimization? right, cost driven will be

[GitHub] [spark] windpiger commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-26 Thread GitBox
windpiger commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct URL: https://github.com/apache/spark/pull/24215#issuecomment-476931064 > Could this lead to performance regressions if the group by doesn't have reduction? Here provide a switch to turn i

[GitHub] [spark] maropu closed pull request #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
maropu closed pull request #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226 This is an automated message from the Apache Git Service. To

[GitHub] [spark] maropu commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
maropu commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476928566 I checked most noisy warning logs gone. Thanks! Merged to master. --

[GitHub] [spark] HyukjinKwon commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
HyukjinKwon commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476928355 Merged to master. This is an autom

[GitHub] [spark] AmplabJenkins commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476927774 Test PASSed. Refer to this link for build results (access rights to CI server needed): http

[GitHub] [spark] AmplabJenkins removed a comment on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476927774 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476927765 Merged build finished. Test PASSed. -

[GitHub] [spark] AmplabJenkins commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476927765 Merged build finished. Test PASSed. -

[GitHub] [spark] SparkQA removed a comment on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
SparkQA removed a comment on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476852479 **[Test build #103995 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestB

[GitHub] [spark] caneGuy commented on issue #23670: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable

2019-03-26 Thread GitBox
caneGuy commented on issue #23670: [SPARK-26601][SQL] Make broadcast-exchange thread pool configurable URL: https://github.com/apache/spark/pull/23670#issuecomment-476927512 @dongjoon-hyun could you help check this pr?thanks

[GitHub] [spark] wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics.

2019-03-26 Thread GitBox
wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r269378096 ## File path: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala ##

[GitHub] [spark] SparkQA commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB

2019-03-26 Thread GitBox
SparkQA commented on issue #24226: [SPARK-26660][FOLLOWUP] Raise task serialized size warning threshold to 1000 KiB URL: https://github.com/apache/spark/pull/24226#issuecomment-476927260 **[Test build #103995 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/

[GitHub] [spark] beliefer commented on issue #23841: [SPARK-26936][SQL] Fix bug of insert overwrite local dir can not create temporary path in local staging directory

2019-03-26 Thread GitBox
beliefer commented on issue #23841: [SPARK-26936][SQL] Fix bug of insert overwrite local dir can not create temporary path in local staging directory URL: https://github.com/apache/spark/pull/23841#issuecomment-476924777 > That makes more sense if this isn't YARN-specific, but isn't this st

[GitHub] [spark] gatorsmile commented on a change in pull request #24209: [SPARK-27255][SQL] Aggregate functions should not be allowed in WHERE

2019-03-26 Thread GitBox
gatorsmile commented on a change in pull request #24209: [SPARK-27255][SQL] Aggregate functions should not be allowed in WHERE URL: https://github.com/apache/spark/pull/24209#discussion_r269376472 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Chec

[GitHub] [spark] gatorsmile commented on issue #24209: [SPARK-27255][SQL] Aggregate functions should not be allowed in WHERE

2019-03-26 Thread GitBox
gatorsmile commented on issue #24209: [SPARK-27255][SQL] Aggregate functions should not be allowed in WHERE URL: https://github.com/apache/spark/pull/24209#issuecomment-476924219 I think we should verify it in the analyzer stage. The plan integrity verification is just for ensuring

[GitHub] [spark] maropu commented on a change in pull request #24223: [SPARK-27278][SQL] Optimize GetMapValue when the map is a foldable and the key is not

2019-03-26 Thread GitBox
maropu commented on a change in pull request #24223: [SPARK-27278][SQL] Optimize GetMapValue when the map is a foldable and the key is not URL: https://github.com/apache/spark/pull/24223#discussion_r269375161 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/o

[GitHub] [spark] viirya commented on issue #24220: [SPARK-27288][SQL] Pruning nested field in complex map key from object serializers

2019-03-26 Thread GitBox
viirya commented on issue #24220: [SPARK-27288][SQL] Pruning nested field in complex map key from object serializers URL: https://github.com/apache/spark/pull/24220#issuecomment-476920990 @dongjoon-hyun Thanks! Created a new JIRA for this. --

[GitHub] [spark] wangjiaochun commented on a change in pull request #24080: [SPARK-27147][TEST]Create new unit test cases for SortShuffleWriter

2019-03-26 Thread GitBox
wangjiaochun commented on a change in pull request #24080: [SPARK-27147][TEST]Create new unit test cases for SortShuffleWriter URL: https://github.com/apache/spark/pull/24080#discussion_r269371993 ## File path: core/src/test/scala/org/apache/spark/shuffle/sort/SortShuffleWriterSuit

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24195: [SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_timestamp

2019-03-26 Thread GitBox
HyukjinKwon commented on a change in pull request #24195: [SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_timestamp URL: https://github.com/apache/spark/pull/24195#discussion_r269368914 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/d

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24195: [SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_timestamp

2019-03-26 Thread GitBox
HyukjinKwon commented on a change in pull request #24195: [SPARK-25496][SQL] Deprecate from_utc_timestamp and to_utc_timestamp URL: https://github.com/apache/spark/pull/24195#discussion_r269368585 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ##

[GitHub] [spark] wangyum commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-26 Thread GitBox
wangyum commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1 URL: https://github.com/apache/spark/pull/24119#issuecomment-476907659 @liancheng Yes. It's a subset of #23788.

[GitHub] [spark] maropu commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-26 Thread GitBox
maropu commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct URL: https://github.com/apache/spark/pull/24215#issuecomment-476907430 Yea, I think so. How about checking `#distinctRowCount / rowCount` before optimization? -

[GitHub] [spark] dongjoon-hyun commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
dongjoon-hyun commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476905635 :D Great! Please merge this if this is finally done, @shaneknapp ! Th

[GitHub] [spark] liancheng commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1

2019-03-26 Thread GitBox
liancheng commented on issue #24119: [SPARK-27182][SQL] Move the conflict source code of the sql/core module to sql/core/v1.2.1 URL: https://github.com/apache/spark/pull/24119#issuecomment-476904194 Hey @wangyum, sorry for the long delay. IIUC, this PR is basically a subset of #23788. Once

[GitHub] [spark] wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics.

2019-03-26 Thread GitBox
wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r269357625 ## File path: core/src/main/scala/org/apache/spark/executor/ExecutorMetricsPoller.scala ###

[GitHub] [spark] rxin commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct

2019-03-26 Thread GitBox
rxin commented on issue #24215: [SPARK-27229][SQL] GroupBy Placement in Intersect Distinct URL: https://github.com/apache/spark/pull/24215#issuecomment-476895292 Could this lead to performance regressions if the group by doesn't have reduction?

[GitHub] [spark] skonto commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
skonto commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476894285 @shaneknapp looks stable. This is an automated message from the Apache Git Se

[GitHub] [spark] wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics.

2019-03-26 Thread GitBox
wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r269352335 ## File path: core/src/main/scala/org/apache/spark/executor/ExecutorMetricsPoller.scala ###

[GitHub] [spark] AmplabJenkins removed a comment on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476891741 Merged build finished. Test PASSed. This is an automated mess

[GitHub] [spark] AmplabJenkins removed a comment on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476891749 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/j

[GitHub] [spark] AmplabJenkins commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476891749 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//

[GitHub] [spark] SparkQA commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
SparkQA commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476891729 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified

[GitHub] [spark] AmplabJenkins commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #23514: [SPARK-24902][K8s] Add PV integration tests URL: https://github.com/apache/spark/pull/23514#issuecomment-476891741 Merged build finished. Test PASSed. This is an automated message from

[GitHub] [spark] AmplabJenkins removed a comment on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader URL: https://github.com/apache/spark/pull/24225#issuecomment-476890580 Test PASSed. Refer to this link for build results (access rights to CI se

[GitHub] [spark] AmplabJenkins commented on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader URL: https://github.com/apache/spark/pull/24225#issuecomment-476890580 Test PASSed. Refer to this link for build results (access rights to CI server nee

[GitHub] [spark] wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics.

2019-03-26 Thread GitBox
wypoon commented on a change in pull request #23767: [SPARK-26329][CORE] Faster polling of executor memory metrics. URL: https://github.com/apache/spark/pull/23767#discussion_r269352335 ## File path: core/src/main/scala/org/apache/spark/executor/ExecutorMetricsPoller.scala ###

[GitHub] [spark] AmplabJenkins removed a comment on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor URL: https://github.com/apache/spark/pull/24136#issuecomment-476890393 Merged build finished. Test PASSed. --

[GitHub] [spark] AmplabJenkins removed a comment on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor URL: https://github.com/apache/spark/pull/24136#issuecomment-476890397 Test PASSed. Refer to this link for build results (access rights to CI server ne

[GitHub] [spark] AmplabJenkins removed a comment on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader

2019-03-26 Thread GitBox
AmplabJenkins removed a comment on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader URL: https://github.com/apache/spark/pull/24225#issuecomment-476890577 Merged build finished. Test PASSed. ---

[GitHub] [spark] AmplabJenkins commented on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24225: [WIP][SPARK-27286][SQL] Handles exceptions on proceeding to next record in FilePartitionReader URL: https://github.com/apache/spark/pull/24225#issuecomment-476890577 Merged build finished. Test PASSed. ---

[GitHub] [spark] AmplabJenkins commented on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor

2019-03-26 Thread GitBox
AmplabJenkins commented on issue #24136: [SPARK-27088][SQL] Add a configuration to set log level for each batch at RuleExecutor URL: https://github.com/apache/spark/pull/24136#issuecomment-476890397 Test PASSed. Refer to this link for build results (access rights to CI server needed):

  1   2   3   4   5   6   7   8   9   10   >