[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218665902 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218652707 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218651545 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218640368 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218639550 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218639483 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218631745 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218631682 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218630513 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruningSuite.scala --- @@ -22,21 +22,29 @@ import scala.collection.JavaConverters._

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218630488 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala --- @@ -557,11 +557,13 @@ class DataFrameAggregateSuite extends QueryTest

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218630324 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -204,6 +204,13 @@ object SQLConf { .intConf

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-09-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r218614872 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -44,18 +45,23 @@ private[spark] sealed trait MapStatus { * necessary

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212844439 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-26 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212830045 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212811618 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212805707 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212805327 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212797691 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-limit.sql --- @@ -1,6 +1,9 @@ -- A test suite for IN LIMIT in parent

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212792753 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-limit.sql --- @@ -1,6 +1,9 @@ -- A test suite for IN LIMIT in parent

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212792225 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-limit.sql --- @@ -1,6 +1,9 @@ -- A test suite for IN LIMIT in parent

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16677 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-07-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r204580788 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/TakeOrderedAndProjectSuite.scala --- @@ -22,6 +22,7 @@ import scala.util.Random

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-07-23 Thread sujith71955
Github user sujith71955 commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r204362254 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -231,6 +231,12 @@ object

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-07-23 Thread sujith71955
Github user sujith71955 commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r204361301 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/TakeOrderedAndProjectSuite.scala --- @@ -22,6 +22,7 @@ import scala.util.Random

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-07-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r204221009 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -204,6 +204,13 @@ object SQLConf { .intConf

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r198746910 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java --- @@ -145,10 +145,12 @@ public void write(Iterator> records)

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-26 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r198360170 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java --- @@ -145,10 +145,12 @@ public void write(Iterator>

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r198328639 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -193,6 +193,16 @@ case object SinglePartition

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r198115993 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -193,6 +193,16 @@ case object SinglePartition

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-26 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r198106419 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -193,6 +193,16 @@ case object

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197613554 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java --- @@ -145,10 +145,12 @@ public void write(Iterator> records)

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197613004 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -204,6 +204,13 @@ object SQLConf { .intConf

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-23 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197607227 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,94 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-23 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197606877 --- Diff: core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java --- @@ -145,10 +145,12 @@ public void write(Iterator>

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197596253 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -247,6 +253,10 @@ object ShuffleExchangeExec {

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197462974 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -247,6 +253,10 @@ object ShuffleExchangeExec {

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197430814 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +98,95 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197430376 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -193,6 +193,16 @@ case object SinglePartition

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197410930 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +98,95 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197410511 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -193,6 +193,16 @@ case object

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197388872 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -193,6 +193,16 @@ case object SinglePartition

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197366478 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +98,101 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197284779 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +98,101 @@ trait BaseLimitExec extends UnaryExecNode with

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197117604 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -193,6 +193,16 @@ case object

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-06-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r197116936 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -247,6 +253,10 @@ object ShuffleExchangeExec