[GitHub] spark issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd wit...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23262 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22764: [SPARK-25765][ML] Add training cost to BisectingKMeans s...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22764 kindly ping @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23057: [SPARK-26078][SQL] Dedup self-join attributes on IN subq...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23057 @cloud-fan @gatorsmile may you please take a look at this? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23259: [SPARK-26215][SQL][WIP] Define reserved/non-reserved key...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23259 +1 for SQL 2011. I downloaded the standard but I couldn't find any section dedicated to, In postgres doc, though, they are stating that they are not following the standard strictly: https

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r240146706 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #23262: [SPARK-26312][SQL]Converting converters in RDDCon...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23262#discussion_r240134191 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -17,51 +17,39 @@ package

[GitHub] spark pull request #23262: [SPARK-26312][SQL]Converting converters in RDDCon...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23262#discussion_r240135694 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -17,51 +17,39 @@ package

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-10 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22957 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-08 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r240003036 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-12-07 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22957 @cloud-fan @gatorsmile I updated the PR according to the previous suggestions and added a new dedicated test suite. May you please review this again? Thanks

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23213 @maropu I'd say so, but I am still not sure what (if there is one) is the difference between `wholeStage=false,sactoryMode=NO_CODEGEN` and `wholeStage=true,factoryMode=NO_CODEGEN`. `wholeStage

[GitHub] spark pull request #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow wh...

2018-12-05 Thread mgaido91
Github user mgaido91 closed the pull request at: https://github.com/apache/spark/pull/23233 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow wh...

2018-12-05 Thread mgaido91
Github user mgaido91 closed the pull request at: https://github.com/apache/spark/pull/23234 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow wh...

2018-12-05 Thread mgaido91
Github user mgaido91 closed the pull request at: https://github.com/apache/spark/pull/23232 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when enco...

2018-12-05 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23232 Done, thanks @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23210: [SPARK-26233][SQL] CheckOverflow when encoding a decimal...

2018-12-05 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23210 thanks @cloud-fan @dongjoon-hyun, I created the PRs for the backports. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow wh...

2018-12-05 Thread mgaido91
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/23234 [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when encoding a decimal value ## What changes were proposed in this pull request? When we encode a Decimal from external source we don't

[GitHub] spark issue #23234: [SPARK-26233][SQL][BACKPORT-2.2] CheckOverflow when enco...

2018-12-05 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23234 cc @cloud-fan @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when enco...

2018-12-05 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23233 cc @cloud-fan @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23233: [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow wh...

2018-12-05 Thread mgaido91
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/23233 [SPARK-26233][SQL][BACKPORT-2.3] CheckOverflow when encoding a decimal value ## What changes were proposed in this pull request? When we encode a Decimal from external source we

[GitHub] spark issue #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when enco...

2018-12-05 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23232 cc @cloud-fan @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #23232: [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow wh...

2018-12-05 Thread mgaido91
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/23232 [SPARK-26233][SQL][BACKPORT-2.4] CheckOverflow when encoding a decimal value When we encode a Decimal from external source we don't check for overflow. That method is useful not only in order

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23213 Yes, I am wondering too: which is the difference between: `spark.sql.codegen.wholeStage=false,spark.sql.codegen.factoryMode=NO_CODEGEN` and `spark.sql.codegen.wholeStage=true

[GitHub] spark issue #23217: [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat in order...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23217 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23217: [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat i...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23217#discussion_r238700354 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala --- @@ -47,13 +48,17 @@ class ArrayBasedMapBuilder

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238696879 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -145,9 +145,14 @@ case class EnsureRequirements

[GitHub] spark pull request #23217: [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat i...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23217#discussion_r238690465 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala --- @@ -47,13 +48,17 @@ class ArrayBasedMapBuilder

[GitHub] spark issue #23217: [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat in order...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23217 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23217: [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat i...

2018-12-04 Thread mgaido91
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/23217 [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat in order to check properly the limit size ## What changes were proposed in this pull request? The PR starts from the [comment](https

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238642801 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -145,9 +145,14 @@ case class EnsureRequirements

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23213 > I personally think its orthogonal to SPARK-24562. yes I agree. I am just asking if it makes sense to create a framework like that. Now it is only about codegen, but in the future we

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238630487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -145,9 +145,14 @@ case class EnsureRequirements

[GitHub] spark issue #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHOLESTAGE...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23213 just a question, why didn't we introduce something like what was done in SPARK-24562? I see that these are configs which are valid for all queries, so using what was done in SPARK-24562

[GitHub] spark issue #22450: [SPARK-25454][SQL] Avoid precision loss in division with...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22450 @cloud-fan this has been stuck for a while now. Is there something blocking this? Is there something I can do? Thanks

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238583330 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -145,9 +145,14 @@ case class EnsureRequirements

[GitHub] spark pull request #23210: [SPARK-26233][SQL] CheckOverflow when encoding a ...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23210#discussion_r238471660 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -1647,6 +1647,15 @@ class DatasetSuite extends QueryTest

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238460901 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -145,9 +145,14 @@ case class EnsureRequirements

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238459238 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -145,9 +145,14 @@ case class EnsureRequirements

[GitHub] spark pull request #23210: [SPARK-26233][SQL] CheckOverflow when encoding a ...

2018-12-03 Thread mgaido91
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/23210 [SPARK-26233][SQL] CheckOverflow when encoding a decimal value ## What changes were proposed in this pull request? When we encode a Decimal from external source we don't check

[GitHub] spark issue #23057: [SPARK-26078][SQL] Dedup self-join attributes on IN subq...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23057 @mccheah this is waiting for reviews by committers --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238292468 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala --- @@ -780,6 +780,23 @@ class PlannerSuite extends SharedSQLContext

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r238291824 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -223,14 +223,35 @@ abstract class Expression

[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23186 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23150#discussion_r238195206 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1618,6 +1618,13 @@ object SQLConf { "a Spar

[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...

2018-12-03 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23150#discussion_r238195153 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1618,6 +1618,13 @@ object SQLConf { "a Spar

[GitHub] spark pull request #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, v...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23186#discussion_r238008395 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -345,15 +346,18 @@ object PartitioningUtils

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237905754 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala --- @@ -780,6 +780,23 @@ class PlannerSuite extends SharedSQLContext

[GitHub] spark pull request #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, v...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23186#discussion_r237889346 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -345,15 +346,18 @@ object PartitioningUtils

[GitHub] spark pull request #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, v...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23186#discussion_r237889521 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileIndexSuite.scala --- @@ -65,6 +65,34 @@ class FileIndexSuite extends

[GitHub] spark pull request #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, v...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23186#discussion_r237888926 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -345,15 +346,18 @@ object PartitioningUtils

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237887275 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala --- @@ -780,6 +780,23 @@ class PlannerSuite extends SharedSQLContext

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237787920 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,35 @@ abstract class Expression

[GitHub] spark issue #23183: [SPARK-26226][SQL] Update query tracker to report timeli...

2018-11-30 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23183 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-11-29 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23171 yes @aokolnychyi , I agree that the work can be done later (not in the scope of this PR). We can maybe just open a new JIRA about it so we won't forget

[GitHub] spark issue #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts, ints

2018-11-29 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23171 @dbtsai I see, it would be great, though, to check which is this threshold. My understanding is that the current solution has better performance even for several hundreds of items. If this number

[GitHub] spark pull request #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts,...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23171#discussion_r237564039 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -335,6 +343,41 @@ case class In(value: Expression

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237539036 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2542,10 +2542,10 @@ object EliminateUnions extends

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237536540 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2542,10 +2542,10 @@ object EliminateUnions extends

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237526646 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -2542,10 +2542,10 @@ object EliminateUnions extends

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237518419 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -250,7 +276,13 @@ object PartitioningUtils

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237510736 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -250,7 +276,13 @@ object PartitioningUtils

[GitHub] spark pull request #23165: [SPARK-26188][SQL] FileIndex: don't infer data ty...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23165#discussion_r237510234 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -94,18 +94,34 @@ object PartitioningUtils

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22957 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23150#discussion_r237451758 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala --- @@ -23,10 +23,16 @@ import

[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23176 LGTM, thanks for the fix! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

2018-11-29 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23176 LGTM, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIn...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21004#discussion_r237093630 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,35 +126,32 @@ abstract class

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237075431 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237073129 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237070496 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23153 a late LGTM as well, thanks @cloud-fan for the patch and thanks @xuanyuanking for the review. --- - To unsubscribe, e-mail

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237068718 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237062190 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,26 @@ abstract class Expression

[GitHub] spark issue #22947: [SPARK-24913][SQL] Make AssertNotNull and AssertTrue non...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22947 any more comments on this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22957 cc @cloud-fan @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23057: [SPARK-26078][SQL] Dedup self-join attributes on IN subq...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23057 cc @gatorsmile too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22764: [SPARK-25765][ML] Add training cost to BisectingKMeans s...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22764 @dbtsai any luck with this? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23124: [SPARK-25829][SQL] remove duplicated map keys with last ...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23124 LGTM too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21004: [SPARK-23896][SQL]Improve PartitioningAwareFileIn...

2018-11-28 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21004#discussion_r236998030 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala --- @@ -126,35 +126,32 @@ abstract class

[GitHub] spark issue #23153: [SPARK-26147][SQL] only pull out unevaluable python udf ...

2018-11-27 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23153 the change itself seems fine to me, as @xuanyuanking mentioned, though, we should update the existing tests. What about adding a test in the new suite checking the plans instead of a end-to-end

[GitHub] spark issue #23057: [SPARK-26078][SQL] Dedup self-join attributes on IN subq...

2018-11-26 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23057 any comments @cloud-fan ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-26 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r236171035 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -646,34 +633,35 @@ case class MapConcat

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-26 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r236170759 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -546,33 +546,29 @@ case class MapConcat

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r235932502 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -546,33 +546,29 @@ case class MapConcat

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r235931894 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -646,34 +633,35 @@ case class MapConcat

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r235931588 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r235866111 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -646,34 +633,35 @@ case class MapConcat

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r235865179 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -546,33 +546,29 @@ case class MapConcat

[GitHub] spark pull request #23124: [SPARK-25829][SQL] remove duplicated map keys wit...

2018-11-23 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23124#discussion_r235867070 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapBuilder.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed

[GitHub] spark issue #23104: [SPARK-26138][SQL] LimitPushDown cross join requires may...

2018-11-22 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23104 @guoxiaolongzte still that doesn't explain why we can push to the right side too. I do believe that it is possible. If the right side contains more than N items, where N is the limit size

[GitHub] spark issue #22875: [SPARK-25867][ML] Remove KMeans computeCost

2018-11-22 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22875 sure, thanks @srowen , no need to apologize at all, thanks for your help reviewing this. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22875: [SPARK-25867][ML] Remove KMeans computeCost

2018-11-21 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22875 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23104: [SPARK-26138][SQL] LimitPushDown cross join requi...

2018-11-21 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23104#discussion_r235392255 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -459,6 +459,7 @@ object LimitPushDown extends Rule

[GitHub] spark issue #22875: [SPARK-25867][ML] Remove KMeans computeCost

2018-11-21 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22875 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #23057: [SPARK-26078][SQL] Dedup self-join attributes on IN subq...

2018-11-21 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/23057 thanks @viirya , I added a comment --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22875: [SPARK-25867][ML] Remove KMeans computeCost

2018-11-21 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22875 thanks for your review @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22875: [SPARK-25867][ML] Remove KMeans computeCost

2018-11-20 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22875 Yes, it was deprecated in #22756 and it is deprecated since 3.0, so we cannot remove it... --- - To unsubscribe, e-mail

[GitHub] spark issue #22875: [SPARK-25867][ML] Remove KMeans computeCost

2018-11-20 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22875 Yes, the problem is that there is also the `computeCost` of `BisectingKMeans`. I proposed to deprecate it in 2.4 and remove in 3.0, but I didn't manage to have it done for 2.4 (please refer

[GitHub] spark pull request #23093: [SPARK-26127][ML] Remove deprecated setImpurity f...

2018-11-20 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/23093#discussion_r235016727 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala --- @@ -91,7 +91,7 @@ class DecisionTreeClassifier @Since

[GitHub] spark issue #22875: [SPARK-25867][ML] Remove KMeans computeCost

2018-11-20 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/22875 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

  1   2   3   4   5   6   7   8   9   10   >