[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-10 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r240148392 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #23273: [SPARK-25212][SQL][FOLLOWUP][DOC] Fix comments of...

2018-12-10 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/23273 [SPARK-25212][SQL][FOLLOWUP][DOC] Fix comments of ConvertToLocalRelation rule ## What changes were proposed in this pull request? There are some comments issues left when

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-09 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r240038550 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-08 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r240024723 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP][DOC] Add migration doc for ...

2018-12-08 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/23238 Thank you! @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metrics whi...

2018-12-08 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/23258 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r239995952 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r239995901 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r239995882 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -26,7 +26,7 @@ import

[GitHub] spark pull request #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metric...

2018-12-07 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23224#discussion_r239992863 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -80,8 +80,10 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/23258 [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metrics while Sort is missing ## What changes were proposed in this pull request? #20560/[SPARK-23375](https://issues.apache.org/jira/browse/SPARK

[GitHub] spark issue #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metrics shoul...

2018-12-07 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/23224 @felixcheung Yes, that makes sense. I have added a commit to check that. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #23238: [SPARK-25132][SQL][FOLLOWUP][DOC] Add migration d...

2018-12-07 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/23238#discussion_r239752238 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -141,6 +141,8 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.3

[GitHub] spark issue #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-...

2018-12-06 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/23238 @HyukjinKwon Would you please kindly take a look at this when you have time? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23237: [SPARK-26279][CORE] Remove unused method in Logging

2018-12-05 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/23237 @HyukjinKwon Close this PR. Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #23237: [SPARK-26279][CORE] Remove unused method in Loggi...

2018-12-05 Thread seancxmao
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/23237 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23238: [SPARK-25132][SQL][FOLLOWUP] Add migration doc fo...

2018-12-05 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/23238 [SPARK-25132][SQL][FOLLOWUP] Add migration doc for case-insensitive field resolution when reading from Parquet ## What changes were proposed in this pull request? #22148 introduces

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-12-05 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 @srowen Sorry for the late reply! I'd like to close this PR and file a new one since our SQL doc has changed a lot. Thank you all for your comments and time

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-12-05 Thread seancxmao
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/22184 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23237: [SPARK-26279][CORE] Remove unused method in Loggi...

2018-12-05 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/23237 [SPARK-26279][CORE] Remove unused method in Logging ## What changes were proposed in this pull request? The method `Logging.isTraceEnabled` is not used anywhere. We should remove it to avoid

[GitHub] spark issue #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metrics shoul...

2018-12-05 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/23224 @HyukjinKwon Thank you for your comments! I have filed a JIRA and updated the PR title accordingly. --- - To unsubscribe, e

[GitHub] spark pull request #23224: [MINOR][SQL][TEST] WholeStageCodegen metrics shou...

2018-12-04 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/23224 [MINOR][SQL][TEST] WholeStageCodegen metrics should be tested with whole-stage codegen enabled ## What changes were proposed in this pull request

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-11-11 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 @HyukjinKwon Thank you for your comments. Yes, this is only valid when upgrade Spark 2.3 to 2.4. I will do

[GitHub] spark pull request #22868: [SPARK-25833][SQL][DOCS] Update migration guide f...

2018-10-29 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22868#discussion_r229156006 --- Diff: docs/sql-migration-guide-hive-compatibility.md --- @@ -51,6 +51,22 @@ Spark SQL supports the vast majority of Hive features

[GitHub] spark pull request #22868: [SPARK-25833][SQL][DOCS] Update migration guide f...

2018-10-29 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22868#discussion_r229155030 --- Diff: docs/sql-migration-guide-hive-compatibility.md --- @@ -53,7 +53,20 @@ Spark SQL supports the vast majority of Hive features

[GitHub] spark pull request #22868: [SPARK-25833][SQL][DOCS] Update migration guide f...

2018-10-29 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22868#discussion_r229150147 --- Diff: docs/sql-migration-guide-hive-compatibility.md --- @@ -51,6 +51,22 @@ Spark SQL supports the vast majority of Hive features

[GitHub] spark issue #22868: [SPARK-25833][SQL][DOCS] Update migration guide for Hive...

2018-10-29 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22868 @dongjoon-hyun Do you mean SPARK-25833? Since SPARK-24864 is resolved as Won't Fix, I updated type, priority and title of SPARK-25833

[GitHub] spark pull request #22868: [SPARK-25833][SQL][DOCS] Update migration guide f...

2018-10-29 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22868#discussion_r228845332 --- Diff: docs/sql-migration-guide-hive-compatibility.md --- @@ -51,6 +51,9 @@ Spark SQL supports the vast majority of Hive features

[GitHub] spark pull request #22851: [SPARK-25797][SQL][DOCS][BACKPORT-2.3] Add migrat...

2018-10-29 Thread seancxmao
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/22851 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22851: [SPARK-25797][SQL][DOCS][BACKPORT-2.3] Add migration doc...

2018-10-29 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22851 Closing this. Thank you @dongjoon-hyun @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22868: [SPARK-25833][SQL][DOCS] Update migration guide for Hive...

2018-10-28 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22868 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22868: [SPARK-25833][SQL][DOCS] Update migration guide f...

2018-10-28 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22868 [SPARK-25833][SQL][DOCS] Update migration guide for Hive view compatibility ## What changes were proposed in this pull request? Both Spark and Hive support views. However in some cases views

[GitHub] spark issue #22846: [SPARK-25797][SQL][DOCS] Add migration doc for solving i...

2018-10-26 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22846 @cloud-fan PR for 2.3 is submitted. Please see #22851. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22851: [SPARK-25797][SQL][DOCS][BACKPORT-2.3] Add migrat...

2018-10-26 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22851 [SPARK-25797][SQL][DOCS][BACKPORT-2.3] Add migration doc for solving issues caused by view canonicalization approach change ## What changes were proposed in this pull request? Since Spark

[GitHub] spark issue #22846: [SPARK-25797][SQL][DOCS] Add migration doc for solving i...

2018-10-26 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22846 @cloud-fan Sure, I will send a new PR for 2.3. Thanks you for review this. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22846: [SPARK-25797][SQL][DOCS] Add migration doc for solving i...

2018-10-26 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22846 @jiangxb1987 @cloud-fan @gatorsmile Could you please kindly review this when you have time? --- - To unsubscribe, e-mail

[GitHub] spark pull request #22846: [SPARK-25797][SQL][DOCS] Add migration doc for so...

2018-10-26 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22846 [SPARK-25797][SQL][DOCS] Add migration doc for solving issues caused by view canonicalization approach change ## What changes were proposed in this pull request? Since Spark 2.2, view

[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...

2018-09-29 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22461 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...

2018-09-28 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r221411618 --- Diff: docs/sql-programming-guide.md --- @@ -1489,7 +1489,7 @@ See the [Apache Avro Data Source Guide](avro-data-source-guide.html

[GitHub] spark issue #22531: [SPARK-25415][SQL][FOLLOW-UP] Add Locale.ROOT when toUpp...

2018-09-28 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22531 @HyukjinKwon Please go ahead since you've already been working on this. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.writeLega...

2018-09-26 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22453 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.wr...

2018-09-25 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22453#discussion_r220407692 --- Diff: docs/sql-programming-guide.md --- @@ -1002,6 +1002,21 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession

[GitHub] spark pull request #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.wr...

2018-09-24 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22453#discussion_r220042478 --- Diff: docs/sql-programming-guide.md --- @@ -1002,6 +1002,21 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession

[GitHub] spark pull request #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.wr...

2018-09-24 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22453#discussion_r220038438 --- Diff: docs/sql-programming-guide.md --- @@ -1002,6 +1002,21 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession

[GitHub] spark issue #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.writeLega...

2018-09-23 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22453 FYI. I had a brief survey on Parquet decimal support of computing engines at the time of writing. Hive * [HIVE-19069](https://jira.apache.org/jira/browse/HIVE-19069) Hive can't read

[GitHub] spark pull request #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.wr...

2018-09-23 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22453#discussion_r219729166 --- Diff: docs/sql-programming-guide.md --- @@ -1002,6 +1002,15 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession

[GitHub] spark pull request #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.wr...

2018-09-23 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22453#discussion_r219721110 --- Diff: docs/sql-programming-guide.md --- @@ -1002,6 +1002,15 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession

[GitHub] spark pull request #22499: [SPARK-25489][ML][TEST] Refactor UDTSerialization...

2018-09-23 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22499#discussion_r219698800 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/UDTSerializationBenchmark.scala --- @@ -18,52 +18,52 @@ package

[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...

2018-09-21 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219539062 --- Diff: docs/sql-programming-guide.md --- @@ -1287,8 +1287,18 @@ bin/spark-shell --driver-class-path postgresql-9.4.1207.jar --jars postgresql-9

[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...

2018-09-21 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219537942 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -462,6 +468,12 @@ class

[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...

2018-09-21 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219403696 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -462,6 +464,9 @@ class

[GitHub] spark issue #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmar...

2018-09-21 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22497 @kiszk @wangyum Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...

2018-09-20 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22461#discussion_r219386919 --- Diff: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala --- @@ -442,6 +442,8 @@ class

[GitHub] spark pull request #22499: [SPARK-25489][ML][TEST] Refactor UDTSerialization...

2018-09-20 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22499 [SPARK-25489][ML][TEST] Refactor UDTSerializationBenchmark ## What changes were proposed in this pull request? Refactor `UDTSerializationBenchmark` to use main method and print the output

[GitHub] spark pull request #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayB...

2018-09-20 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22497 [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmark ## What changes were proposed in this pull request? Refactor PrimitiveArrayBenchmark to use main method and print the output

[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...

2018-09-19 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22461 I tested in my mac, following guidance of http://spark.apache.org/docs/latest/building-spark.html#running-docker-based-integration-test-suites. ``` ./build/mvn install -DskipTests

[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...

2018-09-19 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22461 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...

2018-09-18 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22461 @gatorsmile Thanks a lot! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...

2018-09-18 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22461 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22461: [SPARK-25453] OracleIntegrationSuite IllegalArgum...

2018-09-18 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22461 [SPARK-25453] OracleIntegrationSuite IllegalArgumentException: Timestamp format must be -mm-dd hh:mm:ss[.f] ## What changes were proposed in this pull request? This PR aims

[GitHub] spark issue #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.writeLega...

2018-09-18 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22453 @HyukjinKwon Could you please help review this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22453: [SPARK-20937][DOCS] Describe spark.sql.parquet.wr...

2018-09-18 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22453 [SPARK-20937][DOCS] Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide ## What changes were proposed in this pull request? Describe

[GitHub] spark pull request #22343: [SPARK-25391][SQL] Make behaviors consistent when...

2018-09-16 Thread seancxmao
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/22343 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-16 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 Sure, close this PR. Thank you all for your time and insights. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-12 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 I agree that correctness is more important. If we should not make behaviors consistent when do the convertion, I will close this PR. @cloud-fan @gatorsmile what do you think

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-11 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 It keeps Hive compatibility but loses performance benefit by setting spark.sql.hive.convertMetastoreParquet=false. We can do better by enabling the conversion and still keeping Hive compatibility

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-11 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 Could we see this as a behavior change? We can add a legacy conf (e.g. `spark.sql.hive.legacy.convertMetastoreParquet`, may be defined in HiveUtils) to enable users to revert back to the previous

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 @dongjoon-hyun It is a little complicated. There has been a discussion about this in #22184. Below are some key comments from @cloud-fan and @gatorsmile, just FYI. * https://github.com

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 Hi, @dongjoon-hyun When we find duplicated field names in the case of convertMetastoreXXX, we have 2 options (1) raise exception as parquet data source. To most of end users, they do

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-09-10 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 @cloud-fan @gatorsmile I think the old `Upgrading From Spark SQL 2.3.1 to 2.3.2 and above` is not needed since we do not backport SPARK-25132 to branch-2.3. I'm wondering if we need `Upgrading

[GitHub] spark pull request #22343: [SPARK-25391][SQL] Make behaviors consistent when...

2018-09-10 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22343#discussion_r216212552 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala --- @@ -1390,7 +1395,11 @@ class

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-09 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 @dongjoon-hyun @HyukjinKwon I created a new JIRA ticket and try to use a more complete and clear title for this PR. What do you think

[GitHub] spark issue #22262: [SPARK-25175][SQL] Field resolution should fail if there...

2018-09-09 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22262 @dongjoon-hyun Thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22262: [SPARK-25175][SQL] Field resolution should fail if there...

2018-09-08 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22262 > ... we need this duplication check in case-sensitive mode ... Do you mean we may define ORC/Parquet schema with identical field names (even in the same letter case)? Would you please expl

[GitHub] spark issue #22343: [SPARK-25132][SQL][FOLLOW-UP] The behavior must be consi...

2018-09-07 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22343 @HyukjinKwon @cloud-fan @gatorsmile Could you please kindly help review this if you have time? --- - To unsubscribe, e-mail

[GitHub] spark pull request #22262: [SPARK-25175][SQL] Field resolution should fail i...

2018-09-07 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22262#discussion_r216121510 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala --- @@ -116,6 +116,14 @@ object OrcUtils extends Logging

[GitHub] spark issue #22262: [SPARK-25175][SQL] Field resolution should fail if there...

2018-09-07 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22262 @dongjoon-hyun That's all right :). I have reverted to the first commit and adjusted the indentation. --- - To unsubscribe, e

[GitHub] spark issue #22262: [SPARK-25175][SQL] Field resolution should fail if there...

2018-09-06 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22262 I updated the PR description. Thank you for pointing that PR description should stay focused. I also think it's more clear

[GitHub] spark issue #22262: [SPARK-25175][SQL] Field resolution should fail if there...

2018-09-05 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22262 @dongjoon-hyun I have updated PR description to explain in more details. As you mentioned, this PR is specific to the case when reading from data source table persisted in metastore

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-09-05 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 @cloud-fan I've just sent a PR (#22343) for this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22343: [SPARK-25132][SQL][FOLLOW-UP] The behavior must b...

2018-09-05 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22343 [SPARK-25132][SQL][FOLLOW-UP] The behavior must be consistent to do the conversion ## What changes were proposed in this pull request? parquet data source tables and hive parquet tables have

[GitHub] spark pull request #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive...

2018-09-04 Thread seancxmao
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/22183 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-08-29 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 > My proposal is, parquet data source should provide an option(not SQL conf) to ... You mentioned this option is not SQL conf. Could you give me some advice about where this option sho

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-08-29 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 @cloud-fan OK, I will do it. Just to confirm, when reading from hive parquet table, if `spark.sql.hive.convertMetastoreParquet` and `spark.sql.caseSensitive` are both set to true, we

[GitHub] spark issue #22262: [SPARK-25175][SQL] Field resolution should fail if there...

2018-08-29 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22262 @dongjoon-hyun @cloud-fan @gatorsmile Could you please kindly review this? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22262: [SPARK-25175][SQL] Field resolution should fail i...

2018-08-29 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22262 [SPARK-25175][SQL] Field resolution should fail if there is ambiguity for ORC native reader ## What changes were proposed in this pull request? This PR aims to make ORC data source native

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-27 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r213020789 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-27 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r212894532 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-23 Thread seancxmao
Github user seancxmao commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r212405373 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-08-22 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22184 @gatorsmile Could you kindly help trigger Jenkins and review? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-22 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22184 [SPARK-25132][SQL][DOC] Add migration doc for case-insensitive field resolution when reading from Parquet ## What changes were proposed in this pull request? #22148 introduces a behavior

[GitHub] spark pull request #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive...

2018-08-21 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22183 [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when reading from Parquet ## What changes were proposed in this pull request? This is a backport of https://github.com

[GitHub] spark issue #22148: [SPARK-25132][SQL] Case-insensitive field resolution whe...

2018-08-20 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22148 Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22148: [SPARK-25132][SQL] Case-insensitive field resolut...

2018-08-19 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22148 [SPARK-25132][SQL] Case-insensitive field resolution when reading from Parquet ## What changes were proposed in this pull request? Spark SQL returns NULL for a column whose Hive metastore

[GitHub] spark issue #22142: [SPARK-25132][SQL] Case-insensitive field resolution whe...

2018-08-19 Thread seancxmao
Github user seancxmao commented on the issue: https://github.com/apache/spark/pull/22142 Split this into 2 PRs, one for Parquet and ORC respectively. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22142: [SPARK-25132][SQL] Case-insensitive field resolut...

2018-08-19 Thread seancxmao
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/22142 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22142: [SPARK-25132][SQL] case-insensitive field resolut...

2018-08-19 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/22142 [SPARK-25132][SQL] case-insensitive field resolution when reading from Parquet/ORC ## What changes were proposed in this pull request? Spark SQL returns NULL for a column whose Hive

[GitHub] spark pull request #21113: [SPARK-13136][SQL][FOLLOW-UP] Fix comment

2018-04-20 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/21113 [SPARK-13136][SQL][FOLLOW-UP] Fix comment ## What changes were proposed in this pull request? Fix comment. Change `BroadcastHashJoin.broadcastFuture` to `BroadcastExchange.relationFuture

[GitHub] spark pull request #20597: [MINOR][TEST] Update from 2.2.0 to 2.2.1 in HiveE...

2018-02-24 Thread seancxmao
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/20597 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20597: [MINOR][TEST] Update from 2.2.0 to 2.2.1 in HiveE...

2018-02-13 Thread seancxmao
GitHub user seancxmao opened a pull request: https://github.com/apache/spark/pull/20597 [MINOR][TEST] Update from 2.2.0 to 2.2.1 in HiveExternalCatalogVersionsSuite ## What changes were proposed in this pull request? In `HiveExternalCatalogVersionsSuite`, latest version of 2.2.x

  1   2   >