[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-10 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22707#discussion_r240139057 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -227,18 +227,22 @@ case class InsertIntoHiveTable

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22707#discussion_r240077883 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -774,4 +774,23 @@ class InsertSuite extends QueryTest

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22707#discussion_r240070378 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -227,18 +227,22 @@ case class InsertIntoHiveTable

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22707#discussion_r240068636 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -774,4 +774,23 @@ class InsertSuite extends QueryTest

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-12-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22707#discussion_r240067762 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -774,4 +774,23 @@ class InsertSuite extends QueryTest

[GitHub] spark issue #22707: [SPARK-25717][SQL] Insert overwrite a recreated external...

2018-12-06 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22707 Is there any more suggestions? @wangyum @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22693: [SPARK-25701][SQL] Supports calculation of table statist...

2018-11-05 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22693 This may not be a good solution, and a better one had been provided by @wangyum Close it. --- - To unsubscribe, e-mail

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-11-05 Thread fjh100456
Github user fjh100456 closed the pull request at: https://github.com/apache/spark/pull/22693 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-11-05 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22693#discussion_r230726170 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -115,26 +116,45 @@ class ResolveHiveSerdeTable(session

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-10-16 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22707#discussion_r225756219 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -227,18 +227,22 @@ case class InsertIntoHiveTable

[GitHub] spark issue #22707: [SPARK-25717][SQL] Insert overwrite a recreated external...

2018-10-14 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22707 @wangyum I have added the test. Thank you. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-10-12 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/22707 [SPARK-25717][SQL] Insert overwrite a recreated external and partitioned table may result in incorrect query results ## What changes were proposed in this pull request? Consider

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-10-11 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/22693 [SPARK-25701][SQL] Supports calculation of table statistics from partition's catalog statistics. ## What changes were proposed in this pull request? When determine table statistics

[GitHub] spark pull request #22641: [SPARK-25611][SPARK-25612][SQL][TESTS] Improve te...

2018-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22641#discussion_r223914298 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -262,7 +261,10 @@ class CompressionCodecSuite extends

[GitHub] spark issue #22641: [SPARK-25611][SPARK-25612][SQL][TESTS] Improve test run ...

2018-10-09 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22641 I think random tests are not a good solution. If the use case does run too slowly, it may be much better to reduce codecs. We have done unit tests on the codec priority

[GitHub] spark pull request #22412: [SPARK-25404][SQL] Staging path may not on the ex...

2018-09-14 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22412#discussion_r217613124 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -217,12 +217,7 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #22412: [SPARK-25404][SQL] Staging path may not on the ex...

2018-09-13 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/22412 [SPARK-25404][SQL] Staging path may not on the expected place when table path contains the stagingDir string ## What changes were proposed in this pull request? As described in [#SPARK

[GitHub] spark issue #22302: [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec test f...

2018-09-05 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22302 The pr [#20120](https://github.com/apache/spark/pull/20120) was closed, instead [SPARK-23355](https://issues.apache.org/jira/browse/SPARK-23355) had resolved it. Does it mean the same thing

[GitHub] spark issue #22302: [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec test f...

2018-09-05 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22302 @dongjoon-hyun @maropu I'm so sorry, and I have changed the description. Is it ok now ? --- - To unsubscribe, e-mail

[GitHub] spark issue #22302: [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec test f...

2018-09-02 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22302 @maropu I'd update the PR description, thank you! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22302: [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec...

2018-08-31 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/22302 [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec test for CTAS What changes were proposed in this pull request? Since resolved by @DongJoon in 20522, compressionCodec test for CTAS has been

[GitHub] spark issue #22301: [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec test f...

2018-08-31 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/22301 Ok, I will do it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22301: [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec...

2018-08-31 Thread fjh100456
Github user fjh100456 closed the pull request at: https://github.com/apache/spark/pull/22301 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22301: [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec...

2018-08-31 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/22301 [SPARK-21786][SQL][FOLLOWUP] Add compressionCodec test for CTAS What changes were proposed in this pull request? Since resolved by @dongjoon in [20522](https://github.com/apache/spark/pull

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r163150428 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -55,18 +55,28 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r163132078 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -55,18 +55,28 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-21 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162836010 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -55,18 +55,28 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-20 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162779218 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,354 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2018-01-19 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20087 @gatorsmile I had fix the test case for CTAS, but it may not pass the test, until merge the code of your PR #20120

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-19 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162563141 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-18 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162551602 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-18 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162528298 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-18 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162520864 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-18 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162519320 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2018-01-10 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20087 @gatorsmile I'd change the precedence. For Parquet, if `hive.exec.compress.output` is true, keep the old precedence, otherwise, get compression from HiveOption. For Orc, because

[GitHub] spark issue #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2018-01-08 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20087 Retest please. @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-07 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r160086482 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,10 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #20076: [SPARK-21786][SQL] When acquiring 'compressionCod...

2018-01-04 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20076#discussion_r159802320 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -27,7 +28,7 @@ import

[GitHub] spark issue #20076: [SPARK-21786][SQL] When acquiring 'compressionCodecClass...

2018-01-02 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20076 @gatorsmile I have added two test cases. Please review them. Thank you very much. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #20076: [SPARK-21786][SQL] When acquiring 'compressionCod...

2018-01-02 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20076#discussion_r159218648 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -364,7 +366,9 @@ object SQLConf { .createWithDefault

[GitHub] spark pull request #20076: [SPARK-21786][SQL] When acquiring 'compressionCod...

2018-01-02 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20076#discussion_r159218522 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/CompressionCodecPrecedenceSuite.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-26 Thread fjh100456
Github user fjh100456 closed the pull request at: https://github.com/apache/spark/pull/19218 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-26 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 Please go to #20087 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-26 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/20087 [SPARK-21786][SQL] The 'spark.sql.parquet.compression.codec' and 'spark.sql.orc.compression.codec' configuration doesn't take effect on hive table writing [SPARK-21786][SQL

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-26 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 I'd finished to write the test case with the table containing mixed compression codec. But maybe I'd made a mistake, the original branch was deleted mistakenly, I will closed this PR and create

[GitHub] spark issue #20076: [SPARK-21786][SQL] When acquiring 'compressionCodecClass...

2017-12-26 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20076 @gatorsmile I test it manually and found that table-level compression property coming from sqls like below still can not take effect, enven though passing table properties to a `hadoopConf

[GitHub] spark issue #20076: [SPARK-21786][SQL] When acquiring 'compressionCodecClass...

2017-12-26 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20076 Does it mean what we do in the test case of another pr #19218 ? @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20076: [SPARK-21786][SQL] When acquiring 'compressionCodecClass...

2017-12-25 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20076 Well, I'll revert back the renaming. Any comments? @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20076: [SPARK-21786][SQL] When acquiring 'compressionCod...

2017-12-25 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20076#discussion_r158632847 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -42,8 +43,15 @@ private[parquet] class

[GitHub] spark pull request #20076: [SPARK-21786][SQL] When acquiring 'compressionCod...

2017-12-25 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20076#discussion_r158631203 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -42,8 +43,15 @@ private[parquet] class

[GitHub] spark pull request #20076: [SPARK-21786][SQL] When acquiring 'compressionCod...

2017-12-25 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20076#discussion_r158628479 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -42,8 +43,15 @@ private[parquet] class

[GitHub] spark issue #20076: [SPARK-21786][SQL] When acquiring 'compressionCodecClass...

2017-12-24 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/20076 cc @gatorsmile No orc configuration found in "sql-programming-guide.md", so I did not add the precedence description to `spark.sql.orc.compres

[GitHub] spark pull request #20076: [SPARK-21786][SQL] When acquiring 'compressionCod...

2017-12-24 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/20076 [SPARK-21786][SQL] When acquiring 'compressionCodecClassName' in 'ParquetOptions', `parquet.compression` needs to be considered. [SPARK-21786][SQL] When acquiring 'compressionCodecClassName

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-23 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 I mean sevral files with different compressioncodcs under the same table directory, like below: ![image](https://user-images.githubusercontent.com/26785576/34318192-6891bf5a-e7fc-11e7-82b9

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile I'd test manully. When table-level compression not configured, it always take the session level compression, and ignore the existing file compression. Seems like a bug, however

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r158574986 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -42,8 +43,15 @@ private[parquet] class

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r158492627 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -35,7 +39,7 @@ case class TestData(key: Int, value: String

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r158460931 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -35,7 +39,7 @@ case class TestData(key: Int, value: String

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r158459550 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveOptions.scala --- @@ -102,4 +111,18 @@ object HiveOptions

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r158458747 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveOptions.scala --- @@ -102,4 +111,18 @@ object HiveOptions

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r158458003 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveOptions.scala --- @@ -19,7 +19,16 @@ package

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-12-22 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r158457806 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -42,8 +43,15 @@ private[parquet] class

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-21 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile Could you help to review it? Thanks very much! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-19 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile @maropu Does it look better now? About statistic issue, is there any suggestion? @SparkQA Please start test, thanks

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-18 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 I will change the code with the suggestion of @gatorsmile ,it's a little busy this days.I will do it tomorrow

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-12-05 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile Thank you very much.I'll fix all the problem later.For more,I'm not very clear what you mean by 'workload', I changed the test to a benchmark in this two days, and the code

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-11-28 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile I'd tested the performance of 'uncompressed', 'snappy', 'gzip' compression algorithm for parquet, the input data volume is 22MB, 220MB, 1100MB, respectively run 10 times, finally

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-11-01 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 cc @gatorsmile @dongjoon-hyun Is it ok now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-12 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r144202674 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -728,4 +732,254 @@ class InsertSuite extends QueryTest

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-11 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 cc @gatorsmile @dongjoon-hyun Is it ok now? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #6751: [SPARK-8300] DataFrame hint for broadcast join.

2017-10-11 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/6751 With `/*+ broadcast(table) */`, it works well, thank you very much. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #6751: [SPARK-8300] DataFrame hint for broadcast join.

2017-10-10 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/6751 Is there an example? I use broadcast like the following, but it perform an error.Would you be so kind as to show me an example? >spark-sql> select a.* from tableA a left oute

[GitHub] spark issue #6751: [SPARK-8300] DataFrame hint for broadcast join.

2017-10-10 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/6751 @rxin @marmbrus Is there another way to broadcast table with the spark-sql now, except by `spark.sql.autoBroadcastJoinThreshold`? And if no, is it a good way to broadcast table by user

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624224 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624210 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624196 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-10-09 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/19218#discussion_r143624181 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -68,6 +68,26 @@ private[hive] trait SaveAsHiveFile

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-27 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @dongjoon-hyun @gatorsmile I'd fix them. Could you help me to review it again? Thanks. --- - To unsubscribe, e-mail

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-26 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @dongjoon-hyun Thank you very much, I'll fix them as soon as possible. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-20 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile @dongjoon-hyun I'd fix it. Could you helpe me to review it again? Thanks. --- - To unsubscribe, e-mail

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-18 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 Thanks for your review. @gatorsmile In the first question I mean that ‘parquet.compression’ can be found in the `table: Tabledesc` (maybe similar with `catalogtable`), and can also

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-18 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 Encounter two problems: 1. I tried to fix it in the order of 'compression' > 'parquet.compression' > 'spark.sql.parquet.compression. codec', but found 'parquet.compression' may com

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-16 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @dongjoon-hyun A problem has been encountered, There are two ways to specify the compression format: 1. CREATE TABLE Test(id int) STORED AS ORC TBLPROPERTIES ('orc.compress'='SNAPPY

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-15 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @dongjoon-hyun Thank you very much, I'll fix it now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-15 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 @gatorsmile 1. Non-partitioned tables do not have this problem, 'spark.sql.parquet.compression.codec' can take effect normally, because the process of writing data differs from

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-14 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 cc @maropu I have added the test. However, all of my local use cases do not work properly, so I'm not sure if the new use case will pass, but I will always be concerned

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-09-13 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/19218 [SPARK-21786][SQL] The 'spark.sql.parquet.compression.codec' configuration doesn't take effect on tables with partition field(s) [SPARK-21786][SQL] The 'spark.sql.parquet.compression.codec

[GitHub] spark pull request #19189: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-09-11 Thread fjh100456
Github user fjh100456 closed the pull request at: https://github.com/apache/spark/pull/19189 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19189: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2017-09-11 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/19189 [SPARK-21786][SQL] The 'spark.sql.parquet.compression.codec' configuration doesn't take effect on tables with partition field(s) ## What changes were proposed in this pull request? Pass

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-28 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 Hi, @srowen , what's your opinion about this issue? There's been no progress for days. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-26 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 It seems to be a big change to solve this problem completely, it may not be worth making such a big change for such a small problem, so I think we should hide it. cc @srowen --- If your

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-22 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 According to Jerryshao, I should change to use "Currenttime - StartTime", although in some scenarios the value is wrong. But I tend to hide the duration for incomplete ap

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-20 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 Oops, I didn't notice the comment. But it seems that if we don't change the "lastUpdated", it just be a "Currenttime". Even so,I'd like to ask for other contribu

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-20 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 Sorry, I forgot to explain that what I mean by “Lastupdated“ refers to the value of "Filestatus.Getmodificationtime".Maybe the value of lastUpdated field needs a change too.

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-20 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 @jerryshao For example, you use shell commands to kill the application's submit process from outside, this application is always treated as a "incomplete application" in history

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-20 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 @jerryshao I thought about doing like this, but the application of the exception abort will always be treated as “incompleted application”, if we use “currTimeInMs - startTimeâ

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-20 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 @guoxiaolongzte I have considered this problem, but there is no “EndTime“ value for applications in processing, including the application of the exception abort I mentioned. So in the Scala

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-19 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 Yes,it should be. @ajbozarth The screenshot:@zhuoliu ![default](https://user-images.githubusercontent.com/26785576/27312007-89a3eca6-5597-11e7-81fe-7dcff2c2a861.png

[GitHub] spark issue #18351: [SPARK-21135][WEB UI] On history server page,duration ...

2017-06-19 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/18351 I have not found a similar problem on other pages? The CompleteTime of this page has been hidden, indicating that it should have been considered before. Have you considered the other question

[GitHub] spark pull request #18351: [SPARK-21135][WEB UI] On history server page,du...

2017-06-19 Thread fjh100456
GitHub user fjh100456 opened a pull request: https://github.com/apache/spark/pull/18351 [SPARK-21135][WEB UI] On history server page,duration of incompleted applications should be hidden instead of showing up as 0 ## What changes were proposed in this pull request? Hide

  1   2   >