[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19449 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r206760031 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/internal/ExecutorSideSQLConfSuite.scala --- @@ -82,4 +84,22 @@ class ExecutorSideSQLConfSuite extends SparkFunSuite with SQLTestUtils { assert(checks.forall(_ == true)) } } + + test("SPARK-22219: refactor to control to generate comment") { +withSQLConf(StaticSQLConf.CODEGEN_COMMENTS.key -> "false") { + val res = codegenStringSeq(spark.range(10).groupBy(col("id") * 2).count() +.queryExecution.executedPlan) + assert(res.length == 2) + assert(res.forall{ case (_, code) => +!code.contains("* Codegend pipeline") && !code.contains("// input[")}) +} + +withSQLConf(StaticSQLConf.CODEGEN_COMMENTS.key -> "true") { + val res = codegenStringSeq(spark.range(10).groupBy(col("id") * 2).count() +.queryExecution.executedPlan) + assert(res.length == 2) + assert(res.forall{ case (_, code) => +code.contains("* Codegend pipeline") && code.contains("// input[")}) +} --- End diff -- combine these two? ``` Seq(true, false).foreach { flag => ... if (flag) { ... } else { ... } } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r202869932 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -751,6 +751,12 @@ object SQLConf { .booleanConf .createWithDefault(true) + val MAX_CASES_BRANCHES = buildConf("spark.sql.codegen.maxCaseBranches") --- End diff -- Oh..., I will remove this --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r202833782 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -751,6 +751,12 @@ object SQLConf { .booleanConf .createWithDefault(true) + val MAX_CASES_BRANCHES = buildConf("spark.sql.codegen.maxCaseBranches") --- End diff -- ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143385878 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- So far, I do not have a bandwidth to fix it. If anybody is interested in this, please feel free to start it. This requires a design doc at first. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143351534 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- Good to hear that @tejasapatil Could you please let us know when it is available as a PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143345610 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- @tejasapatil Maybe just do nothing now. We need to fix the bugs in active session maintenance first. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143345568 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- I am afraid it might require a lot of code changes if we add SQLConf to `CodegenContext`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143326066 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. --- End diff -- We should move these comments to the SQLConf description. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143318406 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- Thank you for your comments. Is it better to pass `SQLConf` to the constructor of `CodegenContext`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143312237 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- There are places in codebase which do `SQLConf.get` (esp. `CodeGenerator.scala` which is being modified in this PR). Are you suggesting to change that everywhere like #18568 ? I think it will be good thing to do. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143307261 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- See the PR: https://github.com/apache/spark/pull/18568 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143306861 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- `SQLConf` is based on the active spark session. The active spark session is not always correctly set. Thus, we do not encourage the community to use `SQLConf`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/19449#discussion_r143291724 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -929,7 +929,7 @@ class CodegenContext { // be extremely expensive in certain cases, such as deeply-nested expressions which operate over // inputs with wide schemas. For more details on the performance issues that motivated this // flat, see SPARK-15680. -if (SparkEnv.get != null && SparkEnv.get.conf.getBoolean("spark.sql.codegen.comments", false)) { --- End diff -- note to other reviewers: for historical context why this was done, see https://github.com/apache/spark/pull/13421/files#r65268674 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19449: [SPARK-22219][SQL] Refactor code to get a value f...
GitHub user kiszk opened a pull request: https://github.com/apache/spark/pull/19449 [SPARK-22219][SQL] Refactor code to get a value for "spark.sql.codegen.comments" ## What changes were proposed in this pull request? This PR refactors code to get a value for "spark.sql.codegen.comments" by avoiding `SparkEnv.get.conf`. This PR uses `SQLConf.get.codegenComments` since `SQLConf.get` always returns an instance of `SQLConf`. ## How was this patch tested? Added test case to `DebuggingSuite` You can merge this pull request into a Git repository by running: $ git pull https://github.com/kiszk/spark SPARK-22219 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19449.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19449 commit be4220dcb218366f81f4015af56bee9fade90066 Author: Kazuaki Ishizaki Date: 2017-10-06T17:54:03Z Initial commit --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org