[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-220093861 Yeah, we could repurpose #12979 for security reason, will review that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219945292 hm that's true. @davies want to review that one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219944159 I have one concern about the whitelisting approach. Even if each single character is safe, it's difficult to ensure any character sequences which consist of those safe characters are always safe. E.g. `*` and `/` are safe themselves but `*/` is not safe. It's difficult to ensure there are no unsafe combination such. The place holder approach I mentioned above (#12979) may be safer because the place holder consists of only `{`, `comment_placeholder`, numbers and `}`. Anyway, I'll try the whitelisting approach. Let's discuss more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219938986 `/` is used as the division operator and `*` is used as the multiplication operator so it's good to add those characters but we should `*/` so we need to add `\*(?!/)` and `(?
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219935334 Either way works for me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219934463 @sarutak That's true. Should `/` also be common used? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219933348 @davies this is the 2nd security bug with codegen we found already. @sarutak sgtm. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219933074 @rxin If there is any new bug found on this, we could switch to white list, otherwise I'd like to have the current solution. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219932692 O.K. Initially we add some characters to the whitelist and if we need some more characters, we'll consider whether it should be add or not at any time. How about this idea? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219931861 We can just expand the whitelist and add * + and such into that can't we? My main worry is that security is very difficult to get right, and having a whitelist substantially reduces the chance of corner cases that we didn't expect happening. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219931480 Lots of punctuation characters like `*`, `+` can be used as an operator in expressions so I'm afraid comments in generated code will be difficult to read if characters are removed based on the whitelist. On the other hand, I noticed my another PR (#12979) can keep the readability of the comment and the safety. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219908934 Just remove the character. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219904534 @davies @sarutak I'm wondering if we should go with a whitelist approach, i.e. only whitelisting a-z0-9 and () []. It wouldn't sacrifice readability as much, but would be a lot safer. WDYT? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219786749 @sarutak Could you send another PR for 1.6 branch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12939 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219785529 LGTM, Merging this into master and 2.0, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r63562738 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -162,7 +162,18 @@ package object util { def toCommentSafeString(str: String): String = { val len = math.min(str.length, 128) val suffix = if (str.length > len) "..." else "" -str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") + suffix --- End diff -- Good point, LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219669985 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58675/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219669982 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219669729 **[Test build #58675 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58675/consoleFull)** for PR 12939 at commit [`f2b7adb`](https://github.com/apache/spark/commit/f2b7adb42adc7726d17842a98c1b5d4a1d14067c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-219650199 **[Test build #58675 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58675/consoleFull)** for PR 12939 at commit [`f2b7adb`](https://github.com/apache/spark/commit/f2b7adb42adc7726d17842a98c1b5d4a1d14067c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r63464908 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -162,7 +162,18 @@ package object util { def toCommentSafeString(str: String): String = { val len = math.min(str.length, 128) val suffix = if (str.length > len) "..." else "" -str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") + suffix --- End diff -- Thanks for the advice. I think "\u" should be escaped too otherwise, the compilation will fail when invalid unicode characters, like `\u002X` or `\u001`, are in literals. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r63406727 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2496,4 +2496,28 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { } } + test("check code injection is prevented") { +// `\u002A/` is `*/` --- End diff -- Could you add more cases: ``` */ *\u002F \u002A/ \\u002A/ \u002A\u002F ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r63404429 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -162,7 +162,18 @@ package object util { def toCommentSafeString(str: String): String = { val len = math.min(str.length, 128) val suffix = if (str.length > len) "..." else "" -str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") + suffix --- End diff -- We only need to make sure that the comment string does not have `*/` in it, one simpler solution could be ``` str.substring(0, len).replace("*/", "*\\/").replace("u0022/", "u0022\\/") + suffix ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218702610 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218702611 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58460/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218702380 **[Test build #58460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58460/consoleFull)** for PR 12939 at commit [`1140642`](https://github.com/apache/spark/commit/1140642e10273954ee2f7787b92a7c80f5e6ec89). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218683370 **[Test build #58460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58460/consoleFull)** for PR 12939 at commit [`1140642`](https://github.com/apache/spark/commit/1140642e10273954ee2f7787b92a7c80f5e6ec89). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218382804 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58333/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218382799 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218382549 **[Test build #58333 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58333/consoleFull)** for PR 12939 at commit [`7106f23`](https://github.com/apache/spark/commit/7106f234fd4f04ce8e922e643bb343ac635d09d2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218368915 **[Test build #58333 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58333/consoleFull)** for PR 12939 at commit [`7106f23`](https://github.com/apache/spark/commit/7106f234fd4f04ce8e922e643bb343ac635d09d2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r62718307 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -162,7 +162,8 @@ package object util { def toCommentSafeString(str: String): String = { val len = math.min(str.length, 128) val suffix = if (str.length > len) "..." else "" -str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") + suffix +str.substring(0, len).replace("*/", "\\*\\/") + .replaceAll("(^|[^])(()*u)", "$1$2") + suffix --- End diff -- Thanks for the suggestion @mhseiden . I tried escaping by `escapeJava` and it may fix this issue but I noticed it escapes all of "\", means the number of "\" will be doubled. For example, "\\\u0022" will be "\\u0022" but I expects only "\" just before "u" will be escaped if the number of "\" is odd. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user mhseiden commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r62688712 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -162,7 +162,8 @@ package object util { def toCommentSafeString(str: String): String = { val len = math.min(str.length, 128) val suffix = if (str.length > len) "..." else "" -str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") + suffix +str.substring(0, len).replace("*/", "\\*\\/") + .replaceAll("(^|[^])(()*u)", "$1$2") + suffix --- End diff -- Is the implementation of `org.apache.commons.lang3.StringEscapeUtils.escapeJava(...)` sufficient to cover this case? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218122745 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218122747 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58228/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218122540 **[Test build #58228 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58228/consoleFull)** for PR 12939 at commit [`15a23aa`](https://github.com/apache/spark/commit/15a23aab32eb0ba409d1555edec6b71a5ae59494). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-218104347 **[Test build #58228 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58228/consoleFull)** for PR 12939 at commit [`15a23aa`](https://github.com/apache/spark/commit/15a23aab32eb0ba409d1555edec6b71a5ae59494). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r62637582 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -162,7 +162,8 @@ package object util { def toCommentSafeString(str: String): String = { val len = math.min(str.length, 128) val suffix = if (str.length > len) "..." else "" -str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") + suffix +str.substring(0, len).replace("*/", "\\*\\/") + .replaceAll("(^|[^])(()*u)", "$1$2") + suffix --- End diff -- Yeah, make sense. I've added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12939#discussion_r62572000 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala --- @@ -162,7 +162,8 @@ package object util { def toCommentSafeString(str: String): String = { val len = math.min(str.length, 128) val suffix = if (str.length > len) "..." else "" -str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") + suffix +str.substring(0, len).replace("*/", "\\*\\/") + .replaceAll("(^|[^])(()*u)", "$1$2") + suffix --- End diff -- How about we also have a comment at here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-217612432 CC: @rxin , @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-217313174 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57927/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-217313173 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-217313015 **[Test build #57927 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57927/consoleFull)** for PR 12939 at commit [`30ad081`](https://github.com/apache/spark/commit/30ad0819103d35b612c52000bcdcd0b5daa798e1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12939#issuecomment-217298905 **[Test build #57927 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57927/consoleFull)** for PR 12939 at commit [`30ad081`](https://github.com/apache/spark/commit/30ad0819103d35b612c52000bcdcd0b5daa798e1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/12939 [SPARK-15165][SQL] Codegen can break because toCommentSafeString is not actually safe ## What changes were proposed in this pull request? toCommentSafeString method replaces "\u" with "\\u" to avoid codegen breaking. But if the even number of "\" is put before "u", like "\\u", in the string literal in the query, codegen can break. Following code occurs compilation error. ``` val df = Seq(...).toDF df.select("'u002A/'").show ``` The reason of the compilation error is because "u002A/" is translated into "*/" (the end of comment). Due to this unsafety, arbitrary code can be injected like as follows. ``` val df = Seq(...).toDF // Inject "System.exit(1)" df.select("'u002A/{System.exit(1);}/*'").show ``` ## How was this patch tested? Added new test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sarutak/spark SPARK-15165 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12939.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12939 commit 30ad0819103d35b612c52000bcdcd0b5daa798e1 Author: Kousuke SarutaDate: 2016-05-05T22:04:08Z Made toCommentSafeString method safer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org