[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-220093861
  
Yeah, we could repurpose #12979 for security reason, will review that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219945292
  
hm that's true. @davies want to review that one?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219944159
  
I have one concern about the whitelisting approach. Even if each single 
character is safe, it's difficult to ensure any character sequences which 
consist of those safe characters are always safe. 
E.g. `*` and `/` are safe themselves but `*/` is not safe. It's difficult 
to ensure there are no unsafe combination such.

The place holder approach I mentioned above (#12979) may be safer because 
the place holder consists of only `{`, `comment_placeholder`, numbers and `}`.

Anyway, I'll try the whitelisting approach. Let's discuss more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219938986
  
`/` is used as the division operator and `*` is used as the multiplication 
operator so it's good to add those characters but we should `*/` so we need to 
add `\*(?!/)` and `(?

[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219935334
  
Either way works for me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219934463
  
@sarutak That's true. Should `/` also be common used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219933348
  
@davies this is the 2nd security bug with codegen we found already.

@sarutak sgtm.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219933074
  
@rxin If there is any new bug found on this, we could switch to white list, 
otherwise I'd like to have the current solution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-18 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219932692
  
O.K. Initially we add some characters to the whitelist and if we need some 
more characters, we'll consider whether it should be add  or not at any time. 
How about this idea?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219931861
  
We can just expand the whitelist and add * + and such into that can't we? 
My main worry is that security is very difficult to get right, and having a 
whitelist substantially reduces the chance of corner cases that we didn't 
expect happening.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219931480
  
Lots of punctuation characters like `*`, `+` can be used as an operator in 
expressions so I'm afraid comments in generated code will be difficult to read 
if characters are removed based on the whitelist.

On the other hand, I noticed my another PR (#12979) can keep the 
readability of the comment  and the safety.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219908934
  
Just remove the character.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219904534
  
@davies @sarutak I'm wondering if we should go with a whitelist approach, 
i.e. only whitelisting a-z0-9 and () []. It wouldn't sacrifice readability as 
much, but would be a lot safer. WDYT?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219786749
  
@sarutak Could you send another PR for 1.6 branch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/12939


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219785529
  
LGTM,
Merging this into master and 2.0, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r63562738
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala ---
@@ -162,7 +162,18 @@ package object util {
   def toCommentSafeString(str: String): String = {
 val len = math.min(str.length, 128)
 val suffix = if (str.length > len) "..." else ""
-str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") 
+ suffix
--- End diff --

Good point, LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219669985
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58675/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219669982
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219669729
  
**[Test build #58675 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58675/consoleFull)**
 for PR 12939 at commit 
[`f2b7adb`](https://github.com/apache/spark/commit/f2b7adb42adc7726d17842a98c1b5d4a1d14067c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-219650199
  
**[Test build #58675 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58675/consoleFull)**
 for PR 12939 at commit 
[`f2b7adb`](https://github.com/apache/spark/commit/f2b7adb42adc7726d17842a98c1b5d4a1d14067c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-16 Thread sarutak
Github user sarutak commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r63464908
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala ---
@@ -162,7 +162,18 @@ package object util {
   def toCommentSafeString(str: String): String = {
 val len = math.min(str.length, 128)
 val suffix = if (str.length > len) "..." else ""
-str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") 
+ suffix
--- End diff --

Thanks for the advice.
I think "\u" should be escaped too otherwise, the compilation will fail 
when invalid unicode characters, like `\u002X` or `\u001`, are in literals.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-16 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r63406727
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2496,4 +2496,28 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("check code injection is prevented") {
+// `\u002A/` is `*/`
--- End diff --

Could you add more cases:
```
*/
*\u002F
\u002A/
\\u002A/
\u002A\u002F
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-16 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r63404429
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala ---
@@ -162,7 +162,18 @@ package object util {
   def toCommentSafeString(str: String): String = {
 val len = math.min(str.length, 128)
 val suffix = if (str.length > len) "..." else ""
-str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") 
+ suffix
--- End diff --

We only need to make sure that the comment string does not have `*/` in it, 
one simpler solution could be

```
str.substring(0, len).replace("*/", "*\\/").replace("u0022/", "u0022\\/") + 
suffix
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218702610
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218702611
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58460/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218702380
  
**[Test build #58460 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58460/consoleFull)**
 for PR 12939 at commit 
[`1140642`](https://github.com/apache/spark/commit/1140642e10273954ee2f7787b92a7c80f5e6ec89).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218683370
  
**[Test build #58460 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58460/consoleFull)**
 for PR 12939 at commit 
[`1140642`](https://github.com/apache/spark/commit/1140642e10273954ee2f7787b92a7c80f5e6ec89).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218382804
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58333/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218382799
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218382549
  
**[Test build #58333 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58333/consoleFull)**
 for PR 12939 at commit 
[`7106f23`](https://github.com/apache/spark/commit/7106f234fd4f04ce8e922e643bb343ac635d09d2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218368915
  
**[Test build #58333 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58333/consoleFull)**
 for PR 12939 at commit 
[`7106f23`](https://github.com/apache/spark/commit/7106f234fd4f04ce8e922e643bb343ac635d09d2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread sarutak
Github user sarutak commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r62718307
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala ---
@@ -162,7 +162,8 @@ package object util {
   def toCommentSafeString(str: String): String = {
 val len = math.min(str.length, 128)
 val suffix = if (str.length > len) "..." else ""
-str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") 
+ suffix
+str.substring(0, len).replace("*/", "\\*\\/")
+  .replaceAll("(^|[^])(()*u)", "$1$2") + suffix
--- End diff --

Thanks for the suggestion @mhseiden .
I tried escaping by `escapeJava` and it may fix this issue but I noticed it 
escapes all of "\", means the number of "\" will be doubled.
For example, "\\\u0022" will be "\\u0022" but I expects only "\" just 
before "u" will be escaped if the number of "\" is odd.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread mhseiden
Github user mhseiden commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r62688712
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala ---
@@ -162,7 +162,8 @@ package object util {
   def toCommentSafeString(str: String): String = {
 val len = math.min(str.length, 128)
 val suffix = if (str.length > len) "..." else ""
-str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") 
+ suffix
+str.substring(0, len).replace("*/", "\\*\\/")
+  .replaceAll("(^|[^])(()*u)", "$1$2") + suffix
--- End diff --

Is the implementation of 
`org.apache.commons.lang3.StringEscapeUtils.escapeJava(...)` sufficient to 
cover this case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218122745
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218122747
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58228/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218122540
  
**[Test build #58228 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58228/consoleFull)**
 for PR 12939 at commit 
[`15a23aa`](https://github.com/apache/spark/commit/15a23aab32eb0ba409d1555edec6b71a5ae59494).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-218104347
  
**[Test build #58228 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58228/consoleFull)**
 for PR 12939 at commit 
[`15a23aa`](https://github.com/apache/spark/commit/15a23aab32eb0ba409d1555edec6b71a5ae59494).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-10 Thread sarutak
Github user sarutak commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r62637582
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala ---
@@ -162,7 +162,8 @@ package object util {
   def toCommentSafeString(str: String): String = {
 val len = math.min(str.length, 128)
 val suffix = if (str.length > len) "..." else ""
-str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") 
+ suffix
+str.substring(0, len).replace("*/", "\\*\\/")
+  .replaceAll("(^|[^])(()*u)", "$1$2") + suffix
--- End diff --

Yeah, make sense.
I've added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-09 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/12939#discussion_r62572000
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/package.scala ---
@@ -162,7 +162,8 @@ package object util {
   def toCommentSafeString(str: String): String = {
 val len = math.min(str.length, 128)
 val suffix = if (str.length > len) "..." else ""
-str.substring(0, len).replace("*/", "\\*\\/").replace("\\u", "u") 
+ suffix
+str.substring(0, len).replace("*/", "\\*\\/")
+  .replaceAll("(^|[^])(()*u)", "$1$2") + suffix
--- End diff --

How about we also have a comment at here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-07 Thread sarutak
Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-217612432
  
CC: @rxin , @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-217313174
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57927/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-217313173
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-217313015
  
**[Test build #57927 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57927/consoleFull)**
 for PR 12939 at commit 
[`30ad081`](https://github.com/apache/spark/commit/30ad0819103d35b612c52000bcdcd0b5daa798e1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12939#issuecomment-217298905
  
**[Test build #57927 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57927/consoleFull)**
 for PR 12939 at commit 
[`30ad081`](https://github.com/apache/spark/commit/30ad0819103d35b612c52000bcdcd0b5daa798e1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-15165][SQL] Codegen can break because t...

2016-05-05 Thread sarutak
GitHub user sarutak opened a pull request:

https://github.com/apache/spark/pull/12939

[SPARK-15165][SQL] Codegen can break because toCommentSafeString is not 
actually safe

## What changes were proposed in this pull request?

toCommentSafeString method replaces "\u" with "\\u" to avoid codegen 
breaking.
But if the even number of "\" is put before "u", like "\\u", in the string 
literal in the query, codegen can break.

Following code occurs compilation error.

```
val df = Seq(...).toDF
df.select("'u002A/'").show
```

The reason of the compilation error is because "u002A/" is 
translated into "*/" (the end of comment). 

Due to this unsafety, arbitrary code can be injected like as follows.

```
val df = Seq(...).toDF
// Inject "System.exit(1)"
df.select("'u002A/{System.exit(1);}/*'").show
```


## How was this patch tested?

Added new test cases.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sarutak/spark SPARK-15165

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12939.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12939


commit 30ad0819103d35b612c52000bcdcd0b5daa798e1
Author: Kousuke Saruta 
Date:   2016-05-05T22:04:08Z

Made toCommentSafeString method safer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org