[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 Thank you everybody sincerely. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17956 thanks, merging to master/2.2! I think this change is pretty safe, we can discuss 2 things later: 1. if we want to support more special strings like `Inf` 2. if we want to make it

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76894/ Test PASSed. ---

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76894 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76894/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 @gatorsmile, @cloud-fan and @viirya, could you take another look please? I tried to get rid of all the behaviour changes existing in both previous PRs but only leave the change to avoid the

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 Unfortunately, we already support ``` "NaN" "-Infinity" "Infinity" ``` Now, this PR targets primarily to avoid unnecessary conversion try primarily. --- If

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 Note, we do not support the following cases ``` def floatRecords: Dataset[String] = spark.createDataset(spark.sparkContext.parallelize( """{"f": "18.00"}""" ::

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 We are handling the Json data sources here. Are the following inputs widely used? ``` {"a": "+INF"} {"a": "INF"} {"a": "-INF"} {"a": "NaN"} {"a": "+NaN"} {"a": "-NaN"}

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76894/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 Given the existing logic below: ``` if (lowerCaseValue.equals("nan") || lowerCaseValue.equals("infinity") || lowerCaseValue.equals("-infinity") ||

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 Yea, Let's focus on the topic. For the cases below: ``` {"a": NaN} {"a": Infinity} {"a": +Infinity} {"a": -Infinity} ``` They are related with

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 BTW, if we want to support the JSON strings like `{"a": "-Infinity"}` as the FLOAT type, I think we also should support the float data in a string. Below is an example. ```Scala def

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 Based on my previous work experience, I am very conservative to introduce any behavior change unless we are sure this is the right thing to do. I do not think this is a high priority PR. I can

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 1. It is not good to hold off PRs. If a PR looks good and coherent, I think we could merge. 2. "How many JSON writers write a float "INF" as a string?", if it is your worry, I will

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 Let me share my understanding. First, we do not have a deadline for any non-urgent fixes. This PR is an example. How many JSON writers write a float "INF" as a string? Being a

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 WDYT? I think it'd be faster to get this merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 Hey @gatorsmile, how about picking up the commits here and opening a PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 We really need to be careful when making the code changes. If we support such a feature, we need to support all the data types. This is unexpected to users. --- If your project is set up for

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 Yea, I am aware of it because we support special floating cases in string and I think we'd expect other doubles could be in string? --- If your project is set up for it, you can reply to this

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 This PR also introduces the following behavior changes. ```Scala def floatRecords: Dataset[String] = spark.createDataset(spark.sparkContext.parallelize(

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 `allowNonNumericNumbers` option is undocumented and incomplete for now - https://github.com/apache/spark/pull/9724#r44883828 > allowNonNumericNumbers is undocumented for now, since I

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 Before this PR, the option `allowNonNumericNumbers` affects the results of the following four cases: ```Json {"a": NaN} {"a": Infinity} {"a": +Infinity} {"a": -Infinity}

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76858/ Test PASSed. ---

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76858 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76858/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76857/ Test PASSed. ---

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76857 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76857/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76854/ Test PASSed. ---

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76854 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76854/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 (Ah, it was about Jackson's `JsonParser.Feature.ALLOW_NON_NUMERIC_NUMBERS` ... I see.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17956 > Do we need to follow allowNonNumericNumbers when parsing """{"a": "-Infinity"}"""? cc @cloud-fan @viirya `allowNonNumericNumbers` supports it and Scala supports it too. It seems to me

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 I didn't understand what ^ means. Could you elaborate please? `allowNonNumericNumbers` option here for now is incomplete up to my knowledge. --- If your project is set up for it, you can reply

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 Do we need to follow `allowNonNumericNumbers` when parsing ` """{"a": "-Infinity"}"""`? cc @cloud-fan @viirya --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 uh... That is different... This is String. That is Float. nvm. They are different issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 Yes, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 If we do not upgrade `Jackson`, is that possible we can do it by checking the flag `allowNonNumericNumbers` before [doing the

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 We should upgrade Jackson accordingly to support that option correctly. This looks reverted due to dependency problem - https://github.com/apache/spark/pull/9759#issuecomment-222759764. It

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 Could we update the JIRA, PR title, and PR description? Also check the JSON flag `allowNonNumericNumbers` when doing the conversion? --- If your project is set up for it, you can

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76858 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76858/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76857 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76857/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 @gatorsmail, I added some more cases I could identify based on http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#valueOf(java.lang.String) (this looks calling

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76854 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76854/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76851/ Test FAILed. ---

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17956 This PR description is misleading. This PR is actually a bug fix, I think --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76851 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76851/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76846/ Test PASSed. ---

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76846 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76846/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17956 LGTM except for a minor comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76841/ Test PASSed. ---

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17956 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76841 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76841/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17956 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76846 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76846/testReport)** for PR 17956 at commit

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 cc @NathanHowell, @cloud-fan and @viirya. (I just want to note this will not change any input/output but just the exception type and avoid additional conversion try.) --- If your

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try for ...

2017-05-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17956 **[Test build #76841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76841/testReport)** for PR 17956 at commit