[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-06-06 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Yep, should be doable without too much effort. On Sun, Jun 4, 2017 at 9:54 PM, Xiao Li wrote: > @NathanHowell It

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-06-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16386 @NathanHowell It sounds like we also can provide multi-line support for JSON too. For example, in a single JSON file ``` {"a": 1, "b": 1.1} {"a": 2, "b": 1.1} {"a": 3, "b":

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16386 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73032/ Test PASSed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #73032 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73032/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73030/ Test PASSed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #73030 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73030/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73029/ Test PASSed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #73029 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73029/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #73032 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73032/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #73030 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73030/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #73029 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73029/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 @cloud-fan When implementing tests for the other modes I've uncovered an existing bug in schema inference in `DROPMALFORMED` mode: https://issues.apache.org/jira/browse/SPARK-19641. Since it

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16386 LGTM if the test can pass. It will be good if you can also address https://github.com/apache/spark/pull/16386#discussion_r100679183 --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72975 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72975/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72975/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72975 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72975/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72729/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72729 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72729/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72726/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72726 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72726/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 @cloud-fan I just pushed a few more changes to address some of your comments. I'll be back later next week to continue work. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72729/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72726/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72694/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72694 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72694/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72694 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72694/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72685/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72685/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72620/ Test PASSed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72620/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72620/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72603/ Test PASSed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72603 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72603/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72603 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72603/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Rebased again to pickup the build break hotfix in c618ccdbe9ac103dfa3182346e2a14a1e7fca91a --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72601/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72601 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72601/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 I rebased to master and hopefully addressed all of your comments @cloud-fan, please have another look. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #72601 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72601/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16386 Hi @NathanHowell , do you have time to work on it? thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16386 Sorry, I missed the ping. Will review it tonight. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16386 can we focus on supporting multiline json in this PR? We can leave the improvements in new PRs, or this PR is kind of hard to review. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-01 Thread sameeragarwal
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/16386 cc @gatorsmile can you please take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-23 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Any other comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71147/ Test PASSed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #71147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71147/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #71147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71147/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Can someone kick off the tests again? The last failure was in another module (Kafka). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70730/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70730 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70730/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 @HyukjinKwon I just pushed a change that makes the corrupt record handling consistent: if a corrupt record column is defined it will always get the json text for failed records. If `wholeFile`

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70730 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70730/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16386 Only regarding the comment, https://github.com/apache/spark/pull/16386#issuecomment-269386229, I have a similar (rather combined) idea that we provide another option such as corrupt file name

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 The tests failed for an unrelated reason, looks to be running out of heap space in SBT somewhere. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 @HyukjinKwon I agree that overloading the corrupt record column is undesirable and `F.input_file_name` is a better way to fetch the filename. It would be nice to extend this concept further

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70644/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70644 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70644/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70644 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70644/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-23 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 @srowen It is functionally the same as what you're suggesting. The question is how (or if) it should it be first class in the `DataFrameReader` api. If we agree that it should be exposed,

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-23 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16386 Why do we need this at all? just use `wholeTextFiles` and parse them as JSON. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16386 > the corrupt column will contain the filename instead of the literal JSON if there is a parsing failure I am worried of changing the behaviour. I understand why it had to be here as

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70531/ Test FAILed. ---

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70531/testReport)** for PR 16386 at commit

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread NathanHowell
Github user NathanHowell commented on the issue: https://github.com/apache/spark/pull/16386 Hello recent JacksonGenerator.scala commiters, please take a look. cc/ @rxin @hvanhovell @clockfly @hyukjinkwon @cloud-fan --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16386 **[Test build #70531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70531/testReport)** for PR 16386 at commit