[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-06-06 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
Yep, should be doable without too much effort.

On Sun, Jun 4, 2017 at 9:54 PM, Xiao Li  wrote:

> @NathanHowell  It sounds like we also
> can provide multi-line support for JSON too. For example, in a single JSON
> file
>
> {"a": 1,
> "b": 1.1}
> {"a": 2, "b": 1.1}
> {"a": 3, "b": 1.1}
>
> When using the wholeFile mode, we only parse the first Json record {"a":
> 1, b": 1.1} but ignore the following records. It sounds like we should
> also parse them too and rename wholeFile to multiLine?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-06-04 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16386
  
@NathanHowell It sounds like we also can provide multi-line support for 
JSON too. For example, in a single JSON file
```
{"a": 1,
"b": 1.1}
{"a": 2, "b": 1.1}
{"a": 3, "b": 1.1}
```

When using the `wholeFile` mode, we only parse the first Json record `{"a": 
1, b": 1.1}` but ignore the following records. It sounds like we should also 
parse them too and rename `wholeFile` to `multiLine`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16386
  
thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73032/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #73032 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73032/testReport)**
 for PR 16386 at commit 
[`b801ab0`](https://github.com/apache/spark/commit/b801ab096c3bf426c7d2044291785359ace4922f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73030/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #73030 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73030/testReport)**
 for PR 16386 at commit 
[`58118f2`](https://github.com/apache/spark/commit/58118f2762a3abfb5ca82c156eabb9622844b764).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73029/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #73029 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73029/testReport)**
 for PR 16386 at commit 
[`e323317`](https://github.com/apache/spark/commit/e32331794a63e8fcfe60a0901672e69dcfd6fe15).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #73032 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73032/testReport)**
 for PR 16386 at commit 
[`b801ab0`](https://github.com/apache/spark/commit/b801ab096c3bf426c7d2044291785359ace4922f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #73030 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73030/testReport)**
 for PR 16386 at commit 
[`58118f2`](https://github.com/apache/spark/commit/58118f2762a3abfb5ca82c156eabb9622844b764).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #73029 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73029/testReport)**
 for PR 16386 at commit 
[`e323317`](https://github.com/apache/spark/commit/e32331794a63e8fcfe60a0901672e69dcfd6fe15).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-16 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
@cloud-fan When implementing tests for the other modes I've uncovered an 
existing bug in schema inference in `DROPMALFORMED` mode: 
https://issues.apache.org/jira/browse/SPARK-19641. Since it is not introduced 
in this set of patches I will open a new pull request once this is one merged. 
You can inspect the fix here: 
https://github.com/NathanHowell/spark/commit/e233fd03346a73b3b447fa4c24f3b12c8b2e53ae


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16386
  
LGTM if the test can pass. It will be good if you can also address 
https://github.com/apache/spark/pull/16386#discussion_r100679183


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72975 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72975/testReport)**
 for PR 16386 at commit 
[`7296f7e`](https://github.com/apache/spark/commit/7296f7e705c0087fd2a03d35a8eb0d85ee22b597).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72975/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72975 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72975/testReport)**
 for PR 16386 at commit 
[`7296f7e`](https://github.com/apache/spark/commit/7296f7e705c0087fd2a03d35a8eb0d85ee22b597).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72729/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72729 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72729/testReport)**
 for PR 16386 at commit 
[`b0a5cc8`](https://github.com/apache/spark/commit/b0a5cc8f0c6475b4fa603036f11a9c1248295b40).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72726/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72726 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72726/testReport)**
 for PR 16386 at commit 
[`0f4686d`](https://github.com/apache/spark/commit/0f4686d39890e84dd8542297eeb688746f80ee55).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
@cloud-fan I just pushed a few more changes to address some of your 
comments. I'll be back later next week to continue work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72729 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72729/testReport)**
 for PR 16386 at commit 
[`b0a5cc8`](https://github.com/apache/spark/commit/b0a5cc8f0c6475b4fa603036f11a9c1248295b40).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72726 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72726/testReport)**
 for PR 16386 at commit 
[`0f4686d`](https://github.com/apache/spark/commit/0f4686d39890e84dd8542297eeb688746f80ee55).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72694/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72694 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72694/testReport)**
 for PR 16386 at commit 
[`aafe7bd`](https://github.com/apache/spark/commit/aafe7bded6e614dddaed74d15cecfb8b1a78a639).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72694 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72694/testReport)**
 for PR 16386 at commit 
[`aafe7bd`](https://github.com/apache/spark/commit/aafe7bded6e614dddaed74d15cecfb8b1a78a639).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72685/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72685 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72685/testReport)**
 for PR 16386 at commit 
[`aafe7bd`](https://github.com/apache/spark/commit/aafe7bded6e614dddaed74d15cecfb8b1a78a639).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72620/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72620 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72620/testReport)**
 for PR 16386 at commit 
[`f71a465`](https://github.com/apache/spark/commit/f71a465cf07fb9c043b2ccd86fa57e8e8ea9dc00).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72620 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72620/testReport)**
 for PR 16386 at commit 
[`f71a465`](https://github.com/apache/spark/commit/f71a465cf07fb9c043b2ccd86fa57e8e8ea9dc00).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72603/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72603 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72603/testReport)**
 for PR 16386 at commit 
[`30fb509`](https://github.com/apache/spark/commit/30fb509c71ed1f919e0198f2366aa817d96fc0ca).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72603 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72603/testReport)**
 for PR 16386 at commit 
[`30fb509`](https://github.com/apache/spark/commit/30fb509c71ed1f919e0198f2366aa817d96fc0ca).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
Rebased again to pickup the build break hotfix in 
c618ccdbe9ac103dfa3182346e2a14a1e7fca91a


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72601/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72601 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72601/testReport)**
 for PR 16386 at commit 
[`6f8b0c3`](https://github.com/apache/spark/commit/6f8b0c3249729aeecc5cf1275e29afe535cc7238).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
I rebased to master and hopefully addressed all of your comments 
@cloud-fan, please have another look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72601 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72601/testReport)**
 for PR 16386 at commit 
[`6f8b0c3`](https://github.com/apache/spark/commit/6f8b0c3249729aeecc5cf1275e29afe535cc7238).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-07 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16386
  
Hi @NathanHowell , do you have time to work on it? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16386
  
Sorry, I missed the ping. Will review it tonight. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16386
  
can we focus on supporting multiline json in this PR? We can leave the 
improvements in new PRs, or this PR is kind of hard to review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-01 Thread sameeragarwal
Github user sameeragarwal commented on the issue:

https://github.com/apache/spark/pull/16386
  
cc @gatorsmile can you please take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-23 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
Any other comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71147/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #71147 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71147/testReport)**
 for PR 16386 at commit 
[`9dc084d`](https://github.com/apache/spark/commit/9dc084d5f0938cc8f5aad1233a99d37bf85957fd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #71147 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71147/testReport)**
 for PR 16386 at commit 
[`9dc084d`](https://github.com/apache/spark/commit/9dc084d5f0938cc8f5aad1233a99d37bf85957fd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-01-10 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
Can someone kick off the tests again? The last failure was in another 
module (Kafka).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70730/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #70730 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70730/testReport)**
 for PR 16386 at commit 
[`9dc084d`](https://github.com/apache/spark/commit/9dc084d5f0938cc8f5aad1233a99d37bf85957fd).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
@HyukjinKwon I just pushed a change that makes the corrupt record handling 
consistent: if a corrupt record column is defined it will always get the json 
text for failed records. If `wholeFile` is enabled a warning is emitted.

I think more discussion is needed to figure out the best way to handle 
corrupt records and exceptions, perhaps it can be shelved for now and we can 
pick it up later under another ticket?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #70730 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70730/testReport)**
 for PR 16386 at commit 
[`9dc084d`](https://github.com/apache/spark/commit/9dc084d5f0938cc8f5aad1233a99d37bf85957fd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16386
  
Only regarding the comment, 
https://github.com/apache/spark/pull/16386#issuecomment-269386229, I have a 
similar (rather combined) idea that we provide another option such as corrupt 
file name optionally (meaning maybe the column appears only when user 
explicitly set for backwards compatibility), don't add a column by 
`columnNameOfCorruptRecord` with a proper documentation in `wholeFile` mode and 
issue a warning message if `columnNameOfCorruptRecord` is set by user in 
`wholeFile` mode. This is a bit complicated idea that might make users confused 
though. I am not sure if it is the best idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
The tests failed for an unrelated reason, looks to be running out of heap 
space in SBT somewhere.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
@HyukjinKwon I agree that overloading the corrupt record column is 
undesirable and `F.input_file_name` is a better way to fetch the filename. It 
would be nice to extend this concept further and provide new functions (like 
`F.json_exception`) to retrieve exceptions and their locations, and this would 
work for the base case (parsing a string) as well as `wholeFile`. Plumbing this 
type of change through appears to require thread locale storage (unfortunately) 
but otherwise doesn't look too bad.

The question then is what to put in the corrupt record column, if one is 
defined, when in `wholeFile` mode. To retain consistency with the string paths 
we should really put the entire file in the column. This is problematic for 
large files (>2GB) since Spark SQL doesn't have blob support... so the 
allocations will fail (along with the task) and there is no way for the end 
user to work around this limitation. Functions like `substr` are applied to 
byte arrays and not file streams. Perhaps it's good enough to issue a warning 
(along the lines of "don't define a corrupt record column in wholeFile mode" 
and hope for the best?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70644/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #70644 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70644/testReport)**
 for PR 16386 at commit 
[`aa5a6db`](https://github.com/apache/spark/commit/aa5a6db716c935d81b72a7f11dcb0ead4be42c85).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-27 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #70644 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70644/testReport)**
 for PR 16386 at commit 
[`aa5a6db`](https://github.com/apache/spark/commit/aa5a6db716c935d81b72a7f11dcb0ead4be42c85).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-23 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
@srowen It is functionally the same as what you're suggesting. The question 
is how (or if) it should it be first class in the `DataFrameReader` api. If we 
agree that it should be exposed, either via a new `FileFormat` or an option to 
`JsonFileFormat`, some abstraction is necessary to support reading from 
different RDD classes.

This PR just pushes that boundary a little further and let's the inference 
and parser code work over more types, not just `String`. This may make parsing 
more efficient in the line oriented codepath by avoiding a conversion from 
`Text` and `UTF8String` (in `JsonToStruct`) to `String`, and also lets us parse 
an `InputStream` without requiring all of the data to be in memory. For small 
files it's not likely to have a benefit (if the file is smaller than 4k it will 
be read entirely anyways) but as the file size increases this reduces the 
amount of memory required for parsing, is friendlier (in theory) on the GC and 
let's us consume files larger than 2GB.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-23 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/16386
  
Why do we need this at all? just use `wholeTextFiles` and parse them as 
JSON.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16386
  
> the corrupt column will contain the filename instead of the literal JSON 
if there is a parsing failure

I am worried of changing the behaviour. I understand why it had to be here 
as you described in the description but we have `input_file_name` functions for 
these. I would not expect, at least, there are file names in `_corrupt_record`.

We need to document this around `spark.sql.columnNameOfCorruptRecord` in 
`SQLConf` and `columnNameOfCorruptRecord` in read/writer in Python and Scala if 
this is acceptable.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70531/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #70531 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70531/testReport)**
 for PR 16386 at commit 
[`7ad5d5b`](https://github.com/apache/spark/commit/7ad5d5be0c7b41112f9f6ad3cb0cf9055de62695).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread NathanHowell
Github user NathanHowell commented on the issue:

https://github.com/apache/spark/pull/16386
  
Hello recent JacksonGenerator.scala commiters, please take a look.

cc/ @rxin @hvanhovell @clockfly @hyukjinkwon @cloud-fan


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2016-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #70531 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70531/testReport)**
 for PR 16386 at commit 
[`7ad5d5b`](https://github.com/apache/spark/commit/7ad5d5be0c7b41112f9f6ad3cb0cf9055de62695).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org