Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/11947
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-216006137
OK I'm going to merge this in master and manually update the commit message.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-216003501
LGTM. (Maybe we should not forget, for documentation, `nullValue` has the
highest priority than other options such as `nanValue` if the same value is
given as
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215984253
@HyukjinKwon would be great if you can review this. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215984080
@falaki can you update the pr description?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user koertkuipers commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215979899
please also provide a way for strings to be converted to null upon reading
---
If your project is set up for it, you can reply to this email and have your
reply
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215947150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215947147
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215947097
**[Test build #57423 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57423/consoleFull)**
for PR 11947 at commit
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215943795
LGTM pending tests.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215943744
**[Test build #57423 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57423/consoleFull)**
for PR 11947 at commit
Github user falaki commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215943595
@rxin done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215940817
@falaki sorry this no longer merges cleanly. Do you mind bringing it up to
date?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215930852
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215930854
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215930751
**[Test build #57394 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57394/consoleFull)**
for PR 11947 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215926089
**[Test build #57394 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57394/consoleFull)**
for PR 11947 at commit
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215924124
As discussed offline, we should just have a single option for setting null,
another for nan, another for inf and negative inf. Basically just 4.
---
If your
Github user koertkuipers commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215196735
i personally would have been happy with a simple single values for nulls
for all datatypes.
and the usage of that single value should be consistent across
Github user koertkuipers commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215194241
do these settings roundtrip correctly? say i set doubleNaNValue to "XY",
and i create a dataframe with a Double.NaN in it, does it get written out
correctly as
Github user koertkuipers commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-215192562
hello!
why is there no stringNullValue?
basically i want for a column with type string to read in all empty strings
as nulls. this is what the old option
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-208662536
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-208662534
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-208662375
**[Test build #9 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/9/consoleFull)**
for PR 11947 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-208638423
**[Test build #9 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/9/consoleFull)**
for PR 11947 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-206023267
**[Test build #55033 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55033/consoleFull)**
for PR 11947 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-206023276
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-206023274
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-206020403
**[Test build #55033 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55033/consoleFull)**
for PR 11947 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-205955742
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-205955671
**[Test build #55010 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55010/consoleFull)**
for PR 11947 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-205955740
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-205951008
**[Test build #55010 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55010/consoleFull)**
for PR 11947 at commit
Github user falaki commented on a diff in the pull request:
https://github.com/apache/spark/pull/11947#discussion_r58596297
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala
---
@@ -177,35 +177,57 @@ private[csv] object
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/11947#discussion_r57813530
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala
---
@@ -177,35 +177,57 @@ private[csv] object
Github user cloud-fan commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-202843052
I'm not sure how complicated the use case will be, but it really scares me
with so many options...
If we decide to do it, I think we should also add these
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/11947#discussion_r57708347
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVTypeCastSuite.scala
---
@@ -27,6 +27,8 @@ import
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-202711498
I found both `NaN` and `Infinity` are handled in JSON data source and it
was fixed in this PR,
Github user HyukjinKwon commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-202682552
For codes, overall, it looks good to me. However, I am not used to and have
a lot of experience of dealing with `NaN`, `Inf ` or `-Inf`. If the values can
be
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/11947#discussion_r57657765
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVTypeCastSuite.scala
---
@@ -64,17 +66,21 @@ class CSVTypeCastSuite
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/11947#discussion_r57656879
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
---
@@ -478,4 +479,34 @@ class CSVSuite extends
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/11947#discussion_r57656806
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala
---
@@ -101,3 +125,14 @@ private[sql] class
Github user falaki commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-202570253
@cloud-fan would you take a look at this if you have time?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user falaki commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-202502231
ping @HyukjinKwon and @rxin
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-201080503
**[Test build #54113 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54113/consoleFull)**
for PR 11947 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-201080508
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-201080505
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11947#issuecomment-201080156
**[Test build #54113 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54113/consoleFull)**
for PR 11947 at commit
GitHub user falaki opened a pull request:
https://github.com/apache/spark/pull/11947
[SPARK-14143] Options for parsing NaNs, Infinity and nulls for numeric types
## What changes were proposed in this pull request?
1. Adds following options for parsing type-specfic nulls to
49 matches
Mail list logo