[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22234
  
@mmolimar, let's leave this closed since the newer one is open BTW. You 
will be credited as a primary author of #22367 anyway.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-08 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/22234
  
@gatorsmile @HyukjinKwon Please, take a look at #22367 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-08 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22234
  
@MaxGekk Could you take this PR over? I think we need to merge this to 
Spark 2.4. Users can set the behaviors to the previous one by this new conf 
`emptyValue`, if needed. Also update the migration guide about the behavior 
change and explain how to set `emptyValue`. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22234
  
Oh no I mean we fixed a bug..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-07 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/22234
  
> cc @MaxGekk for a followup

@HyukjinKwon Do you mean to update migration guide in master and probably 
in Spark 2.4? I don't think this should be considered as a bug because current 
version and previous versions of Spark can read saved CSV files correctly. Yes, 
for now empty strings are saved as `""` and `null`s as nothing but this is 
supposed to be to distinguish empty string and null in read. And produced CSV 
files are valid, and they can be read by any mature CSV libs.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22234
  
This is rather a quite corner case (see the elaborated cases in the JIRA 
[SPARK-17916](https://issues.apache.org/jira/browse/SPARK-17916)) and there's 
ambiguity to treat this as a bug or a proper behaviour change; however, I don't 
object if this can be worth enough as something that should be mentioned.

cc @MaxGekk for a followup


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22234
  
Have we documented the behavior changes in the migration guide? If not, can 
we do it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22234
  
From my understanding, yea. The problem here is sounds like ambiguity in 
empty strings since they can be interpreted as empty strings and also `null`. 
To me, this is actually rather a bug since we can't distinguish, and don't 
respect the empty value. If empty strings are written, they should be read as 
empty strings.

This PR proposes an ability explicitly set the empty value to work around 
the behaviour change.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22234
  
Did we introduce any behavior change in 
https://github.com/apache/spark/pull/21273? Does this PR resolve it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22234
  
Seems okay but I or someone else should take a closer look before getting 
this in.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95274/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95274 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95274/testReport)**
 for PR 22234 at commit 
[`0bcdb2a`](https://github.com/apache/spark/commit/0bcdb2a7f2299add11fd78a551027572f80f1ae7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95271/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95271 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95271/testReport)**
 for PR 22234 at commit 
[`bb28db9`](https://github.com/apache/spark/commit/bb28db976fad9316f68a74da4955d08c5b7abaf2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95270/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95270 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95270/testReport)**
 for PR 22234 at commit 
[`3d3f178`](https://github.com/apache/spark/commit/3d3f178a55a8fdb4630916252866a68a98ae17cd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95274 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95274/testReport)**
 for PR 22234 at commit 
[`0bcdb2a`](https://github.com/apache/spark/commit/0bcdb2a7f2299add11fd78a551027572f80f1ae7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95271 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95271/testReport)**
 for PR 22234 at commit 
[`bb28db9`](https://github.com/apache/spark/commit/bb28db976fad9316f68a74da4955d08c5b7abaf2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95270 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95270/testReport)**
 for PR 22234 at commit 
[`3d3f178`](https://github.com/apache/spark/commit/3d3f178a55a8fdb4630916252866a68a98ae17cd).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread mmolimar
Github user mmolimar commented on the issue:

https://github.com/apache/spark/pull/22234
  
@MaxGekk I added what you suggested as well.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95259/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95259 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95259/testReport)**
 for PR 22234 at commit 
[`8b51800`](https://github.com/apache/spark/commit/8b5180021d246ab2fdf0824c01b9f180136837ce).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22234
  
**[Test build #95259 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95259/testReport)**
 for PR 22234 at commit 
[`8b51800`](https://github.com/apache/spark/commit/8b5180021d246ab2fdf0824c01b9f180136837ce).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22234
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-26 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/22234
  
Should the new option be taken into account there: 
https://github.com/apache/spark/blob/b461acb2d90b734393c27fe7b359e2f2d297b8d4/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala#L94
 ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22234
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org