[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...

2018-10-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22858#discussion_r228731178
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2326,7 +2326,7 @@ def schema_of_json(json):
 >>> df.select(schema_of_json('{"a": 0}').alias("json")).collect()
 [Row(json=u'struct')]
 """
-if isinstance(json, basestring):
+if isinstance(json, str):
--- End diff --

Yea we should. They are put only when it's needed because there are so many 
cases like that (for instance, imap in Python 2 and map in Python 3)

Looks that's added in another PR in master beach only.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...

2018-10-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22858#discussion_r228730582
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2326,7 +2326,7 @@ def schema_of_json(json):
 >>> df.select(schema_of_json('{"a": 0}').alias("json")).collect()
 [Row(json=u'struct')]
 """
-if isinstance(json, basestring):
+if isinstance(json, str):
--- End diff --

shall we apply it to 2.4? I'm not aware of the background, why we did not 
put
```
if sys.version >= '3':
basestring = str
```
in 2.4?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...

2018-10-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22858#discussion_r228713086
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2326,7 +2326,7 @@ def schema_of_json(json):
 >>> df.select(schema_of_json('{"a": 0}').alias("json")).collect()
 [Row(json=u'struct')]
 """
-if isinstance(json, basestring):
+if isinstance(json, str):
--- End diff --

The problem here is we will not support unicode in Python 2 .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...

2018-10-27 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/22858

[SPARK-24709][SQL][2.4] use str instead of basestring

## What changes were proposed in this pull request?

after backport https://github.com/apache/spark/pull/22775 to 2.4, the 2.4 
sbt Jenkins QA job is broken, see 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7/147/console

I checked all the `isinstance` calls in `functions.py`, all of them use 
`str` to check string type. I don't know why `basestring` works in master and 
2.4 maven build, but it's safer to follow exiting code.

## How was this patch tested?

existing test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark python

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22858.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22858


commit 2917acd18994c3901c8c5b562cf87964bca879d9
Author: Wenchen Fan 
Date:   2018-10-27T11:12:10Z

use str instead of basestring




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org