[GitHub] spark issue #22231: [SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle t...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 @holdenk Can you please do me a big favor and create a separate PR that makes sure that we are doing at least __flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics__ on the whole codebase on Python 2 and Python 3? When I create PRs on this project, I make so many mistakes that it takes pages of conversation and weeks of testing to get them merged. I just want to make sure that the project never backslides on F821s because it has taken [a lot of effort](https://github.com/apache/spark/pulls?q=is%3Apr+author%3Acclauss+is%3Aclosed) to get us clean. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22231: [SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle t...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 Removed the escaped backticks. @HyukjinKwon Undefined names (F821) is a flake8 error, not a PyCodeStyle error. That means that PyCodeStyle on its own (this PR) is unable to detect the problem with `__version__`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22231: [SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle t...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 OK... Removed backticks and we are once again testing ./dev/run-tests-jenkins.py --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22231: [SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle t...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 OK... __# noqa__ works within the flake8 wrapper around PyFlakes but does not work when PyFlakes is called outside of flake8. Also added r"comment" as you suggested. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22231: [SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle t...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 @srowen Following the merge of #22400 there are just three W605s at the bottom of https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96020/console Should we __# noqa__ those lines or do you have a slicker solution/ Thanks massively for your persistence. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22231: [SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle t...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 Let's wait until #22400 is merged before proceeding with this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22400: [SPARK-25238][PYTHON] lint-python: Fix W605 warni...
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/22400#discussion_r216926889 --- Diff: dev/create-release/generate-contributors.py --- @@ -88,7 +88,7 @@ def print_indented(_list): def is_release(commit_title): -return re.findall("\[release\]", commit_title.lower()) or \ +return re.findall(r"\[release\]", commit_title.lower()) or \ --- End diff -- 1. Could we use parents to remove line terminating backslashes as recommended in PEP8? 2. Could we get rid of the use of __re__ in this instance with ```python return ("[release]", commit_title.lower() or "preparing spark release" in commit_title.lower() or "preparing development version" in commit_title.lower() or "CHANGES.txt" in commit_title) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22231: [SPARK-25238][PYTHON] lint-python: Upgrade pycodestyle t...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 @srowen Could you please provide some test cases in Python 2 and Python 3 to prove that the changes that you propose will be properly interpreted without side effects in both? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22266: [SPARK-25270] lint-python: Add flake8 to find syn...
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/22266#discussion_r214201610 --- Diff: dev/lint-python --- @@ -82,6 +82,25 @@ else rm "$PYCODESTYLE_REPORT_PATH" fi +# stop the build if there are Python syntax errors or undefined names +flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics --- End diff -- My take is that we want contributors to be able to locally detect syntax errors and undefined names _before_ they submit a pull request. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22266: [SPARK-25270] lint-python: Add flake8 to find syn...
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/22266#discussion_r214196909 --- Diff: dev/lint-python --- @@ -82,6 +82,26 @@ else rm "$PYCODESTYLE_REPORT_PATH" fi +python -m pip install flake8 --- End diff -- Removed --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22266: [SPARK-25270] lint-python: Add flake8 to find syntax err...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22266 @HyukjinKwon Done. @viirya Done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22266: lint-python: Add flake8 to find syntax errors and undefi...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22266 These tests should fail until #22265 or similar is merged. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22266: lint-python: Add flake8 tests to find Python synt...
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/22266 lint-python: Add flake8 tests to find Python syntax errors and undefi⦠â¦ned names Add [flake8](http://flake8.pycqa.org) tests to find Python syntax errors and undefined names. __E901,E999,F821,F822,F823__ are the "_showstopper_" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. Most other flake8 issues are merely "style violations" -- useful for readability but they do not effect runtime safety. * F821: undefined name `name` * F822: undefined name `name` in `__all__` * F823: local variable name referenced before assignment * E901: SyntaxError or IndentationError * E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark patch-3 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22266.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22266 commit 129119803eacbca1a53384893424e4ae75774626 Author: cclauss Date: 2018-08-29T09:46:38Z lint-python: Add flake8 tests to find Python syntax errors and undefined names Add [flake8](http://flake8.pycqa.org) tests to find Python syntax errors and undefined names. __E901,E999,F821,F822,F823__ are the "_showstopper_" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. Most other flake8 issues are merely "style violations" -- useful for readability but they do not effect runtime safety. * F821: undefined name `name` * F822: undefined name `name` in `__all__` * F823: local variable name referenced before assignment * E901: SyntaxError or IndentationError * E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22265: Undefined name: from pyspark.util import _excepti...
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/22265 Undefined name: from pyspark.util import _exception_message [flake8](http://flake8.pycqa.org) testing of https://github.com/apache/spark on Python 3.7.0 $ __flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics__ ``` ./python/pyspark/java_gateway.py:172:20: F821 undefined name '_exception_message' emsg = _exception_message(e) ^ 1 F821 undefined name '_exception_message' 1 ``` @HyukjinKwon ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark patch-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22265.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22265 commit 197ef63b57f089bc69c08cc9b975df79a73d Author: cclauss Date: 2018-08-29T09:24:38Z Undefined name: from pyspark.util import _exception_message [flake8](http://flake8.pycqa.org) testing of https://github.com/apache/spark on Python 3.7.0 $ __flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics__ ``` ./python/pyspark/java_gateway.py:172:20: F821 undefined name '_exception_message' emsg = _exception_message(e) ^ 1 F821 undefined name '_exception_message' 1 ``` @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22231: [25238][PYTHON] lint-python: Upgrade pycodestyle to v2.4...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/22231 https://issues.apache.org/jira/browse/SPARK-25238 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22231: [MINOR][PYTHON] lint-python: Upgrade pycodestyle ...
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/22231 [MINOR][PYTHON] lint-python: Upgrade pycodestyle to v2.4.0 See https://pycodestyle.readthedocs.io/en/latest/developer.html#changes for changes made in this release. ## What changes were proposed in this pull request? Upgrade pycodestyle to v2.4.0 ## How was this patch tested? __pycodestyle__ Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22231.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22231 commit 00f34cafbe4b3526b48ac794677b9569e58ae01d Author: cclauss Date: 2018-08-25T06:04:29Z [MINOR][PYTHON] lint-python: Upgrade pycodestyle to v2.4.0 See https://pycodestyle.readthedocs.io/en/latest/developer.html#changes for changes made in this release. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22214: [SPARK-23698][Python] Avoid 'undefined name' by d...
Github user cclauss closed the pull request at: https://github.com/apache/spark/pull/22214 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22214: [SPARK-23698][Python] Avoid 'undefined name' by d...
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/22214 [SPARK-23698][Python] Avoid 'undefined name' by defining __version__ ## What changes were proposed in this pull request? Prevent linters from raising __undefined name '\_\_version\_\_'__ by initializing the variable before setting it via a call to __exec()__. This is the last remaining issue related to the work done in #20838 ## How was this patch tested? * $ __python2 -m flake8 . --count --select=E9,F82 --show-source --statistics__ * $ __python3 -m flake8 . --count --select=E9,F82 --show-source --statistics__ Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22214.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22214 commit a077542e67213c0add6093c0bf8c7568b6cb7b32 Author: cclauss Date: 2018-08-24T09:36:07Z [SPARK-23698][Python] Avoid 'undefined name' by defining __version__ Prevent linters from raising __undefined name '\_\_version\_\_'__ by initializing the variable before setting it via a call to __exec()__. This is the last remaining issue related to the work done in #20838 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22128: Add test_slice() to streaming BasicOperations
Github user cclauss closed the pull request at: https://github.com/apache/spark/pull/22128 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 Thanks massively for this. I doubt that I _ever_ would have gotten to that on my own. This is a test so my proposal would be that _you create a separate PR_ so that we are all assured that it passes in the current codebase. Once that PR has been merged, I can come back and finish this PR. Thanks again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 This is not working at all... I am wasting way too much time. 5+ months and 80+ comments for 12 lines of code is I do not have the skills to solve the following undefined name 'long' in a satisfactory manner: ``` ./python/pyspark/streaming/dstream.py:405:35: F821 undefined name 'long' return self._sc._jvm.Time(long(timestamp * 1000)) ^ ``` If someone with more skills would be willing to take that one undefined name off my plate and solve it with a test in a separate PR then I would be grateful. I will study that PR carefully and can then proceed with the others that are in this PR. My recommended fix is at https://github.com/apache/spark/pull/20838/files#diff-6c576c52abc0624ccb6a2f45828dc6a7 and my proposed test (it is failing!) is immediately following. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 It was reverted because [__slice_test()__](#22128) was causing the build to fail. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22128: Add test_slice() to streaming BasicOperations
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/22128 Add test_slice() to streaming BasicOperations As suggested in https://github.com/apache/spark/pull/20838#pullrequestreview-139118618 ## What changes were proposed in this pull request? Add a test for slice operations on streams. (Please fill in changes proposed in this fix) ## How was this patch tested? It is a new test being added to the automated test suite. (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22128.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22128 commit 4094422d58077aa95129a7ec9fddf75c2e3af7a7 Author: cclauss Date: 2018-08-16T23:06:59Z Add test_slice() to streaming BasicOperations As suggested in https://github.com/apache/spark/pull/20838#pullrequestreview-139118618 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 @HyukjinKwon Your advise on next steps? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 May I please break this PR into pieces as I did in #21959 and #21960 so that we can isolate why tests keep failing? These issues were all discovered using the same (flake8 F821) technique but they are in separate files and cover different undefined names so they could rightfully be part of separate PRs. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 Several are fixed it this PR. Some others are not fixable without resorting to using __#noqa__. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21960: [SPARK-23698] Remove unused definitions of long a...
Github user cclauss closed the pull request at: https://github.com/apache/spark/pull/21960 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21959: [SPARK-23698] Define xrange() for Python 3 in dum...
Github user cclauss closed the pull request at: https://github.com/apache/spark/pull/21959 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21960: [SPARK-23698] Remove unused definitions of long a...
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/21960 [SPARK-23698] Remove unused definitions of long and unicode __intlike__ and __unicode__ were defined but not used in the existing code. __basestring()__was removed in Python 3 in favor of __str()__ because all str are Unicode. This simple change resolves an Undefined Name (__long__) and was originally suggested in #20838 which is currently mired in 50+ comments on unrelated modifications. @HyukjinKwon @holdenk ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark patch-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21960.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21960 commit 20d072e6e19114e485711b882ff5f48e48a884a1 Author: cclauss Date: 2018-08-02T05:24:03Z [SPARK-23698] Remove unused definitions of long and unicode __intlike__ and __unicode__ were defined but not used in the existing code. __basestring()__was removed in Python 3 in favor of __str()__ because all str are Unicode. This simple change resolves an Undefined Name (__long__) and was originally suggested in #20838 which is currently mired in 50+ comments on unrelated modifications. @HyukjinKwon @holdenk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21959: [SPARK-23698] Define xrange() for Python 3 in dumpdata_s...
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/21959 As it says in the commit message, these changes are already in #20838 but that PR has been open for 139 days and has 50+ comments. The only way that I seem to make progress is by opening separate PRs that are easier to review and approve. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21959: [SPARK-23698] Define xrange() for Python 3 in dum...
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/21959 [SPARK-23698] Define xrange() for Python 3 in dumpdata_script.py __xrange()__ was removed in Python 3 in favor of __range()__. This simple change removes three Undefined Names was originally suggested in #20838 which is currently mired in 50+ comments on unrelated modifications. @HyukjinKwon @holdenk ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21959.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21959 commit 15676ab875838cc99f71fab6f8618c886b294b1a Author: cclauss Date: 2018-08-02T05:13:37Z [SPARK-23698] Define xrange() for Python 3 in dumpdata_script.py __xrange()__ was removed in Python 3 in favor of __range()__. This simple change removes three Undefined Names was originally suggested in #20838 which is currently mired in 50+ comments on unrelated modifications. @HyukjinKwon @holdenk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 @holdenk Did we miss the window? I still count 10 undefined names in this repo. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r205933934 --- Diff: python/pyspark/streaming/tests.py --- @@ -206,6 +207,22 @@ def func(dstream): expected = [[len(x)] for x in input] self._test_func(input, func, expected) +def test_slice(self): --- End diff -- @holdenk Comments please on __test_slice()__ and on the test results https://github.com/apache/spark/pull/20838#issuecomment-408566860 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r205870627 --- Diff: dev/create-release/releaseutils.py --- @@ -149,7 +152,11 @@ def get_commits(tag): if not is_valid_author(author): author = github_username # Guard against special characters -author = unidecode.unidecode(unicode(author, "UTF-8")).strip() +try: # Python 2 +author = unicode(author, "UTF-8") +except NameError: # Python 3 +author = str(author) +author = unidecode.unidecode(author).strip() --- End diff -- > The way it is now [...] it's probably safer. Let's agree to leave this as is in this PR. EOL of Python 2 in 500 daze away so safe is better. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r204094593 --- Diff: dev/create-release/releaseutils.py --- @@ -149,7 +152,11 @@ def get_commits(tag): if not is_valid_author(author): author = github_username # Guard against special characters -author = unidecode.unidecode(unicode(author, "UTF-8")).strip() +try: # Python 2 +author = unicode(author, "UTF-8") +except NameError: # Python 3 +author = str(author) +author = unidecode.unidecode(author).strip() --- End diff -- Yes... My other worry in Py2 would be if [__sys.setdefaultencoding()__](https://docs.python.org/2/library/sys.html#sys.setdefaultencoding) was set to somthing other that utf-8. That method was also thankfully dropped in Py3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 Three (different) failures in a row. Should I break this PR into separate PRs? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21702: [SPARK-23698] Remove raw_input() from Python 2
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/21702 Tested data entry... In __./dev/merge_spark_pr.py__ just after __clean_up()__, I added the lines: ``` while not continue_maybe('y to conntinue'): print('loop') sys.exit() ``` Test: 'y' or 'Y' caused a loop while all others including "", "yes", "n", "N", "0", "1", "." caused an exit() Identical results on both Python 2 and Python 3 --- In __./dev/create-release/releaseutils.py__ just after __yesOrNoPrompt()_ I added the lines: ``` while not yesOrNoPrompt('y to quit'): print('got an 'n'') sys.exit() ``` Test: 'y' caused an exit() while all others including "", "Y", "yes", "n", "N", "0", "1", "." caused a loop Identical results on both Python 2 and Python 3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21702: [SPARK-23698] Remove raw_input() from Python 2
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/21702#discussion_r199723478 --- Diff: dev/create-release/releaseutils.py --- @@ -49,13 +49,16 @@ print("Install using 'sudo pip install unidecode'") sys.exit(-1) +if sys.version < '3': +input = raw_input --- End diff -- Two downsides to that approach: 1. We stick with the legacy Python syntax which will unnecessarily complicate our lives (and our diffs) [in 18 months](http://pythonclock.org) 2. This approach reduces the linting errors from 10 down to just 2. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21702: [SPARK-23698] Remove raw_input() from Python 2
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/21702 [SPARK-23698] Remove raw_input() from Python 2 Signed-off-by: cclauss ## What changes were proposed in this pull request? Humans will be able to enter text in Python 3 prompts which they can not do today. The Python builtin __raw_input()__ was removed in Python 3 in favor of __input()__. This PR does the same thing in Python 2. ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) flake8 testing Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark python-fix-raw_input Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21702.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21702 commit 960769a735933d58d136ea068954ad83d4731b10 Author: cclauss Date: 2018-07-03T07:10:46Z [SPARK-23698] Remove raw_input() from Python 2 Signed-off-by: cclauss --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r199701965 --- Diff: dev/merge_spark_pr.py --- @@ -39,6 +39,9 @@ except ImportError: JIRA_IMPORTED = False +if sys.version_info[0] >= 3: +raw_input = input --- End diff -- It creates a new function called __raw_input()__ that is identical to the builtin __input()__. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r199701953 --- Diff: python/pyspark/sql/conf.py --- @@ -59,7 +62,7 @@ def unset(self, key): def _checkType(self, obj, identifier): """Assert that an object is of type str.""" -if not isinstance(obj, str) and not isinstance(obj, unicode): +if not isinstance(obj, basestring): --- End diff -- Is there an issue here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r199702001 --- Diff: dev/create-release/releaseutils.py --- @@ -49,6 +49,9 @@ print("Install using 'sudo pip install unidecode'") sys.exit(-1) +if sys.version_info[0] >= 3: +raw_input = input --- End diff -- It creates a new function called __raw_input()__ that is identical to the builtin __input()__. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r181192109 --- Diff: python/pyspark/cloudpickle.py --- @@ -802,9 +802,8 @@ def save_not_implemented(self, obj): self.save_reduce(_gen_not_implemented, ()) if PY3: -dispatch[io.TextIOWrapper] = save_file -else: -dispatch[file] = save_file +file = io.TextIOWrapper +dispatch[file] = save_file --- End diff -- This was merged upstream. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 Other issues / suggestions? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 The cloudpickle change is now merged upstream. I would like to avoid @BryanCutler suggestion above because Unicode is really touchy in Python 2 and a lot can change based on [sys.setdefaultencoding()](https://docs.python.org/2/library/sys.html#sys.setdefaultencoding) so I would like to avoid assumptions and make as small modifications as possible. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on the issue: https://github.com/apache/spark/pull/20838 @HyukjinKwon Was there something more to do on this one? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
Github user cclauss commented on a diff in the pull request: https://github.com/apache/spark/pull/20838#discussion_r175000559 --- Diff: dev/create-release/releaseutils.py --- @@ -49,6 +49,11 @@ print("Install using 'sudo pip install unidecode'") sys.exit(-1) +try: +raw_input # Python 2 --- End diff -- âWhere practical, apply the Python porting best practice: [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection).â --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20838: [SPARK-23698] Resolve undefined names in Python 3
GitHub user cclauss opened a pull request: https://github.com/apache/spark/pull/20838 [SPARK-23698] Resolve undefined names in Python 3 ## What changes were proposed in this pull request? Fix issues arising from the fact that __file__, __long__, __raw_input()__, __unicode__, __xrange()__, etc. were remove from Python 3. __Undefined names__ have the potential to raise NameError at runtime. Apply the Python porting best practice: [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection). ## How was this patch tested? * $ __python2 -m flake8 . --count --select=E9,F82 --show-source --statistics__ * $ __python3 -m flake8 . --count --select=E9,F82 --show-source --statistics__ Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cclauss/spark fix-undefined-names Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20838.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20838 commit bdcd740a5144efef0db1f2b6c29b64c85f3d8ef5 Author: cclauss Date: 2018-03-15T23:08:04Z [SPARK-23698] Reduce undefined names in Python 3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org