[GitHub] spark pull request #14180: Wheelhouse and VirtualEnv support

2016-08-02 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14180#discussion_r73142048 --- Diff: python/pyspark/worker.py --- @@ -19,18 +19,27 @@ Worker that receives input from Piped RDD. """ from __

[GitHub] spark pull request #14180: Wheelhouse and VirtualEnv support

2016-08-02 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14180#discussion_r73142199 --- Diff: .editorconfig --- @@ -0,0 +1,15 @@ +root = true + +[*] +indent_style = space +indent_size = 4 +end_of_line = lf

[GitHub] spark issue #14180: Wheelhouse and VirtualEnv support

2016-08-02 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Yes, I'll be glad ! It is not fully ready yet, I still need to figure out how the script is launched in each situation --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #14567: Python import reorg

2016-08-09 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r74102329 --- Diff: python/pyspark/context.py --- @@ -22,22 +22,30 @@ import signal import sys import threading -from threading import RLock

[GitHub] spark issue #14180: Wheelhouse and VirtualEnv support

2016-08-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Opened #14567 with Pep8, import reorganisations and editconfig. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #14567: Python import reorg

2016-08-09 Thread Stibbons
GitHub user Stibbons opened a pull request: https://github.com/apache/spark/pull/14567 Python import reorg ## What changes were proposed in this pull request? This patch adds a code style validation script following pep8 recommendations. Features: - add

[GitHub] spark pull request #14567: Python import reorg

2016-08-09 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r74101512 --- Diff: python/run-tests.py --- @@ -37,11 +45,6 @@ sys.path.append(os.path.join(os.path.dirname(os.path.realpath(__file__)), ".

[GitHub] spark issue #14567: Python import reorg

2016-08-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 Rebased, sorry I had to force push this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-12 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r74590160 --- Diff: python/pyspark/cloudpickle.py --- @@ -194,7 +194,7 @@ def save_function(self, obj, name=None): # we'll pickle the actual function

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-12 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r74590863 --- Diff: python/pep8rc --- @@ -0,0 +1,21 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-08-12 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 I reworked my Pull Request: - tox.ini and pep8rc are factorized - manual pep8 fixes that are not done by autopep8 are done - there is NO automatic formatting of the code. I propose

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-12 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r74615638 --- Diff: python/pyspark/ml/param/shared.py --- @@ -25,7 +25,8 @@ class HasMaxIter(Params): Mixin for param maxIter: max number of iterations (>

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-12 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r74615609 --- Diff: python/pyspark/cloudpickle.py --- @@ -42,17 +42,17 @@ """ --- End diff -- isort/pep8 executed on this file to mak

[GitHub] spark issue #14567: [SPARK-16992] Python Pep8 formatting and import reorgani...

2016-08-10 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Wheelhouse and VirtualEnv support

2016-08-10 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Rebased, without import reorg and editorconfig files. Still not fully validated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #14180: Wheelhouse and VirtualEnv support

2016-07-13 Thread Stibbons
GitHub user Stibbons opened a pull request: https://github.com/apache/spark/pull/14180 Wheelhouse and VirtualEnv support ## What changes were proposed in this pull request? Support virtualenv and wheel in PySpark, based on SPARK-13587. Full description in [SPARK-16367

[GitHub] spark issue #5408: [SPARK-6764] Add wheel package support for PySpark

2016-07-04 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/5408 I am working on a new version. Will propose a PR soon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14180: Wheelhouse and VirtualEnv support

2016-08-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Yes I am back from vacation! Can work on it now :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #14567: Python import reorg

2016-08-09 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r74100949 --- Diff: python/pyspark/context.py --- @@ -22,22 +22,30 @@ import signal import sys import threading -from threading import RLock

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-08-16 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 Any update? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14918: [SPARK-17360][PYSPARK] Support generator in createDataFr...

2017-01-25 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14918 Ok, abandonning --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14918: [SPARK-17360][PYSPARK] Support generator in creat...

2017-01-25 Thread Stibbons
Github user Stibbons closed the pull request at: https://github.com/apache/spark/pull/14918 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] import sort and auto...

2017-02-15 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r101262948 --- Diff: examples/src/main/python/ml/decision_tree_classification_example.py --- @@ -65,8 +67,9 @@ predictions.select("predi

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] import sort and auto...

2017-02-15 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r101263394 --- Diff: examples/src/main/python/mllib/naive_bayes_example.py --- @@ -24,16 +24,17 @@ from __future__ import print_function

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] import sort and auto...

2017-02-15 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r101263279 --- Diff: examples/src/main/python/mllib/streaming_linear_regression_example.py --- @@ -25,13 +25,14 @@ # $example off$ from pyspark

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] import sort and auto...

2017-02-15 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r101263182 --- Diff: examples/src/main/python/streaming/network_wordjoinsentiments.py --- @@ -54,22 +54,25 @@ def print_happiest_words(rdd): # Read

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-02-15 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14830 Hello. This is actually the execution of the pylint/autopep8 config proposed in #14963. I can minimize a little bit more this PR by ignoring indeed more rules. --- If your project is set up

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] import sort and auto...

2017-02-15 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r101262606 --- Diff: examples/src/main/python/ml/count_vectorizer_example.py --- @@ -17,23 +17,26 @@ from __future__ import print_function -from

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-19 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r75561735 --- Diff: dev/py-validate.sh --- @@ -0,0 +1,110 @@ +#!/usr/bin/env bash --- End diff -- it is the script that format all python code

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-19 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r75561821 --- Diff: python/pyspark/ml/param/shared.py --- @@ -25,7 +25,8 @@ class HasMaxIter(Params): Mixin for param maxIter: max number of iterations (>

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-22 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r75642520 --- Diff: python/pyspark/cloudpickle.py --- @@ -280,7 +279,7 @@ def extract_code_globals(co): # see if nested function have any global refs

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-08-22 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 Reworked according to your review. Please keep in mind this is just the first part of a two part pull request, the second will contain a selected part of [this diff](https://github.com/apache

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Wheelhouse and VirtualEnv support

2016-09-05 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Hello. Can someone help to review this PR? I find the current way Spark handle Python programs really problematic, with this proposal (based on top of #13599), jobs deployment becomes so much

[GitHub] spark pull request #14963: [SPARK-16992][PYSPARK] Reenable Pylint

2016-09-05 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14963#discussion_r77542206 --- Diff: dev/lint-python --- @@ -26,30 +67,26 @@ PYLINT_REPORT_PATH="$SPARK_ROOT_DIR/dev/pylint-report.txt" PYLINT_INSTALL_INFO="$SPA

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] PEP8 on Pyspark docu...

2016-09-05 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r77530346 --- Diff: examples/src/main/python/ml/string_indexer_example.py --- @@ -22,6 +22,7 @@ # $example off$ from pyspark.sql import SparkSession

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] PEP8 on Pyspark docu...

2016-09-05 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r77530312 --- Diff: examples/src/main/python/ml/tf_idf_example.py --- @@ -18,7 +18,7 @@ from __future__ import print_function # $example on$ -from

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] PEP8 on Pyspark docu...

2016-09-05 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r77530245 --- Diff: docs/streaming-programming-guide.md --- @@ -1626,10 +1626,10 @@ See the full [source code]({{site.SPARK_GITHUB_URL}}/blob/v{{site.SPARK_VERSION_

[GitHub] spark pull request #14963: [SPARK-16992][PYSPARK] Reenable Pylint

2016-09-05 Thread Stibbons
GitHub user Stibbons opened a pull request: https://github.com/apache/spark/pull/14963 [SPARK-16992][PYSPARK] Reenable Pylint Use a virtualenv for isolation and easy installation. This basically reverts 85a50a6352b72c4619d010e29e3a76774dbc0c71 Might have been

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-09-05 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 I am not sure if there is a test in pylint on the backslash syntax, there are some cases like with the ```with``` statement where the backslash might not be easily replaceable (see https

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] PEP8 on Pyspark docu...

2016-09-05 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r77530179 --- Diff: docs/streaming-programming-guide.md --- @@ -1099,7 +1099,7 @@ joinedStream = stream1.join(stream2) {% endhighlight %} -Here

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK][DOCS] PEP8 on Pyspark docu...

2016-09-05 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r77530186 --- Diff: docs/streaming-programming-guide.md --- @@ -1585,7 +1585,7 @@ public class JavaRow implements java.io.Serializable { /** DataFrame

[GitHub] spark pull request #14963: [SPARK-16992][PYSPARK] Reenable Pylint

2016-09-05 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14963#discussion_r77540952 --- Diff: python/pylintrc --- @@ -84,7 +84,85 @@ enable= # If you would like to improve the code quality of pyspark, remove any of these disabled

[GitHub] spark issue #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2016-09-06 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/13599 @zjffdu any news? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14918: [SPARK-17360][PYSPARK] Support generator in createDataFr...

2016-09-06 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14918 at last, tests pass ! :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14863: [SPARK-16992][PYSPARK] use map comprehension in doc

2016-09-01 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14863 I agree. I would prefer if Spark examples also "promotes" the good practice of Python, ie, replacing 'map' and 'filter' by list or map comprehension ('reduce' has no equivalent on com

[GitHub] spark pull request #14918: [SPARK-17360][PYSPARK] Support generator in creat...

2016-09-01 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14918#discussion_r77174714 --- Diff: python/pyspark/sql/session.py --- @@ -396,14 +398,18 @@ def _createFromLocal(self, data, schema): raise TypeError("schema s

[GitHub] spark pull request #14863: [SPARK-16992][PYSPARK] use map comprehension in d...

2016-09-01 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14863#discussion_r77165741 --- Diff: examples/src/main/python/ml/vector_slicer_example.py --- @@ -32,8 +32,8 @@ # $example on$ df = spark.createDataFrame

[GitHub] spark pull request #14863: [SPARK-16992][PYSPARK] use map comprehension in d...

2016-09-01 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14863#discussion_r77165689 --- Diff: examples/src/main/python/ml/quantile_discretizer_example.py --- @@ -29,7 +29,7 @@ .getOrCreate() # $example

[GitHub] spark issue #14863: [SPARK-16992][PYSPARK] use map comprehension in doc

2016-09-01 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14863 No my proposal was wrong. I have updated it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #14918: [SPARK-17360][PYSPARK] Support generator in creat...

2016-09-01 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14918#discussion_r77174968 --- Diff: python/pyspark/sql/tests.py --- @@ -196,7 +199,8 @@ def setUpClass(cls): cls.tempdir = tempfile.NamedTemporaryFile(delete=False

[GitHub] spark issue #14863: [SPARK-16992][PYSPARK] use map comprehension in doc

2016-09-01 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14863 This is actually wrong, 'map()' returns a 'list' and not a dict --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #14918: [SPARK-17360][PYSPARK] Support generator in creat...

2016-09-01 Thread Stibbons
GitHub user Stibbons opened a pull request: https://github.com/apache/spark/pull/14918 [SPARK-17360][PYSPARK] Support generator in createDataFrame ## What changes were proposed in this pull request? Avoid useless iteration within 'data' structure when creating a data frame

[GitHub] spark pull request #14918: [SPARK-17360][PYSPARK] Support generator in creat...

2016-09-01 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14918#discussion_r77174898 --- Diff: python/pyspark/sql/session.py --- @@ -396,14 +398,18 @@ def _createFromLocal(self, data, schema): raise TypeError("schema s

[GitHub] spark pull request #14918: [SPARK-17360][PYSPARK] Support generator in creat...

2016-09-01 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14918#discussion_r77174813 --- Diff: python/pyspark/sql/session.py --- @@ -373,16 +375,16 @@ def _createFromRDD(self, rdd, schema, samplingRatio): rdd = rdd.map

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-09-01 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 Rebased. Update: - move ```.editorconfig``` up to the root of the project. This is needed so editors plugin will find it and configure both scala and python files. I didn't found

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Wheelhouse and VirtualEnv support

2016-09-07 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 I have written a blog post about this pull request to explain what we can do with it: http://www.great-a-blog.co/wheel-deployment-for-pyspark/ --- If your project is set up for it, you can reply

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Wheelhouse and VirtualEnv support

2016-08-30 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Status for test 'standalone install, 'client' deployment": - virtualenv create and pip install Pypi repository: ok (1 min 30 exec) - wheelhouse (Pypi repositoy): ko, because 'cffi' re

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-09-01 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r77177492 --- Diff: .gitignore --- @@ -28,6 +28,7 @@ build/*.jar build/apache-maven* build/scala* build/zinc* +build/venv --- End diff

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-31 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76967333 --- Diff: dev/isort.cfg --- @@ -1,9 +1,9 @@ # Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-31 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76968414 --- Diff: python/pep8rc --- @@ -0,0 +1,21 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-31 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76967736 --- Diff: dev/py-validate.sh --- @@ -0,0 +1,110 @@ +#!/usr/bin/env bash --- End diff -- My point of view: - don't enforce it right away

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-31 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76967915 --- Diff: python/.editorconfig --- @@ -0,0 +1,30 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license

[GitHub] spark pull request #14963: [SPARK-16992][PYSPARK] Reenable Pylint

2016-09-09 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14963#discussion_r78135769 --- Diff: dev/lint-python --- @@ -17,6 +17,47 @@ # limitations under the License. # +VIRTUAL_ENV_DIR="build/venv" +

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8

2016-09-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Yes but for pylint you have many dependencies to update as well (astroid,...). At least with a virtualenv, pip does it for us :) And every time I see an hard coded external url I am afraid

[GitHub] spark pull request #14963: [SPARK-16992][PYSPARK] Reenable Pylint

2016-09-09 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14963#discussion_r78135524 --- Diff: dev/lint-python --- @@ -26,30 +67,26 @@ PYLINT_REPORT_PATH="$SPARK_ROOT_DIR/dev/pylint-report.txt" PYLINT_INSTALL_INFO="$SPA

[GitHub] spark pull request #14963: [SPARK-16992][PYSPARK] Reenable Pylint

2016-09-09 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14963#discussion_r78135582 --- Diff: dev/requirements.txt --- @@ -1,3 +1,5 @@ jira==1.0.3 PyGithub==1.26.0 Unidecode==0.04.19 +pep8==1.7.0 --- End diff

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Reenable Pylint

2016-09-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Hum I don't see how it was reenabled... Where is it called? And I had many errors to ignore once I have reenabled it on the execution of lint-python. I'll update the title. At least, being

[GitHub] spark pull request #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and ...

2016-09-09 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14963#discussion_r78136301 --- Diff: dev/lint-python --- @@ -26,30 +67,26 @@ PYLINT_REPORT_PATH="$SPARK_ROOT_DIR/dev/pylint-report.txt" PYLINT_INSTALL_INFO="$SPA

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2016-09-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14830 Sure ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15026: [SPARK-17472] [PYSPARK] Better error message for ...

2016-09-13 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/15026#discussion_r78669516 --- Diff: python/pyspark/broadcast.py --- @@ -75,7 +75,13 @@ def __init__(self, sc=None, value=None, pickle_registry=None, path=None

[GitHub] spark pull request #15026: [SPARK-17472] [PYSPARK] Better error message for ...

2016-09-12 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/15026#discussion_r78464418 --- Diff: python/pyspark/broadcast.py --- @@ -75,7 +75,13 @@ def __init__(self, sc=None, value=None, pickle_registry=None, path=None

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-09-16 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Hello, sorry to bother you, but if this patch gets merged, I can work on the pylint errors and submit new PR I had to add in the ignore list of pylint. If I reenable most of them, here

[GitHub] spark issue #14918: [SPARK-17360][PYSPARK] Support generator in createDataFr...

2016-09-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14918 Indeed, i dont think it will be feasible to propagate the generator up to the jvm. It would be cool, because when we have the schema there is no need to iterate several time on the complete

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8

2016-09-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Great ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8

2016-09-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Great ! By the way, I am also working on virtualenv and wheel support for PySpark job deployment (see #14180) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2016-09-26 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r80423208 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -69,6 +84,67 @@ private[spark] class PythonWorkerFactory

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-08 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 I would love to have a bit more feedback on this matter but it does not seem to interest core developers, sadly :( It's a bit disappointing, seeing how Python support on Spark is great, being

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-08 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 sorry I wrote on the wrong pull request, I was talking about the virtualenv support for executor (#14180) :( This one is indeed only to reenable pylint. Merging this would allow me to submit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-09-21 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 What is the PR dashboard ? I usually rebase this patch one or twice a week, I'll do it tomorrow --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #14918: [SPARK-17360][PYSPARK] Support generator in createDataFr...

2016-09-22 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14918 For information, I continue to look at these kind of simple optimisations that does not cost too much. Python is a pretty slow language, very productive in term of code writing, but inefficient

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-09-22 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Pull Request rebased --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #14918: [SPARK-17360][PYSPARK] Support generator in createDataFr...

2016-09-22 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14918 Rebased --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2016-09-22 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14830 Rebased --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-09-22 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 Rebased --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Wheelhouse and VirtualEnv support

2016-09-22 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Rebased. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Wheelhouse and VirtualEnv support

2016-08-18 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 We are implementing Mesos here (may take a while). While not so many people use it, on the paper it looks great ;) Please mail me at gaetan[a t]xeberon.net if is easier for you

[GitHub] spark issue #14180: [SPARK-16367][PYSPARK] Wheelhouse and VirtualEnv support

2016-08-18 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14180 Actually I was waiting for #14567 to be reviewed and merged :( I might have some questions on how Spark deploys Python script on YARN or Mesos if you know how it works --- If your

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK] autopep8 on documentation example...

2016-08-27 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14830 yes, i will try to understand how it works and make it beautiful. The goal is to move toward an automation of such code housework, but it may take some time. I'll continue to submit part

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK] autopep8 on documentation ...

2016-08-27 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14830#discussion_r76513509 --- Diff: examples/src/main/python/ml/binarizer_example.py --- @@ -17,9 +17,10 @@ from __future__ import print_function -from

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-26 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76393422 --- Diff: python/pyspark/cloudpickle.py --- @@ -280,7 +279,7 @@ def extract_code_globals(co): # see if nested function have any global refs

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-08-26 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 changes: - use pycodestyle (replaces 'pep8' tool since may 2016) - keep using autopep8 for light modifications - Add yapf style ```dev/style.yapf.ini``` to people that wish to use

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-08-26 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 @davies, @holdenk,@srowen, @rxin do you have time review this PR ? thx --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #14805: [MINOR][DOCS] Fix minor typos in python example c...

2016-08-26 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14805#discussion_r76393985 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -487,11 +487,11 @@ Dataset[Row] csvDF = spark spark = SparkSession

[GitHub] spark pull request #14830: [SPARK-16992][PYSPARK] [DO NOT MERGE] #14567 exec...

2016-08-26 Thread Stibbons
GitHub user Stibbons opened a pull request: https://github.com/apache/spark/pull/14830 [SPARK-16992][PYSPARK] [DO NOT MERGE] #14567 execution example This is a set of files that has been formatted by the script defined in #14567. Not all files are formatted, only

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-26 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76391648 --- Diff: .editorconfig --- @@ -0,0 +1,30 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more --- End diff -- Should

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-26 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76392662 --- Diff: .isort.cfg --- @@ -1,9 +1,9 @@ # Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-08-26 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14567 thanks a lot :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code

2016-08-26 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14805 For information, autopep8 will do all the space changes automatically --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and...

2016-08-26 Thread Stibbons
Github user Stibbons commented on a diff in the pull request: https://github.com/apache/spark/pull/14567#discussion_r76394439 --- Diff: .editorconfig --- @@ -0,0 +1,30 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more --- End diff -- yes

[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code

2016-08-26 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14805 Look good to me. I think it good to have the space arround kwargs assignation removed to distinguish beween ``` a = 1 ``` and ``` afunc(a=1) ``` which is a good

  1   2   >