Repository: spark
Updated Branches:
  refs/heads/master 3b5aaa6a5 -> f17d43b03


[SPARK-6219] [Build] Check that Python code compiles

This PR expands the Python lint checks so that they check for obvious 
compilation errors in our Python code.

For example:

```
$ ./dev/lint-python
Python lint checks failed.
Compiling ./ec2/spark_ec2.py ...
  File "./ec2/spark_ec2.py", line 618
    return (master_nodes,, slave_nodes)
                         ^
SyntaxError: invalid syntax

./ec2/spark_ec2.py:618:25: E231 missing whitespace after ','
./ec2/spark_ec2.py:1117:101: E501 line too long (102 > 100 characters)
```

This PR also bumps up the version of `pep8`. It ignores new types of checks 
introduced by that version bump while fixing problems missed by the older 
version of `pep8` we were using.

Author: Nicholas Chammas <nicholas.cham...@gmail.com>

Closes #4941 from nchammas/compile-spark-ec2 and squashes the following commits:

75e31d8 [Nicholas Chammas] upgrade pep8 + check compile
b33651c [Nicholas Chammas] PEP8 line length


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f17d43b0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f17d43b0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f17d43b0

Branch: refs/heads/master
Commit: f17d43b033d928dbc46aef8e367aa08902e698ad
Parents: 3b5aaa6
Author: Nicholas Chammas <nicholas.cham...@gmail.com>
Authored: Thu Mar 19 12:46:10 2015 -0700
Committer: Josh Rosen <joshro...@databricks.com>
Committed: Thu Mar 19 12:46:10 2015 -0700

----------------------------------------------------------------------
 dev/lint-python  | 44 +++++++++++++++++++++++++++-----------------
 ec2/spark_ec2.py |  4 ++--
 2 files changed, 29 insertions(+), 19 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/f17d43b0/dev/lint-python
----------------------------------------------------------------------
diff --git a/dev/lint-python b/dev/lint-python
index 772f856..fded654 100755
--- a/dev/lint-python
+++ b/dev/lint-python
@@ -19,43 +19,53 @@
 
 SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
 SPARK_ROOT_DIR="$(dirname "$SCRIPT_DIR")"
-PEP8_REPORT_PATH="$SPARK_ROOT_DIR/dev/pep8-report.txt"
+PATHS_TO_CHECK="./python/pyspark/ ./ec2/spark_ec2.py 
./examples/src/main/python/"
+PYTHON_LINT_REPORT_PATH="$SPARK_ROOT_DIR/dev/python-lint-report.txt"
 
 cd "$SPARK_ROOT_DIR"
 
+# compileall: https://docs.python.org/2/library/compileall.html
+python -B -m compileall -q -l $PATHS_TO_CHECK > "$PYTHON_LINT_REPORT_PATH"
+compile_status="${PIPESTATUS[0]}"
+
 # Get pep8 at runtime so that we don't rely on it being installed on the build 
server.
 #+ See: https://github.com/apache/spark/pull/1744#issuecomment-50982162
 #+ TODOs:
-#+  - Dynamically determine latest release version of pep8 and use that.
-#+  - Download this from a more reliable source. (GitHub raw can be flaky, 
apparently. (?))
+#+  - Download pep8 from PyPI. It's more "official".
 PEP8_SCRIPT_PATH="$SPARK_ROOT_DIR/dev/pep8.py"
-PEP8_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/jcrocholl/pep8/1.5.7/pep8.py";
-PEP8_PATHS_TO_CHECK="./python/pyspark/ ./ec2/spark_ec2.py 
./examples/src/main/python/"
+PEP8_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/jcrocholl/pep8/1.6.2/pep8.py";
 
+# if [ ! -e "$PEP8_SCRIPT_PATH" ]; then
 curl --silent -o "$PEP8_SCRIPT_PATH" "$PEP8_SCRIPT_REMOTE_PATH"    
-curl_status=$?
+curl_status="$?"
 
-if [ $curl_status -ne 0 ]; then
+if [ "$curl_status" -ne 0 ]; then
     echo "Failed to download pep8.py from \"$PEP8_SCRIPT_REMOTE_PATH\"."
-    exit $curl_status
+    exit "$curl_status"
 fi
-
+# fi
 
 # There is no need to write this output to a file
 #+ first, but we do so so that the check status can
 #+ be output before the report, like with the
 #+ scalastyle and RAT checks.
-python "$PEP8_SCRIPT_PATH" $PEP8_PATHS_TO_CHECK > "$PEP8_REPORT_PATH"
-pep8_status=${PIPESTATUS[0]} #$?
+python "$PEP8_SCRIPT_PATH" --ignore=E402,E731,E241,W503,E226 $PATHS_TO_CHECK 
>> "$PYTHON_LINT_REPORT_PATH"
+pep8_status="${PIPESTATUS[0]}"
+
+if [ "$compile_status" -eq 0 -a "$pep8_status" -eq 0 ]; then
+    lint_status=0
+else
+    lint_status=1
+fi
 
-if [ $pep8_status -ne 0 ]; then
-    echo "PEP 8 checks failed."
-    cat "$PEP8_REPORT_PATH"
+if [ "$lint_status" -ne 0 ]; then
+    echo "Python lint checks failed."
+    cat "$PYTHON_LINT_REPORT_PATH"
 else
-    echo "PEP 8 checks passed."
+    echo "Python lint checks passed."
 fi
 
-rm "$PEP8_REPORT_PATH"
 rm "$PEP8_SCRIPT_PATH"
+rm "$PYTHON_LINT_REPORT_PATH"
 
-exit $pep8_status
+exit "$lint_status"

http://git-wip-us.apache.org/repos/asf/spark/blob/f17d43b0/ec2/spark_ec2.py
----------------------------------------------------------------------
diff --git a/ec2/spark_ec2.py b/ec2/spark_ec2.py
index f848874..c467cd0 100755
--- a/ec2/spark_ec2.py
+++ b/ec2/spark_ec2.py
@@ -1159,8 +1159,8 @@ def real_main():
             if EC2_INSTANCE_TYPES[opts.instance_type] != \
                EC2_INSTANCE_TYPES[opts.master_instance_type]:
                 print >> stderr, \
-                    "Error: spark-ec2 currently does not support having a 
master and slaves with " + \
-                    "different AMI virtualization types."
+                    "Error: spark-ec2 currently does not support having a 
master and slaves " + \
+                    "with different AMI virtualization types."
                 print >> stderr, "master instance virtualization type: 
{t}".format(
                     t=EC2_INSTANCE_TYPES[opts.master_instance_type])
                 print >> stderr, "slave instance virtualization type: 
{t}".format(


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to