[ https://issues.apache.org/jira/browse/BAHIR-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15380419#comment-15380419 ]
ASF GitHub Bot commented on BAHIR-24: ------------------------------------- GitHub user ckadner opened a pull request: https://github.com/apache/bahir/pull/10 [BAHIR-24] fix MQTT Python code, examples, add tests [BAHIR-24: Fix MQTT Python code](https://issues.apache.org/jira/browse/BAHIR-24) **Changes in this PR:** - remove unnecessary files from `streaming-mqtt/python` (`__init__.py`, `dstream.py`) - updated all `*.py` files with respect to the modified project structure `pyspark.streaming.mqtt` --> `mqtt` (see Question 1 below) - add test cases that were left out from the import and add shell script to run them (compare to [spark-packages/dstream-mqtt](https://github.com/spark-packages/dstream-mqtt/tree/master/python-tests)) - `streaming-mqtt/python-tests/run-python-tests.sh` - `streaming-mqtt/python-tests/tests.py` - modify `MQTTTestUtils.scala` to limit the required disk storage space - modify `bin/run-example` script to setup `PYTHONPATH` to run Python examples - update the Spark version we are building against from `2.0.0-SNAPSHOT` to `2.0.1-SNAPSHOT` **Open questions:** 1. Should we preserve the original PySpark package structure (pre-Spark 2.0) so users with existing PySpark-MQTT programs don't need to change their import statements? i.e. - `from pyspark.streaming.mqtt import MQTTUtils` vs - `from mqtt import MQTTUtils` 2. Should we use the `--py-files` option with `spark-submit` as opposed to setting up the `PYTHONPATH` in the `bin/run-example` script. I did not do it for these reasons. - the `--py-files` option with individual `*.py` files requires that the example Python scripts are changed to move the import statements after SparkContext initialization - alternatively the `--py-files` option requires to create a zip with all the required `*.py` files, but then we should change our packaged binary jar files to include the Python sources at root level so that users can then use `--py-files spark-streaming-mqtt_2.11-2.0.0-SNAPSHOT.jar` without having to create another zip file You can merge this pull request into a Git repository by running: $ git pull https://github.com/ckadner/bahir BAHIR-24_MQTT_Python_fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/bahir/pull/10.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10 ---- commit 817b959fb19f8c3fba88844d5a2664a7490f0bde Author: Christian Kadner <ckad...@us.ibm.com> Date: 2016-07-16T00:49:40Z [BAHIR-24] fix MQTT Python code, examples, add tests ---- > Fix MQTT Python code > -------------------- > > Key: BAHIR-24 > URL: https://issues.apache.org/jira/browse/BAHIR-24 > Project: Bahir > Issue Type: Bug > Components: Spark Streaming Connectors > Affects Versions: 2.0.0 > Reporter: Christian Kadner > Assignee: Christian Kadner > Original Estimate: 12h > Remaining Estimate: 12h > > When the Bahir project was created from Spark revision {{8301fadd8}} the > Python code (incl. examples) were not updated with respect to the modified > project structure and test cases were left out from the import. -- This message was sent by Atlassian JIRA (v6.3.4#6332)