[
https://issues.apache.org/jira/browse/BAHIR-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15380419#comment-15380419
]
ASF GitHub Bot commented on BAHIR-24:
-------------------------------------
GitHub user ckadner opened a pull request:
https://github.com/apache/bahir/pull/10
[BAHIR-24] fix MQTT Python code, examples, add tests
[BAHIR-24: Fix MQTT Python
code](https://issues.apache.org/jira/browse/BAHIR-24)
**Changes in this PR:**
- remove unnecessary files from `streaming-mqtt/python` (`__init__.py`,
`dstream.py`)
- updated all `*.py` files with respect to the modified project structure
`pyspark.streaming.mqtt` --> `mqtt` (see Question 1 below)
- add test cases that were left out from the import and add shell script to
run them (compare to
[spark-packages/dstream-mqtt](https://github.com/spark-packages/dstream-mqtt/tree/master/python-tests))
- `streaming-mqtt/python-tests/run-python-tests.sh`
- `streaming-mqtt/python-tests/tests.py`
- modify `MQTTTestUtils.scala` to limit the required disk storage space
- modify `bin/run-example` script to setup `PYTHONPATH` to run Python
examples
- update the Spark version we are building against from `2.0.0-SNAPSHOT` to
`2.0.1-SNAPSHOT`
**Open questions:**
1. Should we preserve the original PySpark package structure (pre-Spark
2.0) so users with existing PySpark-MQTT programs don't need to change their
import statements? i.e.
- `from pyspark.streaming.mqtt import MQTTUtils` vs
- `from mqtt import MQTTUtils`
2. Should we use the `--py-files` option with `spark-submit` as opposed to
setting up the `PYTHONPATH` in the `bin/run-example` script. I did not do it
for these reasons.
- the `--py-files` option with individual `*.py` files requires that the
example Python scripts are changed to move the import statements after
SparkContext initialization
- alternatively the `--py-files` option requires to create a zip with all
the required `*.py` files, but then we should change our packaged binary jar
files to include the Python sources at root level so that users can then use
`--py-files spark-streaming-mqtt_2.11-2.0.0-SNAPSHOT.jar` without having to
create another zip file
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ckadner/bahir BAHIR-24_MQTT_Python_fixes
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/bahir/pull/10.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #10
----
commit 817b959fb19f8c3fba88844d5a2664a7490f0bde
Author: Christian Kadner <[email protected]>
Date: 2016-07-16T00:49:40Z
[BAHIR-24] fix MQTT Python code, examples, add tests
----
> Fix MQTT Python code
> --------------------
>
> Key: BAHIR-24
> URL: https://issues.apache.org/jira/browse/BAHIR-24
> Project: Bahir
> Issue Type: Bug
> Components: Spark Streaming Connectors
> Affects Versions: 2.0.0
> Reporter: Christian Kadner
> Assignee: Christian Kadner
> Original Estimate: 12h
> Remaining Estimate: 12h
>
> When the Bahir project was created from Spark revision {{8301fadd8}} the
> Python code (incl. examples) were not updated with respect to the modified
> project structure and test cases were left out from the import.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)