[ 
https://issues.apache.org/jira/browse/BAHIR-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15380419#comment-15380419
 ] 

ASF GitHub Bot commented on BAHIR-24:
-------------------------------------

GitHub user ckadner opened a pull request:

    https://github.com/apache/bahir/pull/10

    [BAHIR-24] fix MQTT Python code, examples, add tests

    [BAHIR-24: Fix MQTT Python 
code](https://issues.apache.org/jira/browse/BAHIR-24)
    
    
    **Changes in this PR:**
    
    - remove unnecessary files from `streaming-mqtt/python` (`__init__.py`, 
`dstream.py`)
    - updated all `*.py` files with respect to the modified project structure 
`pyspark.streaming.mqtt` --> `mqtt` (see Question 1 below) 
    - add test cases that were left out from the import and add shell script to 
run them (compare to 
[spark-packages/dstream-mqtt](https://github.com/spark-packages/dstream-mqtt/tree/master/python-tests))
      - `streaming-mqtt/python-tests/run-python-tests.sh`
      - `streaming-mqtt/python-tests/tests.py`
    - modify `MQTTTestUtils.scala` to limit the required disk storage space
    - modify `bin/run-example` script to setup `PYTHONPATH` to run Python 
examples
    - update the Spark version we are building against from `2.0.0-SNAPSHOT` to 
`2.0.1-SNAPSHOT`
    
     
    **Open questions:**
    
    1. Should we preserve the original PySpark package structure (pre-Spark 
2.0) so users with existing PySpark-MQTT programs don't need to change their 
import statements? i.e. 
    
      - `from pyspark.streaming.mqtt import MQTTUtils` vs 
      - `from mqtt import MQTTUtils`
    
    2. Should we use the `--py-files` option with `spark-submit` as opposed to 
setting up the `PYTHONPATH` in the `bin/run-example` script. I did not do it 
for these reasons.
    
      - the `--py-files` option with individual `*.py` files requires that the 
example Python scripts are changed to move the import statements after 
SparkContext initialization
      - alternatively the `--py-files` option requires to create a zip with all 
the required `*.py` files, but then we should change our packaged binary jar 
files to include the Python sources at root level so that users can then use 
`--py-files spark-streaming-mqtt_2.11-2.0.0-SNAPSHOT.jar` without having to 
create another zip file
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ckadner/bahir BAHIR-24_MQTT_Python_fixes

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/bahir/pull/10.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10
    
----
commit 817b959fb19f8c3fba88844d5a2664a7490f0bde
Author: Christian Kadner <ckad...@us.ibm.com>
Date:   2016-07-16T00:49:40Z

    [BAHIR-24] fix MQTT Python code, examples, add tests

----


> Fix MQTT Python code
> --------------------
>
>                 Key: BAHIR-24
>                 URL: https://issues.apache.org/jira/browse/BAHIR-24
>             Project: Bahir
>          Issue Type: Bug
>          Components: Spark Streaming Connectors
>    Affects Versions: 2.0.0
>            Reporter: Christian Kadner
>            Assignee: Christian Kadner
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When the Bahir project was created from Spark revision {{8301fadd8}} the 
> Python code (incl. examples) were not updated with respect to the modified 
> project structure and test cases were left out from the import.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to