GitHub user freeman-lab opened a pull request:

    https://github.com/apache/spark/pull/3803

    [SPARK-4969] [STREAMING] [PYTHON] Add binaryRecords to streaming

    In Spark 1.2 we added a `binaryRecords` input method for loading flat 
binary data. This format is useful for numerical array data, e.g. in scientific 
computing applications. This PR adds support for the same format in Streaming 
applications, where it is similarly useful, especially for streaming time 
series or sensor data.
    
    Summary of additions
    - adding `binaryRecordsStream` to Spark Streaming 
    - exposing `binaryRecordsStream` in the new PySpark Streaming
    - new unit tests in Scala and Python
    
    This required adding an optional Hadoop configuration param to `fileStream` 
and `FileInputStream`, but was otherwise straightforward.
    
    @tdas @davies

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/freeman-lab/spark streaming-binary-records

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3803.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3803
    
----
commit 8550c2619aba22b40dc109171b395522ccfaaf08
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T09:31:31Z

    Expose additional argument combination

commit ecef0eb8d4bf30627e5b35c40c2f4204e1670390
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T09:34:49Z

    Add binaryRecordsStream to python

commit fe4e803f8810c19aac02e7c8927af1d08b2f0a94
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T09:35:12Z

    Add binaryRecordStream to Java API

commit 36cb0fd576abb20b9c3210774ec9ff0471e2cf48
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T09:35:41Z

    Add binaryRecordsStream to scala

commit 23dd69f318aedbf12cab10380a50d94ce8c3ca92
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T09:35:52Z

    Tests for binaryRecordsStream

commit 9398bcb615c6cbf033b796c0837c99aba83303b4
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T09:40:06Z

    Expose optional hadoop configuration

commit 28bff9bab7be7c2f614a011f6b68e2103234c1df
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T10:02:42Z

    Fix missing arg

commit 8b70fbcf785074c7cde873cf10e8d5f0ea9e3979
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T10:03:01Z

    Reorganization

commit 2843e9de60f23bbce3ac185c09b8575a7513fe0d
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T17:43:20Z

    Add params to docstring

commit 94d90d0fbc576c4e475bb0a053e6c35d53152cf4
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T17:44:09Z

    Spelling

commit 1c739aa67a006a62a6ee8f294ff60568f9031476
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T17:48:04Z

    Simpler default arg handling

commit 029d49c143c7bed603db3ca43b44d212de516df8
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T17:50:42Z

    Formatting

commit a4324a38f8155f6b3e776326925af61f16a2fdfb
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T17:56:45Z

    Line length

commit d3e75b2bad2ba5048b36300cfd61b7cb5c39414b
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T19:29:06Z

    Add tests in python

commit becb34474fd165ee8aae9d207532869bce3ef743
Author: freeman <the.freeman....@gmail.com>
Date:   2014-12-25T19:31:07Z

    Formatting

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to