GitHub user freeman-lab opened a pull request: https://github.com/apache/spark/pull/3803
[SPARK-4969] [STREAMING] [PYTHON] Add binaryRecords to streaming In Spark 1.2 we added a `binaryRecords` input method for loading flat binary data. This format is useful for numerical array data, e.g. in scientific computing applications. This PR adds support for the same format in Streaming applications, where it is similarly useful, especially for streaming time series or sensor data. Summary of additions - adding `binaryRecordsStream` to Spark Streaming - exposing `binaryRecordsStream` in the new PySpark Streaming - new unit tests in Scala and Python This required adding an optional Hadoop configuration param to `fileStream` and `FileInputStream`, but was otherwise straightforward. @tdas @davies You can merge this pull request into a Git repository by running: $ git pull https://github.com/freeman-lab/spark streaming-binary-records Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3803.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3803 ---- commit 8550c2619aba22b40dc109171b395522ccfaaf08 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T09:31:31Z Expose additional argument combination commit ecef0eb8d4bf30627e5b35c40c2f4204e1670390 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T09:34:49Z Add binaryRecordsStream to python commit fe4e803f8810c19aac02e7c8927af1d08b2f0a94 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T09:35:12Z Add binaryRecordStream to Java API commit 36cb0fd576abb20b9c3210774ec9ff0471e2cf48 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T09:35:41Z Add binaryRecordsStream to scala commit 23dd69f318aedbf12cab10380a50d94ce8c3ca92 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T09:35:52Z Tests for binaryRecordsStream commit 9398bcb615c6cbf033b796c0837c99aba83303b4 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T09:40:06Z Expose optional hadoop configuration commit 28bff9bab7be7c2f614a011f6b68e2103234c1df Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T10:02:42Z Fix missing arg commit 8b70fbcf785074c7cde873cf10e8d5f0ea9e3979 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T10:03:01Z Reorganization commit 2843e9de60f23bbce3ac185c09b8575a7513fe0d Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T17:43:20Z Add params to docstring commit 94d90d0fbc576c4e475bb0a053e6c35d53152cf4 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T17:44:09Z Spelling commit 1c739aa67a006a62a6ee8f294ff60568f9031476 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T17:48:04Z Simpler default arg handling commit 029d49c143c7bed603db3ca43b44d212de516df8 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T17:50:42Z Formatting commit a4324a38f8155f6b3e776326925af61f16a2fdfb Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T17:56:45Z Line length commit d3e75b2bad2ba5048b36300cfd61b7cb5c39414b Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T19:29:06Z Add tests in python commit becb34474fd165ee8aae9d207532869bce3ef743 Author: freeman <the.freeman....@gmail.com> Date: 2014-12-25T19:31:07Z Formatting ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org