GitHub user industrial-sloth opened a pull request:
https://github.com/apache/spark/pull/3858
[SPARK-5037] dynamically loaded DStreams implementation and example
This PR adds a new reflection-based method of creating input DStreams to
the scala StreamingContext, and wires it through to the python streaming API.
Trying to create DStream instances directly by reflection runs into trouble
with unwanted stuff getting dragged into closures, so I worked around this by
defining a new abstract serializable `ReflectedDStreamFactory` class. The idea
is that one subclasses this with a concrete implementation that directly
instantiates the desired InputDStream; then the StreamingContext uses
reflection to dynamically load this new Factory implementation. This PR also
has an example showing how this works with the existing ZeroMQ example code in
both the scala and python streaming APIs.
Parameters are passed into the input DStream indirectly by first putting
them into the factory constructor, then requiring the factory implementation to
pass them on into the DStream instance. At the moment these parameters are
limited to String type, which I think should cover the majority of use cases,
but I'd think it should be possible to generalize this further.
Am throwing this out there for comment; suggestions and alternative
approaches more than welcome.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/industrial-sloth/spark reflected-dstreams
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3858.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3858
----
commit 2ffec19c21348934911a56a14799a0ddcae5e4da
Author: industrial-sloth <[email protected]>
Date: 2014-12-31T16:54:48Z
dynamically leaded DStreams implementation and example
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]