GitHub user davies opened a pull request:
https://github.com/apache/spark/pull/2538
[WIP] [SPARK-2377] Python API for Streaming
This patch bring Python API for Streaming, WIP.
TODO:
updateStateByKey()
windowXXX()
This patch is based on work from @giwa
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/davies/spark streaming
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2538.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2538
----
commit e8c7bfc556da45d33f9ffecf8c6b802fe7a7e49c
Author: giwa <[email protected]>
Date: 2014-08-11T11:31:59Z
remove export PYSPARK_PYTHON in spark submit
commit bdde697368cee7c06fcbcf4f2102fedf3a58536f
Author: giwa <[email protected]>
Date: 2014-08-11T11:42:08Z
removed unnesessary changes
commit a65f3021fc8aa5f82889a18a728eed3c901996d0
Author: giwa <[email protected]>
Date: 2014-08-11T12:32:28Z
edited the comment to add more precise description
commit 90a6484066ec2c157db6650d470e0b66cf42b342
Author: giwa <[email protected]>
Date: 2014-08-11T23:34:12Z
added mapValues and flatMapVaules WIP for glom and mapPartitions test
commit 0704b86a9963c1d62b1934ce2fb47094b3fb03d3
Author: giwa <[email protected]>
Date: 2014-08-14T04:04:26Z
WIP: solved partitioned and None is not recognized
commit 080541a6d77cb85f788c297670cca24fbbc9f9b5
Author: giwa <[email protected]>
Date: 2014-08-14T09:19:46Z
broke something
commit 2112638167e258609551df6e6036f33e08ff82e3
Author: giwa <[email protected]>
Date: 2014-08-15T01:07:10Z
all tests are passed if numSlice is 2 and the numver of each input is over 4
commit 536def42b9c8b0b81499e5e06d22b813f18d0bdd
Author: giwa <[email protected]>
Date: 2014-08-15T06:42:34Z
basic function test cases are passed
commit a14c7e1a59370949a5f1eab16e448cc0012fa65e
Author: giwa <[email protected]>
Date: 2014-08-15T06:46:45Z
modified streaming test case to add coment
commit e3033fcdd24258eb3836c0c07e5c959c3dfde7d2
Author: giwa <[email protected]>
Date: 2014-08-15T18:28:39Z
remove waste duplicated code
commit 89ae38a0d6bc299ebb9aa81c7510812874ce7879
Author: giwa <[email protected]>
Date: 2014-08-16T00:10:56Z
added saveAsTextFiles and saveAsPickledFiles
commit ea9c8731b3d997ead7015d721c66231064e19ff9
Author: giwa <[email protected]>
Date: 2014-08-16T05:30:58Z
added TODO coments
commit d8b593b20351d32d4ac3948778bf2ebbab86879f
Author: giwa <[email protected]>
Date: 2014-08-18T07:30:17Z
add comments
commit e7ebb08da3c59102cfad08ce4d687e56d02a0edf
Author: giwa <[email protected]>
Date: 2014-08-18T07:35:50Z
removed wasted print in DStream
commit 636090ac5323cdde6c72d48336b716693a80e010
Author: giwa <[email protected]>
Date: 2014-08-18T20:24:17Z
added sparkContext as input parameter in StreamingContext
commit a3d2379d79fdb8573963564f5c5be98558e495f2
Author: giwa <[email protected]>
Date: 2014-08-18T21:39:45Z
added gorupByKey testcase
commit 665bfdb48523ecb7aa5174341a74c55c2088a891
Author: giwa <[email protected]>
Date: 2014-08-18T22:12:31Z
added testcase for combineByKey
commit 5c3a683efb76c49e6441672272bc029ecfbb687a
Author: Ken <[email protected]>
Date: 2014-07-09T01:31:41Z
initial commit for pySparkStreaming
commit e497b9bfe6ba96db46122aa369b5dba528524c2e
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-15T22:41:52Z
comment PythonDStream.PairwiseDStream
commit 6e0d9c749e7ef0067a6cd7ae9d21e8b599e32d54
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T00:19:20Z
modify dstream.py to fix indent error
commit 9af03f40bbb9d04cfe66398a8632e4398214e3d7
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T04:08:43Z
added reducedByKey not working yet
commit dcf243f1cd0e7e5e47fb5b4ef9f269a344291f1b
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T18:07:42Z
implementing transform function in Python
commit c5518b42c6f5b3832a508eb302c34f84cf15b864
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T18:12:53Z
modified the code base on comment in https://github.com/tdas/spark/pull/10
commit 375817561de68b54be4b41ddbf6dbfc352d59360
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T18:17:02Z
add coment for hack why PYSPARK_PYTHON is needed in spark-submit
commit e551e1355132ed239baf4edd51f3e275222362cc
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T18:19:13Z
add coment for hack why PYSPARK_PYTHON is needed in spark-submit
commit 2adca8419495eaaafab2677c8e2ba6f9588dfeb0
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T18:24:08Z
remove not implemented DStream functions in python
commit 5594bd43622adeac153642d600ab5585c5f7a2bb
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T18:35:59Z
revert pom.xml
commit 490e338374bef5265796332f7b0a5defe6839754
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T19:15:06Z
sorted the import following Spark coding convention
commit 856d98e67b7df23f9c86da6c42795238c8dbcdc4
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T19:19:42Z
add empty line
commit 4ce4058a216de9118772df8b46085665bf28a51c
Author: Ken Takagiwa <[email protected]>
Date: 2014-07-16T22:40:42Z
remove unused import in python
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]