[
https://issues.apache.org/jira/browse/BIGTOP-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871238#comment-13871238
]
Sean Mackrory edited comment on BIGTOP-1181 at 1/14/14 9:40 PM:
----------------------------------------------------------------
pyspark is a python shell for spark. A couple of quick examples that I tested:
{code}sc.parallelize([1,2,3]).sum(){code}
And assuming you have a dictionary at hdfs:///words:
{code}sc.textFile("/words").filter(lambda w: w.startswith("spar")).take(5){code}
was (Author: mackrorysd):
pyspark is a python shell for spark. A couple of quick examples that I tested:
{code}sc.parallelize([1,2,3]).sum(){code}
And assuming you have a dictionary at hdfs:///words:
{code}sc.textFile("/usr/share/dict/words").filter(lambda w:
w.startswith("spar")).take(5){code}
> Add pyspark to spark package
> ----------------------------
>
> Key: BIGTOP-1181
> URL: https://issues.apache.org/jira/browse/BIGTOP-1181
> Project: Bigtop
> Issue Type: Bug
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
> Attachments: 0001-BIGTOP-1181.-Add-pyspark-to-spark-package.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)