Hi Mohit, Welcome to the Spark community! We normally look at feature proposals using github pull requests mind submitting one? The contribution process is covered here:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark On Tue, Sep 16, 2014 at 9:16 PM, Mohit Jaggi <mohitja...@gmail.com> wrote: > https://issues.apache.org/jira/browse/SPARK-3489 > > Folks, > I am Mohit Jaggi and I work for Ayasdi Inc. After experimenting with Spark > for a while and discovering its awesomeness(!) I made an attempt to > provide a wrapper API that looks like R and/or pandas dataframe. > > https://github.com/AyasdiOpenSource/df > > "df" uses a collection of RDDs, each element in the collection being a > column in a dataframe. To make rows from the columns I used zip() in a loop > but that is not very efficient. I created JIRA 3489 requesting a zip() > variant that zips a sequence of RDDs. I noticed that it was easy to write > that code so I wrote that code and it seems to work. I attached the diff to > the jira. I believe that this API would be useful in general and is not > specific to "df". Please take a look at the request and the proposed > solution and let me know what you think. > > Cheers, > Mohit --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org