Re: Contributing to PySpark
Hi Krishna, Thanks for your interest contributing to PySpark! I don't personally use either of those IDEs so I'll leave that part for someone else to answer - but in general you can find the building spark documentation at http://spark.apache.org/docs/latest/building-spark.html which includes notes on how to run the Python tests as well. You will also probably want to check out the contributing to Spark guide at https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark. Cheers, Holden :) On Tue, Oct 18, 2016 at 2:16 AM, Krishna Kalyan <krishnakaly...@gmail.com> wrote: > Hello, > I am a masters student. Could someone please let me know how set up my dev > working environment to contribute to pyspark. > Questions I had were: > a) Should I use Intellij Idea or PyCharm?. > b) How do I test my changes?. > > Regards, > Krishna > -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau
Contributing to PySpark
Hello, I am a masters student. Could someone please let me know how set up my dev working environment to contribute to pyspark. Questions I had were: a) Should I use Intellij Idea or PyCharm?. b) How do I test my changes?. Regards, Krishna
Re: Contributing to pyspark
1, Yes, because the issues are in JIRA. 2. Nope, (at least as far as MLlib is concerned) because most if it are just wrappers to the underlying Scala functions or methods and are not implemented in pure Python. 3. I'm not sure about this. It seems to work fine for me! HTH On Fri, Jun 12, 2015 at 10:41 AM, Usman Ehtesham Gul uehtesha...@gmail.com wrote: Hello Manoj, First of all thank you for the quick reply. Just a couple of more things. I have started reading the link you provided; I will definitely filter JIRA with PySpark. Can you verify: 1) We fork from Github right? I ask because on Github, I see its mirrored and there are no issues section. I am assuming because that is done in Jira. 2) To contribute to PySpark, we will have to clone the whole project. But if our changes/contributions are only specific to pyspark, we can do those too without relying on core spark and other client libraries right? 3) I think the email u...@spark.apache.org is broken. I am getting email from mailer-dae...@apache.org that email could be sent to this address. Can you check this? Thank you again. Hope to hear from you soon. Usman On Jun 12, 2015, at 12:57 AM, Manoj Kumar manojkumarsivaraj...@gmail.com wrote: Hi, Thanks for your interest in PySpark. The first thing is to have a look at the how to contribute guide https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark and filter the JIRA's using the label PySpark. If you have your own improvement in mind, you can file your a JIRA, discuss and then send a Pull Request HTH Regards. On Fri, Jun 12, 2015 at 9:36 AM, Usman Ehtesham uehtesha...@gmail.com wrote: Hello, I am currently taking a course in Apache Spark via EdX ( https://www.edx.org/course/introduction-big-data-apache-spark-uc-berkeleyx-cs100-1x) and at the same time I try to look at the code for pySpark too. I wanted to ask, if ideally I would like to contribute to pyspark specifically, how can I do that? I do not intend to contribute to core Apache Spark any time soon (mainly because I do not know Scala) but I am very comfortable in Python. Any tips on how to contribute specifically to pyspark without being affected by other parts of Spark would be greatly appreciated. P.S.: I ask this because there is a small change/improvement I would like to propose. Also since I just started learning Spark, I would like to also read and understand the pyspark code as I learn about Spark. :) Hope to hear from you soon. Usman Ehtesham Gul https://github.com/ueg1990 -- Godspeed, Manoj Kumar, http://manojbits.wordpress.com http://goog_1017110195/ http://github.com/MechCoder -- Godspeed, Manoj Kumar, http://manojbits.wordpress.com http://goog_1017110195 http://github.com/MechCoder
Re: Contributing to pyspark
Hi, Thanks for your interest in PySpark. The first thing is to have a look at the how to contribute guide https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark and filter the JIRA's using the label PySpark. If you have your own improvement in mind, you can file your a JIRA, discuss and then send a Pull Request HTH Regards. On Fri, Jun 12, 2015 at 9:36 AM, Usman Ehtesham uehtesha...@gmail.com wrote: Hello, I am currently taking a course in Apache Spark via EdX ( https://www.edx.org/course/introduction-big-data-apache-spark-uc-berkeleyx-cs100-1x) and at the same time I try to look at the code for pySpark too. I wanted to ask, if ideally I would like to contribute to pyspark specifically, how can I do that? I do not intend to contribute to core Apache Spark any time soon (mainly because I do not know Scala) but I am very comfortable in Python. Any tips on how to contribute specifically to pyspark without being affected by other parts of Spark would be greatly appreciated. P.S.: I ask this because there is a small change/improvement I would like to propose. Also since I just started learning Spark, I would like to also read and understand the pyspark code as I learn about Spark. :) Hope to hear from you soon. Usman Ehtesham Gul https://github.com/ueg1990 -- Godspeed, Manoj Kumar, http://manojbits.wordpress.com http://goog_1017110195 http://github.com/MechCoder
Contributing to pyspark
Hello, I am currently taking a course in Apache Spark via EdX ( https://www.edx.org/course/introduction-big-data-apache-spark-uc-berkeleyx-cs100-1x) and at the same time I try to look at the code for pySpark too. I wanted to ask, if ideally I would like to contribute to pyspark specifically, how can I do that? I do not intend to contribute to core Apache Spark any time soon (mainly because I do not know Scala) but I am very comfortable in Python. Any tips on how to contribute specifically to pyspark without being affected by other parts of Spark would be greatly appreciated. P.S.: I ask this because there is a small change/improvement I would like to propose. Also since I just started learning Spark, I would like to also read and understand the pyspark code as I learn about Spark. :) Hope to hear from you soon. Usman Ehtesham Gul https://github.com/ueg1990