Awesome! Thanks again for all the guidance Matei. I look forward to helping out.
Thanks again! -- Joyce On Tue, Jul 30, 2013 at 8:56 PM, Matei Zaharia <[email protected]>wrote: > Cool! The way I'd start is perhaps by adding a new Python example job. For > example, a good one to implement would be PageRank -- you can look at these > slides for a Scala version of it: > http://ampcamp.berkeley.edu/wp-content/uploads/2012/06/matei-zaharia-part-2-amp-camp-2012-standalone-programs.pdf. > Another possibility is linear regression. But feel free to also come up > with your own. > > There are also a number of Python issues open relating to adding some > missing API features, but these require a more thorough understanding of > how PySpark work and possibly some hacking around in pickled data: > https://spark-project.atlassian.net/browse/SPARK-791?jql=component%20%3D%20PySpark%20AND%20status%20%3D%20Open. > The easiest one to start with is probably SPARK-838. > > Matei > > On Jul 30, 2013, at 6:44 AM, Michael Joyce <[email protected]> wrote: > > > Hay Matei, > > > > I would love to help on the Python API. I'll start taking a look at that. > > Unfortunately I don't have access to a Windows computer, so I can't be of > > much use there. I would also be more than happy to work on the JVM stuff > as > > well. If you have a list stuff to do there (or it wouldn't take too long > to > > compile one), I would gladly take a look. > > > > Thanks for all the help! > > > > > > -- Joyce > > > > > > On Mon, Jul 29, 2013 at 4:17 PM, Matei Zaharia <[email protected] > >wrote: > > > >> Hey Michael, > >> > >> Depending on your background, there are quite a few things to do. > >> > >> One general area that we might use more help for, if you have experience > >> there, is the Python API. Part of it can be just to add more examples in > >> Python, e.g., to show how one can use NumPy or SciPy with it. Another > thing > >> that would be super useful if you also have access to Windows is this: > >> https://spark-project.atlassian.net/browse/SPARK-649. We want to make > >> Spark very broadly accessible for science work and it sounds like your > >> background at JPL is good for that. > >> > >> Alternatively, if you prefer to work on the Java VM, there are a bunch > of > >> internal things to do there too -- I can give an overview of what I'd > >> consider easy to jump into there. > >> > >> Matei > >> > >> On Jul 29, 2013, at 1:03 PM, Michael Joyce <[email protected]> wrote: > >> > >>> Hay Matei, > >>> > >>> Truth be told I haven't had much of a chance to look through JIRA and > the > >>> code base to pick a specific part to work on. Is there anything in > >>> particular that needs some work? I'm more than happy to throw some > effort > >>> at a specific problem if something needs attention. Otherwise I can > just > >>> poke around and try to find a nice niche in which to work so I can help > >> out. > >>> > >>> Thanks much! > >>> > >>> -- Joyce > >>> > >>> > >>> On Mon, Jul 29, 2013 at 10:55 AM, Matei Zaharia < > [email protected] > >>> wrote: > >>> > >>>> Hey Michael, > >>>> > >>>> Glad to hear you're interested in helping. Are there specific things > >> you'd > >>>> like to work on? Certainly we will need help with various Apache > >> packaging, > >>>> etc so it's good to have more people with experience at Apache. > >>>> > >>>> Matei > >>>> > >>>> On Jul 29, 2013, at 8:36 AM, Michael Joyce <[email protected]> wrote: > >>>> > >>>>> Hi all! > >>>>> > >>>>> My name is Michael Joyce. I work at JPL and have heard some great > >> things > >>>>> about Spark from Chris Mattmann. I figured I would stop by, say > hello, > >>>> and > >>>>> hopefully throw some helpful contributions at the project. > >>>>> > >>>>> Look forward to helping out! > >>>>> > >>>>> -- Joyce > >>>> > >>>> > >> > >> > >
