I guess I'll follow up:

My interest is in distributed systems and databases, and particularly where
those two intersect. I'm a Pig committer by night (and a little by day), and
tech lead of Twitter's data analysis infrastructure team by day (and a
little by night). My interest in Giraph is mostly motivated by the promise
of reusing our existing Hadoop infrastructure to perform a variety of
calculations much more efficiently. For now I'm mostly concerned with
getting things into a state where I won't be concerned with giving Giraph as
a tool to the data scientists on the team; that means integration with our
existing data sources, trimming the memory footprint, and finding and
eliminating a few edge cases that can prevent jobs from starting correctly.
Longer-term, I would like to work on shoring up fault-tolerance and
improving the RPC subsystem.


On Thu, Sep 15, 2011 at 10:26 PM, Jake Mannix <jake.man...@gmail.com> wrote:

> Thanks Avery,
>   Greetings all.  In the other Apache communities of which I'm familiar
> (Mahout and Lucene, in particular), it is customary for new committers to
> give a little background / bio / self-introduction, so I'll carry that over,
> in hopes that it is a fairly universal practice. :)
>   I'm originally a physics nerd, turned mathematician, turned software
> engineer mostly working on search (I built large parts of 
> this<http://www.linkedin.com/search/fpsearch?type=people&keywords=jake+mannix>search
>  engine, as well as
> this <http://twitter.com/#!/who_to_follow/search/jake%20mannix> one), and
> as such have spent a lot of time in the Apache 
> Lucene<http://lucene.apache.org>community (both of the linked-to search 
> engines are built on Lucene,
> naturally enough).  Over the past few years, I've been working more trying
> to apply my IR and math skills to machine learning, and as such have been
> working on Apache Mahout <http://mahout.apache.org>, where I'm a committer
> and PMC member, primarily working on distributed matrix computations and in
> more specific: decompositions (e.g. 
> SVD<http://en.wikipedia.org/wiki/Singular_value_decomposition>)
> and topic modeling (e.g 
> LDA<http://en.wikipedia.org/wiki/Latent_Dirichlet_Allocation>).
>   As you might imagine, social graphs play a pretty important role in much
> of the work I've been in, so finding efficient ways to do monstrously large
> graph computations is what brought Apache Giraph to my attention.  I hope to
> spend some of my time (both free and as part of my workday) helping make
> Giraph speedy and CPU+memory+network efficient, by whatever means I can
> think of, and to write up some fun graph applications to go in the
> "examples" area as well.
>   In fact, finding ways of doing stuff which is a bit *outside* the normal
> thought of a BSP graph calculation is one of my motivations for using /
> working with / helping Giraph: I'd love to see how hard it is (and how
> efficient the result is!) to compute truncated matrix SVD's in Giraph, or do
> a big topic-model learning of an LDA model, or any of the various other
> sophisticated machine learning algorithms of which I sadly know very little
> (like really anything to do with gradient boosted decision trees, or
> restricted boltzmann machines, etc).
>   Well that was a bit long, but I can be a bit chatty, but there you go.
>  Looking forward to working with the rest of the community here, and
> building some great stuff!  The codebase is pretty huge and impressive
> already, I'm honored to help out in whatever way I can.
>   Hi!
>   -jake
> On Thu, Sep 15, 2011 at 9:05 PM, Avery Ching <ach...@apache.org> wrote:
>> As an early Apache Incubator project, we need to build and grow the Giraph
>> community of folks interested in large-scale graph processing. Both Jake and
>> Dmitriy have demonstrated exceptional passion and talent in working with
>> Giraph.  The Giraph PPMC has had only positives to say about them in the
>> voting process.  They have both graciously accepted the offered
>> responsibilities and we are pleased to announce that they are now Giraph
>> committers and PPMC members!
>> Thanks,
>> Avery

Dmitriy V Ryaboy
Twitter Analytics

