Hi, The Dachis Group data analytics team are big users of Pig and just a little Mahout so far, but that's changing soon so we'd like to contribute some of our works and know-how back to the Mahout / Pig communities.
In the near term, we're planning a Pig-Mahout hackday (code-named "Pigout") at our HQ in Austin, TX on Friday, May 11 to co-incide with Twitter's Pig hackday (http://www.meetup.com/PigUser/events/62108962/). We're actually hoping to get some sort of remote connection to the Twitter group ;-) We're really keen on Ted's pig-vector project (https://github.com/tdunning/pig-vector) as we're building a number of classifiers on Mahout's SGD framework, with the bulk of our data being in Cassandra processed almost entirely with Pig. We'd love to hear about any planned features for the pig-vector project we can help out on. Any similar Pig-Mahout projects we should know about? In general, we're reaching out today to see who else in the community is interested in better Pig / Mahout integration and what types of challenges they're facing? Any cool UDFs you'd like to share? Lastly, let us know if you'll be in the Austin area on Friday, May 11, 2012 as we'd love to join forces with other local Mahout / Pig users ... might even bring in some BBQ and really pig-out! Cheers, Timothy Potter Big Data Architect, Dachis Group www.socialbusinessindex.com
