On Jul 10, 2012, at 12:58 PM, Daniel Dai wrote: > Who's the author for Wonderdog? Jacob Perkins wrote Wonderdog while he was at Infochimps. I'll CC him on the thread because I don't know that he subscribes to the dev list. Jacob - did you want to chime in here?
> Can Russell or the author talk about > it in our next hackthon? Also we need to discuss with the author about > it. > > On Tue, Jul 10, 2012 at 9:23 AM, Alan Gates <ga...@hortonworks.com> wrote: >> From https://issues.apache.org/jira/browse/PIG-2803 posted yesterday by >> Russell. I'm copying it here because I think we need to discuss this and >> decide what we want to do: >> >> I propose to add Wonderdog to Pig contrib/ >> Wonderdog is an Apache 2.0 licensed project that adds Hadoop and Pig >> integration for ElasticSearch. This lets you index any Pig relation with a >> single UDF call, which is very powerful. Both writing searchable indexes and >> loading based on search queries is supported. >> More information on Wonderdog is available at >> https://github.com/infochimps-labs/wonderdog and a great introduction to >> ElasticSearch is available at >> http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html >> Wonderdog broke in Pig 0.10.0, and was patched to work here: >> https://github.com/infochimps-labs/wonderdog/pull/9 Even still, there is the >> issue of Pig creating schema files when storing and loading JSON that must >> be manually removed to make Wonderdog go. >> Moving forward, I would like the Pig project to maintain Wonderdog in >> contrib/ and verify that it works with each version increment. Wonderdog is >> an incredibly useful library that is license compatible with Pig itself. >> Along with ElasticSearch, it adds the ability for any user to index his Pig >> relations and to load subsets of data by pushing search queries down to >> ElasticSearch. >> I use Wonderdog in production and in my book, so I volunteer to do the >> maintenance on contrib/wonderdog.