On Jul 10, 2012, at 12:58 PM, Daniel Dai wrote:

> Who's the author for Wonderdog?
Jacob Perkins wrote Wonderdog while he was at Infochimps.  I'll CC him on the 
thread because I don't know that he subscribes to the dev list.  Jacob - did 
you want to chime in here?

> Can Russell or the author talk about
> it in our next hackthon? Also we need to discuss with the author about
> it.
> 
> On Tue, Jul 10, 2012 at 9:23 AM, Alan Gates <ga...@hortonworks.com> wrote:
>> From https://issues.apache.org/jira/browse/PIG-2803 posted yesterday by 
>> Russell.  I'm copying it here because I think we need to discuss this and 
>> decide what we want to do:
>> 
>> I propose to add Wonderdog to Pig contrib/
>> Wonderdog is an Apache 2.0 licensed project that adds Hadoop and Pig 
>> integration for ElasticSearch. This lets you index any Pig relation with a 
>> single UDF call, which is very powerful. Both writing searchable indexes and 
>> loading based on search queries is supported.
>> More information on Wonderdog is available at 
>> https://github.com/infochimps-labs/wonderdog and a great introduction to 
>> ElasticSearch is available at 
>> http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html
>> Wonderdog broke in Pig 0.10.0, and was patched to work here: 
>> https://github.com/infochimps-labs/wonderdog/pull/9 Even still, there is the 
>> issue of Pig creating schema files when storing and loading JSON that must 
>> be manually removed to make Wonderdog go.
>> Moving forward, I would like the Pig project to maintain Wonderdog in 
>> contrib/ and verify that it works with each version increment. Wonderdog is 
>> an incredibly useful library that is license compatible with Pig itself. 
>> Along with ElasticSearch, it adds the ability for any user to index his Pig 
>> relations and to load subsets of data by pushing search queries down to 
>> ElasticSearch.
>> I use Wonderdog in production and in my book, so I volunteer to do the 
>> maintenance on contrib/wonderdog.

Reply via email to