I wrote up a pig-centric tutorial about how to get started collecting data with Avro, processing it with Pig and publishing it with Voldemort/Sinatra: http://datasyndrome.com/post/13707537045/booting-the-analytics-application-events-ruby
I hope it may be helpful for some. There is actually no Hadoop in the intro, as it wasn't neccessary to get started using the tools to build things - albeit on smaller data ;) Still, its nice to pick a set of tools and processes that will scale to hell and back, and then not worry about switching platforms later. -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
