Thanks for this. A few questions:
Is this Ruby code?
require 'rubygems'
require 'avro' SCHEMA = <<-JSON
{ "type": "record", "name": "Email", "fields" : [ {"name": "message_id",
"type": "int"}, {"name": "topic", "type": "string"}, {"name": "user_id",
"type": "int"} ]}
JSON
And what happened to user_id (maybe a typo?)
grunt> DESCRIBE messages
avros: {message_id: int,topic: chararray}
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.
________________________________
From: Russell Jurney <[email protected]>
To: [email protected]; [email protected]
Sent: Wednesday, December 7, 2011 2:40 PM
Subject: Booting the analytics application: events -> ruby -> avro -> pig ->
voldemort -> sinatra -> web browser -> user
I wrote up a pig-centric tutorial about how to get started collecting data
with Avro, processing it with Pig and publishing it with Voldemort/Sinatra:
http://datasyndrome.com/post/13707537045/booting-the-analytics-application-events-ruby
I hope it may be helpful for some. There is actually no Hadoop in the
intro, as it wasn't neccessary to get started using the tools to build
things - albeit on smaller data ;) Still, its nice to pick a set of tools
and processes that will scale to hell and back, and then not worry about
switching platforms later.
--
Russell Jurney
twitter.com/rjurney
[email protected]
datasyndrome.com