Hi folks. Ive been hacking around on the big pet store idea. So far ive only got the template for the synthetic data set generator:
https://raw.github.com/jayunit100/hadoop-example-jobs/master/src/main/java/org/bigtop/bigpetstore/PetStoreTransactionGeneratorJob.java This is the "first" phase implementation of a MapReduce job that will a generate synthetic data set of transactions in a petstore. It is meant to be configurable: So people can use it to generate as many transactions as they want. I will also add more "products" to it. 2) The next step will be to flesh out the transaction data and then write up aggregations both in hive, pig, and mapreduce. That will serve as the ETL blueprint. 3) Then the interesting part will come: Feeding those ETL'd statistics into an available data store that is bigtop supported : i.e. SOLR indices and HBASE keyvalues. At that point the sample application will be ready and the first iteration of bigtop.blueprints will be ready to share. If Any initial thoughts or anyone else wants to jump in, let me know.? :) Jay Vyas http://jayunit100.blogspot.com
