Okay that makes sense. Now time for my all-to-often asked bigtop question:
Where would this project go? My proposal: My initial thoughts are a 1) location: Simply a new submodule, under top level bigtop, called blueprints/ with a single java application under bigpetstore/ as the submodule. 2) extensibility: Then others could add their own submodules easily by just creating a new folder. 3) deliverable: The artifact created by this submodule would simply be a jar file, with a shell script for executing the whole pipeline. 4) bootstrap / input data: We could put CSV delimited input data somewhere on a public s3 instance , and have small input csv text files as a failsafe inside the repo so people can always run it from just the git repo alone. On Thu, Sep 19, 2013 at 5:28 PM, Roman Shaposhnik <[email protected]> wrote: > On Thu, Sep 19, 2013 at 2:19 PM, Jay Vyas <[email protected]> wrote: > > Hey bigtop: > > > > Another idea, which i have been toying with for some time - is the idea > of > > implementing the old hibernate/ibatis app "jpetstore" for hadoop. > > I think providing example would be very nice. I honestly think that > perhaps the best place to start would be in Hue, though. Hue already > comes with simple toy example for things like Hive/Pig workflows, etc. > > Take a look at those. > > > I think bigtop might be a good template for this, but not sure if it > should > > go in bigtop itself : i.e. put an entire bigdata workflow into bigtop > as an > > example/template for people to better comprehend how mapreduce ETL plays > > with adhoc analytics (HIVE/PIG) , and how machine learning (mahout etc) > > finally interact with end sinks (hbase). etc... > > Ah! That actually goes beyond examples and would also be quite appreciated. > I'd call those 'Bigdata pipelines blueprints'. There I would encourage > folks > to approach it from the Oozie perspective. That's what most of the > heavyweight Hadoop users seems to be doing -- they've got those complex > pipelines with ingest coming from the Flume side of things, batch managed > by Oozie and analytic being provided by Hive/Pig/Spark and most recently > Solr. > > > Not sure if this is in the scope of bigtop but i think, for people > getting > > into the hadoop ecosystem and useing bigtop as a venue to do so, an > example > > app of this sort might be particularly useful. > > > > Apologies is this is off scope of bigtop but let me know! > > Personally I think Bigtop is a really good place for these types of > blueprints > to be developed and tested. > > Thanks, > Roman. > -- Jay Vyas http://jayunit100.blogspot.com
