bumtp^^ ... Any thoughts on where these blueprints should go and how to organize them? At that point ill roll it into a jira
On Thu, Sep 19, 2013 at 5:41 PM, Jay Vyas <[email protected]> wrote: > Okay that makes sense. Now time for my all-to-often asked bigtop > question: > > Where would this project go? > > My proposal: > > My initial thoughts are a > > 1) location: Simply a new submodule, under top level bigtop, called > blueprints/ with a single java application under bigpetstore/ as the > submodule. > > 2) extensibility: Then others could add their own submodules easily by > just creating a new folder. > > 3) deliverable: The artifact created by this submodule would simply be a > jar file, with a shell script for executing the whole pipeline. > > 4) bootstrap / input data: We could put CSV delimited input data somewhere > on a public s3 instance , and have small input csv text files as a failsafe > inside the repo so people can always run it from just the git repo alone. > > > > > > > > > > > > > On Thu, Sep 19, 2013 at 5:28 PM, Roman Shaposhnik <[email protected]> wrote: > >> On Thu, Sep 19, 2013 at 2:19 PM, Jay Vyas <[email protected]> wrote: >> > Hey bigtop: >> > >> > Another idea, which i have been toying with for some time - is the idea >> of >> > implementing the old hibernate/ibatis app "jpetstore" for hadoop. >> >> I think providing example would be very nice. I honestly think that >> perhaps the best place to start would be in Hue, though. Hue already >> comes with simple toy example for things like Hive/Pig workflows, etc. >> >> Take a look at those. >> >> > I think bigtop might be a good template for this, but not sure if it >> should >> > go in bigtop itself : i.e. put an entire bigdata workflow into bigtop >> as an >> > example/template for people to better comprehend how mapreduce ETL plays >> > with adhoc analytics (HIVE/PIG) , and how machine learning (mahout etc) >> > finally interact with end sinks (hbase). etc... >> >> Ah! That actually goes beyond examples and would also be quite >> appreciated. >> I'd call those 'Bigdata pipelines blueprints'. There I would encourage >> folks >> to approach it from the Oozie perspective. That's what most of the >> heavyweight Hadoop users seems to be doing -- they've got those complex >> pipelines with ingest coming from the Flume side of things, batch managed >> by Oozie and analytic being provided by Hive/Pig/Spark and most recently >> Solr. >> >> > Not sure if this is in the scope of bigtop but i think, for people >> getting >> > into the hadoop ecosystem and useing bigtop as a venue to do so, an >> example >> > app of this sort might be particularly useful. >> > >> > Apologies is this is off scope of bigtop but let me know! >> >> Personally I think Bigtop is a really good place for these types of >> blueprints >> to be developed and tested. >> >> Thanks, >> Roman. >> > > > > -- > Jay Vyas > http://jayunit100.blogspot.com > -- Jay Vyas http://jayunit100.blogspot.com
