This is a great idea and this proposal looks good to me.

My only feedback would be:
Would "samples" be more obvious than "blueprints"?

On 09/23/2013 11:47 AM, Jay Vyas wrote:
bumtp^^ ... Any thoughts on where these blueprints should go and how to
organize them?  At that point ill roll it into  a jira


On Thu, Sep 19, 2013 at 5:41 PM, Jay Vyas <[email protected]
<mailto:[email protected]>> wrote:

    Okay that makes sense.  Now time for my all-to-often asked bigtop
    question:

    Where would this project go?

    My proposal:

    My initial thoughts are a

    1) location: Simply a new submodule, under top level bigtop, called
    blueprints/ with a single java application under bigpetstore/ as the
    submodule.

    2) extensibility: Then others could add their own submodules easily
    by just creating a new folder.

    3) deliverable: The artifact created by this submodule would simply
    be a jar file, with a shell script for executing the whole pipeline.

    4) bootstrap / input data: We could put CSV delimited input data
    somewhere on a public s3 instance , and have small input csv text
    files as a failsafe inside the repo so people can always run it from
    just the git repo alone.












    On Thu, Sep 19, 2013 at 5:28 PM, Roman Shaposhnik <[email protected]
    <mailto:[email protected]>> wrote:

        On Thu, Sep 19, 2013 at 2:19 PM, Jay Vyas <[email protected]
        <mailto:[email protected]>> wrote:
         > Hey bigtop:
         >
         > Another idea, which i have been toying with for some time -
        is the idea of
         > implementing the old hibernate/ibatis app "jpetstore" for hadoop.

        I think providing example would be very nice. I honestly think that
        perhaps the best place to start would be in Hue, though. Hue already
        comes with simple toy example for things like Hive/Pig
        workflows, etc.

        Take a look at those.

         > I think bigtop might be a good template for this, but not
        sure if it should
         > go in bigtop itself : i.e.  put an entire bigdata workflow
        into bigtop as an
         > example/template for people to better comprehend how
        mapreduce ETL plays
         > with adhoc analytics (HIVE/PIG) , and how machine learning
        (mahout etc)
         > finally interact with end sinks (hbase). etc...

        Ah! That actually goes beyond examples and would also be quite
        appreciated.
        I'd call those 'Bigdata pipelines blueprints'. There I would
        encourage folks
        to approach it from the Oozie perspective. That's what most of the
        heavyweight Hadoop users seems to be doing -- they've got those
        complex
        pipelines with ingest coming from the Flume side of things,
        batch managed
        by Oozie and analytic being provided by Hive/Pig/Spark and most
        recently Solr.

         > Not sure if this is in the scope of bigtop but i think, for
        people getting
         > into the hadoop ecosystem and useing bigtop as a venue to do
        so, an example
         > app of this sort might be particularly useful.
         >
         > Apologies is this is off scope of bigtop but let me know!

        Personally I think Bigtop is a really good place for these types
        of blueprints
        to be developed and tested.

        Thanks,
        Roman.




    --
    Jay Vyas
    http://jayunit100.blogspot.com




--
Jay Vyas
http://jayunit100.blogspot.com

Reply via email to