I agree the proposal sounds very interesting.

I can also help with the HBase side of things.

On the general subject of data generators, you may want to reach out to the
people behind the "BigBench" project (
https://github.com/intel-hadoop/Big-Bench). These are ex colleagues of mine
from Intel. When I was there they were interested in contributing to
Apache, but had significant problems in that the data generator itself was
licensed under non-free terms incompatible with the ASL. I think they
wanted to move past that but weren't sure exactly how (including having the
bandwidth to do so). I see occasional updates to the repo so they are still
active in some way.



On Fri, Mar 27, 2015 at 6:42 AM, jay vyas <[email protected]>
wrote:

> Thanks for proposing rj.
>
> Im in favor, so long as it comes w/ a bigtop supported use case, and indeed
> BigTop bazaar is a lovely use case for hbase !
>
> I'm happy help you with the HBase side of things, maybe andrew can
> collaborate on a reference architecture with us for scale testing of hbase
> via bigtop bazaar's realtime IoT style of data generation.
>
> That will be a great blueprint compleiment to the mapreduce, spark,
> blueprints which we already have.
>
>
>
> On Thu, Mar 26, 2015 at 4:22 PM, RJ Nowling <[email protected]> wrote:
>
> > Hi all,
> >
> > Most of you are aware of my work with Jay on BigPetStore, particularly
> the
> > data generator and Spark pipelines.  Data generators are a great way to
> > load test systems, as Jay has recently done for kubernetes using the BPS
> > data generator.
> >
> > We think they're generally useful to the big data community. Would BigTop
> > be interested in hosting these data generator / load testing tools as
> > released artifacts in their own right?
> >
> > For example, we'd like to set up a web page on the BigTop site with links
> > to:
> >
> > * BPS Data Generator
> > * BPS Spark
> > * BPS Transaction Queue for using the data generator to test streaming
> > services
> >
> > and we'd like to release these as source tarballs, uber JARs,
> Maven-hosted
> > JARs, and Docker containers (as appropriate).
> >
> > Would this be okay or should everything be released as part of BigTop
> > itself?
> >
> > Secondly, I've been working on a model for simulating customer movements
> at
> > a conference.  It's designed for development and testing for a real-time
> > streaming analytics application where we didn't have access to data ahead
> > of time.  You can read about it here:
> >
> > http://rnowling.github.io/math/2015/03/24/bigtop-bazaar-model.html
> >
> > I'd like to call it "BigTop Bazaar" and release it through BigTop.  Is
> the
> > BigTop community interested in having multiple data generators?
> >
> > Thanks,
> > RJ
> >
>
>
>
> --
> jay vyas
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Reply via email to