Definetely will be awesome if andrew can help us craft an idiomatic and meaningfull way to stress HBase at scale w/ iot data
On Fri, Mar 27, 2015 at 2:48 PM, RJ Nowling <[email protected]> wrote: > Jay and Andrew, thanks for the feedback!. I'd be happy to discuss ways to > connect BigTop Bazaar to HBase. > > It would be great to work with the BigBench project to see if our data > generators would be of interest. > > On Fri, Mar 27, 2015 at 1:17 PM, Andrew Purtell <[email protected]> > wrote: > > > I agree the proposal sounds very interesting. > > > > I can also help with the HBase side of things. > > > > On the general subject of data generators, you may want to reach out to > the > > people behind the "BigBench" project ( > > https://github.com/intel-hadoop/Big-Bench). These are ex colleagues of > > mine > > from Intel. When I was there they were interested in contributing to > > Apache, but had significant problems in that the data generator itself > was > > licensed under non-free terms incompatible with the ASL. I think they > > wanted to move past that but weren't sure exactly how (including having > the > > bandwidth to do so). I see occasional updates to the repo so they are > still > > active in some way. > > > > > > > > On Fri, Mar 27, 2015 at 6:42 AM, jay vyas <[email protected]> > > wrote: > > > > > Thanks for proposing rj. > > > > > > Im in favor, so long as it comes w/ a bigtop supported use case, and > > indeed > > > BigTop bazaar is a lovely use case for hbase ! > > > > > > I'm happy help you with the HBase side of things, maybe andrew can > > > collaborate on a reference architecture with us for scale testing of > > hbase > > > via bigtop bazaar's realtime IoT style of data generation. > > > > > > That will be a great blueprint compleiment to the mapreduce, spark, > > > blueprints which we already have. > > > > > > > > > > > > On Thu, Mar 26, 2015 at 4:22 PM, RJ Nowling <[email protected]> > wrote: > > > > > > > Hi all, > > > > > > > > Most of you are aware of my work with Jay on BigPetStore, > particularly > > > the > > > > data generator and Spark pipelines. Data generators are a great way > to > > > > load test systems, as Jay has recently done for kubernetes using the > > BPS > > > > data generator. > > > > > > > > We think they're generally useful to the big data community. Would > > BigTop > > > > be interested in hosting these data generator / load testing tools as > > > > released artifacts in their own right? > > > > > > > > For example, we'd like to set up a web page on the BigTop site with > > links > > > > to: > > > > > > > > * BPS Data Generator > > > > * BPS Spark > > > > * BPS Transaction Queue for using the data generator to test > streaming > > > > services > > > > > > > > and we'd like to release these as source tarballs, uber JARs, > > > Maven-hosted > > > > JARs, and Docker containers (as appropriate). > > > > > > > > Would this be okay or should everything be released as part of BigTop > > > > itself? > > > > > > > > Secondly, I've been working on a model for simulating customer > > movements > > > at > > > > a conference. It's designed for development and testing for a > > real-time > > > > streaming analytics application where we didn't have access to data > > ahead > > > > of time. You can read about it here: > > > > > > > > http://rnowling.github.io/math/2015/03/24/bigtop-bazaar-model.html > > > > > > > > I'd like to call it "BigTop Bazaar" and release it through BigTop. > Is > > > the > > > > BigTop community interested in having multiple data generators? > > > > > > > > Thanks, > > > > RJ > > > > > > > > > > > > > > > > -- > > > jay vyas > > > > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > -- jay vyas
