Hi, Nive to have data generators in Bigtop.
But please do not include it in bigtop_utils, since this package is mandatory. Not everyone needs a data generator . Olaf > Am 26.08.2015 um 11:25 schrieb Jay Vyas <[email protected]>: > > Publishing the jar to bigtops maven is probably a good first step ,Then apps > can just include it as needed...?. > > I'm not against packaging if someone wants packages for this. Maybe even > include it in bigtop util ? > > Let's move to jira, > >> On Aug 25, 2015, at 9:41 PM, Konstantin Boudnik <[email protected]> wrote: >> >> It is pretty cool indeed! >> >> I wonder how it needs to be structured to be: >> - easy to access/use from other components wherever it is needed >> - doesn't interfere with the rest of the stack >> >> I guess one possible way would be to implement the generator as a set of >> maven >> artifacts, that could be installed/consumed transparently by just declaring a >> dependency e.g as proposed via top-level component. >> >> Another way is to have a new package like we do for bigtop-utils and such. >> >> Perhaps this discussion should be moved to JIRA or shall we continue on the >> dev@ ?? >> >> Cos >> >>> On Sun, Aug 23, 2015 at 11:53AM, RJ Nowling wrote: >>> Hi BigTop, >>> >>> I had a discussion with Jay yesterday, we'd like to propose a new component >>> for BigTop: BigTop Data Generators. >>> >>> BigTop Data Generators would consist of a common set of libraries for >>> building data generators and three example data generators: >>> >>> * BigPetStore transaction generator (moved from BigPetStore) >>> * BigTop Bazaar -- attendee movement and interactions with booths on a >>> showroom floor, at a conference, or at a mall >>> * BigTop Weatherman -- stochastic weather simulation (temperature, wind >>> speed, wind chill, rainfall, etc.) per zip code. (From a model trained on >>> NOAA historical weather data) >>> >>> We believe that creating a common set of libraries will have several >>> benefits including: >>> >>> * Easier for others to build their own data generators >>> * Make data generators smaller and easier to maintain >>> * Share improvements across the data generators >>> >>> More details on the libraries are below. >>> >>> BigPetStore will be continue to focus on building and maintaining >>> blueprints, powered by the BigTop Data Generators. >>> >>> Our vision is that we get all of Apache coming to BigTop for tools for >>> building better, more comprehensive blueprints. We want to support these >>> efforts through data generators and the initial set of blueprint we've been >>> building. >>> >>> If the community is generally in support of this, I can create a top-level >>> "bigtop-data-generators" directory and put the data generators and >>> libraries in there. >>> >>> Thanks! >>> >>> RJ >>> >>> >>> ------- >>> Library details: >>> >>> So far, I've extracted the following common libraries: >>> >>> * Samplers -- provides classes for PDFs and various samplers >>> * Name generator -- data set and samplers for generating names >>> * Location data set -- data set and classes for US zip codes, their >>> GPS coordinates, median house hold incomes, and population sizes >>> * Product generator -- library for enumerating products from a >>> specification file. Comes with default specifications for BigPetStore >>> >>> I also expect that I'll add libraries for: >>> >>> * Particle simulation -- customer movement in a room >>> * Latent factor model generation -- generate latent factors and >>> customer weights to create something like MovieLens data. Used in Bazaar >>> for booth preferences and potentially in BigPetStore for customer item >>> preferences >>> >>> Most of these libraries came out of the BigPetStore data generator but the >>> other generators have been refactored to be based off the standard set of >>> libraries.
signature.asc
Description: Message signed with OpenPGP using GPGMail
