+1 (non-binding :)) On Sunday, August 23, 2015, Suneel Marthi <[email protected]> wrote:
> +1 from me too, unlike Jay I am unbiased. great idea RJ. > > On Sun, Aug 23, 2015 at 1:42 PM, jay vyas <[email protected] > <javascript:;>> > wrote: > > > +1 from me ; but of course im a little biased. Thanks for summing up > this > > idea RJ! > > > > > > > > On Sun, Aug 23, 2015 at 12:53 PM, RJ Nowling <[email protected] > <javascript:;>> wrote: > > > > > Hi BigTop, > > > > > > I had a discussion with Jay yesterday, we'd like to propose a new > > component > > > for BigTop: BigTop Data Generators. > > > > > > BigTop Data Generators would consist of a common set of libraries for > > > building data generators and three example data generators: > > > > > > * BigPetStore transaction generator (moved from BigPetStore) > > > * BigTop Bazaar -- attendee movement and interactions with booths > on > > a > > > showroom floor, at a conference, or at a mall > > > * BigTop Weatherman -- stochastic weather simulation (temperature, > > wind > > > speed, wind chill, rainfall, etc.) per zip code. (From a model trained > > on > > > NOAA historical weather data) > > > > > > We believe that creating a common set of libraries will have several > > > benefits including: > > > > > > * Easier for others to build their own data generators > > > * Make data generators smaller and easier to maintain > > > * Share improvements across the data generators > > > > > > More details on the libraries are below. > > > > > > BigPetStore will be continue to focus on building and maintaining > > > blueprints, powered by the BigTop Data Generators. > > > > > > Our vision is that we get all of Apache coming to BigTop for tools for > > > building better, more comprehensive blueprints. We want to support > these > > > efforts through data generators and the initial set of blueprint we've > > been > > > building. > > > > > > If the community is generally in support of this, I can create a > > top-level > > > "bigtop-data-generators" directory and put the data generators and > > > libraries in there. > > > > > > Thanks! > > > > > > RJ > > > > > > > > > ------- > > > Library details: > > > > > > So far, I've extracted the following common libraries: > > > > > > * Samplers -- provides classes for PDFs and various samplers > > > * Name generator -- data set and samplers for generating names > > > * Location data set -- data set and classes for US zip codes, > their > > > GPS coordinates, median house hold incomes, and population sizes > > > * Product generator -- library for enumerating products from a > > > specification file. Comes with default specifications for BigPetStore > > > > > > I also expect that I'll add libraries for: > > > > > > * Particle simulation -- customer movement in a room > > > * Latent factor model generation -- generate latent factors and > > > customer weights to create something like MovieLens data. Used in > Bazaar > > > for booth preferences and potentially in BigPetStore for customer item > > > preferences > > > > > > Most of these libraries came out of the BigPetStore data generator but > > the > > > other generators have been refactored to be based off the standard set > of > > > libraries. > > > > > > > > > > > -- > > jay vyas > > >
