Thanks all!

Jay and I have been talking about maven artifacts for a while.  We would
appreciate help or some docs on how best to do that.  :)

I set a gradle multi-project build within the data generators to make it
easy to build everything and their dependencies in go.  But that would
horribly complicate things on a BigTop-wide scale.


On Tue, Aug 25, 2015 at 8:41 PM, Konstantin Boudnik <[email protected]> wrote:

> It is pretty cool indeed!
>
> I wonder how it needs to be structured to be:
>  - easy to access/use from other components wherever it is needed
>  - doesn't interfere with the rest of the stack
>
> I guess one possible way would be to implement the generator as a set of
> maven
> artifacts, that could be installed/consumed transparently by just
> declaring a
> dependency e.g as proposed via top-level component.
>
> Another way is to have a new package like we do for bigtop-utils and such.
>
> Perhaps this discussion should be moved to JIRA or shall we continue on the
> dev@ ??
>
> Cos
>
> On Sun, Aug 23, 2015 at 11:53AM, RJ Nowling wrote:
> > Hi BigTop,
> >
> > I had a discussion with Jay yesterday, we'd like to propose a new
> component
> > for BigTop: BigTop Data Generators.
> >
> > BigTop Data Generators would consist of a common set of libraries for
> > building data generators and three example data generators:
> >
> >     * BigPetStore transaction generator (moved from BigPetStore)
> >     * BigTop Bazaar -- attendee movement and interactions with booths on
> a
> > showroom floor, at a conference, or at a mall
> >     * BigTop Weatherman -- stochastic weather simulation (temperature,
> wind
> > speed, wind chill, rainfall, etc.) per zip code.  (From a model trained
> on
> > NOAA historical weather data)
> >
> > We believe that creating a common set of libraries will have several
> > benefits including:
> >
> >      * Easier for others to build their own data generators
> >      * Make data generators smaller and easier to maintain
> >      * Share improvements across the data generators
> >
> > More details on the libraries are below.
> >
> > BigPetStore will be continue to focus on building  and maintaining
> > blueprints, powered by the BigTop Data Generators.
> >
> > Our vision is that we get all of Apache coming to BigTop for tools for
> > building better, more comprehensive blueprints.  We want to support these
> > efforts through data generators and the initial set of blueprint we've
> been
> > building.
> >
> > If the community is generally in support of this, I can create a
> top-level
> > "bigtop-data-generators" directory and put the data generators and
> > libraries in there.
> >
> > Thanks!
> >
> > RJ
> >
> >
> > -------
> > Library details:
> >
> > So far, I've extracted the following common libraries:
> >
> >      * Samplers -- provides classes for PDFs and various samplers
> >      * Name generator -- data set and samplers for generating names
> >      * Location data set -- data set and classes for US zip codes, their
> > GPS coordinates, median house hold incomes, and population sizes
> >      * Product generator -- library for enumerating products from a
> > specification file.  Comes with default specifications for BigPetStore
> >
> > I also expect that I'll add libraries for:
> >
> >       * Particle simulation -- customer movement in a room
> >       * Latent factor model generation -- generate latent factors and
> > customer weights to create something like MovieLens data.  Used in Bazaar
> > for booth preferences and potentially in BigPetStore for customer item
> > preferences
> >
> > Most of these libraries came out of the BigPetStore data generator but
> the
> > other generators have been refactored to be based off the standard set of
> > libraries.
>

Reply via email to