Hi,

Nive to have data generators in Bigtop.

But please do not include it in bigtop_utils, since this package is mandatory. 
Not everyone needs a data generator .

Olaf


> Am 26.08.2015 um 11:25 schrieb Jay Vyas <[email protected]>:
> 
> Publishing the jar to bigtops maven is probably a good first step ,Then apps 
> can just include it as needed...?.
> 
> I'm not against packaging if someone wants packages for this.  Maybe even 
> include it in bigtop util ?
> 
> Let's move to jira,
> 
>> On Aug 25, 2015, at 9:41 PM, Konstantin Boudnik <[email protected]> wrote:
>> 
>> It is pretty cool indeed!
>> 
>> I wonder how it needs to be structured to be:
>> - easy to access/use from other components wherever it is needed
>> - doesn't interfere with the rest of the stack
>> 
>> I guess one possible way would be to implement the generator as a set of 
>> maven
>> artifacts, that could be installed/consumed transparently by just declaring a
>> dependency e.g as proposed via top-level component.
>> 
>> Another way is to have a new package like we do for bigtop-utils and such.
>> 
>> Perhaps this discussion should be moved to JIRA or shall we continue on the
>> dev@ ??
>> 
>> Cos
>> 
>>> On Sun, Aug 23, 2015 at 11:53AM, RJ Nowling wrote:
>>> Hi BigTop,
>>> 
>>> I had a discussion with Jay yesterday, we'd like to propose a new component
>>> for BigTop: BigTop Data Generators.
>>> 
>>> BigTop Data Generators would consist of a common set of libraries for
>>> building data generators and three example data generators:
>>> 
>>>   * BigPetStore transaction generator (moved from BigPetStore)
>>>   * BigTop Bazaar -- attendee movement and interactions with booths on a
>>> showroom floor, at a conference, or at a mall
>>>   * BigTop Weatherman -- stochastic weather simulation (temperature, wind
>>> speed, wind chill, rainfall, etc.) per zip code.  (From a model trained on
>>> NOAA historical weather data)
>>> 
>>> We believe that creating a common set of libraries will have several
>>> benefits including:
>>> 
>>>    * Easier for others to build their own data generators
>>>    * Make data generators smaller and easier to maintain
>>>    * Share improvements across the data generators
>>> 
>>> More details on the libraries are below.
>>> 
>>> BigPetStore will be continue to focus on building  and maintaining
>>> blueprints, powered by the BigTop Data Generators.
>>> 
>>> Our vision is that we get all of Apache coming to BigTop for tools for
>>> building better, more comprehensive blueprints.  We want to support these
>>> efforts through data generators and the initial set of blueprint we've been
>>> building.
>>> 
>>> If the community is generally in support of this, I can create a top-level
>>> "bigtop-data-generators" directory and put the data generators and
>>> libraries in there.
>>> 
>>> Thanks!
>>> 
>>> RJ
>>> 
>>> 
>>> -------
>>> Library details:
>>> 
>>> So far, I've extracted the following common libraries:
>>> 
>>>    * Samplers -- provides classes for PDFs and various samplers
>>>    * Name generator -- data set and samplers for generating names
>>>    * Location data set -- data set and classes for US zip codes, their
>>> GPS coordinates, median house hold incomes, and population sizes
>>>    * Product generator -- library for enumerating products from a
>>> specification file.  Comes with default specifications for BigPetStore
>>> 
>>> I also expect that I'll add libraries for:
>>> 
>>>     * Particle simulation -- customer movement in a room
>>>     * Latent factor model generation -- generate latent factors and
>>> customer weights to create something like MovieLens data.  Used in Bazaar
>>> for booth preferences and potentially in BigPetStore for customer item
>>> preferences
>>> 
>>> Most of these libraries came out of the BigPetStore data generator but the
>>> other generators have been refactored to be based off the standard set of
>>> libraries.

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to