Uwe, Thanks for the pointer.
Some differences (based on a quick glance): - GPL versus Apache license - log-synth generates data using the JSON data model. For playing with Drill this is really handy. - log-synth has file lookup samplers like datagenerator does, but it also has a wide variety of skewed sampling against these files - log-synth has realistic sampling for dates, time sequences, VIN numbers, SSN's, Zip codes, random walks, names, addresses, browser versions, languages. The first three also support lots of additional details. For instance, the VIN decodes country of origin, for some manufacturers also model number, engine size and more. - log-synth has the ability to sample stateful sequences - log-synth can fill in templates for crazy sampling - log-synth can use random samplers as the input for other samplers or as the parameters of other samplers - log-synth is easily extensible and has a very simple Java API in case you want to use it from a program That said, I would love to work together on this sort of problem if we can resolve the license issues. On Tue, Aug 25, 2015 at 12:50 AM, Geercken, Uwe <[email protected]> wrote: > Hello everybody, > > Here is another tool: to generate mass CSV data. It is a java based tool > and named: datagenerator. Generate data based on word lists, regular > expressions or random. Also generates columns for date/time that correspond > to each other (that "make sense") > > https://github.com/uwegeercken/datagenerator > > > hope this helps. > > Uwe >
