I think this is a great tool and would be a nice contribution. I am however not sure about the licensing here. Even though the used library appears to be AL2 licensed, I do not know if there are any restrictions from the Transaction Processing Performance Council (TPC, tpc.org). TPC-H is a benchmark published by the TPC and their rights might be affected.
We should clarify that we are allowed to include this code under AL2. Cheers, Fabian 2015-02-09 16:03 GMT+01:00 Robert Metzger <rmetz...@apache.org>: > Hi, > > we recently added the "flink-contrib" module for user contributed tools > etc. > > On one of the last weekends, I've created a distributed tpch generator, > based on this libary: https://github.com/airlift/tpch (which is from a > PrestoDB developer and available on Maven central). > > You can find my code here: > https://github.com/rmetzger/scratch/tree/distributed-tpch-generator > > It contains two examples: > a) a full TPC data generator (as a flink program): > > https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGenerator.java > > b) an example which generates two TPC-H tables on-the-fly to join them: > > https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGeneratorExample.java > > > Before I spend time on integrating it into the "flink-contrib" package, I > was wondering if the community is willing this contribution to Flink. > > > Best, > Robert >