It is fine, but you have to design it that generated rows are written in large 
blocks for optimal performance. 
The most tricky part with data generation is the conceptual part, such as 
probabilistic distribution etc
You have to check as well that you use a good random generator, for some cases 
the Java internal might be not that well.

> On 20. Jun 2017, at 16:04, Esa Heikkinen <esa.heikki...@student.tut.fi> wrote:
> 
> Hi
> 
> 
> Spark is a data analyzer, but would it be possible to use Spark as a data 
> generator or simulator ?
> 
> My simulation can be very huge and i think a parallelized simulation using by 
> Spark (cloud) could work.
> 
> Is that good or bad idea ?
> 
> Regards
> Esa Heikkinen
> 

Reply via email to