Re: Using Spark as a simulator

2017-07-07 Thread Steve Loughran
On 7 Jul 2017, at 08:37, Esa Heikkinen mailto:esa.heikki...@student.tut.fi>> wrote: I only want to simulate very huge "network" with even millions parallel time syncronized actors (state machines). There are also communication between actors via some (key-value pairs) database. I also want th

VS: VS: Using Spark as a simulator

2017-07-07 Thread Esa Heikkinen
4 Vastaanottaja: Esa Heikkinen Kopio: Mahesh Sawaiker; user@spark.apache.org Aihe: Re: VS: Using Spark as a simulator Spark dropped Akka some time ago... I think the main issue he will face is a library for simulating the state machines (randomly), storing a huge amount of files (HDFS is probably

Re: VS: Using Spark as a simulator

2017-07-07 Thread Jörn Franke
y: 21. kesäkuuta 2017 14:45 > Vastaanottaja: Esa Heikkinen; Jörn Franke > Kopio: user@spark.apache.org > Aihe: RE: Using Spark as a simulator > > Spark can help you to create one large file if needed, but hdfs itself will > provide abstraction over such things, so it’s a trivia

VS: Using Spark as a simulator

2017-07-06 Thread Esa Heikkinen
The spark was originally built on it (Akka). Esa Lähettäjä: Mahesh Sawaiker Lähetetty: 21. kesäkuuta 2017 14:45 Vastaanottaja: Esa Heikkinen; Jörn Franke Kopio: user@spark.apache.org Aihe: RE: Using Spark as a simulator Spark can help you to create one large file

RE: Using Spark as a simulator

2017-06-21 Thread Mahesh Sawaiker
object. This way you will get a RDD of scala objects, which you can then process functional/set operators. You would want to read about PairRDDs. From: Esa Heikkinen [mailto:esa.heikki...@student.tut.fi] Sent: Wednesday, June 21, 2017 1:12 PM To: Jörn Franke Cc: user@spark.apache.org Subjec

VS: Using Spark as a simulator

2017-06-21 Thread Esa Heikkinen
Vastaanottaja: Esa Heikkinen Kopio: user@spark.apache.org Aihe: Re: Using Spark as a simulator It is fine, but you have to design it that generated rows are written in large blocks for optimal performance. The most tricky part with data generation is the conceptual part, such as probabilistic

RE: Using Spark as a simulator

2017-06-20 Thread Mahesh Sawaiker
nt in the tables from 1G upwards. From: Esa Heikkinen [mailto:esa.heikki...@student.tut.fi] Sent: Tuesday, June 20, 2017 7:34 PM To: user@spark.apache.org Subject: Using Spark as a simulator Hi Spark is a data analyzer, but would it be possible to use Spark as a data generator or simulator ?

Re: Using Spark as a simulator

2017-06-20 Thread Jörn Franke
It is fine, but you have to design it that generated rows are written in large blocks for optimal performance. The most tricky part with data generation is the conceptual part, such as probabilistic distribution etc You have to check as well that you use a good random generator, for some cases

Using Spark as a simulator

2017-06-20 Thread Esa Heikkinen
Hi Spark is a data analyzer, but would it be possible to use Spark as a data generator or simulator ? My simulation can be very huge and i think a parallelized simulation using by Spark (cloud) could work. Is that good or bad idea ? Regards Esa Heikkinen