You can try out a few tricks employed by folks at Lynx Analytics...
Daniel Darabos gave some details at Spark Summit:
https://www.youtube.com/watch?v=zt1LdVj76LU&index=13&list=PL-x35fyliRwhP52fwDqULJLOnqnrN5nDs
On 22.7.2015. 17:00, Louis Hust wrote:
My code like below:
Map t11opt
You can try out a few tricks employed by folks at Lynx Analytics...
Daniel Darabos gave some details at Spark Summit:
https://www.youtube.com/watch?v=zt1LdVj76LU&index=13&list=PL-x35fyliRwhP52fwDqULJLOnqnrN5nDs
On 22.7.2015. 17:00, Louis Hust wrote:
My code like below:
Map t11opt
My code like below:
Map t11opt = new HashMap();
t11opt.put("url", DB_URL);
t11opt.put("dbtable", "t11");
DataFrame t11 = sqlContext.load("jdbc", t11opt);
t11.registerTempTable("t11");
...the same for t12, t21, t22
you can use spark rest job server(or any other solution that provides long
running spark context) so that you won't pay this bootstrap time on each
query
in addition : if you have some rdd that u want your queries to be executed
on, you can cache this rdd in memory(depends on ur cluster memory size
etto: Re: Is spark suitable for real time query
I do a simple test using spark in standalone mode(not cluster),
and found a simple action take a few seconds, the data size is small, just few
rows.
So each spark job will cost some time for init or prepare work no matter what
the job is?
I mean if the
I do a simple test using spark in standalone mode(not cluster),
and found a simple action take a few seconds, the data size is small, just
few rows.
So each spark job will cost some time for init or prepare work no matter
what the job is?
I mean if the basic framework of spark job will cost second
Real-time is, of course, relative but you’ve mentioned microsecond level. Spark
is designed to process large amounts of data in a distributed fashion. No
distributed system I know of could give any kind of guarantees at the
microsecond level.
Robin
> On 22 Jul 2015, at 11:14, Louis Hust wrote
Hi, all
I am using spark jar in standalone mode, fetch data from different mysql
instance and do some action, but i found the time is at second level.
So i want to know if spark job is suitable for real time query which at
microseconds?