We want to extract data from mysql, and calculate in sparksql.
The sql explain like below.
REGIONKEY#177,N_COMMENT#178] PushedFilters: [], ReadSchema:
struct
+- *(20) Sort [r_regionkey#203 ASC NULLS FIRST], false, 0
+- Exchange(coordinator id: 266374831)
ha
We want to extract data from mysql, and calculate in sparksql.
The sql explain like below.
== Parsed Logical Plan ==
> 'Sort ['revenue DESC NULLS LAST], true
> +- 'Aggregate ['n_name], ['n_name, 'SUM(('l_extendedprice * (1 -
> 'l_discount))) AS revenue#329]
>+- 'Filter ('c_custkey = 'o_cu
hi, all,
We deploy sparksql in standalone mode without HDFS on 1 machine with 256G
RAM and 64 cores.
The spark session props like below:
SparkSession.builder().appName("MYAPP")
> .config("spark.sql.crossJoin.enabled", "true")
> .config("spark.executor.memory", th
leverage
> the distributed in memory engine of spark.
>
> Paolo
>
> Inviata dal mio Windows Phone
> ------
> Da: Louis Hust
> Inviato: 26/07/2015 10:28
> A: Shixiong Zhu
> Cc: Jerrick Hoang ; user@spark.apache.org
> Oggetto: Re: Spark is m
hixiong Zhu
>
> 2015-07-26 16:16 GMT+08:00 Louis Hust :
>
>> Look at the given url:
>>
>> Code can be found at:
>>
>>
>> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java
>>
>> 2015-07-26 16:14 GMT+08:00 Sh
, it's possible because the overhead of
> Spark dominates for small queries.
>
> Best Regards,
> Shixiong Zhu
>
> 2015-07-26 15:56 GMT+08:00 Jerrick Hoang :
>
>> how big is the dataset? how complicated is the query?
>>
>> On Sun, Jul 26, 2015 at 12:47 AM Louis Hus
Hi, all,
I am using spark DataFrame to fetch small table from MySQL,
and i found it cost so much than directly access MySQL Using JDBC.
Time cost for Spark is about 2033ms, and direct access at about 16ms.
Code can be found at:
https://github.com/louishust/sparkDemo/blob/master/src/main/java/Di
c.textFile("LICENSE").filter(_ contains "Spark").count
>
> This takes less than a second the first time I run it and is instantaneous
> on every subsequent run.
>
> What code are you running?
>
>
> On 22 Jul 2015, at 12:34, Louis Hust wrote:
>
> I
second level.
>
> Robin
>
> > On 22 Jul 2015, at 11:14, Louis Hust wrote:
> >
> > Hi, all
> >
> > I am using spark jar in standalone mode, fetch data from different mysql
> instance and do some action, but i found the time is at second level.
> >
> &
Hi, all
I am using spark jar in standalone mode, fetch data from different mysql
instance and do some action, but i found the time is at second level.
So i want to know if spark job is suitable for real time query which at
microseconds?
Hi, all
I am using spark 1.4, and find some sql is not support,
especially the subquery, such as subquery in select items,
in where clause, and in predicate conditions.
So i want to know if spark support subquery or i am in the wrong way using
spark sql?
If not support subquery, is there a plan
11 matches
Mail list logo