Re: Spark is much slower than direct access MySQL

Louis Hust Sun, 26 Jul 2015 01:29:01 -0700

Thanks for your explain

2015-07-26 16:22 GMT+08:00 Shixiong Zhu <[email protected]>:


> Oh, I see. That's the total time of executing a query in Spark. Then the
> difference is reasonable, considering Spark has much more work to do, e.g.,
> launching tasks in executors.
>
> Best Regards,
> Shixiong Zhu
>
> 2015-07-26 16:16 GMT+08:00 Louis Hust <[email protected]>:
>
>> Look at the given url:
>>
>> Code can be found at:
>>
>>
>> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java
>>
>> 2015-07-26 16:14 GMT+08:00 Shixiong Zhu <[email protected]>:
>>
>>> Could you clarify how you measure the Spark time cost? Is it the total
>>> time of running the query? If so, it's possible because the overhead of
>>> Spark dominates for small queries.
>>>
>>> Best Regards,
>>> Shixiong Zhu
>>>
>>> 2015-07-26 15:56 GMT+08:00 Jerrick Hoang <[email protected]>:
>>>
>>>> how big is the dataset? how complicated is the query?
>>>>
>>>> On Sun, Jul 26, 2015 at 12:47 AM Louis Hust <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi, all,
>>>>>
>>>>> I am using spark DataFrame to fetch small table from MySQL,
>>>>> and i found it cost so much than directly access MySQL Using JDBC.
>>>>>
>>>>> Time cost for Spark is about 2033ms, and direct access at about 16ms.
>>>>>
>>>>> Code can be found at:
>>>>>
>>>>>
>>>>> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java
>>>>>
>>>>> So If my configuration for spark is wrong? How to optimise Spark to
>>>>> achieve the similar performance like direct access?
>>>>>
>>>>> Any idea will be appreciated!
>>>>>
>>>>>
>>>
>>
>

Re: Spark is much slower than direct access MySQL

Reply via email to