Thanks for your explain 2015-07-26 16:22 GMT+08:00 Shixiong Zhu <[email protected]>:
> Oh, I see. That's the total time of executing a query in Spark. Then the > difference is reasonable, considering Spark has much more work to do, e.g., > launching tasks in executors. > > Best Regards, > Shixiong Zhu > > 2015-07-26 16:16 GMT+08:00 Louis Hust <[email protected]>: > >> Look at the given url: >> >> Code can be found at: >> >> >> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java >> >> 2015-07-26 16:14 GMT+08:00 Shixiong Zhu <[email protected]>: >> >>> Could you clarify how you measure the Spark time cost? Is it the total >>> time of running the query? If so, it's possible because the overhead of >>> Spark dominates for small queries. >>> >>> Best Regards, >>> Shixiong Zhu >>> >>> 2015-07-26 15:56 GMT+08:00 Jerrick Hoang <[email protected]>: >>> >>>> how big is the dataset? how complicated is the query? >>>> >>>> On Sun, Jul 26, 2015 at 12:47 AM Louis Hust <[email protected]> >>>> wrote: >>>> >>>>> Hi, all, >>>>> >>>>> I am using spark DataFrame to fetch small table from MySQL, >>>>> and i found it cost so much than directly access MySQL Using JDBC. >>>>> >>>>> Time cost for Spark is about 2033ms, and direct access at about 16ms. >>>>> >>>>> Code can be found at: >>>>> >>>>> >>>>> https://github.com/louishust/sparkDemo/blob/master/src/main/java/DirectQueryTest.java >>>>> >>>>> So If my configuration for spark is wrong? How to optimise Spark to >>>>> achieve the similar performance like direct access? >>>>> >>>>> Any idea will be appreciated! >>>>> >>>>> >>> >> >
