Re: Spark as sql engine on S3

Ashok Kumar Fri, 08 Jul 2016 02:49:36 -0700

Hi
As I said we have using Hive asour SQL engine for the datasets but we are 
storing data externally in amazonS3, 
Now you suggested Spark thrift server.


Started Spark thrift server on port 10001 and I have used beeline that accesses 
thrift server. 
Connecting to jdbc:hive2://,host>:10001Connected to: Spark SQL (version 
1.6.1)Driver: Spark Project Core (version 1.6.1)Transaction isolation: 
TRANSACTION_REPEATABLE_READBeeline version 1.6.1 by Apache Hive
Now I just need to access my external tables on S3 as I do it on Hive with 
beeline connected to Hive thrift server?
The advantage is that using Spark SQL will be much faster?
regards

 

    On Friday, 8 July 2016, 6:30, ayan guha <guha.a...@gmail.com> wrote:
 

 Yes, it can. 
On Fri, Jul 8, 2016 at 3:03 PM, Ashok Kumar <ashok34...@yahoo.com> wrote:

thanks so basically Spark Thrift Server runs on a port much like beeline that 
uses JDBC to connect to Hive?
Can Spark thrift server access Hive tables?
regards 

    On Friday, 8 July 2016, 5:27, ayan guha <guha.a...@gmail.com> wrote:
 

 Spark Thrift Server......works as jdbc server. you can connect to it from any 
jdbc tool like squirrel
On Fri, Jul 8, 2016 at 3:50 AM, Ashok Kumar <ashok34...@yahoo.com.invalid> 
wrote:

Hello gurus,
We are storing data externally on Amazon S3
What is the optimum or best way to use Spark as SQL engine to access data on S3?
Any info/write up will be greatly appreciated.
Regards



-- 
Best Regards,
Ayan Guha


   



-- 
Best Regards,
Ayan Guha

Re: Spark as sql engine on S3

Reply via email to