Re: Querying registered RDD (AsTable) using JDBC

shahab Mon, 22 Dec 2014 00:43:32 -0800

Thanks Evert for the detailing the solution, I do appreciate it. But I
would first try Cheng's suggestion. And Thanks Cheng for the help.
I will let you know if I succeed.


best,
/Shahab

On Sun, Dec 21, 2014 at 12:49 PM, Cheng Lian <[email protected]> wrote:

>  Evert - Thanks for the instructions, this is generally useful in other
> scenarios, but I think this isn’t what Shahab needs, because saveAsTable
> actually saves the contents of the SchemaRDD into Hive.
>
> Shahab - As Michael has answered in another thread, you may try
> HiveThriftServer2.startWithContext, which is a quite experimental
> feature. Here is a quick spark-shell sample session:
>
> import org.apache.spark.sql.hive.HiveContextimport 
> org.apache.spark.sql.catalyst.types._import java.sql.Date
> val sparkContext = scimport sparkContext._
> val sqlContext = new HiveContext(sparkContext)import sqlContext._
>
> makeRDD((1, "hello") :: (2, "world") :: 
> Nil).toSchemaRDD.cache().registerTempTable("t")
> import 
> org.apache.spark.sql.hive.thriftserver._HiveThriftServer2.startWithContext(sqlContext)
>
> Then you can connect to the started server via beeline:
>
> $ ./bin/beeline -u jdbc:hive2://localhost:10000/default
> 0: jdbc:hive2://localhost:10000/default> select * from t;
> +-----+--------+
> | _1  |   _2   |
> +-----+--------+
> | 1   | hello  |
> | 2   | world  |
> +-----+--------+
> 2 rows selected (0.208 seconds)
>
> Cheng
>
> On 12/20/14 1:09 AM, Evert Lammerts wrote:
>
>  Yes you can, using HiveContext, a metastore and the thriftserver. The
> metastore persists information about your SchemaRDD, and the HiveContext,
> initialised with information on the metastore, can interact with the
> metastore. The thriftserver provides JDBC connections using the metastore.
>
>  Using MySQL as an example backend for the metastore:
>
> 1. Install MySQL
> 2. Create a database: CREATE database hive_metastore CHARSET latin1;
> 3. Create a metastore user: GRANT ALL ON hive_metastore.* TO
> metastore_user IDENTIFIED BY 'password';
> 4. Create a hive-site.xml in your Spark's conf dir: see
> http://pastebin.com/VXcmJWdX for an example
> 5. Download the mysql jdbc driver from
> http://dev.mysql.com/downloads/connector/j/
> 6. Start the spark-shell with the mysql driver on the classpath: $
> ./bin/spark-shell --driver-class-path mysql-connector-java-5.1.34-bin.jar
> 7. Register the table using something like:
> > val sqlct = new org.apache.spark.sql.hive.HiveContext(sc)
> > sqlct.setConf("hive.metastore.warehouse.dir”,
> "/some/path/to/store/tables") # if you're local. i.e. not using HDFS
> > ... # create your schemardd using sqlct
> > rdd.saveAsTable("mytable")
> 8. Start the thriftserver (which provides the JDBC
> connection): 0.9710645253623995nbsp;./sbin/start-thriftserver.sh
> --driver-class-path mysql-connector-java-5.1.34-bin.jar --conf
> hive.metastore.warehouse.dir=/some/path/to/store/tables
>
>  Something like that should do it. Now you can connect from for example
> beeline:
>
>  $ ./bin/beeline
> > !connect jdbc:hive2://localhost:10000
>  > show tables;
>
>  This is a good guide re the metastore regardless of your distribution:
> http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-2-0/CDH4-Installation-Guide/cdh4ig_topic_18_4.html
> .
>
>
>
> On Fri Dec 19 2014 at 5:34:49 PM shahab <[email protected]> wrote:
>
>> Hi,
>>
>>  Sorry for repeating the same question, just wanted to clarify the issue
>> :
>>
>>  Is it possible to expose a RDD (or SchemaRDD) to external components
>> (outside spark) so it can  be queried over JDBC (my goal is not to place
>> the RDD back in a database  but use this cached RDD to server JDBC queries)
>> ?
>>
>>  best,
>>
>>  /shahab
>>
>    
>

Re: Querying registered RDD (AsTable) using JDBC

Reply via email to