[ 
https://issues.apache.org/jira/browse/HADOOP-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644829#action_12644829
 ] 

Raghotham Murthy commented on HADOOP-4101:
------------------------------------------

Michi and I were discussing this over the weekend. Here's our current thinking 
about the design. Michi, pls confirm.

1. implement a thrift client/server for hive. for now, the interface consists 
only of execute and fetch_row. we were able to setup the framework with a 
thrift server and a java client which talks to the server. next step is to get 
the server to run the queries. 
notes: we looked at the metastore code and thought it might be simpler to first 
implement a separate thrift client/server before merging it with the metastore. 
some installations might want to have separate instances of metastore and hive 
server. and, its easier to test a smaller interface where we understand the 
code. also, metastore code seems to have classes which arent being used at all 
and the scripts to start/stop metastore dont really work in non-facebook 
installations (need to file separate jiras for those).

2. build a jdbc interface which makes calls to the generated java thrift 
client. we could also have python and perl dbi interfaces which can be make 
calls to the generated thrift client code in those languages. so, the thrift 
interface is a generic interface which is not specific to any particular 
standard (jdbc/dbi etc).

3. the directory structure in the code would be as follows in src/contrib/hive. 
it follows a similar model to metastore.

service/if/hive_service.thrift
service/include/<headers from thrift>
service/fb303/<scripts for service_ctrl to manage server>
service/src/gen-javabean/<generated java code>
service/src/gen-php/<generated php>
service/src/gen-py/<generated python>
service/src/gen-perl/<generated perl>
service/src/scripts/<ctrl scripts for server>
service/src/java/org/apache/hadoop/hive/service/HiveServer.java
service/src/java/org/apache/hadoop/hive/service/HiveClient.java
jdbc/src/java/org/apache/hadoop/hive/jdbc/<whatever is in current jdbc patch>
dbi/<perl dbi interface calling service/src/gen-perl>
cli/<changed to use HiveClient or HiveJdbc>

4. next steps
a. get server to run queries and return results to client.
b. move ql/Driver.java to service since the actual running of the query is not 
really part of the query language.
c. change cli to use the service
d. verify which parts of the metastore interface are needed by jdbc and 
move/copy over parts to hive_service - i dont think it makes sense to do it the 
other way around i.e. put the hive service into metastore since metastore is 
not the right abstraction to actually run queries.
e. there is common thrift code in metastore and service. we should either move 
it to a seprate thrift directory or make metastore use stuff from service.

It will be good to meet up to discuss them in more detail. I'll let Michi 
provide a patch for the hive server/client and jdbc wrappers for the hive 
client.

> Support JDBC connections for interoperability between Hive and RDBMS
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4101
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4101
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/hive
>            Reporter: YoungWoo Kim
>            Priority: Minor
>         Attachments: hadoop-4101.1.patch
>
>
> In many DW and BI systems, the data are stored in RDBMS for now such as 
> oracle, mysql, postgresql ... for reporting, charting and etc.
> It would be useful to be able to import data from RDBMS and export data to 
> RDBMS using JDBC connections.
> If Hive support JDBC connections, It wll be much easier to use 3rd party 
> DW/BI tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to