On Mon, Aug 2, 2010 at 2:33 AM, Sonal Goyal <sonalgoy...@gmail.com> wrote:
> Hi Amit,
>
> Hive needs data to be stored in its own namespace. Can you please explain
> why you want to call the database through Hive ?
>
> Thanks and Regards,
> Sonal
> www.meghsoft.com
> http://in.linkedin.com/in/sonalgoyal
>
>
> On Mon, Aug 2, 2010 at 11:56 AM, amit jaiswal <amit_...@yahoo.com> wrote:
>>
>> Hi,
>>
>> I have a database and am looking for a way to 'mount' the db table in hive
>> in
>> such a way that the select query in hive gets translated to sql query for
>> database. I saw DBInputFormat and sqoop, but nothing that can create a
>> proxy
>> table in hive which internally makes db calls.
>>
>> I also tried to use custom variant of DBInputFormat as the input format
>> for the
>> database table.
>>
>> create table employee (id int, name string) stored as INPUTFORMAT
>> 'mycustominputformat' OUTPUTFORMAT
>> 'org.apache.hadoop.mapred.SequenceFileOutputFormat';
>>
>> select id from employee;
>> This fails while running hadoop job because HiveInputFormat only supports
>> FileSplits.
>>
>> HiveInputFormat:
>> public long getStart() {
>> if (inputSplit instanceof FileSplit) {
>> return ((FileSplit)inputSplit).getStart();
>> }
>> return 0;
>> }
>>
>> Any suggestions as if there are any InputFormat implementation that can be
>> used?
>>
>> -amit
>
>
Maybe the new 'strorage handlers' would help. Storage handlers tie
together, input formats, serde's and create/drop table functions. a
JDBC backend storage handler would be a pretty neat thing.