Re: Connecting to Hive provided by AWS EMR

Paul Mogren Sun, 28 Jun 2015 10:16:10 -0700

Thanks Venki! I should be using port 9083. Port 10000 services the JDBC
connector, and I didn’t realize they’d be different.


I’m not sure what your other comments refer to, or what I would be
specifying. From my recollection of the docs, I didn’t think I would need
any further Hive config in Drill. At first I left the other default config
elements, but whittled it down to what I had before (except the port
number) and it works.


Paul


On 6/26/15, 6:38 PM, "Venki Korukanti" <[email protected]> wrote:

>Hi,
>
>What port is your Hive metastore listening? The default port is 9083. In
>your case you provided 10000 (as part of hive.metastore.uris). Can you
>double check if that is the correct one.
>
>Also you need provide fs.default.name and other s3 related settings in
>Hive
>storage plugin config.
>
>Thanks
>Venki
>
>On Fri, Jun 26, 2015 at 3:12 PM, Paul Mogren <[email protected]>
>wrote:
>
>> I have scoured the Drill website and mailing list, and Google, and have
>> come up with no advice. Can you help?
>>
>> I started up an EMR cluster with AWS Hive 0.13.1 installed,
>>
>> started the metastore service: hive/bin/hive ‹service metastore,
>>
>> created a table:
>> CREATE TABLE apachelog (
>>   host STRING,
>>   IDENTITY STRING,
>>   USER STRING,
>>   TIME STRING,
>>   request STRING,
>>   STATUS STRING,
>>   SIZE STRING,
>>   referrer STRING,
>>   agent STRING
>> )
>> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
>> WITH SERDEPROPERTIES (
>>   "input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) ([^
>> \"]*|\"[^\"]*\") ([0-9]*) ([0-9]*) ([^ \"]*|\"[^\"]*\") ([^
>> \"]*|\"[^\"]*\")"
>> )
>> STORED AS TEXTFILE;
>>
>> And loaded a small amount of data:
>> LOAD DATA LOCAL INPATH 'access_log_1' OVERWRITE INTO TABLE apache_log;
>>  ‹-source:
>> 
>>http://elasticmapreduce.s3.amazonaws.com/samples/pig-apache/input/access_
>>lo
>> g_1
>>
>>
>>
>> I can query this data from the Hive console or from SquirrelSQL using
>>the
>> AWS Hive JDBC4 driver from
>> 
>>http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/HiveJDB
>>CD
>> river.html
>>
>> I configured a Drill storage plugin:
>> {
>>   "type": "hive",
>>   "enabled": true,
>>   "configProps": {
>>     "hive.metastore.uris": "thrift://172.24.7.81:10000",
>>     "hive.metastore.sasl.enabled": "false"
>>   }
>> }
>>
>>
>> But all I get from Drill is socket timeouts reading from the Hive
>> metastore, whether I try to query the apache_log table or Drill¹s
>> INFORMATION_SCHEMA.
>>
>> I have a guess that I need to swap in some AWS-provided Hive-related jar
>> files for others that were included with Drill. Looking for suggestions
>>on
>> that approach, or something else I might be overlooking.
>>
>> Thanks,
>> Paul
>>
>>

Re: Connecting to Hive provided by AWS EMR

Reply via email to