sure, I was able to run follwoing command against my remote es cluster.
hive -i init.hive -f search.hql.
Below is the contents of init.hive, search.hql and data file in hdfs
/user/cloudera/hivework/foobar/foobar.data
I replaced value for es.nodes with fake name. Other than that, it should
ran without problem. I am using feature called 'dynamic/mult resource
wirtes. It works in this example, but when I also add 'es.mapping.id' =
'id' setting. I got a the following error:
*Caused by: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest:
Unexpected character ('"' (code 34)): was expecting comma to separate
OBJECT entries at [Source: [B@7be1d686; line: 1, column: 53] at
org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:300)
at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:278)*
-----init.hive----
set es.nodes=my.remote.escluster;
set es.port=9200;
set es.index.auto.create=yes;
set hive.cli.print.current.db=true;
set hive.exec.mode.local.auto=true;
set mapred.map.tasks.speculative.execution=false;
set mapred.reduce.tasks.speculative.execution=false;
set hive.mapred.reduce.tasks.speculative.execution=false;
add jar
/home/cloudera/elasticsearch-hadoop-2.0.0/dist/elasticsearch-hadoop-hive-2.0.0.jar;
-----search.hql----
use search;
DROP TABLE IF EXISTS foo;
CREATE EXTERNAL TABLE foo (id STRING, bar STRING, bar_type STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/user/cloudera/hivework/foobar';
select * from foo;
DROP TABLE IF EXISTS es_foo;
CREATE EXTERNAL TABLE es_foo (id STRING, bar STRING, bar_type STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'foo_index/{bar_type}');
INSERT OVERWRITE TABLE es_foo SELECT * FROM foo;
----- /user/cloudera/hivework/foobar/foobar.data ---
1, bar1, first_bar
2, bar2, first_bar
3, foo_bar_1, second_bar
4, foo_bar_12, second_bar
~
Jinyuan (Jack) Zhou
On Mon, Jun 16, 2014 at 2:06 PM, Costin Leau <[email protected]> wrote:
> Thanks for sharing - can you also give an example of the table
> initialization in init.hive vs myscript.hql?
>
> Cheers!
>
>
> On 6/16/14 11:19 PM, Jinyuan Zhou wrote:
>
>> Just share a solution I learned hive side.
>>
>> hive cli has an -i option that takes a file of hive commands to
>> initilize the session.
>> so I can put a list of set comand as well as add jar ... command in one
>> file, say inithive
>> then run the cli as this: hive -i init.hive -f myscript.hql. Note table
>> creation hql inside myscript.hql don't have to
>> set es.* properties as long as it appears in init.hive file This solves
>> my problem.
>> Thanks,
>>
>>
>> Jinyuan (Jack) Zhou
>>
>>
>> On Sun, Jun 15, 2014 at 10:24 AM, Jinyuan Zhou <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Thanks Costin,
>> I am aiming at modifying the existing hadoop cluster and hive
>> installation and also modularizing some common es.*
>> properies in a separate common place. I know the first goal can be
>> achieved with hive cli --auxpath option and
>> hive table's TBLPROPERTERTIES. For the secon goal, I am able to move
>> some es.* settings from TBLPROPERTIES
>> declaration to hive's set statments. For example, I can put
>>
>> set es.nodes=my.domain.com <http://my.domain.com>
>>
>>
>> in the same hql file then skip es.nodes setting in TBLPROPERTIES in
>> the external table delcarations in the SAME
>> hql. But I wish I can move the set statetemnt in a separate file. I
>> now realize this is rather a hive question.
>> Regards,
>> Jack
>>
>>
>> On Sun, Jun 15, 2014 at 2:19 AM, Costin Leau <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Could you please raise an issue with some type of example? Due to
>> the way Hadoop (and Hive) works,
>> things tend to be tricky in terms of configuring a job.
>>
>> The configuration needs to be created before a job is submitted
>> which in practice means "dynamic configurations"
>> are basically impossible (this also has some security
>> implications which are simply avoided this way).
>> Thus either one specifies the configuration manually or loads a
>> known location file (hive-site.xml,
>> core-site.xml...)
>> upfront, before the job is submitted.
>> This means when dealing with Hive, Pig, Cascading, etc... unless
>> one adds a pre-processor to the job content
>> (script, flow, etc...)
>> by the time es-hadoop kicks in, the job is already running and
>> thus its changes discarded.
>>
>> Cheers,
>>
>> On 6/14/14 1:57 AM, Jinyuan Zhou wrote:
>>
>> Hi,
>> I am playing with elasticsearch and hive integration. The
>> documentation says
>> to set configuration like es.nodes, es.port in
>> TBLPROPERTIES. It works.
>> But it can cause many reduntant codes. If I have ten data set
>> to index to the same es cluster,
>> I would have to repeat this information ten times in
>> TBLPROPERTIES. Even if
>> I use var substitution I still have to rwrite this
>> subtititiov var for each table definition.
>> What I am looking for is to put these info in say one file
>> and pass the location, in some way, to hive cli
>> so hive elasticsearch will get these settings when trying to
>> find es server to talk to.
>> I am not looking into put these info into files like
>> hive-site.xml.
>>
>> Thanks,
>>
>> Jack
>>
>> --
>> You received this message because you are subscribed to the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from
>> it, send an email to
>> elasticsearch+unsubscribe@__googlegroups.com <mailto:
>> elasticsearch%[email protected]>
>> <mailto:[email protected] <mailto:
>> elasticsearch%[email protected]>>.
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/__msgid/elasticsearch/7040c805-_
>> _e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com <
>> https://groups.google.com/d/msgid/elasticsearch/7040c805-
>> e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com>
>> <https://groups.google.com/d/__msgid/elasticsearch/7040c805-
>> __e845-4b3d-a9fe-5e18d8445f7f%__40googlegroups.com?utm_
>> medium=__email&utm_source=footer
>> <https://groups.google.com/d/msgid/elasticsearch/7040c805-
>> e845-4b3d-a9fe-5e18d8445f7f%40googlegroups.com?utm_medium=
>> email&utm_source=footer>>.
>> For more options, visit https://groups.google.com/d/__optout
>> <https://groups.google.com/d/optout>.
>>
>>
>>
>> --
>> Costin
>>
>> --
>> You received this message because you are subscribed to a topic
>> in the Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/__topic/elasticsearch/__
>> 1WH7kOD3uKs/unsubscribe
>> <https://groups.google.com/d/topic/elasticsearch/
>> 1WH7kOD3uKs/unsubscribe>.
>> To unsubscribe from this group and all its topics, send an email
>> to elasticsearch+unsubscribe@__googlegroups.com
>> <mailto:elasticsearch%[email protected]>.
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/__msgid/elasticsearch/539D6507._
>> _3080207%40gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/539D6507.
>> 3080207%40gmail.com>.
>> For more options, visit https://groups.google.com/d/__optout <
>> https://groups.google.com/d/optout>.
>>
>>
>>
>>
>>
>> --
>> -- Jinyuan (Jack) Zhou
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to
>> [email protected] <mailto:elasticsearch+
>> [email protected]>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-
>> ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CANBTPCErh1M5_xNa0SE-
>> ZShpUDuXKTPMCYqrWCB1z36%3D9vjaDQ%40mail.gmail.com?utm_
>> medium=email&utm_source=footer>.
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> Costin
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/elasticsearch/1WH7kOD3uKs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/539F5C5F.5050408%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANBTPCGhqWTJLAWNKmnkMTOWGFizi4wShfvo7V0u0_5HDniDkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.