I am learning the elasticsearch-hadoop. I have a few issues that I do not understand. I am using ES 1.12 on Windows, elasticsearch-hadoop-2.0.0 and cloudera-quickstart-vm-5.0.0-0-vmware sandbox with Hive.
1. I loaded only 6 rows to ES index car/transactions. Why did Hive return 14 rows instead? See below. 2. "select count(*) from cars2" failed with code 2. "Group by", "sum" also failed. Did I miss anything. The similar query are successful when using sample_07 and sample_08 tables that come with Hive. 3. elasticsearch-hadoop-2.0.0 does seem to work with jetty - the authentication plugin. I got errors when I enable jetty and set 'es.nodes' = 'superuser:[email protected]' 4. I could not pipe data from Hive to ElasticSearch either. *--ISSUE 1*: --load data to ES POST: http://localhost:9200/cars/transactions/_bulk { "index": {}} { "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" } { "index": {}} { "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" } { "index": {}} { "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" } { "index": {}} { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } { "index": {}} { "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" } { "index": {}} { "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" } CREATE EXTERNAL TABLE cars2 (color STRING, make STRING, price BIGINT, sold TIMESTAMP) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'cars/transactions', 'es.nodes' = '192.168.128.1', 'es.port'='9200'); HIVE: select * from cars2; 14 rows returned. color make price sold 0 red honda 20000 2014-11-05 00:00:00.0 1 red honda 10000 2014-10-28 00:00:00.0 2 green ford 30000 2014-05-18 00:00:00.0 3 green toyota 12000 2014-08-19 00:00:00.0 4 blue ford 25000 2014-02-12 00:00:00.0 5 blue toyota 15000 2014-07-02 00:00:00.0 6 red bmw 80000 2014-01-01 00:00:00.0 7 red honda 10000 2014-10-28 00:00:00.0 8 blue toyota 15000 2014-07-02 00:00:00.0 9 red honda 20000 2014-11-05 00:00:00.0 10 green ford 30000 2014-05-18 00:00:00.0 11 green toyota 12000 2014-08-19 00:00:00.0 12 red honda 20000 2014-11-05 00:00:00.0 13 red honda 20000 2014-11-05 00:00:00.0 14 red bmw 80000 2014-01-01 00:00:00.0 *ISSUE2:* HIVE: select count(*) from cars2; Your query has the following error(s): Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask *--ISSUE 4:* CREATE EXTERNAL TABLE test1 ( description STRING) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.host' = '192.168.128.1', 'es.port'='9200', 'es.resource' = 'test1'); INSERT OVERWRITE TABLE test1 select description from sample_07; Your query has the following error(s): Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c642665-424a-48be-bc5d-8625b94243c0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
