I am learning the elasticsearch-hadoop. I have a few issues that I do not 
understand. I am using ES 1.12 on Windows, elasticsearch-hadoop-2.0.0 and 
cloudera-quickstart-vm-5.0.0-0-vmware sandbox with Hive.

1. I loaded only 6 rows to ES index car/transactions. Why did Hive return 
14 rows instead? See below.
2. "select count(*) from cars2" failed with code 2. "Group by", "sum" also 
failed. Did I miss anything. The similar query are successful when using 
sample_07 and sample_08 tables that come with Hive.
3.  elasticsearch-hadoop-2.0.0 does seem to work with jetty - the 
authentication plugin. I got errors when I enable jetty and set 'es.nodes' 
= 'superuser:[email protected]'
4. I could not pipe data from Hive to ElasticSearch either.

*--ISSUE 1*:
--load data to ES
­ POST: http://localhost:9200/cars/transactions/_bulk
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : 
"2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : 
"2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : 
"2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" 
}
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" 
}

CREATE EXTERNAL TABLE cars2 (color STRING, make STRING, price BIGINT, sold 
TIMESTAMP)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.resource' = 'cars/transactions',
'es.nodes' = '192.168.128.1', 'es.port'='9200');

HIVE: select * from cars2;
14 rows returned.

  color make price sold
0 red honda 20000 2014-11-05 00:00:00.0
1 red honda 10000 2014-10-28 00:00:00.0
2 green ford 30000 2014-05-18 00:00:00.0
3 green toyota 12000 2014-08-19 00:00:00.0
4 blue ford 25000 2014-02-12 00:00:00.0
5 blue toyota 15000 2014-07-02 00:00:00.0
6 red bmw 80000 2014-01-01 00:00:00.0
7 red honda 10000 2014-10-28 00:00:00.0
8 blue toyota 15000 2014-07-02 00:00:00.0
9 red honda 20000 2014-11-05 00:00:00.0
10 green ford 30000 2014-05-18 00:00:00.0
11 green toyota 12000 2014-08-19 00:00:00.0
12 red honda 20000 2014-11-05 00:00:00.0
13 red honda 20000 2014-11-05 00:00:00.0
14 red bmw 80000 2014-01-01 00:00:00.0


*ISSUE2:*

HIVE: select count(*) from cars2;

Your query has the following error(s):
Error while processing statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


*--ISSUE 4:*

CREATE EXTERNAL TABLE test1 (
        description STRING)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES('es.host' = '192.168.128.1', 'es.port'='9200', 'es.resource' 
= 'test1');

INSERT OVERWRITE TABLE test1 select description from sample_07;

Your query has the following error(s):

Error while processing statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8c642665-424a-48be-bc5d-8625b94243c0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to