Thank you for the response. The localhost and 192.168.128.1 are the actually the same ES host. I installed ES cloudera vm on xp. I will try your suggestion though and report back. I will try the table without timestamp column.
Sent from my iPhone > On Jun 13, 2014, at 1:59 PM, Costin Leau <[email protected]> wrote: > > Hi, > > Sorry for the delayed response, travel and other things got in the way. I > have tried replicating the issue on my end and couldn't; see below: > >> On 6/8/14 8:03 PM, elitem way wrote: >> I am learning the elasticsearch-hadoop. I have a few issues that I do not >> understand. I am using ES 1.12 on Windows, >> elasticsearch-hadoop-2.0.0 and cloudera-quickstart-vm-5.0.0-0-vmware sandbox >> with Hive. >> >> 1. I loaded only 6 rows to ES index car/transactions. Why did Hive return 14 >> rows instead? See below. >> 2. "select count(*) from cars2" failed with code 2. "Group by", "sum" also >> failed. Did I miss anything. The similar >> query are successful when using sample_07 and sample_08 tables that come >> with Hive. >> 3. elasticsearch-hadoop-2.0.0 does seem to work with jetty - the >> authentication plugin. I got errors when I enable >> jetty and set 'es.nodes' = 'superuser:[email protected]' >> 4. I could not pipe data from Hive to ElasticSearch either. >> >> *--ISSUE 1*: >> --load data to ES >> POST: http://localhost:9200/cars/transactions/_bulk >> { "index": {}} >> { "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" >> } >> { "index": {}} >> { "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : >> "2014-07-02" } >> { "index": {}} >> { "price" : 12000, "color" : "green", "make" : "toyota", "sold" : >> "2014-08-19" } >> { "index": {}} >> { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } >> { "index": {}} >> { "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" } >> { "index": {}} >> { "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" } >> >> CREATE EXTERNAL TABLE cars2 (color STRING, make STRING, price BIGINT, sold >> TIMESTAMP) >> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' >> TBLPROPERTIES('es.resource' = 'cars/transactions', >> 'es.nodes' = '192.168.128.1', 'es.port'='9200'); >> >> HIVE: select * from cars2; >> 14 rows returned. >> >> color make price sold >> 0 red honda 20000 2014-11-05 00:00:00.0 >> 1 red honda 10000 2014-10-28 00:00:00.0 >> 2 green ford 30000 2014-05-18 00:00:00.0 >> 3 green toyota 12000 2014-08-19 00:00:00.0 >> 4 blue ford 25000 2014-02-12 00:00:00.0 >> 5 blue toyota 15000 2014-07-02 00:00:00.0 >> 6 red bmw 80000 2014-01-01 00:00:00.0 >> 7 red honda 10000 2014-10-28 00:00:00.0 >> 8 blue toyota 15000 2014-07-02 00:00:00.0 >> 9 red honda 20000 2014-11-05 00:00:00.0 >> 10 green ford 30000 2014-05-18 00:00:00.0 >> 11 green toyota 12000 2014-08-19 00:00:00.0 >> 12 red honda 20000 2014-11-05 00:00:00.0 >> 13 red honda 20000 2014-11-05 00:00:00.0 >> 14 red bmw 80000 2014-01-01 00:00:00.0 >> >> > > It looks like you are adding data to localhost:9200 but querying on > 192.168.128.1:9200 - most likely they are different, hence > the different data set. To double check, do a query/count through curl on ES > and then check the data through Hive - that's what we do in our tests. > >> *ISSUE2:* >> >> HIVE: select count(*) from cars2; >> >> Your query has the following error(s): >> Error while processing statement: FAILED: Execution Error, return code 2 >> from org.apache.hadoop.hive.ql.exec.mr.MapRedTask >> > > Again since you are querying a different host it's hard to tell what's the > issue. count(*) works in our tests but I've seen cases where count fails when > dealing the newly introduced types (like timestamp). You can use count(1) as > an alternative which should work just fine. >> >> *--ISSUE 4:* >> >> CREATE EXTERNAL TABLE test1 ( >> description STRING) >> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' >> TBLPROPERTIES('es.host' = '192.168.128.1', 'es.port'='9200', 'es.resource' = >> 'test1'); >> >> INSERT OVERWRITE TABLE test1 select description from sample_07; >> >> Your query has the following error(s): >> >> Error while processing statement: FAILED: Execution Error, return code 2 >> from org.apache.hadoop.hive.ql.exec.mr.MapRedTask >> > > That is because you have an invalid table definition; the resource needs to > point to a "index/type" not just an index - if you look deep into the Hive > exception, you should be able to see the actual validation message. Since > Hive executes things lazily and on the server side, there's no other way of > reporting the error to the user... > > Hope this helps, > >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to >> [email protected] >> <mailto:[email protected]>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/8c642665-424a-48be-bc5d-8625b94243c0%40googlegroups.com >> <https://groups.google.com/d/msgid/elasticsearch/8c642665-424a-48be-bc5d-8625b94243c0%40googlegroups.com?utm_medium=email&utm_source=footer>. >> For more options, visit https://groups.google.com/d/optout. > > -- > Costin > > -- > You received this message because you are subscribed to a topic in the Google > Groups "elasticsearch" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/elasticsearch/m-Z1R7LFPRo/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/539B3BFF.5000009%40gmail.com. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8B9CC0C9-3117-4DB7-88D3-E65ABDDB0E51%40gmail.com. For more options, visit https://groups.google.com/d/optout.
