One more update on the issue:
I tried changing the query using 'sum'
{
"size":0,
"aggs": {
"group_by_BodyPart": {
"terms": {
"field": "body_part",
"size": 5,
"order" : { "examcount" : "desc" }
},
"aggs" : {
"examcount" : { "sum" : { "field" : "ExamRowKey" } }
}
}
}
}
But both these queries returns entire search result records to Mapper
method, inspite of giving "size": 5 in query string.
Thanks,
Sona
On Monday, August 25, 2014 3:17:03 PM UTC+5:30, Sona Samad wrote:
>
> Thanks Adrien.
>
> The ExamRowKey and body_part are Strings uploaded from csv file using
> LogStash to ElasticSearch.
>
> - how reproducible is it? Ie. if you run this query 10 times, how many of
> these queries will write such lines to the logs?
> This query returns the error each time its run in the cluster.
> Other simple queries are returning values. Eg:
> {"query":
> {"term":
> {"ExamRowKey":"4090741090"}
> }
> }
>
> - is it common that several queries will be executing at the same time on
> your elasticsearch cluster?
> For testing purpose, I was running only the above specified query.
>
> - are there other exception in your logs that happen approximately at the
> same time?
> No, the stack trace I have posted is the only errors I got today after
> running the query.
>
> Thanks,
> Sona
>
> On Monday, August 25, 2014 1:34:26 PM UTC+5:30, Adrien Grand wrote:
>>
>> Thanks Sona,
>>
>> This stack trace indicates a bug in the cardinality aggregation. I just
>> opened an issue for it:
>> https://github.com/elasticsearch/elasticsearch/issues/7429
>>
>> In order to help me understand/reproduce this bug, could you please
>> provide the mappings of your ExamRowKey and body_part fields? Also answers
>> to the questions below would help me understand better what is happening:
>> - how reproducible is it? Ie. if you run this query 10 times, how many
>> of these queries will write such lines to the logs?
>> - is it common that several queries will be executing at the same time
>> on your elasticsearch cluster?
>> - are there other exception in your logs that happen approximately at
>> the same time?
>>
>> Thanks!
>>
>>
>>
>> On Mon, Aug 25, 2014 at 6:10 AM, Sona Samad <[email protected]> wrote:
>>
>>> Hi Adrien,
>>>
>>> My elasticsearch version is : elasticsearch-1.2.1
>>>
>>> The Maven dependency for hadoop:
>>>
>>> <dependency>
>>> <groupId>org.elasticsearch</groupId>
>>> <artifactId>elasticsearch-hadoop-mr</artifactId>
>>> <version>2.0.1</version>
>>> </dependency>
>>>
>>>
>>> The full stack trace is given below:
>>>
>>> [2014-08-25 09:31:58,892][DEBUG][action.search.type ] [Thane
>>> Ector] [mr][4], node[1ZbXSvkKQC-kDvgMXuC8iQ], [P], s[STARTED]: Failed to
>>> execute [org.elasticsearch.action.search.SearchRequest@6ed78f6d]
>>> org.elasticsearch.search.query.QueryPhaseExecutionException: [mr][4]:
>>> query[ConstantScore(cache(_type:logs))],from[0],size[50]: Query Failed
>>> [Failed to execute main query]
>>>
>>> at
>>> org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:162)
>>> at
>>> org.elasticsearch.search.SearchService.executeScan(SearchService.java:215)
>>> at
>>> org.elasticsearch.search.action.SearchServiceTransportAction$19.call(SearchServiceTransportAction.java:444)
>>> at
>>> org.elasticsearch.search.action.SearchServiceTransportAction$19.call(SearchServiceTransportAction.java:441)
>>> at
>>> org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:517)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 97
>>> at
>>> org.elasticsearch.common.util.BigArrays$IntArrayWrapper.set(BigArrays.java:185)
>>> at
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus$Hashset.values(HyperLogLogPlusPlus.java:499)
>>> at
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.upgradeToHll(HyperLogLogPlusPlus.java:307)
>>> at
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.collectLcEncoded(HyperLogLogPlusPlus.java:245)
>>> at
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.collectLc(HyperLogLogPlusPlus.java:239)
>>> at
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.collect(HyperLogLogPlusPlus.java:231)
>>> at
>>> org.elasticsearch.search.aggregations.metrics.cardinality.CardinalityAggregator$DirectCollector.collect(CardinalityAggregator.java:204)
>>> at
>>> org.elasticsearch.search.aggregations.metrics.cardinality.CardinalityAggregator.collect(CardinalityAggregator.java:118)
>>> at
>>> org.elasticsearch.search.aggregations.bucket.BucketsAggregator.collectBucketNoCounts(BucketsAggregator.java:74)
>>> at
>>> org.elasticsearch.search.aggregations.bucket.BucketsAggregator.collectExistingBucket(BucketsAggregator.java:63)
>>> at
>>> org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.collect(GlobalOrdinalsStringTermsAggregator.java:98)
>>> at
>>> org.elasticsearch.search.aggregations.AggregationPhase$AggregationsCollector.collect(AggregationPhase.java:157)
>>> at
>>> org.elasticsearch.common.lucene.MultiCollector.collect(MultiCollector.java:60)
>>> at
>>> org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)
>>> at
>>> org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:163)
>>> at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
>>> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
>>> at
>>> org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:175)
>>> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309)
>>> at
>>> org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:116)
>>> ... 7 more
>>> [2014-08-25 09:31:58,894][DEBUG][action.search.type ] [Thane
>>> Ector] All shards failed for phase: [init_scan]
>>>
>>> Thanks,
>>> Sona
>>>
>>>
>>> On Friday, August 22, 2014 5:07:33 PM UTC+5:30, Sona Samad wrote:
>>>
>>>> Hi,
>>>>
>>>> I was trying to run the below query from hadoop mapreduce:
>>>>
>>>> {
>>>> "aggs": {
>>>> "group_by_body_part": {
>>>> "terms": {
>>>> "field": "body_part",
>>>> "size": 5,
>>>> "order" : { "examcount" : "desc" }
>>>> },
>>>> "aggs": {
>>>> "examcount": {
>>>> "cardinality": {
>>>> "field": "ExamRowKey"
>>>> }
>>>> }
>>>> }
>>>> }
>>>> }
>>>> }
>>>>
>>>> The query is returning more than 5 records, even when the size is given
>>>> as 5.
>>>> Also, the result was not aggregated, rather it returns the entire
>>>> record from the index as value to mapper.
>>>>
>>>> Also the following error is logged:
>>>>
>>>> [2014-08-22 16:06:21,459][DEBUG][action.search.type ] [Algrim
>>>> the Strong] All shards failed for phase: [init_scan]
>>>> [2014-08-22 16:26:38,875][DEBUG][action.search.type ] [Algrim
>>>> the Strong] [mr][0], node[r9u9daW_TkqTBBeazKJQNw], [P], s[STARTED]: Failed
>>>> to execute [org.elasticsearch.action.search.SearchRequest@31b5b771]
>>>> org.elasticsearch.search.query.QueryPhaseExecutionException: [mr][0]:
>>>> query[ConstantScore(cache(_type:logs))],from[0],size[50]: Query Failed
>>>> [Failed to execute main query]
>>>> at org.elasticsearch.search.query.QueryPhase.execute(
>>>> QueryPhase.java:162)
>>>> at org.elasticsearch.search.SearchService.executeScan(
>>>> SearchService.java:215)
>>>> at org.elasticsearch.search.action.
>>>> SearchServiceTransportAction$19.call(SearchServiceTransportAction.
>>>> java:444)
>>>> at org.elasticsearch.search.action.
>>>> SearchServiceTransportAction$19.call(SearchServiceTransportAction.
>>>> java:441)
>>>> at org.elasticsearch.search.action.
>>>> SearchServiceTransportAction$23.run(SearchServiceTransportAction.
>>>> java:517)
>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>>> ThreadPoolExecutor.java:1145)
>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>>> ThreadPoolExecutor.java:615)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>> Caused by: java.lang.ArrayIndexOutOfBoundsException
>>>>
>>>>
>>>> Could you please help to create the correct query.
>>>>
>>>> Thanks,
>>>> Sona
>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/2e0101ea-3f95-4a5e-94c8-161f0b2d0fa1%40googlegroups.com
>>>
>>> <https://groups.google.com/d/msgid/elasticsearch/2e0101ea-3f95-4a5e-94c8-161f0b2d0fa1%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Adrien Grand
>>
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ce96b96a-d0f5-47cb-a3ee-0b5adacf69cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.