One more update on the issue:

I tried changing the query using 'sum'

{
  "size":0,
  "aggs": {
    "group_by_BodyPart": {
      "terms": {
        "field": "body_part",
        "size": 5,
         "order" : { "examcount" : "desc" }   
        },
      "aggs" : {
        "examcount" : { "sum" : { "field" : "ExamRowKey" } }
    }
    }
  }
}

But both these queries returns entire search result records to Mapper 
method, inspite of giving "size": 5 in query string.


Thanks,
Sona


On Monday, August 25, 2014 3:17:03 PM UTC+5:30, Sona Samad wrote:
>
> Thanks Adrien.
>
> The ExamRowKey and body_part are Strings uploaded from csv file using 
> LogStash to ElasticSearch.
>
> - how reproducible is it? Ie. if you run this query 10 times, how many of 
> these queries will write such lines to the logs?
>     This query returns the error each time its run in the cluster. 
>     Other simple queries are returning values. Eg:
>        {"query":
>                {"term":
>                     {"ExamRowKey":"4090741090"}
>                 }
> }
>
> - is it common that several queries will be executing at the same time on 
> your elasticsearch cluster?
>     For testing purpose, I was running only the above specified query.
>
> - are there other exception in your logs that happen approximately at the 
> same time?
>     No, the stack trace I have posted is the only errors I got today after 
> running the query.
>
> Thanks,
> Sona
>
> On Monday, August 25, 2014 1:34:26 PM UTC+5:30, Adrien Grand wrote:
>>
>> Thanks Sona,
>>
>> This stack trace indicates a bug in the cardinality aggregation. I just 
>> opened an issue for it: 
>> https://github.com/elasticsearch/elasticsearch/issues/7429
>>
>> In order to help me understand/reproduce this bug, could you please 
>> provide the mappings of your ExamRowKey and body_part fields? Also answers 
>> to the questions below would help me understand better what is happening:
>>  - how reproducible is it? Ie. if you run this query 10 times, how many 
>> of these queries will write such lines to the logs?
>>  - is it common that several queries will be executing at the same time 
>> on your elasticsearch cluster?
>>  - are there other exception in your logs that happen approximately at 
>> the same time?
>>
>> Thanks!
>>
>>
>>
>> On Mon, Aug 25, 2014 at 6:10 AM, Sona Samad <[email protected]> wrote:
>>
>>> Hi Adrien,
>>>  
>>> My elasticsearch version is :  elasticsearch-1.2.1 
>>>  
>>> The Maven dependency for hadoop:
>>>  
>>> <dependency>
>>>   <groupId>org.elasticsearch</groupId>
>>>   <artifactId>elasticsearch-hadoop-mr</artifactId>
>>>   <version>2.0.1</version>
>>> </dependency> 
>>>  
>>>  
>>> The full stack trace is given below:
>>>  
>>> [2014-08-25 09:31:58,892][DEBUG][action.search.type       ] [Thane 
>>> Ector] [mr][4], node[1ZbXSvkKQC-kDvgMXuC8iQ], [P], s[STARTED]: Failed to 
>>> execute [org.elasticsearch.action.search.SearchRequest@6ed78f6d]
>>> org.elasticsearch.search.query.QueryPhaseExecutionException: [mr][4]: 
>>> query[ConstantScore(cache(_type:logs))],from[0],size[50]: Query Failed 
>>> [Failed to execute main query]
>>>
>>>  at 
>>> org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:162)
>>>  at 
>>> org.elasticsearch.search.SearchService.executeScan(SearchService.java:215)
>>>  at 
>>> org.elasticsearch.search.action.SearchServiceTransportAction$19.call(SearchServiceTransportAction.java:444)
>>>  at 
>>> org.elasticsearch.search.action.SearchServiceTransportAction$19.call(SearchServiceTransportAction.java:441)
>>>  at 
>>> org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:517)
>>>  at 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>  at 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>  at java.lang.Thread.run(Thread.java:745)
>>> Caused by: java.lang.ArrayIndexOutOfBoundsException: 97
>>>  at 
>>> org.elasticsearch.common.util.BigArrays$IntArrayWrapper.set(BigArrays.java:185)
>>>  at 
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus$Hashset.values(HyperLogLogPlusPlus.java:499)
>>>  at 
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.upgradeToHll(HyperLogLogPlusPlus.java:307)
>>>  at 
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.collectLcEncoded(HyperLogLogPlusPlus.java:245)
>>>  at 
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.collectLc(HyperLogLogPlusPlus.java:239)
>>>  at 
>>> org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.collect(HyperLogLogPlusPlus.java:231)
>>>  at 
>>> org.elasticsearch.search.aggregations.metrics.cardinality.CardinalityAggregator$DirectCollector.collect(CardinalityAggregator.java:204)
>>>  at 
>>> org.elasticsearch.search.aggregations.metrics.cardinality.CardinalityAggregator.collect(CardinalityAggregator.java:118)
>>>  at 
>>> org.elasticsearch.search.aggregations.bucket.BucketsAggregator.collectBucketNoCounts(BucketsAggregator.java:74)
>>>  at 
>>> org.elasticsearch.search.aggregations.bucket.BucketsAggregator.collectExistingBucket(BucketsAggregator.java:63)
>>>  at 
>>> org.elasticsearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.collect(GlobalOrdinalsStringTermsAggregator.java:98)
>>>  at 
>>> org.elasticsearch.search.aggregations.AggregationPhase$AggregationsCollector.collect(AggregationPhase.java:157)
>>>  at 
>>> org.elasticsearch.common.lucene.MultiCollector.collect(MultiCollector.java:60)
>>>  at 
>>> org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:193)
>>>  at 
>>> org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:163)
>>>  at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:35)
>>>  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
>>>  at 
>>> org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:175)
>>>  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309)
>>>  at 
>>> org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:116)
>>>  ... 7 more
>>> [2014-08-25 09:31:58,894][DEBUG][action.search.type       ] [Thane 
>>> Ector] All shards failed for phase: [init_scan]
>>>  
>>> Thanks,
>>> Sona
>>>  
>>>
>>> On Friday, August 22, 2014 5:07:33 PM UTC+5:30, Sona Samad wrote:
>>>
>>>> Hi,
>>>>
>>>> I was trying to run the below query from hadoop mapreduce:
>>>>
>>>> {
>>>>  "aggs": {
>>>>     "group_by_body_part": {
>>>>       "terms": {
>>>>         "field": "body_part",
>>>>         "size": 5,
>>>>         "order" : { "examcount" : "desc" }
>>>>         },
>>>>       "aggs": {
>>>>         "examcount": {
>>>>           "cardinality": {
>>>>             "field": "ExamRowKey"
>>>>           }
>>>>         }
>>>>       }
>>>>     }
>>>>   }
>>>> }
>>>>
>>>> The query is returning more than 5 records, even when the size is given 
>>>> as 5. 
>>>> Also, the result was not aggregated, rather it returns the entire 
>>>> record from the index as value to mapper.
>>>>
>>>> Also the following error is logged:
>>>>
>>>> [2014-08-22 16:06:21,459][DEBUG][action.search.type       ] [Algrim 
>>>> the Strong] All shards failed for phase: [init_scan]
>>>> [2014-08-22 16:26:38,875][DEBUG][action.search.type       ] [Algrim 
>>>> the Strong] [mr][0], node[r9u9daW_TkqTBBeazKJQNw], [P], s[STARTED]: Failed 
>>>> to execute [org.elasticsearch.action.search.SearchRequest@31b5b771]
>>>> org.elasticsearch.search.query.QueryPhaseExecutionException: [mr][0]: 
>>>> query[ConstantScore(cache(_type:logs))],from[0],size[50]: Query Failed 
>>>> [Failed to execute main query]
>>>>         at org.elasticsearch.search.query.QueryPhase.execute(
>>>> QueryPhase.java:162)
>>>>         at org.elasticsearch.search.SearchService.executeScan(
>>>> SearchService.java:215)
>>>>         at org.elasticsearch.search.action.
>>>> SearchServiceTransportAction$19.call(SearchServiceTransportAction.
>>>> java:444)
>>>>         at org.elasticsearch.search.action.
>>>> SearchServiceTransportAction$19.call(SearchServiceTransportAction.
>>>> java:441)
>>>>         at org.elasticsearch.search.action.
>>>> SearchServiceTransportAction$23.run(SearchServiceTransportAction.
>>>> java:517)
>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>>> ThreadPoolExecutor.java:1145)
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>>> ThreadPoolExecutor.java:615)
>>>>         at java.lang.Thread.run(Thread.java:745)
>>>> Caused by: java.lang.ArrayIndexOutOfBoundsException
>>>>
>>>>
>>>> Could you please help to create the correct query.
>>>>
>>>> Thanks,
>>>> Sona
>>>>
>>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/2e0101ea-3f95-4a5e-94c8-161f0b2d0fa1%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/2e0101ea-3f95-4a5e-94c8-161f0b2d0fa1%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Adrien Grand
>>  
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ce96b96a-d0f5-47cb-a3ee-0b5adacf69cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to