Hi Martijn,

Couple hours age I tried to submit a bug on ES Github issues and during 
creating steps of reproduce realized one more thing.

*It happens only if you update the same child document within one bulk 
request.*

Because I didn't manage to reproduce the "arithmetic progression" effect 
with curling my localhost, but it is still reproducible from java code 
doing bulk-update (script + upsert doc). 
I understand that bulk-updating the same document is a pretty ugly thing 
and I was surprised when it worked normally (without exceptions about 
version conflicts) from java client. 

If it might be helpful: these are the steps and queries to curl your 
localhost with parent-child.
Unfortunately I don't know how to create a curl with bulk updates. 


     #Create index "test" with parent-cild mappings

 curl -XPUT localhost:9200/test -d 
'{"mappings":{"root":{"properties":{"country":{"type":"string"}}},"metric":{"_parent":{"type":"root"},"properties":{"count":{"type":"long"}}}}}'
 
#Index parent document:
curl -XPUT localhost:9200/test/root/1 -d '{"country":"de"}'

#Index child document:
curl -XPUT 'http://localhost:9200/test/metric/1?parent=1' -d '{"count":1}'
 #Update child document:
curl -XPOST 'http://localhost:9200/test/metric/1/_update?parent=1' -d 
'{"script":"ctx._source.count+=ct", "params":{"ct":1}}'
#Query with benchmark query, it should return 2
curl -XGET localhost:9200/test/_search -d 
'{"size":0,"query":{"match_all":{}},"aggs":{"requests":{"sum":{"field":"count"}}}}'
#Query with child aggregation query, exepected 2
 curl -XGET localhost:9200/test/metric/_search -d 
'{"size":0,"query":{"match_all":{}},"aggs":{"child":{"children":{"type":"metric"},"aggs":{"requests":{"sum":{"field":"count"}}}}}}'



Thank you

On Tuesday, October 21, 2014 3:33:35 PM UTC+2, Martijn v Groningen wrote:
>
> Hi Vlad,
>
> What you're describing shouldn't happen. The child docs should get 
> detached. I think this is a bug.
> Let me verify and get back to you.
>
> Martijn
>
> On 21 October 2014 13:26, Vlad Vlaskin <[email protected] <javascript:>> 
> wrote:
>
>> After some experiments I believe I found the cause of the discrepancy 
>> problem:
>>
>> *ElasticSearch does not detach child object after it has been updated 
>> from parent child aggregation and uses it in child aggregation. *
>>
>> E.g. I have my child updated 4 times with script (within batch update), 
>> and it has 4 versions:
>> { "count": 1}, { "count": 2}, { "count": 3}, { "count": 4}
>>
>> Query to the child document (after refresh) shows you proper version: 
>> {"count": 4}
>>
>> But child aggregation {"sum":{"field":"count"}} shows you 10, because:
>>
>> 1 + 2 +3 +4 = 10
>>
>> It works pretty accurate (e.g. for 5 you have 15). 
>>
>> It explains the behavior here.
>>
>>
>>
>>
>>
>> On Tuesday, October 21, 2014 3:18:47 AM UTC+2, Vlad Vlaskin wrote:
>>>
>>> Dear ES group,
>>> we've been using ES in production for a while and test eagerly all 
>>> new-coming features such as cardinality and others.
>>>
>>> We try data modeling with parent-child relations (ES version 
>>> 1.4.0.Beta1, 8 nodes, EC2 r3.xlarge, ssd, lot ram etc.)
>>> With data model of: 
>>> *Parent*
>>> {
>>>   "key": "value"  
>>> }
>>>
>>> and a timeline with children, holding metrics:
>>>
>>> *Child* (type "metrics")
>>> {
>>>  "day": "2014-10-20",
>>>   "count: 10
>>> }
>>>
>>> We update metric documents and properly index them with script+upsert.
>>> The problem is that the query below* yields in 2 different results in 
>>> round robin way. *
>>> E.g. first time you call it you receive the first number, a second after 
>>> you receive the second and again back to the first, etc. 
>>>
>>> {
>>>     "size": 0,
>>>     "query": {
>>>         "match_all": {}
>>>     },
>>>     "aggs": {
>>>         "MY_FIELD": {
>>>             "terms": {
>>>                 "field": "FIELD-XYZ"             // parent term 
>>> aggregation 
>>>             },
>>>             "aggs": {
>>>                 "children": {
>>>                     "children": {
>>>                         "type": "metrics"        // child aggregation of 
>>> type "metrics"
>>>                     },
>>>                     "aggs": {
>>>                         "requests": {
>>>                             "sum": {
>>>                                 "field": "count" // target aggregation 
>>> within child documents
>>>                             } 
>>>                         }
>>>                     }
>>>                 }
>>>             }
>>>         }
>>>     }
>>> }
>>>
>>>  Result A: 
>>> "aggregations": {
>>>       "MY_FIELD": {
>>>          "doc_count_error_upper_bound": 0,
>>>          "buckets": [
>>>             {
>>>                "key": "xx",
>>>                "doc_count": 283322,
>>>                "children": {
>>>                   "doc_count": 3740372,
>>>                   "requests": {
>>>                      "value": *5801652297*
>>>                   }
>>>                }
>>>             }
>>>          ]
>>>       }
>>>    }
>>>
>>> Result B:
>>> "aggregations": {
>>>       "MY_FIELD": {
>>>          "doc_count_error_upper_bound": 0,
>>>          "buckets": [
>>>             {
>>>                "key": "xx",
>>>                "doc_count": 302421,
>>>                "children": {
>>>                   "doc_count": 1877361,
>>>                   "requests": {
>>>                      "value": *2965346170*
>>>                   }
>>>                }
>>>             }
>>>          ]
>>>       }
>>>    }
>>>
>>> The problem is that switching A to B back and forth is pretty stable 
>>> and reproducible. 
>>> ES logs are clear. 
>>>
>>> Could someone help towards some ideas here?
>>>
>>> Thank you!
>>>
>>> Vlad
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/2ce80724-b3d9-4d58-b54e-15727f999564%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/2ce80724-b3d9-4d58-b54e-15727f999564%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Met vriendelijke groet,
>
> Martijn van Groningen 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ab610c8d-f85c-4967-aff1-7e79111fe71d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to