Hi Martijn,

Would you help with another question considering this topic

I red that ES stores parent-child relations in a heap, could it be that 
this bug prevents some objects from being GC-ed, e.g. there is a memory 
leak? 
And what happens if there is no more heap but there are more parent-child 
relations incoming? 

The reason Im asking is that our cluster (8 rxlarge, etc etc) went down 
after 2 days updating paren-child relations. 
Index volume is tiny, but the number of child documents updated is huge. 

Thank you.

Vlad 


On Tuesday, October 21, 2014 4:38:55 PM UTC+2, Martijn v Groningen wrote:
>
> Hi Vlad,
>
> I opened: https://github.com/elasticsearch/elasticsearch/pull/8180
>
> Many thanks for reporting this issue!
> Besides this bug the parent/child model works well, so I recommend to keep 
> it. I don't know exactly when the next 1.4 release is released, but I 
> expect within a week or 2.
>
> Martijn 
>
>
> On 21 October 2014 16:17, Vlad Vlaskin <[email protected] <javascript:>> 
> wrote:
>
>> Hi Martijn,
>>
>> great news, thank you!
>>
>> Would you recommend to keep parent-child data model and wait for a 
>> release?  (Do you have a feeling of the date?).
>>
>> Thank you
>>
>> Vlad
>>
>>
>>
>> On Tuesday, October 21, 2014 4:01:47 PM UTC+2, Martijn v Groningen wrote:
>>>
>>> Hi Vlad, 
>>>
>>> I reproduced it. The children agg doesn't take documents marked as 
>>> deleted into account properly.
>>>
>>> When documents are deleted they are initially marked as deleted before 
>>> they're removed from the index. This also applies to updates, because that 
>>> translate into an index + delete. 
>>>
>>> The issue you're experiencing can also happen when not using the bulk 
>>> api. It may just be a bit less likely to manifest.
>>>
>>> The fix for this bug is small. I'll open a PR soon.
>>>
>>> Martijn
>>>
>>> On 21 October 2014 15:51, Vlad Vlaskin <[email protected]> wrote:
>>>
>>>> Hi Martijn,
>>>>
>>>> Couple hours age I tried to submit a bug on ES Github issues and during 
>>>> creating steps of reproduce realized one more thing.
>>>>
>>>> *It happens only if you update the same child document within one bulk 
>>>> request.*
>>>>
>>>> Because I didn't manage to reproduce the "arithmetic progression" 
>>>> effect with curling my localhost, but it is still reproducible from java 
>>>> code doing bulk-update (script + upsert doc). 
>>>> I understand that bulk-updating the same document is a pretty ugly 
>>>> thing 
>>>> and I was surprised when it worked normally (without exceptions about 
>>>> version conflicts) from java client. 
>>>>
>>>> If it might be helpful: these are the steps and queries to curl your 
>>>> localhost with parent-child.
>>>> Unfortunately I don't know how to create a curl with bulk updates. 
>>>>
>>>>
>>>>      #Create index "test" with parent-cild mappings
>>>>
>>>>  curl -XPUT localhost:9200/test -d '{"mappings":{"root":{"
>>>> properties":{"country":{"type":"string"}}},"metric":{"_
>>>> parent":{"type":"root"},"properties":{"count":{"type":"long"}}}}}'
>>>>  
>>>> #Index parent document:
>>>> curl -XPUT localhost:9200/test/root/1 -d '{"country":"de"}'
>>>>
>>>> #Index child document:
>>>> curl -XPUT 'http://localhost:9200/test/metric/1?parent=1' -d 
>>>> '{"count":1}'
>>>>  #Update child document:
>>>> curl -XPOST 'http://localhost:9200/test/metric/1/_update?parent=1' -d 
>>>> '{"script":"ctx._source.count+=ct", "params":{"ct":1}}'
>>>> #Query with benchmark query, it should return 2
>>>> curl -XGET localhost:9200/test/_search -d '{"size":0,"query":{"match_
>>>> all":{}},"aggs":{"requests":{"sum":{"field":"count"}}}}'
>>>> #Query with child aggregation query, exepected 2
>>>>  curl -XGET localhost:9200/test/metric/_search -d 
>>>> '{"size":0,"query":{"match_all":{}},"aggs":{"child":{"
>>>> children":{"type":"metric"},"aggs":{"requests":{"sum":{"
>>>> field":"count"}}}}}}'
>>>>
>>>>
>>>>
>>>> Thank you
>>>>
>>>> On Tuesday, October 21, 2014 3:33:35 PM UTC+2, Martijn v Groningen 
>>>> wrote:
>>>>>
>>>>> Hi Vlad,
>>>>>
>>>>> What you're describing shouldn't happen. The child docs should get 
>>>>> detached. I think this is a bug.
>>>>> Let me verify and get back to you.
>>>>>
>>>>> Martijn
>>>>>
>>>>> On 21 October 2014 13:26, Vlad Vlaskin <[email protected]> wrote:
>>>>>
>>>>>> After some experiments I believe I found the cause of the discrepancy 
>>>>>> problem:
>>>>>>
>>>>>> *ElasticSearch does not detach child object after it has been updated 
>>>>>> from parent child aggregation and uses it in child aggregation. *
>>>>>>
>>>>>> E.g. I have my child updated 4 times with script (within batch 
>>>>>> update), and it has 4 versions:
>>>>>> { "count": 1}, { "count": 2}, { "count": 3}, { "count": 4}
>>>>>>
>>>>>> Query to the child document (after refresh) shows you proper version: 
>>>>>> {"count": 4}
>>>>>>
>>>>>> But child aggregation {"sum":{"field":"count"}} shows you 10, because:
>>>>>>
>>>>>> 1 + 2 +3 +4 = 10
>>>>>>
>>>>>> It works pretty accurate (e.g. for 5 you have 15). 
>>>>>>
>>>>>> It explains the behavior here.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tuesday, October 21, 2014 3:18:47 AM UTC+2, Vlad Vlaskin wrote:
>>>>>>>
>>>>>>> Dear ES group,
>>>>>>> we've been using ES in production for a while and test eagerly all 
>>>>>>> new-coming features such as cardinality and others.
>>>>>>>
>>>>>>> We try data modeling with parent-child relations (ES version 
>>>>>>> 1.4.0.Beta1, 8 nodes, EC2 r3.xlarge, ssd, lot ram etc.)
>>>>>>> With data model of: 
>>>>>>> *Parent*
>>>>>>> {
>>>>>>>   "key": "value"  
>>>>>>> }
>>>>>>>
>>>>>>> and a timeline with children, holding metrics:
>>>>>>>
>>>>>>> *Child* (type "metrics")
>>>>>>> {
>>>>>>>  "day": "2014-10-20",
>>>>>>>   "count: 10
>>>>>>> }
>>>>>>>
>>>>>>> We update metric documents and properly index them with 
>>>>>>> script+upsert.
>>>>>>> The problem is that the query below* yields in 2 different results 
>>>>>>> in round robin way. *
>>>>>>> E.g. first time you call it you receive the first number, a second 
>>>>>>> after you receive the second and again back to the first, etc. 
>>>>>>>
>>>>>>> {
>>>>>>>     "size": 0,
>>>>>>>     "query": {
>>>>>>>         "match_all": {}
>>>>>>>     },
>>>>>>>     "aggs": {
>>>>>>>         "MY_FIELD": {
>>>>>>>             "terms": {
>>>>>>>                 "field": "FIELD-XYZ"             // parent term 
>>>>>>> aggregation 
>>>>>>>             },
>>>>>>>             "aggs": {
>>>>>>>                 "children": {
>>>>>>>                     "children": {
>>>>>>>                         "type": "metrics"        // child 
>>>>>>> aggregation of type "metrics"
>>>>>>>                     },
>>>>>>>                     "aggs": {
>>>>>>>                         "requests": {
>>>>>>>                             "sum": {
>>>>>>>                                 "field": "count" // target 
>>>>>>> aggregation within child documents
>>>>>>>                             } 
>>>>>>>                         }
>>>>>>>                     }
>>>>>>>                 }
>>>>>>>             }
>>>>>>>         }
>>>>>>>     }
>>>>>>> }
>>>>>>>
>>>>>>>  Result A: 
>>>>>>> "aggregations": {
>>>>>>>       "MY_FIELD": {
>>>>>>>          "doc_count_error_upper_bound": 0,
>>>>>>>          "buckets": [
>>>>>>>             {
>>>>>>>                "key": "xx",
>>>>>>>                "doc_count": 283322,
>>>>>>>                "children": {
>>>>>>>                   "doc_count": 3740372,
>>>>>>>                   "requests": {
>>>>>>>                      "value": *5801652297*
>>>>>>>                   }
>>>>>>>                }
>>>>>>>             }
>>>>>>>          ]
>>>>>>>       }
>>>>>>>    }
>>>>>>>
>>>>>>> Result B:
>>>>>>> "aggregations": {
>>>>>>>       "MY_FIELD": {
>>>>>>>          "doc_count_error_upper_bound": 0,
>>>>>>>          "buckets": [
>>>>>>>             {
>>>>>>>                "key": "xx",
>>>>>>>                "doc_count": 302421,
>>>>>>>                "children": {
>>>>>>>                   "doc_count": 1877361,
>>>>>>>                   "requests": {
>>>>>>>                      "value": *2965346170*
>>>>>>>                   }
>>>>>>>                }
>>>>>>>             }
>>>>>>>          ]
>>>>>>>       }
>>>>>>>    }
>>>>>>>
>>>>>>> The problem is that switching A to B back and forth is pretty stable 
>>>>>>> and reproducible. 
>>>>>>> ES logs are clear. 
>>>>>>>
>>>>>>> Could someone help towards some ideas here?
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> Vlad
>>>>>>>
>>>>>>>  -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "elasticsearch" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>>> msgid/elasticsearch/2ce80724-b3d9-4d58-b54e-15727f999564%40goo
>>>>>> glegroups.com 
>>>>>> <https://groups.google.com/d/msgid/elasticsearch/2ce80724-b3d9-4d58-b54e-15727f999564%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Met vriendelijke groet,
>>>>>
>>>>> Martijn van Groningen 
>>>>>
>>>>  -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/ab610c8d-f85c-4967-aff1-7e79111fe71d%
>>>> 40googlegroups.com 
>>>> <https://groups.google.com/d/msgid/elasticsearch/ab610c8d-f85c-4967-aff1-7e79111fe71d%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Met vriendelijke groet,
>>>
>>> Martijn van Groningen 
>>>
>>
>
>
> -- 
> Met vriendelijke groet,
>
> Martijn van Groningen 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/42a73156-f6fb-4e9d-b1da-2615710ea97d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to