Hi Les

>From the release notes of 
1.3.0(http://www.elasticsearch.org/downloads/1-3-0/), It does not mention 
about the performance improvements to parent/child documents queries. I did 
not find it 1.4.0(http://www.elasticsearch.org/downloads/1-4-0/) either. 
How did you find that there are significant performance improvements to 
parent/child queries? What kind of improvements has it done? and how 
significant the improvement is? 

Thanks a lot for the help.

Xiaolin.

On Wednesday, December 10, 2014 10:12:32 AM UTC-8, Les Barstow wrote:
>
> There is in fact a performance difference between has_parent and other 
> filters, as well as a difference in memory/cache use - especially in 
> earlier versions of ES. This is due to the way in which ES has to query the 
> parent/child relationship.
>
> I do believe that there are some significant performance improvements to 
> parent/child documents in 1.3.0+ - check the release notes. Also, I believe 
> there might have been some tuning and monitoring additions in the newer 
> versions that might help you. (I'm a user of our cluster, not so much an 
> administrator, so I'm not so sure on the latter...)
>
> --
> Les Barstow, Senior Software Engineer
> Return Path, Inc.
>
> On Tue, Dec 9, 2014 at 7:53 PM, Xiaolin Xie <[email protected] 
> <javascript:>> wrote:
>
>> Hi Elastic Search developers
>>
>> I am new to ES. We had some performance issues with our Elastic Search 
>> system, and we would like to get some ideas/thoughts about this issue from 
>> your guys.
>>
>> Here is our use case: we have three types of documents in one index: 
>> “campaign_group”, “campaign”, and “ad”. “campaign_group” is the parent of 
>> “campaign”, and “campaign” is the parent of “ad”.  Each document type has 
>> about 10 simple properties, such as string, long, short. The three kinds of 
>> documents all have a property “user”(long) and a property 
>> “run_status”(short). Documents are hashed by “user”, documents with the 
>> same “user” are mapped into the same shard. 
>>
>> We have about 1.4 billion documents in total. We have 200 shards, 3 
>> master node, and 21 data nodes, and each shard has too replica.  The 
>> total data size is 1.5TB. We are running elasticsearch 1.21.
>>
>>  Queries are made against specific shard by routing. The flowing 
>> query(1) checks the run_status of “ads”(run_status is a short type), and it 
>> takes about 100 milliseconds. The query(2) checks both the run_status of 
>> “ad”, and the run_status of its parent, and it takes about 2000 
>> milliseconds.  It looks like there are some performance issues with the 
>> has_parent filter.
>>
>> Do your guys have any thoughts about this problem? Is it expected(because 
>> ES cannot support has_parent well)? Or something else cloud result this 
>> problem? Or we should upgrade our Elastic Search version?
>>
>>  Please let me know if you need any other information about our uses 
>> cases. 
>>
>> Any thoughts/ideas will be highly appreciated.
>>
>> ========================Query(1) ========================
>>
>> {
>>
>>   "filter":{
>>
>>     "and":[
>>
>>       {
>>
>>         "term":{
>>
>>           "user":1436594776581528
>>
>>         }
>>
>>       },
>>
>>       {
>>
>>         "terms":{
>>
>>           "run_status":[
>>
>>             1
>>
>>           ]
>>
>>         }
>>
>>       }
>>
>>     ]
>>
>>   },
>>
>>   "sort":{
>>
>>     "_uid":"desc"
>>
>>   },
>>
>>   "size":1000000,
>>
>>   "from":0
>>
>> }
>>
>>  
>>
>> ===========================Query(2)====================
>>
>> {
>>
>>   "filter":{
>>
>>     "and":[
>>
>>       {
>>
>>         "term":{
>>
>>           "user":1436594776581528
>>
>>         }
>>
>>       },
>>
>>       {
>>
>>         "terms":{
>>
>>           "run_status":[
>>
>>             1
>>
>>           ]
>>
>>         }
>>
>>       },
>>
>>       {
>>
>>           "has_parent" : {
>>
>>               "parent_type": "campaign",
>>
>>               "filter" : {
>>
>>                   "terms" : {
>>
>>                       "run_status" : [1]
>>
>>                   }
>>
>>               }
>>
>>           }
>>
>>       }
>>
>>     ]
>>
>>   },
>>
>>   "sort":{
>>
>>     "_uid":"desc"
>>
>>   },
>>
>>   "size":1000000,
>>
>>   "from":0
>>
>> }
>>
>>  
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/220b1d9a-da80-416c-8b8d-d7cc3efc8b5a%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/220b1d9a-da80-416c-8b8d-d7cc3efc8b5a%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/85c1c4aa-e43e-47e2-ac62-87495f385245%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to