Hi Les >From the release notes of 1.3.0(http://www.elasticsearch.org/downloads/1-3-0/), It does not mention about the performance improvements to parent/child documents queries. I did not find it 1.4.0(http://www.elasticsearch.org/downloads/1-4-0/) either. How did you find that there are significant performance improvements to parent/child queries? What kind of improvements has it done? and how significant the improvement is?
Thanks a lot for the help. Xiaolin. On Wednesday, December 10, 2014 10:12:32 AM UTC-8, Les Barstow wrote: > > There is in fact a performance difference between has_parent and other > filters, as well as a difference in memory/cache use - especially in > earlier versions of ES. This is due to the way in which ES has to query the > parent/child relationship. > > I do believe that there are some significant performance improvements to > parent/child documents in 1.3.0+ - check the release notes. Also, I believe > there might have been some tuning and monitoring additions in the newer > versions that might help you. (I'm a user of our cluster, not so much an > administrator, so I'm not so sure on the latter...) > > -- > Les Barstow, Senior Software Engineer > Return Path, Inc. > > On Tue, Dec 9, 2014 at 7:53 PM, Xiaolin Xie <[email protected] > <javascript:>> wrote: > >> Hi Elastic Search developers >> >> I am new to ES. We had some performance issues with our Elastic Search >> system, and we would like to get some ideas/thoughts about this issue from >> your guys. >> >> Here is our use case: we have three types of documents in one index: >> “campaign_group”, “campaign”, and “ad”. “campaign_group” is the parent of >> “campaign”, and “campaign” is the parent of “ad”. Each document type has >> about 10 simple properties, such as string, long, short. The three kinds of >> documents all have a property “user”(long) and a property >> “run_status”(short). Documents are hashed by “user”, documents with the >> same “user” are mapped into the same shard. >> >> We have about 1.4 billion documents in total. We have 200 shards, 3 >> master node, and 21 data nodes, and each shard has too replica. The >> total data size is 1.5TB. We are running elasticsearch 1.21. >> >> Queries are made against specific shard by routing. The flowing >> query(1) checks the run_status of “ads”(run_status is a short type), and it >> takes about 100 milliseconds. The query(2) checks both the run_status of >> “ad”, and the run_status of its parent, and it takes about 2000 >> milliseconds. It looks like there are some performance issues with the >> has_parent filter. >> >> Do your guys have any thoughts about this problem? Is it expected(because >> ES cannot support has_parent well)? Or something else cloud result this >> problem? Or we should upgrade our Elastic Search version? >> >> Please let me know if you need any other information about our uses >> cases. >> >> Any thoughts/ideas will be highly appreciated. >> >> ========================Query(1) ======================== >> >> { >> >> "filter":{ >> >> "and":[ >> >> { >> >> "term":{ >> >> "user":1436594776581528 >> >> } >> >> }, >> >> { >> >> "terms":{ >> >> "run_status":[ >> >> 1 >> >> ] >> >> } >> >> } >> >> ] >> >> }, >> >> "sort":{ >> >> "_uid":"desc" >> >> }, >> >> "size":1000000, >> >> "from":0 >> >> } >> >> >> >> ===========================Query(2)==================== >> >> { >> >> "filter":{ >> >> "and":[ >> >> { >> >> "term":{ >> >> "user":1436594776581528 >> >> } >> >> }, >> >> { >> >> "terms":{ >> >> "run_status":[ >> >> 1 >> >> ] >> >> } >> >> }, >> >> { >> >> "has_parent" : { >> >> "parent_type": "campaign", >> >> "filter" : { >> >> "terms" : { >> >> "run_status" : [1] >> >> } >> >> } >> >> } >> >> } >> >> ] >> >> }, >> >> "sort":{ >> >> "_uid":"desc" >> >> }, >> >> "size":1000000, >> >> "from":0 >> >> } >> >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/220b1d9a-da80-416c-8b8d-d7cc3efc8b5a%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/220b1d9a-da80-416c-8b8d-d7cc3efc8b5a%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/85c1c4aa-e43e-47e2-ac62-87495f385245%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
