There is in fact a performance difference between has_parent and other filters, as well as a difference in memory/cache use - especially in earlier versions of ES. This is due to the way in which ES has to query the parent/child relationship.
I do believe that there are some significant performance improvements to parent/child documents in 1.3.0+ - check the release notes. Also, I believe there might have been some tuning and monitoring additions in the newer versions that might help you. (I'm a user of our cluster, not so much an administrator, so I'm not so sure on the latter...) -- Les Barstow, Senior Software Engineer Return Path, Inc. On Tue, Dec 9, 2014 at 7:53 PM, Xiaolin Xie <[email protected]> wrote: > Hi Elastic Search developers > > I am new to ES. We had some performance issues with our Elastic Search > system, and we would like to get some ideas/thoughts about this issue from > your guys. > > Here is our use case: we have three types of documents in one index: > “campaign_group”, “campaign”, and “ad”. “campaign_group” is the parent of > “campaign”, and “campaign” is the parent of “ad”. Each document type has > about 10 simple properties, such as string, long, short. The three kinds of > documents all have a property “user”(long) and a property > “run_status”(short). Documents are hashed by “user”, documents with the > same “user” are mapped into the same shard. > > We have about 1.4 billion documents in total. We have 200 shards, 3 master > node, and 21 data nodes, and each shard has too replica. The total data > size is 1.5TB. We are running elasticsearch 1.21. > > Queries are made against specific shard by routing. The flowing query(1) > checks the run_status of “ads”(run_status is a short type), and it takes > about 100 milliseconds. The query(2) checks both the run_status of “ad”, > and the run_status of its parent, and it takes about 2000 milliseconds. It > looks like there are some performance issues with the has_parent filter. > > Do your guys have any thoughts about this problem? Is it expected(because > ES cannot support has_parent well)? Or something else cloud result this > problem? Or we should upgrade our Elastic Search version? > > Please let me know if you need any other information about our uses > cases. > > Any thoughts/ideas will be highly appreciated. > > ========================Query(1) ======================== > > { > > "filter":{ > > "and":[ > > { > > "term":{ > > "user":1436594776581528 > > } > > }, > > { > > "terms":{ > > "run_status":[ > > 1 > > ] > > } > > } > > ] > > }, > > "sort":{ > > "_uid":"desc" > > }, > > "size":1000000, > > "from":0 > > } > > > > ===========================Query(2)==================== > > { > > "filter":{ > > "and":[ > > { > > "term":{ > > "user":1436594776581528 > > } > > }, > > { > > "terms":{ > > "run_status":[ > > 1 > > ] > > } > > }, > > { > > "has_parent" : { > > "parent_type": "campaign", > > "filter" : { > > "terms" : { > > "run_status" : [1] > > } > > } > > } > > } > > ] > > }, > > "sort":{ > > "_uid":"desc" > > }, > > "size":1000000, > > "from":0 > > } > > > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/220b1d9a-da80-416c-8b8d-d7cc3efc8b5a%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/220b1d9a-da80-416c-8b8d-d7cc3efc8b5a%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOppbCVrYWBi1EWbuNi0WphqUyxkhmP%2BTiRsk_yb5eFBt7UVLg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
