Querying one-by-one record is an uncommon task for an RDBMS with relationship management between tables. It is more a task for a key/value store.
Nevertheless, MySQL is slow for key/value store scenario, there are faster products, e.g. memcached, where MySQL offers an integration. ES is faster than MySQL if you ramp up enough RAM and load the docs into direct memory (I/O buffer) into the field data cache and optimize configuration (plus balancing the work load between nodes) so ES can work very close to an in-memory key/value store. The complexity is O(1) * len(dict) for all term lookups. It means, lookup does not depend on key count or on key length, but Lucene needs to scan the term dictionary, unless you implement an FST for terms https://issues.apache.org/jira/browse/LUCENE-3069, see also http://blog.mikemccandless.com/2013/09/lucene-now-has-in-memory-terms.html With FST, the complexity would be O(1) - roughly spoken. In practice, there are more factors to consider (e.g. if the index is write-once or if it is open for ongoing modifications, which requires frequent and expensive cache rebuilding) Next: doc values. By uninverting the field cache http://blog.trifork.com/2011/10/27/introducing-lucene-index-doc-values/ you will need vastly more memory and therefore you must search not only in memory but also on-disk. If you want key lookup only, this is fast enough, because no field cache (re)building is required. This is best for frequently changing data. Jörg On Tue, Mar 3, 2015 at 4:31 AM, Zhantong Mou <[email protected]> wrote: > My scenario: > one billion records. I query one record by one filed,and that filed > is unique. > > How ES ensure the performance. What is the arithmetic? I am very > confused. > > > > > 在 2015年3月2日星期一 UTC+8下午10:16:53,Jörg Prante写道: >> >> I do not have an answer because your question is speculative. Without >> having facts about your MySQL query type and speed and your scenario, it is >> not possible to discuss alternatives. >> >> Fact is, MySQL is very limited, search is slow, it is not a search >> engine. Lucene has plenty of advantages in search over RDBMS, not only >> inverted indexing. As said, if you do not want inverted indexing, you can >> choose doc values. http://www.elasticsearch.org/ >> guide/en/elasticsearch/guide/current/doc-values.html >> >> Jörg >> >> On Mon, Mar 2, 2015 at 1:28 PM, Zhantong Mou <[email protected]> wrote: >> >>> Thank your answer. But I have other question. >>> >>> One billion data from one table of mysql. MySql use B-tree or others >>> ensuring the response of queries. >>> If the data of one field is almost unique. the normal revert index >>> can not improve the query speed. ES is based on Lucene. The revert index >>> have relationship that items to docs. one billion items cannot improve >>> performances. >>> Lucene add items index base normal revert index. The items split >>> into many shards. but I think ES can not improve the speed in this case. >>> >>> What is your opinion? >>> >>> thanks >>> >>> 在 2015年3月2日星期一 UTC+8下午5:03:38,Jörg Prante写道: >>>> >>>> What are the queries, what is the speed of queries? >>>> >>>> A growing number of shards is not related to speed, it is are related >>>> to scalability. This means, even with large document count, the search >>>> response time can be kept low by creating indices that span multiple nodes. >>>> If you increase number of replica shards, you can serve more searches in >>>> parallel. >>>> >>>> ES can not replace MySQL, because ES is not a relational database >>>> system. >>>> >>>> ES is faster than MySQL's direct index because ES queries can operate >>>> in-memory. If you do not want inverted index, choose doc values. >>>> >>>> Jörg >>>> >>>> >>>> >>>> >>>> On Mon, Mar 2, 2015 at 8:54 AM, Zhantong Mou <[email protected]> wrote: >>>> >>>>> I have one billion data in mysql. the data of mysql is like name, ID >>>>> cards, phone numbers. They are almost unique. >>>>> Whether the ElasticSearch based on Inverted index can ensure the >>>>> speed of queries? >>>>> Can we justify the numbers of shards improves the speed of the query? >>>>> If ES can replace MySql, how ES ensures the performance? I think the >>>>> structured >>>>> data can not have good performance than Mysql, because that it based on >>>>> Inverted >>>>> index. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/e61eb6fb-3f5a-40fc-974d-283d76030821%40goo >>>>> glegroups.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/e61eb6fb-3f5a-40fc-974d-283d76030821%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/65aa1642-e037-4321-8743-9207497a346a% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/65aa1642-e037-4321-8743-9207497a346a%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/8cc946a2-31ed-405b-aa5c-9322d18299e7%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/8cc946a2-31ed-405b-aa5c-9322d18299e7%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFq%2B7dQH4NMPWd7P0ukVywe2cGJg4pOMS%3DGiT6wZk6Uxw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
