I am sorry I don't have any numbers. But you will probably find that high-performance databases, such as Cassandra, HBase, MkngoDB and maybe even platinum-hardware-hosted SQL servers, are capable to outperform the Zest runtime"s ability to construct the entities into memory, i.e. the serialization is slower. Some years ago, I vaguely recall that the then Qi4j runtime maxes out somewhere around 3-5000 reads/second, for relatively simple entities.
For querying, the resultset is an EntityReference collection, in principle the identity of the Entity which is then read from the ES. Querying in Zest is done via a Fluent API (DSL if you like), which in a typesafe manner describes the query. The query subsystem translates that into the underlying query system's native language and executes the query. Of course, the query is translated according to the same subsystem's indexing algorithm and there might be room for clever work in making this faster. Again, I don't have the numbers, but gut feeling guess is that it is order of magnitude(s) slower than direct lookup in a fast ES. There are (or used to be) some performance tests available in the source code somewhere, for ES testing. I would be delighted if that could be automated so a comparison (table or graph) can be auto-published by the CI build system. Maybe not as much help as you hoped for. Niclas On Jun 14, 2016 12:24, "zhuangmz08" <[email protected]> wrote: Hi, OK, writing entities and reading entities are separated both theroy and physical implementation. 1. It's acceptable to occupy large storage space (Disk is cheap). All entities are stored in a SINGLE table of the SQL database or in a SINGLE collection of the SINGLE database in Mongo. What's the key factors on writing? Which MapEntityStore is faster in writing entities? I mean, which is better for production use. 2. Reading speed is related to the Indexer? I know something about search engine (Apache Solr). Could you explain more about the querying. When the query string matched some index, how will they interact with the entity database? Do we need to query the Entity database internally? I would like to know the factors impacting read speed. Which is better for production use, OpenRDF or ElasticSearch? Thanks a lot. ------------------ 原始邮件 ------------------ 发件人: "Niclas Hedhman";<[email protected]>; 发送时间: 2016年6月14日(星期二) 中午11:02 收件人: "dev"<[email protected]>; 主题: Re: Large Scale Entity Store Database? In Zest, storage/retrieval and indexing/query are separated concerns. (Disk is cheap) Just like it is on the world-wide web. Now, the relatively simple Entity Stores that are based on the MapEntitStore might be particularly wasteful with storage space, depending on the underlying engine. However, nothing stops you from creating a "native" ES for your favorite storage engine. The Indexing/Query systems are much more complex (compare a website's store/retrieve with Google's Search) and it is not trivial to make an indexing extension that is complete (native queries are available as a compromise). In Zest 2.x and earlier, the default is to index all properties, and you can turn some of them off. In 3.x we intend to change the default to off, and you indicate what needs indexing. Final note, the requirements on the entity stores are that any "unknown" state is preserved so that an update will not modify such state. This is due to the fact that entities of the same identity can have more than one (possibly incompatible) type. This complicates traditional ORM techniques quite a bit. Cheers Niclas On Jun 14, 2016 09:06, "zhuangmz08" <[email protected]> wrote: > Hi, I dig into the Postgres table, and I find that entities are actually > stored as JSON-format strings, which seems to use SQL database as a > Document database. I'm wondering how efficient queries are achieved? I'm > going to insert and query millions of entities. Have you ever tested the > performance? Should I use Mongo-support Entity Store instead? Thanks a lot.
