Hi, On 17 December 2013 22:43, Reza Jalili <[email protected]> wrote: > Forwarding to the open group > > >>Hi Toby, >> >>I've just started to take a look at elasticsearch.org / .com >> >>Do you know: >>How does oak compare with elasticsearch open source >>search/data store?
Elastic search is only a distributed elastic search index based on Lucene, so comparing it with Oak as a whole is not a like for like comparison. It is not a data store. However: Many large applications especially in the OpenData field have used it as a data store since its resilience to unforeseen failures is high mainly due to: * close to real time with a data update latency often around 50ms between update and availability in the index. * replication and sharding with no single point of failure * write ahead log on write giving it automated recovery. * True elasticity. The datastore that results from an elastic search deployment can be considered as a flat datastore with no inherent structure and no versioning. ie billions of documents in a bucket. If you were brave, you could write a EasticSearchMK. > >>What are the dimensions and features that are fair to compare and >>understand? It would be fair to compare the SolrCloud component of a full Oak deployment with ElasticSearch. You will find differences in schema support, replication mechanism, deployment and indexing. Solr has schema capabilities, ES has none. SolrCloud replicates segment data, ES replicates the index update commands after committing to a Write Ahead Log. SolrCloud requires several components including Zookeeper as a HA cluster. ES is a single jar that self discovers peers and has no single book keeping instance. SolrCloud will index documents (pdf etc). ES indexes keywords and streams of tokens leaving you to perform the conversion from document to token. Lucene indexes stored in Oak (as mentioned below) is reminiscent of earlier work that lead to ElasticSearch. There are some talks on the ElasticSearch site that describe the issues with making Lucene based indexes scale. It would not be a like for like comparison to compare all of Oak with ElasticSearch as they are very different beasts. HTH (I am not a core Oak contributor, but have had experience using ES and SolrCloud in the past) Best Regards Ian >> >>Thanks for your help, >>-reza >> >> >> >> >>On 12/17/13 1:46 AM, "Tommaso Teofili" <[email protected]> wrote: >> >>>IIRC the Lucene index data is stored under /oak:index/lucene/:data >>> >>>Regards, >>>Tommaso >
