You should definitely consider using ES as a primary data source, but as 
with any database, make sure to:

- Replicate data across your cluster
- Take daily snapshots and store them on another machine / data center.
- Monitor your cluster 

Regarding the point about data not being available immediately, it's not 
quite true. You can control the write consistency so that you're sure that 
the data is actually persisted on the number of nodes you want and 
available thorugh the Get API. The indexing happens asynchronously, but the 
data is there immediately.

ElasticSearch has (previously) had issues with index corruption, especially 
caused by OOM errors and split brain. You should make sure to set 
minimum_master_nodes to a reasonable value to avoid split-brain, and you 
should use the latest ES version where a circuit breaker has been 
introduced to avoid OOM errors.

Also read the blog posts from people who advice against using ES as a 
primary data source, such as this guy, to make a better 
decision: http://igor.kupczynski.info/2014/06/26/elastic-cap.html

Lasse

On Monday, November 17, 2014 1:32:14 PM UTC+1, Yann Barraud wrote:
>
> Hi,
>
> I've been speaking recently with various persons about using Es as a 
> primary database. Here are the main blockers I heard of : 
>
>    - problems with data corruption. Is it still real for 1.4 version ? 
>
>
>    - data not being available immediately (because of indexing, which is 
>    normal)
>
>
> Do you have further insights about this topic ? 
>
> Cheers,
> Yann
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/390def63-33e9-4b6c-ad34-c6943f59d012%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to