Re: Building an ERP with Elasticsearch. Am I crazy?

Jilles van Gurp Tue, 26 Aug 2014 03:15:44 -0700

This is the generally accepted dogma and it has some merit. However, having 
two storage systems is more than a bit annoying. If you are aware of the 
limitations and caveats, elasticsearch is actually a perfectly good 
document store that happens to have a deeply integrated querying engine. 
This is useful since most solutions involving a secondary store involve 
solutions that have a much less capable querying engine and additional 
latency + architectural complexity related to pumping around data to 
elastic search.

Elasticsearch crud operations are atomic. I.e. you can read your own writes 
across the cluster. If you use the version attribute during updates, you 
can detect version conflicts and prevent overwriting updates with stale 
data as well. This is a similar model that you would find in e.g. couchdb 
and similar document stores. There are not that many sharded and 
replicated, horizontally scalable document stores out there and even fewer 
with decent querying ability.

The caveat is that elasticsearch is not as battle tested as other solutions 
in this space and that various people have shown that ways exist to cause 
an elastic search cluster to lose data, to corrupt data, etc. So, you need 
to be prepared to be able to recover from such situations. That means you 
need backups (e.g. use the snapshots feature) and a plan for when things go 
bad. 

The flip side is that other solutions have issues as well. Postgresql 
clustering is brand new and probably has issues and if you use it in non 
clustered mode, the failure scenarios get even more interesting. I use 
Mariadb Galera cluster and it sucks big time and it needs a lot of 
handholding during upgrades.  Couchdb doesn't shard and shares server 
failure scenarios with elasticsearch. Mongodb and cassandra each have had 
their share of issues related to data corruption and data loss in the 
recent past and both have recently fixed major issues related to that. So, 
there are lots of solutions out there and none of them are perfect.

Elasticsearch has several major areas where it needs improvement (and which 
are indeed being worked on in recent versions):
1) it has many ways it can run out of memory. If you skim through the 
release notes of recent versions, you'll see a lot of fixes related to that 
including the use of e.g. circuit breakers. The problem with OOM's is that 
it can cause a cascading cluster failure where one node becomes slow, 
eventually drops out of the cluster and then other nodes start having the 
same issues. I've personally seen Kibana kill our cluster on two occasions. 
In both cases the logs of all nodes were full of OOM's and the cluster died 
while simply clicking through different dashboards in Kibana. This has not 
happened with the current 1.3.x version (yet) but that doesn't mean it is 
impossible.
2) split brain situations when a quorum is lost but not detected are fairly 
easy to trigger. Every time I do a rolling update, the cluster takes 
several seconds to catch up with fact that I'm shutting down nodes. I have 
a three node cluster. One node goes down, means my cluster should be 
yellow. Two nodes down means red and it should no longer accept writes. The 
problem is that during those few seconds, the cluster status may not 
reflect reality and nodes may in fact be accepting writes when they 
shouldn't. 
3) A full cluster restart needs a lot of handholding. The reason for this 
is that most of the failure scenarios relate to there not being a quorum 
and detecting that. For example, if you simply restart the nodes one by one 
quickly you will easily get your cluster in a red state where it should no 
longer be accepting writes. The problem as described above is that 
detecting this relies on timeouts and there may be some nodes that continue 
to write for a few seconds after they should have stopped doing that. By 
the time your cluster goes red, it's too late and you are going to have to 
manually decide which shards you want to loose. That's why you need to keep 
an eye on cluster status during rolling updates. Imagine somebody power 
cycling your elastic search node cluster or worse, rebooting the switch 
that connects your nodes.
4) Elasticsearch under load may throw 503s occasionally. I've seen this 
happen on our test infrastructure a couple of times and it worries me. This 
is not something you want to see when you are writing customer data. 

Mitigation for these issues typically involves using specialized nodes for 
read and write traffic and cluster management. Additionally, you need to 
heavily tweak things to make certain failure scenarios less likely. Out of 
the box, there is a lot of stuff that can go wrong.  

We're actually deprecating our mariadb architecture and switching to an 
elasticsearch only architecture. I'm well aware that I'm taking a risk here 
and I have a backup plan for most of those risks. This includes changing 
plans and switching to couchdb or a similar document store if elasticsearch 
proves to not be not up to the task. However, so far so good. 

On Tuesday, August 26, 2014 6:55:10 AM UTC+2, Mo wrote:
>
> In general use elasticsearch only as a secondary index. Have a copy of 
> data somewhere else which is more reliable. Elasticsearch often runs into 
> index corruption issues which are hard to resolve.
>
>
> On Mon, Aug 25, 2014 at 9:30 PM, <[email protected] <javascript:>> wrote:
>
>>
>> On Tuesday, August 26, 2014 6:46:12 AM UTC+8, Raphael Waldmann wrote:
>>>
>>> Hi, 
>>>
>>> First I would like to thanks all of you for Elastic. I am thinking in 
>>> use it in a ERP that I am building. What do you think about this? Am I 
>>> crazy?
>>>
>>> Has someone face this? I really don't think that I am comfy enough to do 
>>> this, change the problems that I already know, for new problems that I 
>>> really don't know how to deal. 
>>>
>>> I believe that nosql will prevail over traditional sql, but I don't know 
>>> if I am ready to this task.
>>>
>>> So how you think that I should integrate (or not) postgresql with 
>>> ELASTICSEARCH?
>>>
>>
>> Will you plan t use ES to index data in postgresql?  
>>
>> I have similar idea, want to use ES instead datawarehouse.
>>
>> Some problems  I can see:
>> 1) Data in RDBMS are stored in tables,  connected with relationship. You 
>> can use very complex sql to query a complex result, how to do in ES? 
>> 2) If your want to run some analyse algorithms with exist data, how to 
>> running in ES?
>> 3) if your data are enough big, search one keyword in '_all' field, ES 
>> will be slow? 
>>
>>
>> Thanks.
>> -Terrs
>>
>> Thanks again,
>>>
>>>
>>> rsw1981
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f5500235-46e8-4c6c-8597-e42d7401d22a%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/f5500235-46e8-4c6c-8597-e42d7401d22a%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/390d6744-cfe9-4068-b3dd-fc8337355ee2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Building an ERP with Elasticsearch. Am I crazy?

Reply via email to