>From my understanding, which admittedly  is limited, there is still 
potential to lose data with ElasticSearch.  

Even with the new Snapshot API running regularly, if all indexes get 
corrupted, there is no guarantee of 100% data backup and restore, because 
you would lose the data which was added between you last snapshot and any 
subsequent updates to the index.



On Tuesday, 14 January 2014 13:23:55 UTC, amos.wood wrote:
>
> For one our projects, we also use Elasticsearch as the sole database.  The 
> only consideration to make is that while gets by id are real-time, all 
> other searches are subject to the "refresh interval" setting of a 
> particular index/table.  We overcame this problem by:
>
> 1. Set the refresh_interval at 25ms
> 2. After a write to our service, we paused for 25ms before returning a 
> successful write to the client.
> 3. Put an automatic retry mechanism on particular calls.  This helped when 
> the index servers had heavy traffic and the "refresh interval" actually 
> took more than 25ms.  This scenario happened when a client wrote a record 
> and immediately wanted to get it by a field other than its id.
>
> On Monday, January 13, 2014 11:24:00 AM UTC-6, Eugene Strokin wrote:
>>
>> It seems like you don't really need a search, but just filtering, so, 
>> you'd use a subset of features from ElasticSearch. But why would you think 
>> you cannot use ES as DB? What would be your concern?
>> Just, so you know, I use ES as the only storage for one of my project for 
>> second year already, for Big Data/BigTraffic application. And if you do 
>> things right, you should be allright as well.
>>
>> Eugene
>>
>> On Monday, January 13, 2014 5:31:24 AM UTC-5, Xie Lebing wrote:
>>>
>>> Not a joke.
>>>
>>> We have events log (userid, timestamp, action, entity ....) which 
>>> records players' essential activities and is used for customer service. The 
>>> volume is around 10-15 million rows a day and held for 3 months. The search 
>>> condition could be complicated, such like userid + time range + activities; 
>>> timerange + activities so on.
>>>
>>> Currently 3 solutions are considered:
>>>
>>> 1. Use MongoDB cluster to hold the data. 
>>> 2. Use ES to index the log and for searching. Easy to setup and 
>>> maintain.  
>>> 3. Use HBASE, but have to create multiple "indexes" 
>>>
>>> any idea about that? Thanks!
>>>
>>>
>>>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cc84bbab-252c-4ecf-ab17-3ef6cb10a621%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to