Hi,

Recently i was searching about a Real time analytic tool, and i found ES.

Basically we want to show aggregate information for multiple customers. And 
this information have to be showed in real time. The information expire 
then one month. basically we want to use aggregation for analytics.

Documents has around 5 to 10 fields, basically are all integers. and 
integer arrays. We want to tried index around 120M documents every day (it 
can increase), i was doing a POC with write 1000 docs/s, using ES java API 
with bulk insert, obviously the performance decrease with every write. 

The POC was very simple a Developer Machine Intel Core i5 250SSD, 16GB RAM 
(xmx=8). I use similar 
configuration: https://gist.github.com/reyjrar/4364063

The estructure for indexing is : http://localhost:port/category/campaign_id/

We want to deploy a cluster in EC2 for this POC, someone has some advice 
about:
- EC2 instance to take, how many?
- Strategy of replica, numbers of shards?
- Estructure to index documents, (we was thinking create a index for each 
campaign_id)




-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2f532e5d-ef36-47df-b5ec-c0a8d2116d84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to