Hi, Recently i was searching about a Real time analytic tool, and i found ES.
Basically we want to show aggregate information for multiple customers. And this information have to be showed in real time. The information expire then one month. basically we want to use aggregation for analytics. Documents has around 5 to 10 fields, basically are all integers. and integer arrays. We want to tried index around 120M documents every day (it can increase), i was doing a POC with write 1000 docs/s, using ES java API with bulk insert, obviously the performance decrease with every write. The POC was very simple a Developer Machine Intel Core i5 250SSD, 16GB RAM (xmx=8). I use similar configuration: https://gist.github.com/reyjrar/4364063 The estructure for indexing is : http://localhost:port/category/campaign_id/ We want to deploy a cluster in EC2 for this POC, someone has some advice about: - EC2 instance to take, how many? - Strategy of replica, numbers of shards? - Estructure to index documents, (we was thinking create a index for each campaign_id) -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2f532e5d-ef36-47df-b5ec-c0a8d2116d84%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.