Not really, the smaller server you mentioned before would be suitable.
On 23 April 2015 at 11:04, Jack Park <jackp...@topicquests.org> wrote: > That starts to argue for lots of smaller servers maybe even with smaller > SSD's. Say, a low power i3 with 16 or 23gb ram, and a 128gb SSD. Is that > right? > > On Wed, Apr 22, 2015 at 2:56 PM, Mark Walkom <markwal...@gmail.com> wrote: > >> If you are using time series data then you should be using time series >> indices. As Fred pointed out, routing an entire month's worth of data to a >> single shard is not going to scale. >> >> Also, we recommend that you keep shard size below 50GB, this helps with >> recovery and distribution. There is also a hard 2 billion doc per shard >> limit in the underlying lucene engine, if you hit this then you may lose >> data. >> >> On 23 April 2015 at 03:12, Kimbro Staken <ksta...@kstaken.com> wrote: >> >>> Hello Fred, >>> >>> I have clusters as large as 200billion documents/130TB. Sharing >>> experiences on that would require a book, but a couple quick things that >>> jumped out at me. >>> >>> 1. do not go the huge server route. Elasticasearch works best when you >>> scale it horizontally. The 64GB route is a much better option. >>> >>> 2. If I understand correctly you're routing an entire months data to a >>> single shard? By doing that you're directing all activity on that shard to >>> a single machine, or small set of machines if you have replicas. That has >>> to be much slower than if you were to do something like use a monthly index >>> with a reasonable number of shards to spread that load across the cluster. >>> That is also creating shard sizes that are fairly large and if you have >>> month to month variation in data rates you'll end up with "lumpy" shard >>> sizes which will definitely cause issues if you ever run your cluster low >>> on disk space. >>> >>> 3. Get off of ES 1.3 as fast as you can. 8TB spread across 37 machines >>> is very low density, as you push more data in you don't want to be on ES >>> 1.3. >>> >>> 4. If you're not already using doc_values start looking into it now. >>> Managing heap memory is let's be nice and call it "a challenge" and >>> fielddata can eat heap in ways that will make your head spin. >>> >>> >>> >>> Kimbro Staken >>> >>> >>> On Wed, Apr 22, 2015 at 1:14 AM, <fdevilla...@synthesio.com> wrote: >>> >>>> Hi list, >>>> >>>> I've been using ES in production since 0.17.6 with clusters up to 64 >>>> virtual machines and 20T data (including 3 replica). We're now thinking >>>> about pushing things a bit further and I wondered if people here had >>>> similar experience / needs as we do. >>>> >>>> Our current index is 1.1 billion unique documents, 8Tb data (including >>>> 1 replica) on 37 physical machines (32 data nodes, 3 master nodes and 2 >>>> nodes dedicated to http requests) with ES 1.3 (upgrade to 1.5 already >>>> planned). We're indexing about 2500 new documents / second and everything's >>>> fine so far. >>>> >>>> Our goal is to index (and search) about 30 billion more documents (the >>>> backdata) + about 200 million new documents each month. >>>> >>>> Our company is providing analytics dashboards to their clients, and >>>> they mostly browse their data on a monthly scale, so we're routing >>>> documents monthly. Each shard makes between 200 and 250G. The index is made >>>> of 128 shards, which makes about 10 years of data with 1 month per shard. >>>> Considering what we already have, we should reach 240T of data (and >>>> counting) with a single replica after we index all our backdata. >>>> >>>> So, my questions here: >>>> >>>> - Has someone here the same use / amount of data as we do? >>>> >>>> - Is ES the right technology to do realtime, ligthspeed queries >>>> (filtered queries and high cardinality agregations) on such an amount of >>>> data? >>>> >>>> - What were the traps to avoid? Is it better to add lots of medium >>>> machines (12 core Xeon E5-1650 v2, 64G RAM, 1.8T SAS 15k hard drives) or a >>>> few huge machines with petabytes of RAM, terabytes of SSD and multiple ES >>>> processes? >>>> >>>> Any feedback on similar situation is indeed appreciated. >>>> >>>> Have a nice day, >>>> Fred >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elasticsearch+unsubscr...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elasticsearch/6865703f-2302-4fe0-b929-eb9fbe55a84a%40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/6865703f-2302-4fe0-b929-eb9fbe55a84a%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearch+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/CAA0DmXZTqYgoKAKxLKGUeSXv_Mjjrer1dogaYARf1Ny7kio_3A%40mail.gmail.com >>> <https://groups.google.com/d/msgid/elasticsearch/CAA0DmXZTqYgoKAKxLKGUeSXv_Mjjrer1dogaYARf1Ny7kio_3A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearch+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-PC7L%2Be823M-6wR6ReRdV6zgt56WW0z0Uf_Vy62iNwrQ%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-PC7L%2Be823M-6wR6ReRdV6zgt56WW0z0Uf_Vy62iNwrQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwv98bByXNbTaGXhfQnuAU%3DKfeR-ATEN0XWZb6zbGqGew%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwv98bByXNbTaGXhfQnuAU%3DKfeR-ATEN0XWZb6zbGqGew%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9X5HrL_1wF-iC0qhSUm-ONxeRfF04t8Wc%2B5Kw7uS%2Bw3g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.