Do you say, that 10 servers like 2 CPU, 7.5 RAM (so totally 20 CPUs and 75Gb RAM) cluster would be more powerful then the 3 serves of 8 CPU and 30 RAM (in total 24 CPU and 90RAM) ? Assuming that the information would be spread there equally.
btw, what about the shards allocation. Currently I use the default one 5 shards and 1 replica. Could this be a potential thing to optimisation? How the shards scheme should look on the cluster with the bigger number of the nodes? Regards, On Friday, September 12, 2014 12:11:32 PM UTC+3, Mark Walkom wrote: > > The answer is it depends on what sort of use case you have. > But if you are experiencing problems like you are then usually it's due to > the cluster being at capacity and needing more resources. > > You may find it cheaper to move to more numerous and smaller nodes that > you can distribute the load across, as that is where ES excels and also how > many other big data platforms operate. > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: [email protected] <javascript:> > web: www.campaignmonitor.com > > > On 12 September 2014 19:01, Pavel P <[email protected] <javascript:>> > wrote: > >> Java version is "1.7.0_55" >> Elasticsearch is 1.3.1 >> >> Well, the cost of the whole setup is the question. >> currently it's something about 1000$ per month on AWS. Do we really need >> to pay a lot more then 1000$/month to support the 1.5Tb data? >> >> Could you briefly describe how much nodes do you expect to handle that >> much of data? >> >> The side question is, how the the really Big Data solution works, when >> they do the search or aggregation from the data which size is far more then >> 1.5Tb? Or it's as well is the size of the architecture. >> >> Regards, >> >> On Friday, September 12, 2014 11:53:35 AM UTC+3, Mark Walkom wrote: >>> >>> That's a lot of data for 3 nodes! >>> You really need to adjust your infrastructure; add more nodes, more ram, >>> or alternatively remove some old indexes (delete or close). >>> >>> What ES and java version are you running? >>> >>> Regards, >>> Mark Walkom >>> >>> Infrastructure Engineer >>> Campaign Monitor >>> email: [email protected] >>> web: www.campaignmonitor.com >>> >>> >>> On 12 September 2014 18:48, Pavel P <[email protected]> wrote: >>> >>>> Hi, >>>> >>>> Again I have an issue with the power of the cluster. >>>> >>>> I have the cluster from 3 servers, each has 30RAM, 8 CPUs and 1Tb disk >>>> attached. >>>> >>>> >>>> <https://lh4.googleusercontent.com/-W1AVatn9Cq0/VBKzYgR3QKI/AAAAAAAAAJc/S3TWMBqqqX0/s1600/ES_cluster.png> >>>> >>>> >>>> There are 1323957069 docs (1.64TB) there, the documents distribution >>>> is the next: >>>> >>>> >>>> <https://lh5.googleusercontent.com/-kjlQG7xBfIw/VBKwCt8sKQI/AAAAAAAAAJQ/s8kuqouFUkQ/s1600/Screen%2BShot%2B2014-09-12%2Bat%2B11.33.49%2BAM.png> >>>> >>>> All the 3 nodes are data nodes. >>>> >>>> The index throughput is something about 10-20k documents per minute. >>>> (it's the logstash -> elasticsearch setup, we store different logs in the >>>> cluster) >>>> >>>> My concerns are the next: >>>> >>>> 1. When I load the index page of kibana - the loading of the document >>>> types panel takes about a minute. It that ok? >>>> 2. For the document type user_account, when I try to build the terms >>>> panel for the field "message.raw" (the string of 20-30 characters). My >>>> cluster stucks. >>>> In the logs I can find the next >>>> >>>> [2014-09-11 08:03:34,507][ERROR][indices.fielddata.breaker] [morbius] >>>>> New used memory 6499531395 [6gb] from field [message.raw] would be larger >>>>> than configured breaker: 6414558822 [5.9gb], breaking >>>> >>>> >>>> But, despite of the breakers, when it tries to calculate that terms >>>> pie, it stops indexing the input documents. The queue goes up. Then, it >>>> happens that I see the heap exceptions and to solve them the only thing I >>>> could do is to reboot the cluster. >>>> >>>> *My question is the next:* >>>> >>>> It looks like I have quite powerful servers and the correct >>>> configuration (my ES_HEAP_SIZE is set to 15g), while they are still >>>> not able to process the 1.5Tb of information or doing that quite slowly. >>>> Do you have any advice of how to overcome that and make my cluster to >>>> response more fast? How should I adjust the infrastructure? >>>> >>>> Which hardware should I need to manipulate the 1.5Tb in the reasonable >>>> amount of time? >>>> >>>> Any thoughts are welcome. >>>> >>>> Regards, >>>> >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/707ed8a1-8f94-48cc-a78a-0e1f63f32b8d% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/707ed8a1-8f94-48cc-a78a-0e1f63f32b8d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/3aa93dc1-1c75-4b75-b864-8b391ec218c6%40googlegroups.com >> >> <https://groups.google.com/d/msgid/elasticsearch/3aa93dc1-1c75-4b75-b864-8b391ec218c6%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5464fa68-1d50-47f7-889d-08952501517f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
