Re: Design HA ES for 16 TB logs data | Is SAN storage a good idea?

Mark Walkom Sun, 03 Aug 2014 20:27:07 -0700

Heavy aggregations = lots of ram
Storage, if you can use SSD.

The only rule of thumb is get the best possible hardware that you can
afford.


Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [email protected]
web: www.campaignmonitor.com


On 4 August 2014 13:09, John Cherniavsky <[email protected]> wrote:

> SAN question aside - what are guidelines on the balance of CPU/RAM/Storage
> so that no one thing is the obvious bottleneck.
>
> I know it depends on workload, so
>
> * For aggregation heavy workloads, about how much RAM : Storage?
>
> * For high volume, but smaller queries (individual log retrieval), what's
> the right CPU : Storage for spinning disk? To much CPU and all the extra
> queries are waiting on the disks to return, too much disk and the CPU can't
> keep up (or does that never happen?)
>
> Obviously every configuration is different - so does anyone have
> guidelines or past experience?
>
> On Sunday, August 3, 2014 1:49:09 PM UTC-7, Jörg Prante wrote:
>>
>> A. There are many unknown factors regarding "SAN storage", e.g. how is
>> the latency and the IOPS? Most of SAN are black boxes and do not scale over
>> the number of connected hosts, so you should test it thoroughly to make an
>> educated decision. There is no simple "yes" or "no". As a matter of fact, I
>> would never use SAN, only local storage, because SAN comes with the risk of
>> being bottleneck.
>>
>> B. No matter what specifications, you should test your configuration
>> first if it fits your performance requirements, there is no "yes" or "no".
>> The minimum number of nodes is 3 to avoid situations like split brain.
>>
>> C. You should expect more throughput if you can decouple client workload
>> from server workload, but that also depends on your workload pattern and
>> your tests. For example if you must preprocess data before indexing, or
>> postprocess search results, you will welcome additional nodes as a great
>> help.
>>
>> Jörg
>>
>>
>> On Sun, Aug 3, 2014 at 9:26 PM, sirkubax <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> I'm testing/planning implementation for 16 TB data logs (1 month, daily
>>> indexes about 530GB/day). Indexes are deleted after 1 month (TTL is 1
>>> month).
>>>
>>> The documents size vary from few bytes to 1MB (average of ~3 kb).
>>>
>>> We have 2 data center, and the requirement is to provide access to
>>> dataset when one is down.
>>>
>>> My current implementation looks like this:
>>>
>>>   cluster.routing.allocation.awareness.attributes: datacenter
>>>
>>>   cluster.routing.allocation.awareness.force.datacenter.values:
>>> datacenterA,datacenterB
>>>
>>> So the indexes are located on nodes in datacenterA and datacenterB.
>>> There is 1 replica for each index, so the index/replica is  balanced
>>> between locations.
>>>
>>> The problem A:
>>>
>>>  I have been offered a SAN storage space that could be provided to any
>>> of ES node machines. Now, it index/replica scenario, I need 2 * 16 TB = 32
>>> TB disk storage. If in raid1, it makes 64TB "real world" disk storage.
>>>
>>> Providing "independent, high quality" storage may (if ES would allow)
>>> reduce the size to required 16TB. I said "if ES would allow", because up to
>>> my current knowledge, nodes can not "share" dataset. If many nodes run on a
>>> common storage, they create own, unique path. Is that correct?
>>>
>>>  Could I run ES cluster where indexes have no replica, but still, nodeX
>>> failure does not affect accessibility of nodeXdataset to the Cluster?
>>>
>>> In my current idea of indexes without replica scenario, powering off (or
>>> failure) of the "NodeXDatacenterA" would make datasetX unavailable to read
>>> in cluster, at least until I start NodeXDatacenterB that would have access
>>> to datasetX (the same path configuration). Of course NodeXDatacenterA and
>>> NodeXDatacenterB could not run both in the same time.
>>>
>>> I just guess, that workaround suggested above is not "in the ES
>>> philosophy of shared storage and self-balancing". It would make upgrade of
>>> single node problematic, less fault-tolerant, etc.
>>>
>>>  Facts that makes me think about this solution is, that I have
>>> available some "24-core, 64GH Ram, limited disk storage" machines and a
>>> 16TB SAN storage that I could mount to that machines.
>>>
>>>  Do You have any suggestion of SAN storage usage? Is that a good idea
>>> at all?
>>>
>>>  The problem B: Design
>>>
>>>  My current idea of building the environment is to order N (6-8? or
>>> more) machines with big HDD's and run "normal ES cluster" with shards and
>>> replicas stored locally.
>>>
>>> The question is: how many of them would be enough :)
>>>
>>> Providing 24-core,64GB RAM and 4TB each it would make 4 machines to run
>>> minimal cluster settings in single Datacenter, and 8 machines total for
>>> both datacenters. What do you think about possible performance.
>>>
>>> Actually to be storage-safe I would go for 6-8 TB disk storage per
>>> machine. That would allow to run on "less than 4" nodes while operation in
>>> single datacenter.
>>>
>>> I wonder if 64GB RAM would be enough.
>>>
>>> The whole process of acquiring new servers takes time - is there a "good
>>> practise" guide to determine minimum number of servers in the cluster?
>>>
>>>  How many shards would You suggest?
>>>
>>>  Question C:
>>>
>>>  I have seen some performance advices to make "client" ES nodes as a
>>> machine without data storage so it would not suffer from I/O issues. If
>>> having 2 of them, how would you scale it?
>>>
>>> Do you think it's worth having 2 client-only machines, or better 2 more
>>> "complete" nodes with data storage, as extra nodes to ES cluster (so 10
>>> instead of 8 nodes).
>>>
>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/0565daed-f398-48da-be62-8646844581d0%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/0565daed-f398-48da-be62-8646844581d0%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/67fdd9d5-9c5e-4c4d-af97-5657f024d510%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/67fdd9d5-9c5e-4c4d-af97-5657f024d510%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624a3zc%2BHub9fTwV305-6sDShTgBXHpoewutLZH6uR4Fucw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Design HA ES for 16 TB logs data | Is SAN storage a good idea?

Reply via email to