Design HA ES for 16 TB logs data | Is SAN storage a good idea?

sirkubax Sun, 03 Aug 2014 12:27:27 -0700

Hi,

I'm testing/planning implementation for 16 TB data logs (1 month, daily
indexes about 530GB/day). Indexes are deleted after 1 month (TTL is 1
month).

The documents size vary from few bytes to 1MB (average of ~3 kb).

We have 2 data center, and the requirement is to provide access to dataset
when one is down.

My current implementation looks like this:

cluster.routing.allocation.awareness.attributes: datacenter

cluster.routing.allocation.awareness.force.datacenter.values:
datacenterA,datacenterB

So the indexes are located on nodes in datacenterA and datacenterB. There
is 1 replica for each index, so the index/replica is balanced between
locations.

The problem A:

I have been offered a SAN storage space that could be provided to any of ES
node machines. Now, it index/replica scenario, I need 2 * 16 TB = 32 TB
disk storage. If in raid1, it makes 64TB "real world" disk storage.

Providing "independent, high quality" storage may (if ES would allow)
reduce the size to required 16TB. I said "if ES would allow", because up to
my current knowledge, nodes can not "share" dataset. If many nodes run on a
common storage, they create own, unique path. Is that correct?

Could I run ES cluster where indexes have no replica, but still, nodeX
failure does not affect accessibility of nodeXdataset to the Cluster?

In my current idea of indexes without replica scenario, powering off (or
failure) of the "NodeXDatacenterA" would make datasetX unavailable to read
in cluster, at least until I start NodeXDatacenterB that would have access
to datasetX (the same path configuration). Of course NodeXDatacenterA and
NodeXDatacenterB could not run both in the same time.

I just guess, that workaround suggested above is not "in the ES philosophy
of shared storage and self-balancing". It would make upgrade of single node
problematic, less fault-tolerant, etc.

Facts that makes me think about this solution is, that I have available
some "24-core, 64GH Ram, limited disk storage" machines and a 16TB SAN
storage that I could mount to that machines.

Do You have any suggestion of SAN storage usage? Is that a good idea at
all?

The problem B: Design

My current idea of building the environment is to order N (6-8? or more)
machines with big HDD's and run "normal ES cluster" with shards and
replicas stored locally.

The question is: how many of them would be enough :)

Providing 24-core,64GB RAM and 4TB each it would make 4 machines to run
minimal cluster settings in single Datacenter, and 8 machines total for
both datacenters. What do you think about possible performance.

Actually to be storage-safe I would go for 6-8 TB disk storage per machine.
That would allow to run on "less than 4" nodes while operation in single
datacenter.

I wonder if 64GB RAM would be enough.

The whole process of acquiring new servers takes time - is there a "good
practise" guide to determine minimum number of servers in the cluster?

How many shards would You suggest?

Question C:

I have seen some performance advices to make "client" ES nodes as a machine
without data storage so it would not suffer from I/O issues. If having 2 of
them, how would you scale it?

Do you think it's worth having 2 client-only machines, or better 2 more
"complete" nodes with data storage, as extra nodes to ES cluster (so 10
instead of 8 nodes).

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0565daed-f398-48da-be62-8646844581d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Design HA ES for 16 TB logs data | Is SAN storage a good idea?

Reply via email to