Hello John!
I'm just wonder how often one of your cluster nodes
failed/crashed/go_down or meets disks crashing? Looking for some sort of
probability of hardware failure..
Thank you.
On 01/19/2016 09:21 PM, John Sumsion wrote:
I have a 24 node cluster, with vnodes set to 256.
'nodetool status <ks>' looks like this for our keyspace:
UN <ip01> 588.23 GB 256 11.0%
0c8708a7-b962-4fc9-996c-617da642d9ee 1a
UN <ip02> 601.33 GB 256 11.3%
5ef60730-0b01-4a8b-a578-d828cdf78a1f 1b
UN <ip03> 613.02 GB 256 11.5%
dddc78b1-7dc2-4e9f-8e8a-1b52595aa0e3 1a
UN <ip04> 620.76 GB 256 11.7%
87ac93ff-dc8e-4cd5-842c-0389ce016d70 1b
UN <ip05> 631.81 GB 256 11.9%
8e1416aa-3e75-4ab5-a2a6-49d26f514115 1d
UN <ip06> 634.65 GB 256 11.9%
3c97f722-16f5-455c-8f58-71c07ad93d25 1b
UN <ip07> 634.79 GB 256 11.9%
3e3d41bd-d6e8-4a7e-aee2-7ea16b1dadb9 1d
UN <ip08> 637.05 GB 256 12.0%
2f26f19a-c88f-4cbe-b865-155c0b66bff0 1b
UN <ip09> 637.83 GB 256 12.0%
6385e073-5b48-49b3-a85b-e7511fa8b3a0 1a
UN <ip10> 638.05 GB 256 12.1%
382681e5-c060-4594-ae2a-062a324c12d4 1d
UN <ip11> 660.22 GB 256 12.4%
ea6aad23-7d93-4989-8898-7505df51298f 1d
UN <ip12> 674.98 GB 256 12.6%
7d372371-c23f-4235-9e3c-cf030fb52ab3 1a
UN <ip13> 676.22 GB 256 12.7%
41c4cb98-91ae-43a6-9bc4-11aa6106faad 1d
UN <ip14> 680.15 GB 256 12.7%
65ac3aef-8a9b-423d-83fb-ed8e41f88ccc 1a
UN <ip15> 681.35 GB 256 12.8%
e38efc6a-e7eb-4d8e-9069-a0b099bea96e 1d
UN <ip16> 693.19 GB 256 13.0%
2b9a5d3e-8529-47fe-8d2c-13553a8df91f 1b
UN <ip17> 696.92 GB 256 13.0%
46382cd1-402c-4200-858c-100dade03fc5 1d
UN <ip18> 698.17 GB 256 13.1%
a68107e7-8e1a-469e-8dd1-e2d87445fd47 1b
UN <ip19> 698.92 GB 256 13.1%
662338a7-1f5c-4eaa-926e-9e9fda926504 1a
UN <ip20> 699.26 GB 256 13.1%
e7c15c56-80e6-4961-9cd9-c1302fbf2026 1a
UN <ip21> 702.98 GB 256 13.2%
461baba0-60f3-423a-a5cf-e0c482da2dbf 1b
UN <ip22> 710.27 GB 256 13.3%
ffa9700d-50ef-4b23-92d9-18f8029c8cd6 1d
UN <ip23> 740.63 GB 256 13.8%
d9c6e2a1-2193-4f32-8426-3bd7ad8bf679 1a
UN <ip24> 744.12 GB 256 13.9%
ff841094-7624-4dc5-b480-f39138b7f17c 1b
First, the difference in disk usage between 588G (lowest) and 744G
(highest) is significant - at 156G. I'm sure it's probably a weird
pattern in our partition keys, but we can't predict that until we get
the data loaded.
Maybe someone will advise against using vnodes altogether, but we need
to be able to add 3 nodes for extra capacity and would like not to
have to rewrite the vnode token assignment code in order to figure out
a rack-safe token reassignment.
Given the above, is there any way to manually adjust tokens (while
still using vnodes) so that we can balance the disk usage out? If so,
is there an easy way to do that in a rack-safe manner?
Thanks,
John...
--
Thanks,
Serj