[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448933#comment-15448933 ] Mike Torra commented on CASSANDRA-10430: I am running a relatively small cluster using datastax community cassandra 3.5 in ec2, and I regularly experience this issue. Even without running repair after restarting all nodes, eventually the reported 'load' diverges quite a bit. Is it really supposed to reflect disk usage? I found that to not be the case, so instead I depend on collectd reporting disk usage on my nodes. $ nodetool status Datacenter: ap-southeast Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 52.76.137.4572.89 GB 256 100.0% 47747850-8edb-4f26-9fc0-41cbd9763dd3 1a UN 52.77.178.3063.64 GB 256 100.0% e9817aff-0d12-489e-aa6e-4960e0c43404 1a UN 52.77.175.217 82.93 GB 256 100.0% 56f44708-cd29-4937-8450-86fe8dbc7445 1b Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 52.31.103.1464.44 GB 256 100.0% 26479115-07fa-4d3a-bbb5-cf491b509946 1b UN 52.30.151.214 36.61 GB 256 100.0% 298b143c-a2a9-45bb-b9c2-68675a0a46e0 1c UN 52.210.34.4348.55 GB 256 100.0% a723bdf4-8575-4adf-ae13-891deb4bc986 1a Datacenter: us-east === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 52.205.224.43 141.15 GB 256 100.0% 35b4cf08-fb44-4b2e-869d-707b939e646d 1e UN 52.204.232.195 1.15 TB256 100.0% dfb048f4-c61f-4b77-9d24-5cbf9080a923 1d UN 52.205.186.242 797.57 GB 256 100.0% 71204c7a-6455-441c-a6a3-282672e01736 1b Datacenter: us-west-2 = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 52.26.238.177 76.71 GB 256 100.0% 15e0550a-4798-4dc1-95b2-b5749ebece56 2c UN 52.43.246.8059.43 GB 256 100.0% 28b009e3-928e-457c-98cb-c39c201b3a7f 2a UN 52.42.227.3899.6 GB256 100.0% 49ec7e6d-b392-464f-918b-09e0cc329c31 2b > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148816#comment-15148816 ] clint martin commented on CASSANDRA-10430: -- Thank you! > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148657#comment-15148657 ] Marcus Eriksson commented on CASSANDRA-10430: - the incremental repair issue was fixed in CASSANDRA-10831 > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Fix For: 2.1.x > > Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148655#comment-15148655 ] clint martin commented on CASSANDRA-10430: -- I am also experiencing this issue, using DSE 4.7.3 (cassandra 2.1.8.689). Load was reported correctly until I switched my cluster to use Incremental Repair. # nodetool status Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens OwnsHost ID Rack UN 172.16.10.250 1.76 TB1 ? 88280120-c7d6-401e-8a75-5726cbb081e8 RAC1 UN 172.16.10.251 2.28 TB1 ? 3812bbd5-d63d-4bf1-a22b-6c31ce279018 RAC1 UN 172.16.10.252 2.05 TB1 ? 59028151-892a-4896-89b7-a368cceaddd6 RAC1 I only have 1.3TB of raw space on each of these nodes, and am only actually using approximately 385G to 468G of raw space on each node. > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Fix For: 2.1.x > > Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994613#comment-14994613 ] Jim Witschey commented on CASSANDRA-10430: -- Pinging [~yukim], any idea why this might be happening? > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Fix For: 2.1.x > > Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943835#comment-14943835 ] Philip Thompson commented on CASSANDRA-10430: - [~yukim], any idea what the issue could be? > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Fix For: 2.1.x > > Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943830#comment-14943830 ] julia zhang commented on CASSANDRA-10430: - We have since switched to full repair (nodetool repair -pr keyspace), and nodetool no longer reports invalid "Load". > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Fix For: 2.1.x > > Attachments: system.log.2.zip, system.log.3.zip, system.log.4.zip > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10430) "Load" report from "nodetool status" is inaccurate
[ https://issues.apache.org/jira/browse/CASSANDRA-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943714#comment-14943714 ] Philip Thompson commented on CASSANDRA-10430: - You say that after restart the numbers become correct. How long after that until they become invalid again? Can you include the system.log from one of the nodes this is affecting? > "Load" report from "nodetool status" is inaccurate > -- > > Key: CASSANDRA-10430 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10430 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Cassandra v2.1.9 running on 6 node Amazon AWS, vnodes > enabled. >Reporter: julia zhang > Fix For: 2.1.x > > > After running an incremental repair, nodetool status report unbalanced load > among cluster. > $ nodetool status mykeyspace > == > ||Status|| Address ||Load ||Tokens ||Owns (effective) > ||Host ID || Rack || > |UN |10.1.1.1 |1.13 TB |256|48.5% > |a4477534-a5c6-4e3e-9108-17a69aebcfc0| RAC1| > |UN |10.1.1.2 |2.58 TB |256 |50.5% > |1a7c3864-879f-48c5-8dde-bc00cf4b23e6 |RAC2| > |UN |10.1.1.3 |1.49 TB |256 |51.5% > |27df5b30-a5fc-44a5-9a2c-1cd65e1ba3f7 |RAC1| > |UN |10.1.1.4 |250.97 GB |256 |51.9% > |9898a278-2fe6-4da2-b6dc-392e5fda51e6 |RAC3| > |UN |10.1.1.5 |1.88 TB |256 |49.5% > |04aa9ce1-c1c3-4886-8d72-270b024b49b9 |RAC2| > |UN |10.1.1.6 |1.3 TB|256 |48.1% > |6d5d48e6-d188-4f88-808d-dcdbb39fdca5 |RAC3| > It seems that only 10.1.1.4 reports correct "Load". There is no hints in the > cluster and report remains the same after running "nodetool cleanup" on each > node. "nodetool cfstats" shows number of keys are evenly distributed and > Cassandra data physical disk on each node report about the same usage. > "nodetool status" report these inaccurate large storage load until we restart > each node, after the restart, "Load" report match what we've seen from disk. > We did not see this behavior until upgrade to v2.1.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)