128GB RAM -> that's a good news, you have plenty of room to increase Cassandra heap size. You can start with, let's say, 12GB in jvm.options or 24GB if you use G1 GC. Let us know if the node starts and if DEBUG/TRACE is useful. You can also try "strace -f -p ..." command to see if the process is doing something when it's stuck, but Cassandra has a lots of threads... Le vendredi 9 novembre 2018 à 19:13:51 UTC+1, Francesco Messere <f.mess...@list-group.com> a écrit : Hi Roman yes I modified the .yaml after the issue. The problem is this, if I restart a node in DC-FIRENZE than it not startup I tried first one node and then the second one with the same results.
these are the server resources memory 128Gb free total used free shared buff/cache available Mem: 131741388 13858952 72649704 124584 45232732 116825040 Swap: 16777212 0 16777212 cpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 1 Core(s) per socket: 12 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz Stepping: 1 CPU MHz: 1213.523 CPU max MHz: 2900.0000 CPU min MHz: 1200.0000 BogoMIPS: 4399.97 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 30720K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23 There is nothing in server logs. On monday I will activate debug and try again to startup cassandra node Thanks Francesco Messere On 09/11/2018 18:51, Romain Hardouin wrote: Ok so all nodes in Firenze are down. I thought only one was KO. After a first look at cassandra.yaml the only issue I saw is seeds: the line you commented out was correct (one seed per DC). But I guess you modified it after the issue. You should fix the swap issue. Also can you add more heap to Cassandra? By the way, what are the specs of servers (RAM, CPU, etc)? Did you check Linux system log? And Cassandra's debug.log? You can even enable TRACE logs in logback.xml ( https://github.com/apache/cassandra/blob/cassandra-3.11.3/conf/logback.xml#L100 ) then try to restart a node in Firenze to see where it blocks but if it's due to low resources, hardware issue or swap it won't be useful. Let's give a try anyway. Le vendredi 9 novembre 2018 à 18:20:57 UTC+1, Francesco Messere <f.mess...@list-group.com> a écrit : Hi Romain, you are right, is not possible to work in these towns furtunally I live in Pisa :-). I sow the errors and corrected them, except the swap one. The process stuks, I let it run for 1 day without results. This is the output of nodetool status from the nodes that are up and running (DC-MILANO) /conf/CASSANDRA_SHARE_PROD_conf/bin/cassandra-3.11.3/bin/nodetool -h 192.168.71.210 -p 17052 status Datacenter: DC-FIRENZE ====================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack DN 192.168.204.175 ? 256 100.0% a3c8626e-afab-413e-a153-cccfd0b26d06 RACK1 DN 192.168.204.176 ? 256 100.0% 67738ca8-f1f5-46a9-9d23-490bbebcffaa RACK1 Datacenter: DC-MILANO ===================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 192.168.71.210 5.95 GiB 256 100.0% 210f0cdd-abee-4fc0-abd3-ecdab618606e RACK1 UN 192.168.71.211 5.83 GiB 256 100.0% 96c30edd-4e6c-4952-82d4-dfdf67f6a06f RACK1 and this is describecluster command output /conf/CASSANDRA_SHARE_PROD_conf/bin/cassandra-3.11.3/bin/nodetool -h 192.168.71.210 -p 17052 describecluster Cluster Information: Name: CASSANDRA_3 Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch DynamicEndPointSnitch: enabled Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 6bdd4617-658e-375e-8503-7158df833495: [192.168.71.210, 192.168.71.211] UNREACHABLE: [192.168.204.175, 192.168.204.176] In attach the cassandra.yaml file Regards Francesco Messere On 09/11/2018 17:48, Romain Hardouin wrote: Hi Francesco, it can't work! Milano and Firenze, oh boy, Calcio vs Calcio Storico X-D Ok more seriously, "Updating topology ..." is not a problem. But you have low resources and system misconfiguration: - Small heap size: 3.867GiB From the logs: "Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root." - System settings: Swap shoud be disabled, bad system limits, etc. From the logs: "Cassandra server running in degraded mode. Is swap disabled? : false, Address space adequate? : true, nofile limit adequate? : true, nproc limit adequate? : false" For system tuning see https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html You said "Cassandra node did not startup". What is the problem exactly? The process is stuck or does it dies? What do you see with "nodetool status" on nodes that are up and running? Btw cassandra-topology.properties is not required with GossipingPropertyFileSnitch (unless your are migratig from PropertyFileSnitch). Best, Romain Le vendredi 9 novembre 2018 à 11:34:16 UTC+1, Francesco Messere <f.mess...@list-group.com> a écrit : Hi to all, I have a problem with distribuited cluster configuration. This is a test environment Cassandra version is 3.11.3 2 site Milan and Florence 2 servers on each site 1 common "cluster-name" and 2 DC First installation and startup goes ok all the nodes are present in the cluster. The issue startup after a server reboot in FLORENCE DC Cassandra node did not startup and in system.log last line written is INFO [ScheduledTasks:1] 2018-11-09 10:36:54,306 TokenMetadata.java:498 - Updating topology for all endpoints that have changed The only way to correct the thing I found is to cleanup the node, remove from cluster and re-join it. How can I solve it? here are configuration files less cassandra-topology.properties # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Cassandra Node IP=Data Center:Rack 192.168.204.175=DC-FIRENZE:RACK1 192.168.204.176=DC-FIRENZE:RACK1 192.168.71.210=DC-MILANO:RACK1 192.168.71.211=DC-MILANO:RACK1 # default for unknown nodes default=DC-FIRENZE:r1 # Native IPv6 is supported, however you must escape the colon in the IPv6 Address # Also be sure to comment out JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true" # in cassandra-env.sh #fe80\:0\:0\:0\:202\:b3ff\:fe1e\:8329=DC1:RAC3 cassandra-rackdc.properties # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # These properties are used with GossipingPropertyFileSnitch and will # indicate the rack and dc for this node dc=DC-FIRENZE rack=RACK1 # Add a suffix to a datacenter name. Used by the Ec2Snitch and Ec2MultiRegionSnitch # to append a string to the EC2 region name. #dc_suffix= # Uncomment the following line to make this snitch prefer the internal ip when possible, as the Ec2MultiRegionSnitch does. # prefer_local=true In attach the system.log file Regards Francesco Messere --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org ---------------------------------------------------------------------To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.orgFor additional commands, e-mail: user-h...@cassandra.apache.org