Re: Cluster configuration issue

Romain Hardouin Fri, 09 Nov 2018 11:13:32 -0800

 128GB RAM -> that's a good news, you have plenty of room to increase Cassandra 
heap size. You can start with, let's say, 12GB in jvm.options or 24GB if you 
use G1 GC. Let us know if the node starts and if DEBUG/TRACE is useful. 
You can also try "strace -f -p ..." command to see if the process is doing 
something when it's stuck, but Cassandra has a lots of threads...
    Le vendredi 9 novembre 2018 à 19:13:51 UTC+1, Francesco Messere 
<f.mess...@list-group.com> a écrit :  
 
  
Hi Roman 
 
 
yes  I modified the .yaml after the issue.
 
The problem  is this, if I restart a node in DC-FIRENZE than it not startup I 
tried first one node and then the second one
 
with the same results.


 
 
these are the server resources
 
memory 128Gb
 

 free
               total        used        free      shared  buff/cache   available
 Mem:      131741388    13858952    72649704      124584    45232732   116825040
 Swap:      16777212           0    16777212
 

 
 cpu 
 Architecture:          x86_64
 CPU op-mode(s):        32-bit, 64-bit
 Byte Order:            Little Endian
 CPU(s):                24
 On-line CPU(s) list:   0-23
 Thread(s) per core:    1
 Core(s) per socket:    12
 Socket(s):             2
 NUMA node(s):          2
 Vendor ID:             GenuineIntel
 CPU family:            6
 Model:                 79
 Model name:            Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
 Stepping:              1
 CPU MHz:               1213.523
 CPU max MHz:           2900.0000
 CPU min MHz:           1200.0000
 BogoMIPS:              4399.97
 Virtualization:        VT-x
 L1d cache:             32K
 L1i cache:             32K
 L2 cache:              256K
 L3 cache:              30720K
 NUMA node0 CPU(s):     0,2,4,6,8,10,12,14,16,18,20,22
 NUMA node1 CPU(s):     1,3,5,7,9,11,13,15,17,19,21,23 

 
 
There is nothing in server logs. 
 
 
On monday I will activate debug and try again to startup cassandra node 
 
 
Thanks
 
Francesco Messere
 

 
 

 
 
 On 09/11/2018 18:51, Romain Hardouin wrote:
  
 
  Ok so all nodes in Firenze are down. I thought only one was KO.  
  After a first look at cassandra.yaml the only issue I saw is seeds: the line 
you commented out was correct (one seed per DC). But I guess you modified it 
after the issue.  
  You should fix the swap issue.  
  Also can you add more heap to Cassandra? By the way, what are the specs of 
servers (RAM, CPU, etc)?  
  Did you check Linux system log? And Cassandra's debug.log? You can even 
enable TRACE logs in logback.xml ( 
https://github.com/apache/cassandra/blob/cassandra-3.11.3/conf/logback.xml#L100 
) then try to restart a node in Firenze to see where it blocks but if it's due 
to low resources, hardware issue or swap it won't be useful. Let's give a try 
anyway. 
  
  
      Le vendredi 9 novembre 2018 à 18:20:57 UTC+1, Francesco Messere 
<f.mess...@list-group.com> a écrit :  
  
     
Hi Romain,
 
you are right, is not possible to work in these towns furtunally I live in Pisa 
:-).
 
I sow the errors and corrected them, except the swap one.
 
The process stuks, I let it run for 1 day without results.
 
 This is the output of nodetool status from the nodes that are up and running 
(DC-MILANO)
 
 /conf/CASSANDRA_SHARE_PROD_conf/bin/cassandra-3.11.3/bin/nodetool -h 
192.168.71.210 -p 17052 status
 Datacenter: DC-FIRENZE
 ======================
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address          Load       Tokens       Owns (effective)  Host ID         
                      Rack
 DN  192.168.204.175  ?          256          100.0%            
a3c8626e-afab-413e-a153-cccfd0b26d06  RACK1
 DN  192.168.204.176  ?          256          100.0%            
67738ca8-f1f5-46a9-9d23-490bbebcffaa  RACK1
 Datacenter: DC-MILANO
 =====================
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address          Load       Tokens       Owns (effective)  Host ID         
                      Rack
 UN  192.168.71.210   5.95 GiB   256          100.0%            
210f0cdd-abee-4fc0-abd3-ecdab618606e  RACK1
 UN  192.168.71.211   5.83 GiB   256          100.0%            
96c30edd-4e6c-4952-82d4-dfdf67f6a06f  RACK1
 
 and this is describecluster command output
 
/conf/CASSANDRA_SHARE_PROD_conf/bin/cassandra-3.11.3/bin/nodetool -h 
192.168.71.210 -p 17052 describecluster
 Cluster Information:
         Name: CASSANDRA_3
         Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
         DynamicEndPointSnitch: enabled
         Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
         Schema versions:
                 6bdd4617-658e-375e-8503-7158df833495: [192.168.71.210, 
192.168.71.211]
 
                 UNREACHABLE: [192.168.204.175, 192.168.204.176]
 
 In attach the cassandra.yaml file 
 
 Regards
 Francesco Messere
 
 
 
  On 09/11/2018 17:48, Romain Hardouin wrote:
  
 
        Hi Francesco, it can't work! Milano and Firenze, oh boy, Calcio vs 
Calcio Storico X-D 
  Ok more seriously, "Updating topology ..." is not a problem. But you have low 
resources and system misconfiguration: 
    - Small heap size: 3.867GiB  From the logs: "Unable to lock JVM memory 
(ENOMEM). This can result in part of the JVM being swapped out, especially with 
mmapped I/O enabled. Increase RLIMIT_MEMLOCK or run Cassandra as root." 
   - System settings: Swap shoud be disabled, bad system limits, etc.  From the 
logs: "Cassandra server running in degraded mode. Is swap disabled? : false,  
Address space adequate? : true,  nofile limit adequate? : true, nproc limit 
adequate? : false"    For system tuning see 
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html 
  
  You said "Cassandra node did not startup". What is the problem exactly? The 
process is stuck or does it dies? What do you see with "nodetool status" on 
nodes that are up and running?  
  Btw cassandra-topology.properties is not required with 
GossipingPropertyFileSnitch (unless your are migratig from  
PropertyFileSnitch). 
  
  Best, 
  Romain 
  
      Le vendredi 9 novembre 2018 à 11:34:16 UTC+1, Francesco Messere 
<f.mess...@list-group.com> a écrit :  
  
     
Hi to all, 
 
 I have a problem with distribuited cluster configuration.
 This is a test environment  
 Cassandra version is 3.11.3
 2 site Milan and Florence
 2 servers on each site 
 
 1 common "cluster-name" and 2 DC 
 
 First installation and startup goes ok all the nodes are present in the 
cluster. 
 
 The issue startup after a server reboot in FLORENCE DC 
 
 Cassandra node did not startup and in system.log last line written is 
 
 INFO  [ScheduledTasks:1] 2018-11-09 10:36:54,306 TokenMetadata.java:498 - 
Updating  topology for all endpoints that have changed
 
 
 
 The only way to correct the thing I found is to cleanup the node, remove from 
cluster and re-join it.
  
How can I solve it?
 

 
 
here are configuration files 
 
 
less cassandra-topology.properties 
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
 # Cassandra Node IP=Data Center:Rack
 192.168.204.175=DC-FIRENZE:RACK1
 192.168.204.176=DC-FIRENZE:RACK1
 192.168.71.210=DC-MILANO:RACK1
 192.168.71.211=DC-MILANO:RACK1
 
 # default for unknown nodes
 default=DC-FIRENZE:r1
 
 # Native IPv6 is supported, however you must escape the colon in the IPv6 
Address
 # Also be sure to comment out JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
 # in cassandra-env.sh
#fe80\:0\:0\:0\:202\:b3ff\:fe1e\:8329=DC1:RAC3
 

 
 
cassandra-rackdc.properties
 
 
# Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
 # These properties are used with GossipingPropertyFileSnitch and will
 # indicate the rack and dc for this node
 dc=DC-FIRENZE
 rack=RACK1
 
 # Add a suffix to a datacenter name. Used by the Ec2Snitch and 
Ec2MultiRegionSnitch
 # to append a string to the EC2 region name.
 #dc_suffix=
 
 # Uncomment the following line to make this snitch prefer the internal ip when 
possible, as the Ec2MultiRegionSnitch does.
 # prefer_local=true
 
 

 
 
In attach the system.log file 
 
 
Regards 
 
 
Francesco Messere
 

 
   
---------------------------------------------------------------------
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: user-h...@cassandra.apache.org   
    
---------------------------------------------------------------------
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: user-h...@cassandra.apache.org    
 ---------------------------------------------------------------------To 
unsubscribe, e-mail: user-unsubscribe@cassandra.apache.orgFor additional 
commands, e-mail: user-h...@cassandra.apache.org

Re: Cluster configuration issue

Reply via email to