I think I know where my problem is coming from. I took a look at the log of cassandra on each node and I saw something related to bootstrap. it says that the node is a seed so there will be no bootstraping. Actually I made a mistake. in the cassandra.yaml file each node have two ips as seeds. the ip of the machine itself and the ip of the real seed server. Once i removed the local IP the problem seems to be fixed.
2015-09-07 18:01 GMT+02:00 ICHIBA Sara <ichi.s...@gmail.com>: > Thank you all for your answers. > > @Alain: > Can you detail actions performed, > >>like how you load data > >>>i have a haproxy in front of my cassandra database, so i'm sure that my > application queries each time a different coordinator > > >>what scaleup / scaledown are and precise if you let it decommission > fully (streams finished, node removed from nodetool status) > >>> i'm using openstack platform to autoscale cassandra cluster. Actually, > in openstack, the combination of ceilometer + heat allow to users to > automate the deployment of their applications and supervise their > resources. they can order the scale up (adding of new nodes automatically) > when resources(cpu, ram,...) are needed or scaledown (remove unecessary VMs > automatically). > so with heat i can spawn automatically a cluster of 2 cassandra VMs > (create the cluster and configure each cassandra server with a template). > My cluster can go from 2 nodes to 6 based on the workloads. when their is a > scaledown action, heat automatically execute a script on my node and > decommission it before removing it. > > >>Also, I am not sure of the meaning of this --> " i'm affecting to each > of my node a different token based on there ip address (the token is > A+B+C+D and the ip is A.B.C.D)". > > look at this: > [root@demo-server-seed-wgevseugyjd7 ~]# nodetool status bng; > Datacenter: DC1 > =============== > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host > ID Rack > UN 40.0.0.149 789.03 KB 189 100.0% > bd0b2616-18d9-4bc2-a80b-eebd67474712 RAC1 > UN 40.0.0.168 300.38 KB 208 100.0% > ebd9732b-ebfc-4a6c-b354-d7df860b57b0 RAC1 > > the node with address 40.0.0.149 have the token 189=40+0+0+149 > and the node with address 40.0.0.168 have the token 208=40+0+0+168 > > this way i'm sure that each node in my cluster will have a different > token. I don't know what will happen if all the node have the same token?? > > >>Aren't you using RandomPartitioner or Murmur3Partitioner > > i'm using the default one which is > partitioner: org.apache.cassandra.dht.Murmur3Partitioner > > > in order to configure cassandra on each node i'm using this script > > inputs: > - name: IP > - name: SEED > config: | > #!/bin/bash -v > cat << EOF >> /etc/resolv.conf > nameserver 8.8.8.8 > nameserver 192.168.5.1 > EOF > > DEFAULT=${DEFAULT:-/etc/cassandra/default.conf} > CONFIG=/etc/cassandra/conf > IFS="." read a b c d <<< $IP > s="$[a[0]+b[0]+c[0]+d[0]]" > sed -i -e "s/^cluster_name.*/cluster_name: 'Cassandra cluster for > freeradius'/" $CONFIG/cassandra.yaml > sed -i -e "s/^num_tokens.*/num_tokens: $s/" $CONFIG/cassandra.yaml > sed -i -e "s/^listen_address.*/listen_address: $IP/" > $CONFIG/cassandra.yaml > sed -i -e "s/^rpc_address.*/rpc_address: 0.0.0.0/" > $CONFIG/cassandra.yaml > sed -i -e "s/^broadcast_address.*/broadcast_address: $IP/" > $CONFIG/cassandra.yaml > sed -i -e "s/broadcast_rpc_address.*/broadcast_rpc_address: $IP/" > $CONFIG/cassandra.yaml > sed -i -e > "s/^commitlog_segment_size_in_mb.*/commitlog_segment_size_in_mb: 32/" > $CONFIG/cassandra.yaml > sed -i -e "s/# JVM_OPTS=\"\$JVM_OPTS > -Djava.rmi.server.hostname=<public name>\"/JVM_OPTS=\"\$JVM_OPTS > -Djava.rmi.server.hostname=$IP\"/" $CONFIG/cassandra-env.sh > sed -i -e "s/- seeds.*/- seeds: \"$SEED\"/" $CONFIG/cassandra.yaml > > sed -i -e "s/^endpoint_snitch.*/endpoint_snitch: > GossipingPropertyFileSnitch/" $CONFIG/cassandra.yaml > echo MAX_HEAP_SIZE="4G" >> $CONFIG/cassandra-env.sh > echo HEAP_NEWSIZE="800M" >> $CONFIG/cassandra-env.sh > service cassandra stop > rm -rf /var/lib/cassandra/data/system/* > service cassandra start > > > > 2015-09-07 16:30 GMT+02:00 Ryan Svihla <r...@foundev.pro>: > >> If that's what tracing is telling you then it's fine and just a product >> of data distribution (note your token count isn't identical anyway). >> >> If you're doing cl one queries directly against particular nodes and >> getting different results it sounds like dropped mutations, streaming >> errors and or timeouts. Does running repair or reading at CL level all give >> you an accurate total record count? >> >> nodetool tpstats should help post bootstrap identify dropped mutations >> but you also want to monitor logs for any errors (in general this is always >> good advice for any system).. There could be a myriad or problems with >> bootstrapping new nodes, usually this is related to under provisioning. >> >> On Mon, Sep 7, 2015 at 8:19 AM Alain RODRIGUEZ <arodr...@gmail.com> >> wrote: >> >>> Hi Sara, >>> >>> Can you detail actions performed, like how you load data, what scaleup / >>> scaledown are and precise if you let it decommission fully (streams >>> finished, node removed from nodetool status) etc ? >>> >>> This would help us to help you :). >>> >>> Also, what happens if you query using "CONSISTENCY LOCAL_QUORUM;" (or >>> ALL) before your select ? If not using cqlsh, set the Consistency Level of >>> your client to LOCAL_QUORUM or ALL and try to select again. >>> >>> Also, I am not sure of the meaning of this --> " i'm affecting to each >>> of my node a different token based on there ip address (the token is >>> A+B+C+D and the ip is A.B.C.D)". Aren't you using RandomPartitioner or >>> Murmur3Partitioner ? >>> >>> C*heers, >>> >>> Alain >>> >>> >>> >>> 2015-09-07 12:01 GMT+02:00 Edouard COLE <edouard.c...@rgsystem.com>: >>> >>>> Please, don't mail me directly >>>> >>>> I read your answer, but I cannot help anymore >>>> >>>> And answering with "Sorry, I can't help" is pointless :) >>>> >>>> Wait for the community to answer >>>> >>>> De : ICHIBA Sara [mailto:ichi.s...@gmail.com] >>>> Envoyé : Monday, September 07, 2015 11:34 AM >>>> À : user@cassandra.apache.org >>>> Objet : Re: cassandra scalability >>>> >>>> when there's a scaledown action, i make sure to decommission the node >>>> before. but still, I don't understand why I'm having this behaviour. is it >>>> normal. what do you do normally to remove a node? is it related to tokens? >>>> i'm affecting to each of my node a different token based on there ip >>>> address (the token is A+B+C+D and the ip is A.B.C.D) >>>> >>>> 2015-09-07 11:28 GMT+02:00 ICHIBA Sara <ichi.s...@gmail.com>: >>>> at the biginning it looks like this: >>>> >>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status >>>> Datacenter: DC1 >>>> =============== >>>> Status=Up/Down >>>> |/ State=Normal/Leaving/Joining/Moving >>>> -- Address Load Tokens Owns Host >>>> ID Rack >>>> UN 40.0.0.208 128.73 KB 248 ? >>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606 RAC1 >>>> UN 40.0.0.209 114.59 KB 249 ? >>>> 84f6f0be-6633-4c36-b341-b968ff91a58f RAC1 >>>> UN 40.0.0.205 129.53 KB 245 ? >>>> aa233dc2-a8ae-4c00-af74-0a119825237f RAC1 >>>> >>>> >>>> >>>> >>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status >>>> service_dictionary >>>> Datacenter: DC1 >>>> =============== >>>> Status=Up/Down >>>> |/ State=Normal/Leaving/Joining/Moving >>>> -- Address Load Tokens Owns (effective) Host >>>> ID Rack >>>> UN 40.0.0.208 128.73 KB 248 68.8% >>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606 RAC1 >>>> UN 40.0.0.209 114.59 KB 249 67.8% >>>> 84f6f0be-6633-4c36-b341-b968ff91a58f RAC1 >>>> UN 40.0.0.205 129.53 KB 245 63.5% >>>> aa233dc2-a8ae-4c00-af74-0a119825237f RAC1 >>>> >>>> the result of the query select * from service_dictionary.table1; gave me >>>> 70 rows from 40.0.0.205 >>>> 64 from 40.0.0.209 >>>> 54 from 40.0.0.208 >>>> >>>> 2015-09-07 11:13 GMT+02:00 Edouard COLE <edouard.c...@rgsystem.com>: >>>> Could you provide the result of : >>>> - nodetool status >>>> - nodetool status YOURKEYSPACE >>>> >>>> >>>> >>> -- >> Regards, >> >> Ryan Svihla > > >