Re: cassandra scalability

ICHIBA Sara Mon, 07 Sep 2015 09:12:48 -0700

I think I know where my problem is coming from. I took a look at the log of
cassandra on each node and I saw something related to bootstrap. it says
that the node is a seed so there will be no bootstraping. Actually I made a
mistake. in the cassandra.yaml file each node have two ips as seeds. the ip
of the machine itself and the ip of the real seed server. Once i removed
the local IP the problem seems to be fixed.


2015-09-07 18:01 GMT+02:00 ICHIBA Sara <ichi.s...@gmail.com>:

> Thank you all for your answers.
>
> @Alain:
> Can you detail actions performed,
> >>like how you load data
> >>>i have a haproxy in front of my cassandra database, so i'm sure that my
> application queries each time a different coordinator
>
> >>what scaleup / scaledown are and precise if you let it decommission
> fully (streams finished, node removed from nodetool status)
> >>> i'm using openstack platform to autoscale cassandra cluster. Actually,
> in openstack, the combination of ceilometer + heat allow to users to
> automate the deployment of their applications and supervise their
> resources. they can order the scale up (adding of new nodes automatically)
> when resources(cpu, ram,...) are needed or scaledown (remove unecessary VMs
> automatically).
> so with heat i can spawn automatically a cluster of 2 cassandra VMs
> (create the cluster and configure each cassandra server with a template).
> My cluster can go from 2 nodes to 6 based on the workloads. when their is a
> scaledown action, heat automatically execute a script on my node and
> decommission it before removing it.
>
> >>Also, I am not sure of the meaning of this --> " i'm affecting to each
> of my node a different token based on there ip address (the token is
> A+B+C+D and the ip is A.B.C.D)".
>
> look at this:
> [root@demo-server-seed-wgevseugyjd7 ~]# nodetool status bng;
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address     Load       Tokens  Owns (effective)  Host
> ID                               Rack
> UN  40.0.0.149  789.03 KB  189     100.0%
> bd0b2616-18d9-4bc2-a80b-eebd67474712  RAC1
> UN  40.0.0.168  300.38 KB  208     100.0%
> ebd9732b-ebfc-4a6c-b354-d7df860b57b0  RAC1
>
> the node with address 40.0.0.149 have the token 189=40+0+0+149
> and the node with address 40.0.0.168 have the token 208=40+0+0+168
>
> this way i'm sure that each node in my cluster will have a different
> token. I don't know what will happen if all the node have the same token??
>
> >>Aren't you using RandomPartitioner or Murmur3Partitioner
>
> i'm using the default one which is
> partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>
>
> in order to configure cassandra on each node i'm using this script
>
>       inputs:
>       - name: IP
>       - name: SEED
>       config: |
>         #!/bin/bash -v
>         cat << EOF >> /etc/resolv.conf
>         nameserver 8.8.8.8
>         nameserver 192.168.5.1
>         EOF
>
>         DEFAULT=${DEFAULT:-/etc/cassandra/default.conf}
>         CONFIG=/etc/cassandra/conf
>         IFS="." read a b c d <<< $IP
>         s="$[a[0]+b[0]+c[0]+d[0]]"
>         sed -i -e "s/^cluster_name.*/cluster_name: 'Cassandra cluster for
> freeradius'/" $CONFIG/cassandra.yaml
>         sed -i -e "s/^num_tokens.*/num_tokens: $s/" $CONFIG/cassandra.yaml
>         sed -i -e "s/^listen_address.*/listen_address: $IP/"
> $CONFIG/cassandra.yaml
>         sed -i -e "s/^rpc_address.*/rpc_address: 0.0.0.0/"
> $CONFIG/cassandra.yaml
>         sed -i -e "s/^broadcast_address.*/broadcast_address: $IP/"
> $CONFIG/cassandra.yaml
>         sed -i -e "s/broadcast_rpc_address.*/broadcast_rpc_address: $IP/"
> $CONFIG/cassandra.yaml
>         sed -i -e
> "s/^commitlog_segment_size_in_mb.*/commitlog_segment_size_in_mb: 32/"
> $CONFIG/cassandra.yaml
>         sed -i -e "s/# JVM_OPTS=\"\$JVM_OPTS
> -Djava.rmi.server.hostname=<public name>\"/JVM_OPTS=\"\$JVM_OPTS
> -Djava.rmi.server.hostname=$IP\"/" $CONFIG/cassandra-env.sh
>         sed -i -e "s/- seeds.*/- seeds: \"$SEED\"/" $CONFIG/cassandra.yaml
>
>         sed -i -e "s/^endpoint_snitch.*/endpoint_snitch:
> GossipingPropertyFileSnitch/" $CONFIG/cassandra.yaml
>         echo MAX_HEAP_SIZE="4G" >>  $CONFIG/cassandra-env.sh
>         echo HEAP_NEWSIZE="800M" >> $CONFIG/cassandra-env.sh
>         service cassandra stop
>         rm -rf /var/lib/cassandra/data/system/*
>         service cassandra start
>
>
>
> 2015-09-07 16:30 GMT+02:00 Ryan Svihla <r...@foundev.pro>:
>
>> If that's what tracing is telling you then it's fine and just a product
>> of data distribution (note your token count isn't identical anyway).
>>
>> If you're doing cl one queries directly against particular nodes and
>> getting different results it sounds like dropped mutations, streaming
>> errors and or timeouts. Does running repair or reading at CL level all give
>> you an accurate total record count?
>>
>> nodetool tpstats should help post bootstrap identify dropped mutations
>> but you also want to monitor logs for any errors (in general this is always
>> good advice for any system).. There could be a myriad or problems with
>> bootstrapping new nodes, usually this is related to under provisioning.
>>
>> On Mon, Sep 7, 2015 at 8:19 AM Alain RODRIGUEZ <arodr...@gmail.com>
>> wrote:
>>
>>> Hi Sara,
>>>
>>> Can you detail actions performed, like how you load data, what scaleup /
>>> scaledown are and precise if you let it decommission fully (streams
>>> finished, node removed from nodetool status) etc ?
>>>
>>> This would help us to help you :).
>>>
>>> Also, what happens if you query using "CONSISTENCY LOCAL_QUORUM;" (or
>>> ALL) before your select ? If not using cqlsh, set the Consistency Level of
>>> your client to LOCAL_QUORUM or ALL and try to select again.
>>>
>>> Also, I am not sure of the meaning of this --> " i'm affecting to each
>>> of my node a different token based on there ip address (the token is
>>> A+B+C+D and the ip is A.B.C.D)". Aren't you using RandomPartitioner or
>>> Murmur3Partitioner ?
>>>
>>> C*heers,
>>>
>>> Alain
>>>
>>>
>>>
>>> 2015-09-07 12:01 GMT+02:00 Edouard COLE <edouard.c...@rgsystem.com>:
>>>
>>>> Please, don't mail me directly
>>>>
>>>> I read your answer, but I cannot help anymore
>>>>
>>>> And answering with "Sorry, I can't help" is pointless :)
>>>>
>>>> Wait for the community to answer
>>>>
>>>> De : ICHIBA Sara [mailto:ichi.s...@gmail.com]
>>>> Envoyé : Monday, September 07, 2015 11:34 AM
>>>> À : user@cassandra.apache.org
>>>> Objet : Re: cassandra scalability
>>>>
>>>> when there's a scaledown action, i make sure to decommission the node
>>>> before. but still, I don't understand why I'm having this behaviour. is it
>>>> normal. what do you do normally to remove a node? is it related to tokens?
>>>> i'm affecting to each of my node a different token based on there ip
>>>> address (the token is A+B+C+D and the ip is A.B.C.D)
>>>>
>>>> 2015-09-07 11:28 GMT+02:00 ICHIBA Sara <ichi.s...@gmail.com>:
>>>> at the biginning it looks like this:
>>>>
>>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status
>>>> Datacenter: DC1
>>>> ===============
>>>> Status=Up/Down
>>>> |/ State=Normal/Leaving/Joining/Moving
>>>> --  Address     Load       Tokens  Owns    Host
>>>> ID                               Rack
>>>> UN  40.0.0.208  128.73 KB  248     ?
>>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606  RAC1
>>>> UN  40.0.0.209  114.59 KB  249     ?
>>>> 84f6f0be-6633-4c36-b341-b968ff91a58f  RAC1
>>>> UN  40.0.0.205  129.53 KB  245     ?
>>>> aa233dc2-a8ae-4c00-af74-0a119825237f  RAC1
>>>>
>>>>
>>>>
>>>>
>>>> [root@demo-server-seed-k6g62qr57nok ~]# nodetool status
>>>> service_dictionary
>>>> Datacenter: DC1
>>>> ===============
>>>> Status=Up/Down
>>>> |/ State=Normal/Leaving/Joining/Moving
>>>> --  Address     Load       Tokens  Owns (effective)  Host
>>>> ID                               Rack
>>>> UN  40.0.0.208  128.73 KB  248     68.8%
>>>> 6e7788f9-56bf-4314-a23a-3bf1642d0606  RAC1
>>>> UN  40.0.0.209  114.59 KB  249     67.8%
>>>> 84f6f0be-6633-4c36-b341-b968ff91a58f  RAC1
>>>> UN  40.0.0.205  129.53 KB  245     63.5%
>>>> aa233dc2-a8ae-4c00-af74-0a119825237f  RAC1
>>>>
>>>> the result of the query select * from service_dictionary.table1; gave me
>>>>  70 rows from 40.0.0.205
>>>> 64 from 40.0.0.209
>>>> 54 from 40.0.0.208
>>>>
>>>> 2015-09-07 11:13 GMT+02:00 Edouard COLE <edouard.c...@rgsystem.com>:
>>>> Could you provide the result of :
>>>> - nodetool status
>>>> - nodetool status YOURKEYSPACE
>>>>
>>>>
>>>>
>>> --
>> Regards,
>>
>> Ryan Svihla
>
>
>

Re: cassandra scalability

Reply via email to