Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-29 Thread Carsten Krebs

 
 Also, note that lack of saw-toothing is not a goal in and of itself
 and may even be bad. For example, with respect to the young generation
 the situation is essentially:
 
 (1) The larger the young generation, the more significant the saw-tooth.
 (2) The larger the young generation, the more efficient the GC (if the
 application behaves according to the weak generational hypothesis -
 google it if you want a ref) because less data is promoted to old gen
 and because the overhead of stop-the-world is lessened.
 (3) The larger the young generation, the longer the pause times to do
 collections of the young generation.
 
In this regard, what I personally miss in Mikios - however nice - analysis, is 
what are the effects on the application stop times due to any garbage 
collection runs for the cases tested. In most cases, I prefer having low pauses 
due to any garbage collection runs and don't care too much about the shape of 
the memory usage, and I guess, that's the reason why the low pause collector is 
used by default for running cassandra. For myself, I have mixed feelings 
regarding the low pause collector, because I found it difficult to find good 
young generation sizings, which are suitable to different load patterns. 
Therefor I mostly prefer the throughput collector, which adaptively sizes the 
young generation, doing a good job to avoiding that too much data goes to the 
tenured generation. I would be interested in, what are the differences 
concerning the stop times between the different GC variants, when running 
cassandra. Is it really much better to use the low pause collector in regard to 
get stabile response times, even if I use   XX:+UseParallelOldGC and 
XX:MaxGCPauseMillis=nnn flags? Any experiences with this?

Regards,

Carsten



Re: TokenRange contains endpoints without any port information?

2010-08-09 Thread Carsten Krebs

On 08.08.2010, at 14:47 aaron morton wrote:
 
 What sort of client side load balancing where you thinking of? I just use 
 round robin DNS to distribute clients around the cluster, and have them 
 recycle their connections every so often. 
 
I was thinking about to use this method to give the client to the ability to 
learn what nodes are part of the cluster. Using this information to 
automatically adapt the set of nodes used by the client if a new node is added 
to or respectively removed from the cluster.

Why do you prefer round robin DNS for load balancing? 
One advantage I see is, that the client does not has to take care about the 
node set and especially the management of the node set. The reason why I was 
thinking about a client side load balancing was to avoid the need to write 
additional tools, to monitor all nodes in the cluster and changing the DNS 
entry if any node fails - and this as fast as possible to prevent the clients 
from trying to use a dead node. But the time writing this, I doesn't think 
anymore, that this is good point. This is just a point of some sort of retry 
logic, which is needed anyway in the client.

Carsten



TokenRange contains endpoints without any port information?

2010-08-08 Thread Carsten Krebs

I'm wondering why a TokenRange returned by describe_ring(keyspace) of the 
thrift API just returns endpoints consisting only of an address but omits any 
port information?
My first thought was, this method could be used to expose some information 
about the ring structure to the client, i.e. to do some client side load 
balancing. But now, I'm not sure about this anymore. Additionally, when looking 
into the code, I guess the address returned as part of the TokenRange is the 
address of the storage service which could differ from the thrift address, 
which in turn would make the returned endpoint useless for the client.
What is the purpose of this method or respectively why is the port information 
omitted?

TIA,

Carsten