Re: Follow-up post on cassandra configuration with some experiments on GC tuning
Also, note that lack of saw-toothing is not a goal in and of itself and may even be bad. For example, with respect to the young generation the situation is essentially: (1) The larger the young generation, the more significant the saw-tooth. (2) The larger the young generation, the more efficient the GC (if the application behaves according to the weak generational hypothesis - google it if you want a ref) because less data is promoted to old gen and because the overhead of stop-the-world is lessened. (3) The larger the young generation, the longer the pause times to do collections of the young generation. In this regard, what I personally miss in Mikios - however nice - analysis, is what are the effects on the application stop times due to any garbage collection runs for the cases tested. In most cases, I prefer having low pauses due to any garbage collection runs and don't care too much about the shape of the memory usage, and I guess, that's the reason why the low pause collector is used by default for running cassandra. For myself, I have mixed feelings regarding the low pause collector, because I found it difficult to find good young generation sizings, which are suitable to different load patterns. Therefor I mostly prefer the throughput collector, which adaptively sizes the young generation, doing a good job to avoiding that too much data goes to the tenured generation. I would be interested in, what are the differences concerning the stop times between the different GC variants, when running cassandra. Is it really much better to use the low pause collector in regard to get stabile response times, even if I use XX:+UseParallelOldGC and XX:MaxGCPauseMillis=nnn flags? Any experiences with this? Regards, Carsten
Re: TokenRange contains endpoints without any port information?
On 08.08.2010, at 14:47 aaron morton wrote: What sort of client side load balancing where you thinking of? I just use round robin DNS to distribute clients around the cluster, and have them recycle their connections every so often. I was thinking about to use this method to give the client to the ability to learn what nodes are part of the cluster. Using this information to automatically adapt the set of nodes used by the client if a new node is added to or respectively removed from the cluster. Why do you prefer round robin DNS for load balancing? One advantage I see is, that the client does not has to take care about the node set and especially the management of the node set. The reason why I was thinking about a client side load balancing was to avoid the need to write additional tools, to monitor all nodes in the cluster and changing the DNS entry if any node fails - and this as fast as possible to prevent the clients from trying to use a dead node. But the time writing this, I doesn't think anymore, that this is good point. This is just a point of some sort of retry logic, which is needed anyway in the client. Carsten
TokenRange contains endpoints without any port information?
I'm wondering why a TokenRange returned by describe_ring(keyspace) of the thrift API just returns endpoints consisting only of an address but omits any port information? My first thought was, this method could be used to expose some information about the ring structure to the client, i.e. to do some client side load balancing. But now, I'm not sure about this anymore. Additionally, when looking into the code, I guess the address returned as part of the TokenRange is the address of the storage service which could differ from the thrift address, which in turn would make the returned endpoint useless for the client. What is the purpose of this method or respectively why is the port information omitted? TIA, Carsten