Re: Replication factor and performance questions

2012-11-10 Thread B. Todd Burruss
@oleg, to answer your last question a cassandra node should never ask another node for information it doesn't have. it uses the key and the partitioner to determine where the data is located before ever contacting another node. On Mon, Nov 5, 2012 at 9:45 AM, Andrey Ilinykh ailin...@gmail.com

Re: Replication factor and performance questions

2012-11-05 Thread Michael Kjellman
Rule of thumb is to try to keep nodes under 400GB. Compactions/Repairs/Move operations etc become a nightmare otherwise. How much data do you expect to have on each node? Also depends on caches, bloom filters etc On 11/5/12 8:57 AM, Oleg Dulin oleg.du...@gmail.com wrote: I have 4 nodes at my

Re: Replication factor and performance questions

2012-11-05 Thread Bryan
Our compactions/repairs have already become nightmares and we have not approached the levels of data you describe here (~200 GB). Have any pointers/case studies for optimizing this? On Nov 5, 2012, at 12:00 PM, Michael Kjellman wrote: Rule of thumb is to try to keep nodes under 400GB.

Re: Replication factor and performance questions

2012-11-05 Thread Oleg Dulin
Should be all under 400Gig on each. My question is -- is there additional overhead with replicas making requests to one another for keys they don't have ? how much of an overhead is that ? On 2012-11-05 17:00:37 +, Michael Kjellman said: Rule of thumb is to try to keep nodes under

Re: Replication factor and performance questions

2012-11-05 Thread Andrey Ilinykh
You will have one extra hop. Not big deal, actually. And many client libraries (astyanax for example) are token aware, so they are smart enough to call the right node. On Mon, Nov 5, 2012 at 9:12 AM, Oleg Dulin oleg.du...@gmail.com wrote: Should be all under 400Gig on each. My question is --