Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Radim Kolar
I don't like that every node will have same portion of data. 1. We are using nodes with different HW sizes (number of disks) 2. especially with ordered partitioner there tends to be hotspots and you must assign smaller portion of data to nodes holding hotspots

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Sam Overton
On 17 March 2012 11:15, Radim Kolar h...@filez.com wrote: I don't like that every node will have same portion of data. 1. We are using nodes with different HW sizes (number of disks) 2. especially with ordered partitioner there tends to be hotspots and you must assign smaller portion of

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Zhu Han
On Sat, Mar 17, 2012 at 7:38 AM, Sam Overton s...@acunu.com wrote: Hello cassandra-dev, This is a long email. It concerns a significant change to Cassandra, so deserves a thorough introduction. *The summary is*: we believe virtual nodes are the way forward. We would like to add virtual

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Eric Evans
On Sat, Mar 17, 2012 at 11:15 AM, Radim Kolar h...@filez.com wrote: I don't like that every node will have same portion of data. 1. We are using nodes with different HW sizes (number of disks) 2.  especially with ordered partitioner there tends to be hotspots and you must assign smaller

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Edward Capriolo
I agree having smaller regions would help the rebalencing situation both with rp and bop. However i an not sure if dividing tables across disk s will give any better performance. you will have more seeking spindles and can possibly sub divide token ranges into separate files. But fs cache will

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Eric Evans
On Sat, Mar 17, 2012 at 3:22 PM, Zhu Han schumi@gmail.com wrote: On Sat, Mar 17, 2012 at 7:38 AM, Sam Overton s...@acunu.com wrote: This is a long email. It concerns a significant change to Cassandra, so deserves a thorough introduction. *The summary is*: we believe virtual nodes are the

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Peter Schuller
*The summary is*: we believe virtual nodes are the way forward. We would like to add virtual nodes to Cassandra and we are asking for comments, criticism and collaboration! I am very happy to see some momentum on this, and I would like to go even further than what you propose. The main reasons

Re: RFC: Cassandra Virtual Nodes

2012-03-17 Thread Peter Schuller
Point of clarification: My use of the term bucket is completely unrelated to the term bucket used in the CRUSH paper. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)