Re: about hash calculation in reference page at docs.basho.com
very clear explanation, Charlie. Thanks again! nice weekend. -donald On Tue, Jan 29, 2013 at 1:26 PM, Charlie Voiselle cvoise...@basho.comwrote: Donald: Since the list of partitions can easily be segmented using a filter on partition values that are greater than the hashed value, it makes for a simple determination of the ordered preflist. For Example: Hash = 1045375627425331784151332358177649483819648417632 Parition| P H | List | +---+---+ 0 | False | B | 182687704666362864775460604089535377456991567872| False | B | 365375409332725729550921208179070754913983135744| False | B | 548063113999088594326381812268606132370974703616| False | B | 730750818665451459101842416358141509827966271488| False | B | 913438523331814323877303020447676887284957839360| False | B | 1096126227998177188652763624537212264741949407232 | True | A | 1278813932664540053428224228626747642198940975104 | True | A | +---+---+ Then so long as order is maintained, your sorted preflist is simply A ++ B. It would be a (if only slightly) more complex algorithm, requiring determination by lookahead if you were in the correct partition, in order to have the partitions names be starting values. To answer your second question, the partitions are named by dividing the integer keyspace by the num partitions to create an incrementor[1]. This incrementor is used as the stepping factor in a sequence from 0 to RINGTOP value -- `trunc(math:pow(2,160)-1)`. You can call these functions yourself in the `riak attach` as follows: (dev1@127.0.0.1)1 Inc = chash:ring_increment(8). 182687704666362864775460604089535377456991567872 (dev1@127.0.0.1)2 [{IndexAsInt} || IndexAsInt -lists:seq(0,( trunc( math:pow(2,160) -1) -1),Inc)]. [{0}, {182687704666362864775460604089535377456991567872}, {365375409332725729550921208179070754913983135744}, {548063113999088594326381812268606132370974703616}, {730750818665451459101842416358141509827966271488}, {913438523331814323877303020447676887284957839360}, {1096126227998177188652763624537212264741949407232}, {1278813932664540053428224228626747642198940975104}]} Hope this helps! Charlie Voiselle --- [0] https://github.com/basho/riak_core/blob/master/src/chash.erl#L88 [1] https://github.com/basho/riak_core/blob/master/src/chash.erl#L162 On Jan 29, 2013, at 9:47 AM, Donald Yan dyan@gmail.com wrote: Hi Charlie, That really helps to clear the confusion. Thanks again! Then it comes to the next question :-) Just curious to know: is there a special reason why using the end value as the partition's name? Normally, most people I think would pick the beginning value as the partition's name in common logic. And a second question related to the above question is how this end value is picked: partition named 0 is actually ending with number 0, and so forth for the other partitions. This is like the 160-bit space got right-shifted one number relative to 0-started numbering system. Is there a special reason for it in Riak, or simply want to use 1-started numbering system. wr, donald On Mon, Jan 28, 2013 at 3:21 PM, Charlie Voiselle cvoise...@basho.comwrote: Donald: Since we are traversing the ring in a clockwise fashion, the data is written into the next partition you would run into moving forward. You can see how the partitions are chosen in* *chash:ordered_from/2 at https://github.com/basho/riak_core/blob/master/src/chash.erl#L138. The preference list is built using the partitions that come after the hashed value. If you consider the partition's name to be the end value that the partition holds, the an object that hashes to 1045375627425331784151332358177649483819648417632 being assigned to 1096126227998177188652763624537212264741949407232 makes sense. That partition would hold items that hash from 913438523331814323877303020447676887284957839361 to 1096126227998177188652763624537212264741949407232 Does this help? Thanks, Charlie Voiselle On Jan 24, 2013, at 12:05 AM, Donald Yan dyan@gmail.com wrote: Hi there, I'm reading this page at http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example and have a question about the example given in the page. I quote it here: The DocIdx hash is a 160 bit integer: (dev1@127.0.0.1)5 I:160/integer = DocIdx. 183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96 (dev1@127.0.0.1)6 I.1045375627425331784151332358177649483819648417632 The node looks up the hashed key in the ring which returns a list of “preferred” partitions for the given key. (node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx,
Re: about hash calculation in reference page at docs.basho.com
Hi Charlie, That really helps to clear the confusion. Thanks again! Then it comes to the next question :-) Just curious to know: is there a special reason why using the end value as the partition's name? Normally, most people I think would pick the beginning value as the partition's name in common logic. And a second question related to the above question is how this end value is picked: partition named 0 is actually ending with number 0, and so forth for the other partitions. This is like the 160-bit space got right-shifted one number relative to 0-started numbering system. Is there a special reason for it in Riak, or simply want to use 1-started numbering system. wr, donald On Mon, Jan 28, 2013 at 3:21 PM, Charlie Voiselle cvoise...@basho.comwrote: Donald: Since we are traversing the ring in a clockwise fashion, the data is written into the next partition you would run into moving forward. You can see how the partitions are chosen in* *chash:ordered_from/2 at https://github.com/basho/riak_core/blob/master/src/chash.erl#L138. The preference list is built using the partitions that come after the hashed value. If you consider the partition's name to be the end value that the partition holds, the an object that hashes to 1045375627425331784151332358177649483819648417632 being assigned to 1096126227998177188652763624537212264741949407232 makes sense. That partition would hold items that hash from 913438523331814323877303020447676887284957839361 to 1096126227998177188652763624537212264741949407232 Does this help? Thanks, Charlie Voiselle On Jan 24, 2013, at 12:05 AM, Donald Yan dyan@gmail.com wrote: Hi there, I'm reading this page at http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example and have a question about the example given in the page. I quote it here: The DocIdx hash is a 160 bit integer: (dev1@127.0.0.1)5 I:160/integer = DocIdx. 183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96 (dev1@127.0.0.1)6 I.1045375627425331784151332358177649483819648417632 The node looks up the hashed key in the ring which returns a list of “preferred” partitions for the given key. (node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx, Ring). [{1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'}, {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'}, {0, 'dev1@127.0.0.1'}, {182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'}, {365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'}, {548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'}, {730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'}, {913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'}] I question is since the key is calculated as 1045375627425331784151332358177649483819648417632, shouldn't the prefer list be like the following instead? [{913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'}, {1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'}, {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'}, {0, 'dev1@127.0.0.1'}, {182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'}, {365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'}, {548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'}, {730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'}] because value 1045375627425331784151332358177649483819648417632 falls between 913438523331814323877303020447676887284957839360 (partition 5) and 1096126227998177188652763624537212264741949407232 (partition 6)? Please Riak experts help to clear my confusion here. Why partition 6, 7, 0 are on the top 3 of the list? Appreciate in advance. -donald ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Re: about hash calculation in reference page at docs.basho.com
Donald: Since the list of partitions can easily be segmented using a filter on partition values that are greater than the hashed value, it makes for a simple determination of the ordered preflist. For Example: Hash = 1045375627425331784151332358177649483819648417632 Parition| P H | List | +---+---+ 0 | False | B | 182687704666362864775460604089535377456991567872| False | B | 365375409332725729550921208179070754913983135744| False | B | 548063113999088594326381812268606132370974703616| False | B | 730750818665451459101842416358141509827966271488| False | B | 913438523331814323877303020447676887284957839360| False | B | 1096126227998177188652763624537212264741949407232 | True | A | 1278813932664540053428224228626747642198940975104 | True | A | +---+---+ Then so long as order is maintained, your sorted preflist is simply A ++ B. It would be a (if only slightly) more complex algorithm, requiring determination by lookahead if you were in the correct partition, in order to have the partitions names be starting values. To answer your second question, the partitions are named by dividing the integer keyspace by the num partitions to create an incrementor[1]. This incrementor is used as the stepping factor in a sequence from 0 to RINGTOP value -- `trunc(math:pow(2,160)-1)`. You can call these functions yourself in the `riak attach` as follows: (dev1@127.0.0.1)1 Inc = chash:ring_increment(8). 182687704666362864775460604089535377456991567872 (dev1@127.0.0.1)2 [{IndexAsInt} || IndexAsInt -lists:seq(0,( trunc( math:pow(2,160) -1) -1),Inc)]. [{0}, {182687704666362864775460604089535377456991567872}, {365375409332725729550921208179070754913983135744}, {548063113999088594326381812268606132370974703616}, {730750818665451459101842416358141509827966271488}, {913438523331814323877303020447676887284957839360}, {1096126227998177188652763624537212264741949407232}, {1278813932664540053428224228626747642198940975104}]} Hope this helps! Charlie Voiselle --- [0] https://github.com/basho/riak_core/blob/master/src/chash.erl#L88 [1] https://github.com/basho/riak_core/blob/master/src/chash.erl#L162 On Jan 29, 2013, at 9:47 AM, Donald Yan dyan@gmail.com wrote: Hi Charlie, That really helps to clear the confusion. Thanks again! Then it comes to the next question :-) Just curious to know: is there a special reason why using the end value as the partition's name? Normally, most people I think would pick the beginning value as the partition's name in common logic. And a second question related to the above question is how this end value is picked: partition named 0 is actually ending with number 0, and so forth for the other partitions. This is like the 160-bit space got right-shifted one number relative to 0-started numbering system. Is there a special reason for it in Riak, or simply want to use 1-started numbering system. wr, donald On Mon, Jan 28, 2013 at 3:21 PM, Charlie Voiselle cvoise...@basho.com wrote: Donald: Since we are traversing the ring in a clockwise fashion, the data is written into the next partition you would run into moving forward. You can see how the partitions are chosen in chash:ordered_from/2 at https://github.com/basho/riak_core/blob/master/src/chash.erl#L138. The preference list is built using the partitions that come after the hashed value. If you consider the partition's name to be the end value that the partition holds, the an object that hashes to 1045375627425331784151332358177649483819648417632 being assigned to 1096126227998177188652763624537212264741949407232 makes sense. That partition would hold items that hash from 913438523331814323877303020447676887284957839361 to 1096126227998177188652763624537212264741949407232 Does this help? Thanks, Charlie Voiselle On Jan 24, 2013, at 12:05 AM, Donald Yan dyan@gmail.com wrote: Hi there, I'm reading this page at http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example and have a question about the example given in the page. I quote it here: The DocIdx hash is a 160 bit integer: (dev1@127.0.0.1)5 I:160/integer = DocIdx. 183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96 (dev1@127.0.0.1)6 I. 1045375627425331784151332358177649483819648417632 The node looks up the hashed key in the ring which returns a list of “preferred” partitions for the given key. (node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx, Ring). [{1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'}, {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'}, {0,
about hash calculation in reference page at docs.basho.com
Hi there, I'm reading this page at http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example and have a question about the example given in the page. I quote it here: The DocIdx hash is a 160 bit integer: (dev1@127.0.0.1)5 I:160/integer = DocIdx. 183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96 (dev1@127.0.0.1)6 I.1045375627425331784151332358177649483819648417632 The node looks up the hashed key in the ring which returns a list of “preferred” partitions for the given key. (node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx, Ring). [{1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'}, {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'}, {0, 'dev1@127.0.0.1'}, {182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'}, {365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'}, {548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'}, {730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'}, {913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'}] I question is since the key is calculated as 1045375627425331784151332358177649483819648417632, shouldn't the prefer list be like the following instead? [{913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'}, {1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'}, {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'}, {0, 'dev1@127.0.0.1'}, {182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'}, {365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'}, {548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'}, {730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'}] because value 1045375627425331784151332358177649483819648417632 falls between 913438523331814323877303020447676887284957839360 (partition 5) and 1096126227998177188652763624537212264741949407232 (partition 6)? Please Riak experts help to clear my confusion here. Why partition 6, 7, 0 are on the top 3 of the list? Appreciate in advance. -donald ___ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com