Re: about hash calculation in reference page at docs.basho.com

2013-02-03 Thread Donald Yan
very clear explanation, Charlie.  Thanks again!  nice weekend.

-donald

On Tue, Jan 29, 2013 at 1:26 PM, Charlie Voiselle cvoise...@basho.comwrote:

 Donald:

 Since the list of partitions can easily be segmented using a filter on 
 partition values
 that are greater than the hashed value, it makes for a simple determination 
 of the ordered
 preflist.

 For Example: Hash = 1045375627425331784151332358177649483819648417632

 Parition| P  H | List  |
 +---+---+
 0   | False |   B   |
 182687704666362864775460604089535377456991567872| False |   B   |
 365375409332725729550921208179070754913983135744| False |   B   |
 548063113999088594326381812268606132370974703616| False |   B   |
 730750818665451459101842416358141509827966271488| False |   B   |
 913438523331814323877303020447676887284957839360| False |   B   |
 1096126227998177188652763624537212264741949407232   | True  |   A   |
 1278813932664540053428224228626747642198940975104   | True  |   A   |
 +---+---+

 Then so long as order is maintained, your sorted preflist is simply A ++ B.

 It would be a (if only slightly) more complex algorithm, requiring 
 determination by
 lookahead if you were in the correct partition, in order to have the 
 partitions names be
 starting values.

 To answer your second question, the partitions are named by dividing the 
 integer keyspace
 by the num partitions to create an incrementor[1].  This incrementor is used 
 as the
 stepping factor in a sequence from 0 to RINGTOP value -- 
 `trunc(math:pow(2,160)-1)`.

 You can call these functions yourself in the `riak attach` as follows:

 (dev1@127.0.0.1)1 Inc = chash:ring_increment(8).
 182687704666362864775460604089535377456991567872

 (dev1@127.0.0.1)2 [{IndexAsInt} || IndexAsInt -lists:seq(0,( trunc( 
 math:pow(2,160) -1)
 -1),Inc)].
 [{0},
 {182687704666362864775460604089535377456991567872},
 {365375409332725729550921208179070754913983135744},
 {548063113999088594326381812268606132370974703616},
 {730750818665451459101842416358141509827966271488},
 {913438523331814323877303020447676887284957839360},
 {1096126227998177188652763624537212264741949407232},
 {1278813932664540053428224228626747642198940975104}]}



 Hope this helps!

 Charlie Voiselle


 ---
 [0] https://github.com/basho/riak_core/blob/master/src/chash.erl#L88
 [1] https://github.com/basho/riak_core/blob/master/src/chash.erl#L162





 On Jan 29, 2013, at 9:47 AM, Donald Yan dyan@gmail.com wrote:

 Hi Charlie,

 That really helps to clear the confusion.  Thanks again!

 Then it comes to the next question :-)  Just curious to know: is there a
 special reason why using the end value as the partition's name?  Normally,
 most people I think would pick the beginning value as the partition's name
 in common logic.

 And a second question related to the above question is how this end value
 is picked: partition named 0 is actually ending with number 0, and so
 forth for the other partitions.  This is like the 160-bit space got
 right-shifted one number relative to 0-started numbering system.  Is there
 a special reason for it in Riak, or simply want to use 1-started numbering
 system.


 wr, donald

 On Mon, Jan 28, 2013 at 3:21 PM, Charlie Voiselle cvoise...@basho.comwrote:

 Donald:

 Since we are traversing the ring in a clockwise fashion, the data is
 written into the next partition you would run into moving forward.  You can
 see how the partitions are chosen in* *chash:ordered_from/2 at
 https://github.com/basho/riak_core/blob/master/src/chash.erl#L138.  The
 preference list is built using the partitions that come after the hashed
 value.

 If you consider the partition's name to be the end value that the
 partition holds, the an object that hashes to
 1045375627425331784151332358177649483819648417632 being assigned to
 1096126227998177188652763624537212264741949407232 makes sense. That
 partition would hold items that hash from
 913438523331814323877303020447676887284957839361 to
 1096126227998177188652763624537212264741949407232


 Does this help?

 Thanks,
 Charlie Voiselle


 On Jan 24, 2013, at 12:05 AM, Donald Yan dyan@gmail.com wrote:

 Hi there,

 I'm reading this page at
 http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example
  and
 have a question about the example given in the page.  I quote it here:
 

 The DocIdx hash is a 160 bit integer:

 (dev1@127.0.0.1)5 I:160/integer = DocIdx.
 183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96
 (dev1@127.0.0.1)6 I.1045375627425331784151332358177649483819648417632

 The node looks up the hashed key in the ring which returns a list of
 “preferred” partitions for the given key.

 (node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx, 

Re: about hash calculation in reference page at docs.basho.com

2013-01-29 Thread Donald Yan
Hi Charlie,

That really helps to clear the confusion.  Thanks again!

Then it comes to the next question :-)  Just curious to know: is there a
special reason why using the end value as the partition's name?  Normally,
most people I think would pick the beginning value as the partition's name
in common logic.

And a second question related to the above question is how this end value
is picked: partition named 0 is actually ending with number 0, and so
forth for the other partitions.  This is like the 160-bit space got
right-shifted one number relative to 0-started numbering system.  Is there
a special reason for it in Riak, or simply want to use 1-started numbering
system.


wr, donald

On Mon, Jan 28, 2013 at 3:21 PM, Charlie Voiselle cvoise...@basho.comwrote:

 Donald:

 Since we are traversing the ring in a clockwise fashion, the data is
 written into the next partition you would run into moving forward.  You can
 see how the partitions are chosen in* *chash:ordered_from/2 at
 https://github.com/basho/riak_core/blob/master/src/chash.erl#L138.  The
 preference list is built using the partitions that come after the hashed
 value.

 If you consider the partition's name to be the end value that the
 partition holds, the an object that hashes to
 1045375627425331784151332358177649483819648417632 being assigned to
 1096126227998177188652763624537212264741949407232 makes sense. That
 partition would hold items that hash from
 913438523331814323877303020447676887284957839361 to
 1096126227998177188652763624537212264741949407232


 Does this help?

 Thanks,
 Charlie Voiselle


 On Jan 24, 2013, at 12:05 AM, Donald Yan dyan@gmail.com wrote:

 Hi there,

 I'm reading this page at
 http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example
  and
 have a question about the example given in the page.  I quote it here:
 

 The DocIdx hash is a 160 bit integer:

 (dev1@127.0.0.1)5 I:160/integer = DocIdx.
 183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96
 (dev1@127.0.0.1)6 I.1045375627425331784151332358177649483819648417632

 The node looks up the hashed key in the ring which returns a list of
 “preferred” partitions for the given key.

 (node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx, Ring).
 [{1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'},
 {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'},
 {0, 'dev1@127.0.0.1'},
 {182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'},
 {365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'},
 {548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'},
 {730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'},
 {913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'}]


 

 I question is since the key is  calculated as 
 1045375627425331784151332358177649483819648417632, shouldn't the prefer
 list be like the following instead?
 
 [{913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'},
 {1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'},
 {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'},
 {0, 'dev1@127.0.0.1'},
 {182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'},
 {365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'},
 {548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'},
 {730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'}]
 
 because value 1045375627425331784151332358177649483819648417632 falls
 between 913438523331814323877303020447676887284957839360 (partition 5)
 and 1096126227998177188652763624537212264741949407232 (partition 6)?

 Please Riak experts help to clear my confusion here.  Why partition 6, 7,
 0 are on the top 3 of the list?

 Appreciate in advance.


 -donald
 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: about hash calculation in reference page at docs.basho.com

2013-01-29 Thread Charlie Voiselle
Donald:

Since the list of partitions can easily be segmented using a filter on 
partition values
that are greater than the hashed value, it makes for a simple determination of 
the ordered
preflist.

For Example: Hash = 1045375627425331784151332358177649483819648417632

Parition| P  H | List  |
+---+---+ 
0   | False |   B   |
182687704666362864775460604089535377456991567872| False |   B   |
365375409332725729550921208179070754913983135744| False |   B   |
548063113999088594326381812268606132370974703616| False |   B   |
730750818665451459101842416358141509827966271488| False |   B   |
913438523331814323877303020447676887284957839360| False |   B   |
1096126227998177188652763624537212264741949407232   | True  |   A   |
1278813932664540053428224228626747642198940975104   | True  |   A   |
+---+---+

Then so long as order is maintained, your sorted preflist is simply A ++ B.

It would be a (if only slightly) more complex algorithm, requiring 
determination by
lookahead if you were in the correct partition, in order to have the partitions 
names be
starting values.

To answer your second question, the partitions are named by dividing the 
integer keyspace
by the num partitions to create an incrementor[1].  This incrementor is used as 
the
stepping factor in a sequence from 0 to RINGTOP value -- 
`trunc(math:pow(2,160)-1)`.

You can call these functions yourself in the `riak attach` as follows:

(dev1@127.0.0.1)1 Inc = chash:ring_increment(8).
182687704666362864775460604089535377456991567872

(dev1@127.0.0.1)2 [{IndexAsInt} || IndexAsInt -lists:seq(0,( trunc( 
math:pow(2,160) -1)
-1),Inc)]. 
[{0}, 
{182687704666362864775460604089535377456991567872},
{365375409332725729550921208179070754913983135744},
{548063113999088594326381812268606132370974703616},
{730750818665451459101842416358141509827966271488},
{913438523331814323877303020447676887284957839360},
{1096126227998177188652763624537212264741949407232},
{1278813932664540053428224228626747642198940975104}]}


Hope this helps! 
Charlie Voiselle

--- 
[0] https://github.com/basho/riak_core/blob/master/src/chash.erl#L88 
[1] https://github.com/basho/riak_core/blob/master/src/chash.erl#L162





On Jan 29, 2013, at 9:47 AM, Donald Yan dyan@gmail.com wrote:

 Hi Charlie, 
 
 That really helps to clear the confusion.  Thanks again!
 
 Then it comes to the next question :-)  Just curious to know: is there a 
 special reason why using the end value as the partition's name?  Normally, 
 most people I think would pick the beginning value as the partition's name in 
 common logic.  
 
 And a second question related to the above question is how this end value is 
 picked: partition named 0 is actually ending with number 0, and so forth 
 for the other partitions.  This is like the 160-bit space got right-shifted 
 one number relative to 0-started numbering system.  Is there a special reason 
 for it in Riak, or simply want to use 1-started numbering system.
 
 
 wr, donald
 
 On Mon, Jan 28, 2013 at 3:21 PM, Charlie Voiselle cvoise...@basho.com wrote:
 Donald:
 
 Since we are traversing the ring in a clockwise fashion, the data is written 
 into the next partition you would run into moving forward.  You can see how 
 the partitions are chosen in chash:ordered_from/2 at 
 https://github.com/basho/riak_core/blob/master/src/chash.erl#L138.  The 
 preference list is built using the partitions that come after the hashed 
 value.
 
 If you consider the partition's name to be the end value that the partition 
 holds, the an object that hashes to 
 1045375627425331784151332358177649483819648417632 being assigned to 
 1096126227998177188652763624537212264741949407232 makes sense. That partition 
 would hold items that hash from 
 913438523331814323877303020447676887284957839361 to 
 1096126227998177188652763624537212264741949407232
 
 
 Does this help?
 
 Thanks,
 Charlie Voiselle
 
 
 On Jan 24, 2013, at 12:05 AM, Donald Yan dyan@gmail.com wrote:
 
 Hi there,
 
 I'm reading this page at 
 http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example
  and have a question about the example given in the page.  I quote it here:
 
 The DocIdx hash is a 160 bit integer:
 
 (dev1@127.0.0.1)5 I:160/integer = DocIdx.
 183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96
 (dev1@127.0.0.1)6 I.
 1045375627425331784151332358177649483819648417632
 The node looks up the hashed key in the ring which returns a list of 
 “preferred” partitions for the given key.
 
 (node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx, Ring).
 [{1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'},
 {1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'},
 {0, 

about hash calculation in reference page at docs.basho.com

2013-01-23 Thread Donald Yan
Hi there,

I'm reading this page at
http://docs.basho.com/riak/latest/references/appendices/concepts/Replication/#Understanding-replication-by-example
and
have a question about the example given in the page.  I quote it here:


The DocIdx hash is a 160 bit integer:

(dev1@127.0.0.1)5 I:160/integer = DocIdx.
183,28,67,173,80,128,26,94,190,198,65,15,27,243,135,127,121,101,255,96
(dev1@127.0.0.1)6 I.1045375627425331784151332358177649483819648417632

The node looks up the hashed key in the ring which returns a list of
“preferred” partitions for the given key.

(node1@127.0.0.1) Preflist = riak_core_ring:preflist(DocIdx, Ring).
[{1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'},
{1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'},
{0, 'dev1@127.0.0.1'},
{182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'},
{365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'},
{548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'},
{730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'},
{913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'}]




I question is since the key is  calculated as 
1045375627425331784151332358177649483819648417632, shouldn't the prefer
list be like the following instead?

[{913438523331814323877303020447676887284957839360, 'dev3@127.0.0.1'},
{1096126227998177188652763624537212264741949407232, 'dev1@127.0.0.1'},
{1278813932664540053428224228626747642198940975104, 'dev2@127.0.0.1'},
{0, 'dev1@127.0.0.1'},
{182687704666362864775460604089535377456991567872, 'dev2@127.0.0.1'},
{365375409332725729550921208179070754913983135744, 'dev3@127.0.0.1'},
{548063113999088594326381812268606132370974703616, 'dev1@127.0.0.1'},
{730750818665451459101842416358141509827966271488, 'dev2@127.0.0.1'}]

because value 1045375627425331784151332358177649483819648417632 falls
between 913438523331814323877303020447676887284957839360 (partition 5)
and 1096126227998177188652763624537212264741949407232 (partition 6)?

Please Riak experts help to clear my confusion here.  Why partition 6, 7, 0
are on the top 3 of the list?

Appreciate in advance.


-donald
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com