Hi Sage,
       I have created a new pull request 
(https://github.com/ceph/ceph/pull/3386) , which changed the crush_ln to give 
more 32 digits, thus we don't need to do left shift now.
        You win the bet :), the distribution is much better now,  originally 
the std_dev is 88~121, but now with 32 more digits, we can reduce it to 
12.4895(even lower than expected?????).

        Also note that 8 more digits and 32 more digits show almost the same 
result.

For 32 more digits:

expect                  169.664
osd     weight  count   adjusted
0       1       179     179
1       1.75    291     166
2       3.062   507     165
3       5.359   937     174
4       9.379   1581    168
5       16.41   2771    168
6       28.72   4847    168
7       50.27   8690    172
8       87.96   14924   169
9       153.9   26069   169
10      269.4   45706   169
11      471.4   79754   169
12      825     139847  169
13      1444    245707  170
14      2527    428190  169
std dev 12.4895
     vs 12.6005 (expected)

For 8 more digits

expect                  169.664
osd     weight  count   adjusted
0       1       179     179
1       1.75    290     165
2       3.062   507     165
3       5.359   936     174
4       9.379   1580    168
5       16.41   2771    168
6       28.72   4846    168
7       50.27   8690    172
8       87.96   14923   169
9       153.9   26068   169
10      269.4   45706   169
11      471.4   79754   169
12      825     139847  169
13      1444    245710  170
14      2527    428193  169
std dev 12.5935
     vs 12.5983 (expected).



                                                                                
                Xiaoxi


-----Original Message-----
From: Sage Weil [mailto:[email protected]] 
Sent: Friday, January 16, 2015 10:22 AM
To: Chen, Xiaoxi; Blinick, Stephen L
Cc: [email protected]
Subject: straw2 and ln calculation

Hi everyone,

I took at look at the optimized straw2 code today.  There was one typo
(s/i/u/) and it looks like it works as well as the previous version.  I've 
pushed the latest code and some additional tests to the wip-crush-straw2 branch.

I went on to do some tests of the variance in the resulting distribution when 
the weights are all identical vs somewhat variable vs very skewed. I found that 
for a more 'typical' distribution of weights (e.g, .5 - 3) the std deviation 
stays flat.  However, for very large skews, there are interesting effects that 
emerge based on very small differences in the ln value.  For example, here we 
adjust the table value into a negative integer in mapper.c:

                ln = crush_ln(u) - 0x10000;

If I adjust that 0x10000 value only slightly, and generate a distribution 
across a set of scaled weights (e.g., 1, 2, 4, 8, 16, 32), I will see the low 
weights either more heavily or less heavily weighted than they should be.  
0x10000 skews them a bit low, 0xffff skews them a bit high.  e.g., for 0x10000 
I get

expect                  169.664
osd     weight  count   adjusted
0       1       102     102    <-- here
1       1.75    212     121
2       3.062   442     144
3       5.359   858     160
4       9.379   1483    158
5       16.41   2673    162
6       28.72   4756    165
7       50.27   8620    171
8       87.96   14851   168
9       153.9   26019   169
10      269.4   45674   169
11      471.4   79768   169
12      825     139949  169
13      1444    245958  170
14      2527    428635  169
std dev 88.6993
     vs 12.1484 (expected)

While for 0xffff I get:

expect                  169.664
osd     weight  count   adjusted
0       1       272     272
1       1.75    395     225
2       3.062   596     194
3       5.359   1009    188
4       9.379   1665    177
5       16.41   2869    174
6       28.72   4936    171
7       50.27   8769    174
8       87.96   14981   170
9       153.9   26102   169
10      269.4   45729   169
11      471.4   79737   169
12      825     139719  169
13      1444    245486  170
14      2527    427735  169
std dev 121.243
     vs 13.1205 (expected)

This could be written off to chance, but I see the same effect amplified for 
0xfffe and 0x10001.

I suppose if we have to choose I think skewing high makes more sense since it 
means the high weight items won't skew high and overfill (although in practice 
that doesn't seem to be happening.. it's only on the low end that things get 
wonky.. the high end looks quite good).

In any case, two things:

1) Maybe someone can check my math on the stddev calculation.  I'm scaling the 
actual placement count by the weight (adj = count/weight) and then doing the 
stddev of the adjusted values.  But the expected value is always about 1/3 or 
1/4 of that.  I can't tell if that's because the hash is weak or because I'm 
doing something wrong with the calculation.  Notably, if I plug in rand() with 
equal weights for a sanity check, I still get a stddev of 953 vs expected 
249... about what I see from straw2.  Strangely when I use straw I get a bit 
better than taht, 476.  Am I crazy?  See:

 ./unittest_crush_wrapper --gtest_filter=CrushWrapper.straw2

2) Given how sensitive the ln value is to 0xffff vs 0x10000, I'm highly 
suspicious that slightly better precision will help us out.  Notably, we do

                ln = crush_ln(u) - 0x10000;

                /*
                 * divide by 16.16 fixed-point weight
                 */
                draw = (ln << 32) / bucket->item_weights[i];

...so some extra bits of precision (instead of having the lower 32 zeroed) 
would probably help a lot.

Can crush_ln() be modifed to provide more bits of precision without increasing 
the size of the tables?  I don't really understand the math it's based on, but 
it looks like the same curve is being added in with different precision or 
something.. and a lot of bits in LH are being thrown out here

    LH >>= (48-12);

If there were 8 more bits returned from crush_ln and we shifted by 8 fewer bits 
in bucket_straw2_choose, for example, I bet we could eliminate some of the 
low-weight effect I'm seeing.

To see what I'm seeing, you can run

 ./unittest_crush_wrapper --gtest_filter=CrushWrapper.straw2_stddev

Any input would be helpful, thanks!
sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to