I've done some more testing of find_applet_by_name.  The current code has
two tweakable parameters.  KNOWN_APPNAME_OFFSETS is the number of known
offsets stored in the applet_nameofs array (plus one, because we don't
store the zero offset).  This defaults to 8, which is too large if only a
few applets configured.  This test in the code:

   if (KNOWN_APPNAME_OFFSETS > 0 && NUM_APPLETS > 2*KNOWN_APPNAME_OFFSETS) {

reverts to a simple linear search with no stored offsets if there are
fewer than 16 applets.  The hardcoded multiplier 2 is the second thing
that can be tweaked.

I ran tests for various values of NUM_APPLETS, KNOWN_APPNAME_OFFSETS
and the multiplier.  The results are in the attached file test1.
With the default settings this gives:

 applets     cycles    bloat
   13        224.07       0
   28        249.35     106
   48        289.86     106
   72        356.59     106
   96        415.25     106
  133        520.57     109
  172        612.75     109
  214        764.11     109
  281        941.46     109
  364       1162.98     107
  494       1468.60     107

'cycles' is the average number of cycles for a call to find_applet_by_name
as reported by 'valgrind --tool=callgrind'.  The original code using a
binary search averaged 353 cycles per call.  'bloat' is the bloat in
bytes relative to KNOWN_APPNAME_OFFSETS = 0 (so it doesn't include the
savings that result from not storing offsets for all names).

No, BusyBox doesn't have 494 applets (yet):  I invented some fake ones.

With only a few applets configured the bloat will exceed the savings
arising from not storing offsets for all names (which is about 1.75 *
NUM_APPLETS).  This could be reduced by increasing the multiplier.
Reducing the number of stored offsets would also help.

With a large number of applets it might be valuable to increase the
number of stored offsets to reduce the execution time.

After staring at the numbers and the code for a while I came up with
an alternative approach.  The execution time of find_applet_by_name
depends, in large part, on the number of names between successive
known offsets.  Call that the block size.  It's currently
NUM_APPLETS/KNOWN_APPNAME_OFFSETS.

Suppose we make the block size configurable (APPNAME_BLOCK_SIZE) and
derive the number of offsets from that, NUM_APPLETS/APPNAME_BLOCK_SIZE.
Results of tests using this approach are in the attached file test2.
Some highlights:

block size       25              30              35              40

 applets   cycles  bloat   cycles  bloat   cycles  bloat   cycles  bloat
   13      224.07     0    224.07     0    224.07     0    224.07     0
   28      462.83     0    462.83     0    462.83     0    462.83     0
   48      840.88     0    840.88     0    840.88     0    840.88     0
   72      703.86    57    703.86    57    703.86    57   1212.37     0
   96      678.07    90    678.07    90    941.25    57    941.25    57
  133      665.84   114    740.55    93    927.43    91    927.43    91
  172      745.96   118    829.50   114    951.37    93    951.37    93
  214      756.17   109    818.73   122    900.33   118   1028.08   114
  281      798.10   138    863.99   130    949.66   109   1062.82   122
  364      861.78   148    910.64   140   1016.15   132   1060.04   128
  494      937.61   168    970.37   139   1052.08   148   1125.15   140

In general, compared with the fixed number of offsets, this trades
peformance for bytes at low numbers of applets and bytes for performance
at high numbers.  A block size in the low 30s looks like a reasonable
compromise.

Ron
Test results for find_applet_by_name

The columns are:

 1 number of applets (NUM_APPLETS)
 2 number of known offsets (KNOWN_APPNAME_OFFSETS)
 3 multiplier in test NUM_APPLETS > MULTIPLIER*KNOWN_APP_NAME_OFFSETS
 4 number of names that need to be tested (NUM_APPLETS/KNOWN_APPNAME_OFFSETS)
 5 average number of cycles per call
 6 bloat in bytes relative to KNOWN_APPNAME_OFFSETS = 0

13 0 2 13.0 224.07 0
13 0 4 13.0 224.07 0
13 0 8 13.0 224.07 0
13 2 2 6.5 171.94 54
13 2 4 6.5 171.94 54
13 2 8 13.0 224.07 0
13 4 2 3.2 166.08 90
13 4 4 13.0 224.07 0
13 4 8 13.0 224.07 0
13 8 2 13.0 224.07 0
13 8 4 13.0 224.07 0
13 8 8 13.0 224.07 0
13 16 2 13.0 224.07 0
13 16 4 13.0 224.07 0
13 16 8 13.0 224.07 0
13 32 2 13.0 224.07 0
13 32 4 13.0 224.07 0
13 32 8 13.0 224.07 0
28 0 2 28.0 478.79 0
28 0 4 28.0 478.79 0
28 0 8 28.0 478.79 0
28 2 2 14.0 299.62 54
28 2 4 14.0 299.62 54
28 2 8 14.0 299.62 54
28 4 2 7.0 234.66 90
28 4 4 7.0 234.66 90
28 4 8 28.0 478.79 0
28 8 2 3.5 249.35 106
28 8 4 28.0 478.79 0
28 8 8 28.0 478.79 0
28 16 2 28.0 478.79 0
28 16 4 28.0 478.79 0
28 16 8 28.0 478.79 0
28 32 2 28.0 478.79 0
28 32 4 28.0 478.79 0
28 32 8 28.0 478.79 0
48 0 2 48.0 804.88 0
48 0 4 48.0 804.88 0
48 0 8 48.0 804.88 0
48 2 2 24.0 474.84 54
48 2 4 24.0 474.84 54
48 2 8 24.0 474.84 54
48 4 2 12.0 318.94 90
48 4 4 12.0 318.94 90
48 4 8 12.0 318.94 90
48 8 2 6.0 289.86 106
48 8 4 6.0 289.86 106
48 8 8 48.0 804.88 0
48 16 2 3.0 367.29 138
48 16 4 48.0 804.88 0
48 16 8 48.0 804.88 0
48 32 2 48.0 804.88 0
48 32 4 48.0 804.88 0
48 32 8 48.0 804.88 0
72 0 2 72.0 1220.15 0
72 0 4 72.0 1220.15 0
72 0 8 72.0 1220.15 0
72 2 2 36.0 703.29 57
72 2 4 36.0 703.29 57
72 2 8 36.0 703.29 57
72 4 2 18.0 442.22 90
72 4 4 18.0 442.22 90
72 4 8 18.0 442.22 90
72 8 2 9.0 356.59 106
72 8 4 9.0 356.59 106
72 8 8 9.0 356.59 106
72 16 2 4.5 409.25 138
72 16 4 4.5 409.25 138
72 16 8 72.0 1220.15 0
72 32 2 2.2 643.81 202
72 32 4 72.0 1220.15 0
72 32 8 72.0 1220.15 0
96 0 2 96.0 1714.42 0
96 0 4 96.0 1714.42 0
96 0 8 96.0 1714.42 0
96 2 2 48.0 941.25 57
96 2 4 48.0 941.25 57
96 2 8 48.0 941.25 57
96 4 2 24.0 554.67 90
96 4 4 24.0 554.67 90
96 4 8 24.0 554.67 90
96 8 2 12.0 415.25 106
96 8 4 12.0 415.25 106
96 8 8 12.0 415.25 106
96 16 2 6.0 445.23 138
96 16 4 6.0 445.23 138
96 16 8 96.0 1714.42 0
96 32 2 3.0 664.51 202
96 32 4 96.0 1714.42 0
96 32 8 96.0 1714.42 0
133 0 2 133.0 2525.99 0
133 0 4 133.0 2525.99 0
133 0 8 133.0 2525.99 0
133 2 2 66.5 1395.69 57
133 2 4 66.5 1395.69 57
133 2 8 66.5 1395.69 57
133 4 2 33.2 766.14 93
133 4 4 33.2 766.14 93
133 4 8 33.2 766.14 93
133 8 2 16.6 520.57 109
133 8 4 16.6 520.57 109
133 8 8 16.6 520.57 109
133 16 2 8.3 497.34 141
133 16 4 8.3 497.34 141
133 16 8 8.3 497.34 141
133 32 2 4.2 677.36 205
133 32 4 4.2 677.36 205
133 32 8 133.0 2525.99 0
172 0 2 172.0 3231.88 0
172 0 4 172.0 3231.88 0
172 0 8 172.0 3231.88 0
172 2 2 86.0 1765.25 57
172 2 4 86.0 1765.25 57
172 2 8 86.0 1765.25 57
172 4 2 43.0 952.66 93
172 4 4 43.0 952.66 93
172 4 8 43.0 952.66 93
172 8 2 21.5 612.75 109
172 8 4 21.5 612.75 109
172 8 8 21.5 612.75 109
172 16 2 10.8 538.36 141
172 16 4 10.8 538.36 141
172 16 8 10.8 538.36 141
172 32 2 5.4 696.73 205
172 32 4 5.4 696.73 205
172 32 8 172.0 3231.88 0
214 0 2 214.0 4113.62 0
214 0 4 214.0 4113.62 0
214 0 8 214.0 4113.62 0
214 2 2 107.0 2228.89 57
214 2 4 107.0 2228.89 57
214 2 8 107.0 2228.89 57
214 4 2 53.5 1223.72 93
214 4 4 53.5 1223.72 93
214 4 8 53.5 1223.72 93
214 8 2 26.8 764.11 109
214 8 4 26.8 764.11 109
214 8 8 26.8 764.11 109
214 16 2 13.4 610.39 141
214 16 4 13.4 610.39 141
214 16 8 13.4 610.39 141
214 32 2 6.7 749.06 205
214 32 4 6.7 749.06 205
214 32 8 214.0 4113.62 0
281 0 2 281.0 5501.04 0
281 0 4 281.0 5501.04 0
281 0 8 281.0 5501.04 0
281 2 2 140.5 2942.55 60
281 2 4 140.5 2942.55 60
281 2 8 140.5 2942.55 60
281 4 2 70.2 1566.05 93
281 4 4 70.2 1566.05 93
281 4 8 70.2 1566.05 93
281 8 2 35.1 941.46 109
281 8 4 35.1 941.46 109
281 8 8 35.1 941.46 109
281 16 2 17.6 697.67 141
281 16 4 17.6 697.67 141
281 16 8 17.6 697.67 141
281 32 2 8.8 779.29 205
281 32 4 8.8 779.29 205
281 32 8 8.8 779.29 205
364 0 2 364.0 7202.46 0
364 0 4 364.0 7202.46 0
364 0 8 364.0 7202.46 0
364 2 2 182.0 3843.45 58
364 2 4 182.0 3843.45 58
364 2 8 182.0 3843.45 58
364 4 2 91.0 2005.85 91
364 4 4 91.0 2005.85 91
364 4 8 91.0 2005.85 91
364 8 2 45.5 1162.98 107
364 8 4 45.5 1162.98 107
364 8 8 45.5 1162.98 107
364 16 2 22.8 817.09 139
364 16 4 22.8 817.09 139
364 16 8 22.8 817.09 139
364 32 2 11.4 842.67 203
364 32 4 11.4 842.67 203
364 32 8 11.4 842.67 203
494 0 2 494.0 9702.74 0
494 0 4 494.0 9702.74 0
494 0 8 494.0 9702.74 0
494 2 2 247.0 5128.48 58
494 2 4 247.0 5128.48 58
494 2 8 247.0 5128.48 58
494 4 2 123.5 2641.12 91
494 4 4 123.5 2641.12 91
494 4 8 123.5 2641.12 91
494 8 2 61.8 1468.60 107
494 8 4 61.8 1468.60 107
494 8 8 61.8 1468.60 107
494 16 2 30.9 970.34 139
494 16 4 30.9 970.34 139
494 16 8 30.9 970.34 139
494 32 2 15.4 920.82 203
494 32 4 15.4 920.82 203
494 32 8 15.4 920.82 203
Test results for find_applet_by_name

The columns are:

 1 number of applets (NUM_APPLETS)
 2 number of names that need to be tested (APPNAME_BLOCK_SIZE)
 3 number of known offsets (NUM_APPLETS/APPNAME_BLOCK_SIZE)
 4 average number of cycles per call
 5 bloat in bytes relative to known offsets = 0

13 25 0 224.07 0
13 30 0 224.07 0
13 35 0 224.07 0
13 40 0 224.07 0
13 45 0 224.07 0
13 50 0 224.07 0
28 25 0 462.83 0
28 30 0 462.83 0
28 35 0 462.83 0
28 40 0 462.83 0
28 45 0 462.83 0
28 50 0 462.83 0
48 25 0 840.88 0
48 30 0 840.88 0
48 35 0 840.88 0
48 40 0 840.88 0
48 45 0 840.88 0
48 50 0 840.88 0
72 25 2 703.86 57
72 30 2 703.86 57
72 35 2 703.86 57
72 40 0 1212.37 0
72 45 0 1212.37 0
72 50 0 1212.37 0
96 25 3 678.07 90
96 30 3 678.07 90
96 35 2 941.25 57
96 40 2 941.25 57
96 45 2 941.25 57
96 50 0 1714.42 0
133 25 5 665.84 114
133 30 4 740.55 93
133 35 3 927.43 91
133 40 3 927.43 91
133 45 2 1353.95 57
133 50 2 1353.95 57
172 25 6 745.96 118
172 30 5 829.50 114
172 35 4 951.37 93
172 40 4 951.37 93
172 45 3 1225.35 91
172 50 3 1225.35 91
214 25 8 756.17 109
214 30 7 818.73 122
214 35 6 900.33 118
214 40 5 1028.08 114
214 45 4 1232.25 93
214 50 4 1232.25 93
281 25 11 798.10 138
281 30 9 863.99 130
281 35 8 949.66 109
281 40 7 1062.82 122
281 45 6 1170.01 118
281 50 5 1344.27 114
364 25 14 861.78 148
364 30 12 910.64 140
364 35 10 1016.15 132
364 40 9 1060.04 128
364 45 8 1162.96 107
364 50 7 1291.48 120
494 25 19 937.61 168
494 30 16 970.37 139
494 35 14 1052.08 148
494 40 12 1125.15 140
494 45 10 1272.46 132
494 50 9 1339.46 128
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to