By self-optimizing, I mean:

  1. You have N locations in your code, where two different implementations are 
possible (it could be more than two, but I'm trying to keep this simple).
  2. The global performance of your code/module depends on the _combination_ of 
those N locations (and could be platform specific too).
  3. N much bigger than 1, says 10, so that would be 1024 combinations to try; 
too much to do it manually....



So what you could do is:

  1. Create a benchmark that evaluates the performance of your code.
  2. Create a batch file that will re-compile your code and run the benchmark 
in a loop, until some stop condition is detected ("stop file" present?)
  3. Create a file that will contain the set of flag values, together with 
benchmark results (initially empty).
  4. Read the file using staticRead() and put the result in "const"s, such that 
it can be used in "when" (I _assume_ this is possible)
  5. When the file is empty, assume all flags are false.
  6. When the file is not empty, read the last line and "binary increment" the 
bit-vector (presumably represented as an uint16, since you have 10 flags)
  7. At the end of the benchmark, add a line with the current flags and result.
  8. When the last combination was tried, create the stop condition ("stop 
file")
  9. Start the batch file, and go take a long break (or let it run on someone 
else's computer)
  10. Check the result file to find out the best combination of flags.
  11. Either hard-code this combination, or put it in a separate file per 
compilation target...
  12. Done.



I suspect this general idea might have come up here before. Maybe someone has 
already coded something like this?

Reply via email to