* speedup possible on resource gradients: Just we know how much we can save, I ran the same test (256x256 map, 4 AICastor, 60s, CXXFLAGS="-O3 -march=athlon-xp"), but I "hacked" the code, so that map.syncStep() is executed twice. This make two resource gradients computed instead of one, and they are not the same to avoid cache-optimization bias. (he-he I should not forget not to commit it....)
Simple: cpu usage graph: 100.0 % | * 98.5 % | 95.0 % | 93.5 % | 90.0 % | 88.5 % | 85.0 % | 83.5 % | 80.0 % | 78.5 % | 75.0 % | 73.5 % | 70.0 % | 68.5 % | 65.0 % | 63.5 % | 60.0 % | 58.5 % | 55.0 % | 53.5 % | * 50.0 % | * 48.5 % | * 45.0 % | ** 43.5 % | **** 40.0 % | ***** 38.5 % | **************** 35.0 % | ********* 33.5 % | ***************** 30.0 % | ******************************* 28.5 % | ********* 25.0 % | * 23.5 % | 20.0 % | 18.5 % | 15.0 % | 13.5 % | 10.0 % | 8.5 % | 5.0 % | 3.5 % | 0.0 % | Double: cpu usage graph: 100.0 % | * 98.5 % | 95.0 % | 93.5 % | 90.0 % | 88.5 % | 85.0 % | 83.5 % | 80.0 % | 78.5 % | 75.0 % | 73.5 % | 70.0 % | 68.5 % | 65.0 % | * 63.5 % | 60.0 % | * 58.5 % | * 55.0 % | * 53.5 % | ** 50.0 % | ****** 48.5 % | ************* 45.0 % | ***************** 43.5 % | ********* 40.0 % | ******* 38.5 % | ********* 35.0 % | ********************* 33.5 % | ******* 30.0 % | * 28.5 % | 25.0 % | 23.5 % | 20.0 % | 18.5 % | 15.0 % | 13.5 % | 10.0 % | 8.5 % | 5.0 % | 3.5 % | 0.0 % | It's a bit difficult to interpret, so I took the 90%-quantile. With simple computation, we reach 40% CPU, and with double we reach 50% CPU. We can then fairly expect that the resources gradients take 10% of the CPU on my computer. (AMD Athlon XP 3200+, with low latency but down-clocked RAM and FSB due to a bug). _______________________________________________ glob2-devel mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/glob2-devel
