I did some performance benchmarks with pybench on an ARMv7 board. To
prevent any third party processes from interfering, the board was
running Ubuntu in single user mode, and the stock glibc. I'll run
another set of benchmarks with our glibc tuned with the proposed flags,
and also do another set of benchmarks on my NSLU2 (XScale/ARMv5) to see
what sorta performance hit we're going to see.
With our current CFLAGS:
* Round 1 done in 62.165 seconds.
* Round 2 done in 62.229 seconds.
* Round 3 done in 61.994 seconds.
* Round 4 done in 61.616 seconds.
* Round 5 done in 62.371 seconds.
* Round 6 done in 63.191 seconds.
* Round 7 done in 62.180 seconds.
* Round 8 done in 62.165 seconds.
* Round 9 done in 61.906 seconds.
* Round 10 done in 62.977 seconds.
Test minimum average operation overhead
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 1302ms 1317ms 2.58us 3.114ms
BuiltinMethodLookup: 871ms 871ms 0.83us 3.645ms
CompareFloats: 837ms 974ms 0.81us 4.171ms
CompareFloatsIntegers: 963ms 1052ms 1.17us 3.112ms
CompareIntegers: 657ms 659ms 0.37us 6.304ms
CompareInternedStrings: 667ms 670ms 0.45us 16.016ms
CompareLongs: 564ms 566ms 0.54us 3.641ms
CompareStrings: 550ms 556ms 0.56us 10.780ms
CompareUnicode: 548ms 551ms 0.74us 8.163ms
ComplexPythonFunctionCalls: 1699ms 1750ms 8.75us 5.260ms
ConcatStrings: 1017ms 1099ms 2.20us 6.975ms
ConcatUnicode: 4336ms 4720ms 15.73us 4.897ms
CreateInstances: 1454ms 1463ms 13.06us 4.242ms
CreateNewInstances: 1267ms 1283ms 15.28us 3.627ms
CreateStringsWithConcat: 742ms 750ms 0.75us 10.567ms
CreateUnicodeWithConcat: 772ms 778ms 1.94us 4.172ms
DictCreation: 695ms 695ms 1.74us 4.172ms
DictWithFloatKeys: 1081ms 1086ms 1.21us 7.899ms
DictWithIntegerKeys: 767ms 771ms 0.64us 10.566ms
DictWithStringKeys: 668ms 672ms 0.56us 10.567ms
ForLoops: 671ms 672ms 26.88us 0.685ms
IfThenElse: 533ms 534ms 0.40us 7.898ms
ListSlicing: 557ms 563ms 40.19us 0.642ms
NestedForLoops: 764ms 771ms 0.51us 0.247ms
NestedListComprehensions: 1129ms 1148ms 95.63us 1.012ms
NormalClassAttribute: 761ms 771ms 0.64us 5.271ms
NormalInstanceAttribute: 697ms 697ms 0.58us 5.278ms
PythonFunctionCalls: 693ms 698ms 2.11us 3.125ms
PythonMethodCalls: 1568ms 1574ms 7.00us 1.566ms
Recursion: 1037ms 1043ms 20.86us 5.249ms
SecondImport: 1380ms 1382ms 13.82us 2.052ms
SecondPackageImport: 1408ms 1411ms 14.11us 2.052ms
SecondSubmoduleImport: 1600ms 1602ms 16.02us 2.052ms
SimpleComplexArithmetic: 1581ms 1584ms 1.80us 4.171ms
SimpleDictManipulation: 778ms 782ms 0.65us 5.248ms
SimpleFloatArithmetic: 1415ms 1418ms 1.07us 6.303ms
SimpleIntFloatArithmetic: 659ms 660ms 0.50us 6.304ms
SimpleIntegerArithmetic: 659ms 661ms 0.50us 6.305ms
SimpleListComprehensions: 932ms 947ms 78.88us 1.015ms
SimpleListManipulation: 642ms 646ms 0.55us 6.840ms
SimpleLongArithmetic: 621ms 637ms 0.97us 3.111ms
SmallLists: 984ms 1000ms 1.47us 4.172ms
SmallTuples: 1038ms 1043ms 1.93us 4.700ms
SpecialClassAttribute: 754ms 755ms 0.63us 5.272ms
SpecialInstanceAttribute: 818ms 819ms 0.68us 5.277ms
StringMappings: 1426ms 1428ms 5.66us 4.474ms
StringPredicates: 1303ms 1326ms 1.89us 15.920ms
StringSlicing: 809ms 864ms 1.54us 9.299ms
TryExcept: 658ms 660ms 0.29us 7.895ms
TryFinally: 1530ms 1532ms 9.58us 4.263ms
TryRaiseExcept: 1196ms 1203ms 18.79us 4.172ms
TupleSlicing: 728ms 733ms 2.79us 0.424ms
UnicodeMappings: 703ms 705ms 19.59us 3.839ms
UnicodePredicates: 1390ms 1396ms 2.58us 19.107ms
UnicodeProperties: 1723ms 1729ms 4.32us 15.925ms
UnicodeSlicing: 973ms 1467ms 2.99us 8.250ms
WithFinally: 1457ms 1458ms 9.11us 4.259ms
WithRaiseExcept: 1663ms 1681ms 21.02us 5.357ms
-------------------------------------------------------------------------------
Totals: 60691ms 62279ms
With the proposed CFLAGS:
* Round 1 done in 60.513 seconds.
* Round 2 done in 60.353 seconds.
* Round 3 done in 61.784 seconds.
* Round 4 done in 60.537 seconds.
* Round 5 done in 60.090 seconds.
* Round 6 done in 59.704 seconds.
* Round 7 done in 60.323 seconds.
* Round 8 done in 60.244 seconds.
* Round 9 done in 60.026 seconds.
* Round 10 done in 58.853 seconds.
Average of 60.243 seconds per test run
Test minimum average operation overhead
-------------------------------------------------------------------------------
BuiltinFunctionCalls: 1234ms 1242ms 2.43us 1.986ms
BuiltinMethodLookup: 846ms 867ms 0.83us 2.322ms
CompareFloats: 918ms 1066ms 0.89us 2.654ms
CompareFloatsIntegers: 876ms 974ms 1.08us 1.986ms
CompareIntegers: 681ms 682ms 0.38us 3.988ms
CompareInternedStrings: 694ms 694ms 0.46us 10.148ms
CompareLongs: 548ms 548ms 0.52us 2.320ms
CompareStrings: 564ms 564ms 0.56us 6.895ms
CompareUnicode: 562ms 562ms 0.75us 5.259ms
ComplexPythonFunctionCalls: 1632ms 1710ms 8.55us 3.338ms
ConcatStrings: 960ms 1099ms 2.20us 5.021ms
ConcatUnicode: 4146ms 4732ms 15.77us 3.836ms
CreateInstances: 1411ms 1433ms 12.79us 2.719ms
CreateNewInstances: 1296ms 1314ms 15.65us 2.496ms
CreateStringsWithConcat: 760ms 763ms 0.76us 6.674ms
CreateUnicodeWithConcat: 728ms 751ms 1.88us 2.655ms
DictCreation: 644ms 645ms 1.61us 2.653ms
DictWithFloatKeys: 914ms 916ms 1.02us 4.999ms
DictWithIntegerKeys: 723ms 723ms 0.60us 6.674ms
DictWithStringKeys: 690ms 690ms 0.57us 6.676ms
ForLoops: 687ms 687ms 27.48us 0.460ms
IfThenElse: 553ms 553ms 0.41us 4.999ms
ListSlicing: 558ms 560ms 40.00us 0.653ms
NestedForLoops: 780ms 781ms 0.52us 0.165ms
NestedListComprehensions: 987ms 1044ms 87.01us 0.669ms
NormalClassAttribute: 779ms 780ms 0.65us 3.342ms
NormalInstanceAttribute: 719ms 721ms 0.60us 3.350ms
PythonFunctionCalls: 713ms 715ms 2.17us 1.998ms
PythonMethodCalls: 1509ms 1523ms 6.77us 1.025ms
Recursion: 944ms 945ms 18.90us 3.322ms
SecondImport: 1343ms 1346ms 13.46us 1.342ms
SecondPackageImport: 1352ms 1355ms 13.55us 1.375ms
SecondSubmoduleImport: 1589ms 1594ms 15.94us 1.373ms
SimpleComplexArithmetic: 909ms 914ms 1.04us 2.764ms
SimpleDictManipulation: 712ms 714ms 0.59us 3.474ms
SimpleFloatArithmetic: 961ms 966ms 0.73us 4.156ms
SimpleIntFloatArithmetic: 527ms 527ms 0.40us 4.162ms
SimpleIntegerArithmetic: 527ms 527ms 0.40us 4.165ms
SimpleListComprehensions: 874ms 887ms 73.88us 0.701ms
SimpleListManipulation: 588ms 589ms 0.50us 4.509ms
SimpleLongArithmetic: 610ms 611ms 0.93us 2.068ms
SmallLists: 962ms 968ms 1.42us 2.768ms
SmallTuples: 1024ms 1039ms 1.92us 3.112ms
SpecialClassAttribute: 775ms 776ms 0.65us 3.344ms
SpecialInstanceAttribute: 893ms 899ms 0.75us 3.349ms
StringMappings: 1379ms 1396ms 5.54us 3.128ms
StringPredicates: 2549ms 2549ms 3.64us 12.080ms
StringSlicing: 743ms 846ms 1.51us 6.583ms
TryExcept: 728ms 728ms 0.32us 5.211ms
TryFinally: 1459ms 1461ms 9.13us 2.770ms
TryRaiseExcept: 1217ms 1235ms 19.29us 2.763ms
TupleSlicing: 650ms 679ms 2.59us 0.449ms
UnicodeMappings: 689ms 692ms 19.22us 4.300ms
UnicodePredicates: 1027ms 1028ms 1.90us 14.498ms
UnicodeProperties: 1294ms 1305ms 3.26us 12.075ms
UnicodeSlicing: 928ms 1316ms 2.69us 5.849ms
WithFinally: 1377ms 1379ms 8.62us 2.769ms
WithRaiseExcept: 1626ms 1631ms 20.39us 3.467ms
That being said, the initial results, while having an improvement, are not very
impressive, and I suspect we'll be seeing a reduction in performance on the
NSLU2 due to being tuned against features its core doesn't get. I'll post more
resorts once I have rebuilt glibc.
--
armel gcc default optimisations
https://bugs.launchpad.net/bugs/303232
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs