Hello all,
I have recently proposed adding a git hook
(https://gem5-review.googlesource.com/c/public/gem5/+/21739) to make sure that
commit messages follow the guidelines in CONTRIBUTING, with the uttermost goal
being to make sure that the gem5 tags used are valid. I have also checked if
the previous commits would have been valid if this hook had been applied, and
just a tiny percentage of them would,
Although typos were frequent, the main issue was that the tags in MAINTAINERS
do not correspond to reality, and many other tags were being used instead.
I have done a quick check on all used tags since around 2013 (last 5000
commits) to check which and how many times each tag has been used. I have
manually removed some noise (fixed simple tag typos, and ignored grotesque
outliers, which were less than 0.1% of the tags), and all tags were treated as
lowercase. This is the result:
{'mem': 575, 'systemc': 466, 'cpu': 368, 'arm': 347, 'ruby': 327,
'arch-arm': 251, 'sim': 214, 'stats': 194, 'mem-cache': 193, 'config': 193,
'dev': 192, 'base': 187, 'x86': 172, 'tests': 144, 'scons': 142, 'dev-arm':
127, 'misc': 113, 'arch': 93, 'kvm': 79, 'python': 73, 'util': 70, 'configs':
67, 'mem-ruby': 63, 'syscall_emul': 53, 'sim-se': 52, 'style': 48, 'ext': 48,
'gpu-compute': 46, 'arch-riscv': 43, 'riscv': 35, 'sparc': 32, 'power': 26,
'cpu-o3': 23, 'alpha': 21, 'arch-x86': 19, 'mips': 18, 'slicc': 17, 'hsail':
16, 'regressions': 13, 'system-arm': 12, 'o3': 11, 'learning_gem5': 11,
'syscall-emul': 10, 'fastmodel': 10, 'isa': 8, '': 7, 'ps2': 7, 'kern': 7,
'test': 6, 'dist': 5, 'tlm': 5, 'pwr': 5, 'cache': 4, 'energy': 4, 'gpu': 4,
'probe': 3, 'virtio': 3, 'arch-mips': 3, 'cpu/o3': 3, 'net': 3, 'cpu-minor': 3,
'syscall emulation': 3, 'revert "cpu': 3, 'arch-power': 3, 'null': 3, 'build':
3, 'copyright': 2, 'swig': 2, 'arch-sparc': 2, 'x86 regressions': 2,
'cpu-tester': 2, 'loader': 2, 'arch-hsail': 2, 'cpuid': 2, 'minor': 2, 'o3
cpu': 2, 'syscall': 2, 'proto': 2, 'vnc': 2, 'hsail-x86': 2, 'arch-alpha': 1,
'base simple cpu': 1, 'testlib': 1, 'gdb': 1, 'pci': 1, 'docs': 1, 'tarch': 1,
'x86 cpuid': 1, 'testers': 1, 'debug': 1, 'util/regress': 1, 'branch
predictor': 1, 'devices': 1, 'arch/x86': 1, 'kmi': 1, 'x86 isa': 1, 'cpu.
arch': 1, 'sym': 1, 'rcs scripts': 1, 'regress': 1, 'ide': 1, 'cache recorder':
1, 'o3 iew': 1, 'unittest': 1, 'pseudo inst': 1, 'build opts': 1, 'pl011': 1,
'hmc': 1, 'decoder': 1, 'learning-gem5': 1, 'o3cpu': 1, 'sconstruct': 1,
'cpu-kvm': 1, 'ruby sequencer': 1, 'ext lib': 1, 'mem-garnet': 1,
'arch-generic': 1, 'options': 1}
Matching against the current tags existing in maintainers (not including
dev-arm, which shall be added soon), this is the list:
{'mem': 575, 'cpu': 367, 'arch-arm': 251, 'sim': 214, 'stats': 194,
'mem-cache': 193, 'dev': 192, 'base': 187, 'tests': 144, 'scons': 142, 'misc':
113, 'arch': 93, 'python': 73, 'util': 70, 'configs': 67, 'mem-ruby': 63,
'sim-se': 51, 'ext': 48, 'gpu-compute': 46, 'arch-riscv': 43, 'cpu-o3': 23,
'arch-x86': 19, 'system-arm': 12, 'arch-mips': 3, 'arch-power': 3, 'cpu-minor':
3, 'arch-hsail': 2, 'arch-sparc': 2, 'arch-alpha': 1, 'cpu-kvm': 1,
'mem-garnet': 1, 'learning-gem5': 1}
Therefore these gem5 tags have not been used in a long time:
{'dev-virtio', 'system-alpha', 'sim-power', 'system', 'cpu-simple'}
'dev-virtio' has been used as 'virtio' (3)
'system-alpha' has been used as 'alpha' (21) and 'arch-alpha' (1)
'power' (26) has both been used as 'sim-power' and 'arch-power', which are
completely different
'system' and 'cpu-simple' seem to have been substituted by other tags
=============================================
Further analysis:
Except for arm and riscv, which had tied results, there seems to exist a
preference to not include "arch-" in ISA specific commits, so I think they
should all be replaced, i.e., 'arch-x86' becomes 'x86', 'arch-hsail' becomes
'hsail' and so on, but I also agree with the counterargument that these tags
reflect the path to the dir, and therefore 'arch-' must be present.
The same applies to "mem-ruby", which is constantly used as "ruby".
Maybe 'learning-gem5' should be move to 'misc' due to their low frequency.
Also, move 'mem-garnet' to 'misc', since although it belongs to 'mem' I do not
know if the maintainer is familiar with it.
The usage of 'misc' has been fair, overall, although sometimes a new tag
'style' has been used instead, and other times other tags should have been
applied instead.
'dev-virtio' is very rare, so it could be incorporated by 'dev'.
'cpu-minor' is very rare, so it could be incorporated by 'cpu'
There is no maintainer for 'cpu-o3'. Does it really need to be apart from 'cpu'?
I do not know what to think of the 'system', 'system-alpha', and 'system-arm'
tags.
=============================================
Given all that information, I propose remodeling the list of tags. This is what
I am thinking:
[alpha, arch, arm, base, configs, cpu, cpu-o3, dev, dev-arm, ext, gpu-compute,
hsail, kvm, learning-gem5, mem, mem-cache, mips, misc, power, python, riscv,
ruby, scons, sim, sim-se, sim-power, slicc, sparc, stats, system, system-arm,
systemc, tests, util, x86]
Of course, this is mostly from a statistical point of view, so it may not
reflect the knowledge nor will of the maintainers, so I ask you: what do you
think?
Regards,
Daniel
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev