Hi Gabe,
I have an experimental patch that adds support for HDF5. It's pretty
efficient for stat traces and has good Python support. The file format
conceptually behaves like a file system with groups (~directories) and
datasets(~files). The format itself supports efficient lookups and is
designed to scale to very large datasets.
Datasets are N dimensional matrices. In gem5's HDF5 backend, I use one
dimension for time, so the dimensionality becomes n+1 where n is the
dimensionality of the original stat (e.g., a scalar stat is stored as a
vector).
Last time I tested my HDF5 backend, I got the following sizes:
100 dumps: 20.7 KiB / dump
1000 dumps: 6.5 KiB / dump
10000 dumps: 4.5 KiB / dump
As you can see, the startup cost is pretty high, but you quickly
out-perform the default text format.
The backend has the following known limitations:
* Bulky for single stat dumps
* Slower than text
* Doesn't support multiple concurrent writers (i.e., you can't fork and
write to the same file in the parent and child)
* SimObject structure inferred from stat names
Some of the performance issues are likely caused by gem5 not traversing
the stat hierarchy in a predictable way. If we were to do a depth first
traversal, we could potentially decrease the cost of lookups in the hdf5
file by reusing group pointers.
Cheers,
Andreas
On 16/09/2017 00:25, Jason Lowe-Power wrote:
Hi Gabe,
There's currently no functionality for that. However, I believe there have
been some efforts in the past to get something like this to work, though
none of them were committed.
I've heard that there may be some people working on better stats support.
So, if you're thinking about improving the stats I would make sure that
there aren't duplicate efforts :). Personally, I would *love* to see
something like database support. Pandas (http://pandas.pydata.org/) would
be even cooler, IMO.
Cheers,
Jason
On Fri, Sep 15, 2017 at 3:42 PM Gabe Black <[email protected]> wrote:
Hi folks. This may be documented somewhere already, but is there a way to
collect stats to a database rather than to a text file? That would be
helpful when collecting stats periodically to get a graph over time, which
tends to produce a lot of output that needs to be processed before it's
useful.
Gabe
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev