> On Jan. 25, 2013, 4:13 p.m., Nathan Binkert wrote:
> > Seems like overkill to me.  If you do this, then you can't do any math 
> > using SQL and you have to suck out values to do anything.  If that's the 
> > attitude, why even bother using sqlite at all?
> 
> Ali Saidi wrote:
>     You can't do math in sql, but that probably wasn't what you wanted to do 
> anyway. You probably want to suck the data back in the python class hierarchy 
> and manipulate it there. I think the ideal situation would be to pickle the 
> objects and not use sql, however that was much slower. The slowest (and 
> largest) was having a sql table of stat,x,y,value columns which meant reading 
> a large array took forever.
> 
> Nathan Binkert wrote:
>     Interesting.  When I was doing tons of sampling, doing the math in SQL 
> was exactly what I wanted to do because I could do queries in moments 
> compared to loading several gigabytes of data and then processing it.  All of 
> the context stuff and the stuff in util/stats/db.py was to do that.  The nice 
> thing about the database is that you can build up a very large database of 
> stats across many experiments that have many samples, and with SQL, you can 
> really quickly query those stats.  If you're just trying to have something be 
> a binary format, you may as well just serialize as json (or msgpack) and gzip 
> the whole file.  I, personally, found the SQL thing to be awesome.  I could 
> regenerate complex graphs in moments.  (Not to mention the fact that SQL 
> actually implements tons of useful operations.)

The binary data stored in SQL is a sensible middle ground at this point as you 
can avoid the scenario you describe of having to unzip/unserialize the whole 
file, and can simply get the data you need through queries. Then you will 
indeed have to unzip/unserialize those bits before you can manipulate them, but 
the benefit is that the size of the database is manageable.

We tried a range of options and this seemed like a sensible starting point. If 
someone wants to extend or modify it going forward that is of course very 
welcome.


- Andreas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1646/#review3936
-----------------------------------------------------------


On Jan. 15, 2013, 10:36 a.m., Andreas Hansson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/1646/
> -----------------------------------------------------------
> 
> (Updated Jan. 15, 2013, 10:36 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Description
> -------
> 
> Changeset 9499:bc23f2c316fc
> ---------------------------
> stats: Store vector stats using doubles and compress with zlib
> 
> This patch changes any arrays of values to be stored as an array of doubles,
> rather than floats in the SQL database. This is required as floats lose too 
> much
> accuracy. For example, if the stats are read from the database, and injected
> back into gem5's stats system, then formulas can be recalculated. If floats 
> are
> used, these formulas evaluate to be different from those originally calculated
> when creating the SQL database.
> 
> As doubles take up twice the space of a float (8 Bytes vs 4 Bytes) the SQL
> database becomes larger. The end result is that the database is larger than 
> the
> text based output without compression. Therefore, as the vector storage is
> already not human readable we compress this field using zlib. zlib has been in
> the python standard library since version 1.5.1. so it is already covered in
> the gem5 build prerequisites.
> 
> 
> Diffs
> -----
> 
>   src/python/m5/stats/sql.py PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/1646/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Andreas Hansson
> 
>

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to