Here is a list of things that need to be done:
1. Each time a file is opened add parameters to open() to a structure
of some sort. This is the mode and the file name as well as the the
returned fd. You might be able to extend the fd_map structure to hold
this info.
2. Each time a file is closed remove the entry from that structure
3. When Process::serialize() is called also serialize the newly
recorded data as well as the current position in each file
4. When Process::unserialize() is called restore the newly recorded
data and use it to reopen each file and seek the the appropriate place
in the file
5. Do the same sort of thing for the input file (process.input).
Ali
On Nov 15, 2007, at 12:38 AM, Rick Strong wrote:
All right, I am in the process on understanding how it all works.
Where is a good place to start. I am right now looking through sim/
process.* and sim/syscall_emul* to work backwards to where all the
information is stored. If someone has insight on this system and
could offer a brief description of how it works, it would be very
helpful.
-Richard
Nathan Binkert wrote:
When you fix this, pretty please submit a diff :)
I'm pretty sure I figured it out and I'm pretty sure it is related
to file I/O. When we restore from a checkpoint we don't reopen and
seek to the appropriate place in any files we were reading from/
writing to. I bet what is happening is that the benchmark attempts
to read some input data (or maybe write some data) and the file
descriptor is invalid when M5 passes the syscall through to the
host OS. The OS returns an error code which alters the path of the
benchmark and it exits early. It shouldn't be too hard to fix but
I don't have time to do it at the moment. You would need to keep
track of all the open files paths and modes and add the paths/
modes to the checkpoint along with the current position (via
tell()). Upon restoring from a checkpoint you would reopen the
files and seek() to the appropriate place in the file.
Ali
On Nov 14, 2007, at 10:02 PM, Rick Strong wrote:
When I take a checkpoint in AtomicSimpleCPU (m5_2.0b4) at
curTick=100015476500 (approx. 200,000,000 insts into the binary)
in mcf, and resume execution in any CPU model, I get an exit
syscall (syscall trace included below) at cycle 100522711000
(approx 1014345 insts into execution). What is strange is that if
I run AtomicSimpleCPU through this point (from start), I have no
problems. Any ideas on either the problem or how to debug?
It turns out that the same problem happens for checkpoints in
twolf about 200,000,000 insts into the binary. A resume has some
file i/o and an untimely exit. Both problems seem related to file
i/o and then an exit call. Is it possible that some system call
is not implemented and defaulting to exit. I included the syscall
trace for twolf for any interested parties:
I have resumed both checkpoints, immediately created new
checkpoints, and they diff clean (except for order of the ptable
entries).
I am right now working on getting an EXEC trace for mcf, one from
checkpoint and one executing from the beginning to find any
differences.
TWOLF syscall trace
"
100285445500: system.cpu: pc 4832275812 syscall read called w/
arguments 4,5368834056,8192,1
100285445500: system.cpu: syscall read returns 18446744073709551615
100286500500: system.cpu: pc 4832275812 syscall read called w/
arguments 4,5368834056,8192,5
100286500500: system.cpu: syscall read returns 18446744073709551615
100287514000: system.cpu: pc 4832260836 syscall close called w/
arguments 0,4831383888,1,1048576
100287514000: system.cpu: syscall close returns 0
100287679500: system.cpu: pc 4832260628 syscall write called w/
arguments 1,5368796680,172,1048576
TimberWolfSC version:v4.3a date:Mon Jan 25 18:50:36 EST 1988
Standard Cell Placement and Global Routing Program
Authors: Carl Sechen, Bill Swartz
Yale University
100287679500: system.cpu: syscall write returns 172
100287726500: system.cpu: pc 4832260836 syscall close called w/
arguments 1,4831383888,172,0
" MCF SYSCALL TRACE "
100519102000: system.cpu: syscall read called w/arguments
3,5368799240,8192,7
100519102000: system.cpu: syscall read returns
18446744073709551615
100521401500: system.cpu: syscall obreak called w/arguments
5374902272,0,0,1048576
100521401500: global: Break Point changed to: 0X1405E8000
100521401500: system.cpu: syscall obreak returns 5374902272
100521680500: system.cpu: syscall close called w/arguments
0,4831387472,1,1048576
100521680500: system.cpu: syscall close returns 0
100521846000: system.cpu: syscall write called w/arguments
1,5368778616,119,1048576
100521846000: system.cpu: syscall write returns 119
100521893000: system.cpu: syscall close called w/arguments
1,4831387472,119,0
100521893000: system.cpu: syscall close returns 0
100522014000: system.cpu: syscall close called w/arguments
2,4831387472,0,1048576
100522014000: system.cpu: syscall close returns
18446744073709551615
100522187500: system.cpu: syscall close called w/arguments
3,4831387472,1,1048576
100522187500: system.cpu: syscall close returns 0
100522357000: system.cpu: syscall obreak called w/arguments
5368815616,0,0,1048576
100522357000: global: Break Point changed to: 0X14001A000
100522357000: system.cpu: syscall obreak returns 5368815616
100522623500: system.cpu: syscall sigprocmask called w/
arguments 1,18446744073709547831,0,0
warn: ignoring syscall sigprocmask(1, 18446744073709547831, ...)
100522623500: system.cpu: syscall sigprocmask returns 0
100522711000: system.cpu: syscall exit called w/arguments
18446744073709551615,5368739848,2,0
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
_______________________________________________
m5-users mailing list
m5-users@m5sim.org
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users