I was thinking about this a little more, and I think if we really want to get it right we need a more comprehensive change that better mirrors what's going on inside the OS, where dup'd fds are really just different pointers to the same underlying file object (including the file pointer). So most of the stuff in FdMap should be in a separate file object so that multiple fds can point to it, and the fdMap itself should just have the host fd and a pointer to the file object. Then dupFunc() would call dup() to get a new host fd, then create a new fdMap entry that has the new host fd but just copies the pointer from the original fdMap entry. Obviously the serialize/unserialize gets more complicated now, and if you really want to do it right you'll need some kind of ref counting on the file objects so that a dup()/close() sequence does the right thing... I didn't say it would be a small change, but IMO this is what it will take to really do it right.
And I see that the Process field is really fd_map, but that's actually a historical anomaly that goes against our style... it should be fdMap. Steve On Fri, Jan 21, 2011 at 11:01 AM, Richard Strong <rstr...@cs.ucsd.edu>wrote: > I agree with your point about dup. I imagine we can keep an additional > field that determines whether a file descriptor was created by dup and if > so, re-open it with dup at the point of checkpoint restore. I will work on a > fix and post the diff. > > Best, > -Rick > > > On Thu, Jan 20, 2011 at 10:32 PM, Steve Reinhardt <ste...@gmail.com>wrote: > >> Hi Rick, >> >> Thanks for figuring this out. The last one is clearly just my dumb >> mistake; I just committed a fix. The serialization/unserialization of mode >> is an obvious enough oversight too. >> >> I'm not quite so sure about the dup fix though... I think the original >> sim_fd_obj() is doing the right thing, it's just that perhaps that's not >> what dupFunc() needs or wants. That just means dupFunc() shouldn't be >> calling it though. Maybe dupFunc() needs to call a different function that >> doesn't exist. However, I'm not so sure about that, because I'm not >> convinced the overall strategy for checkpointing and restoring dup'd fds is >> even right. At first glance, it looks like when we unserialize a pair of >> dup'd fds, we'll end up opening the file twice, rather than calling dup() >> again, which seems more correct. >> >> Steve >> >> On Thu, Jan 20, 2011 at 11:41 AM, Richard Strong <rstr...@cs.ucsd.edu>wrote: >> >>> Hi all, >>> >>> I uncovered 3 bugs when trying to resume in ALPHA_SE from a checkpoint >>> for the spec2006 benchmarks. They are: >>> >>> (1) Unserializing curTick failed because the checkpoint file wrote >>> curTick()=, but the unserializing code was expecting curTick= >>> >>> (2) The mode bit was not being stored in the file descriptor checkpoint. >>> I added a feature to store the mode bit. >>> >>> (3) The dup function was duplicating the wrong file descriptor. >>> >>> Let me know what you think, I give the diff below along with a traceflag >>> that allowed me to find the problem. >>> >>> Best, >>> -Rick >>> >>> diff --git a/src/sim/SConscript b/src/sim/SConscript >>> --- a/src/sim/SConscript >>> +++ b/src/sim/SConscript >>> @@ -77,3 +77,4 @@ >>> TraceFlag('Thread') >>> TraceFlag('Timer') >>> TraceFlag('VtoPhys') >>> +TraceFlag('FileDescriptors') >>> diff --git a/src/sim/process.cc b/src/sim/process.cc >>> --- a/src/sim/process.cc >>> +++ b/src/sim/process.cc >>> @@ -268,6 +268,7 @@ >>> int >>> Process::alloc_fd(int sim_fd, string filename, int flags, int mode, bool >>> pipe) >>> { >>> + DPRINTF(FileDescriptors, "alloc_fd fd:%d filename:%s\n", sim_fd, >>> filename); >>> // in case open() returns an error, don't allocate a new fd >>> if (sim_fd == -1) >>> return -1; >>> @@ -295,6 +296,7 @@ >>> void >>> Process::free_fd(int tgt_fd) >>> { >>> + DPRINTF(FileDescriptors, "free_fd fd:%d name:%s\n", tgt_fd, "NULL"); >>> Process::FdMap *fdo = &fd_map[tgt_fd]; >>> if (fdo->fd == -1) >>> warn("Process::free_fd: request to free unused fd %d", tgt_fd); >>> @@ -324,8 +326,16 @@ >>> { >>> if (tgt_fd > MAX_FD) >>> panic("sim_fd_obj called in fd out of range."); >>> + >>> >>> - return &fd_map[tgt_fd]; >>> + //return &fd_map[tgt_fd]; >>> + for (int x = 0; x <= MAX_FD; x++) { >>> + if (fd_map[x].fd == tgt_fd){ >>> + return &fd_map[x]; >>> + } >>> + } >>> + panic("sim_fd_obj could not find system fd."); >>> + return NULL; >>> } >>> >>> bool >>> @@ -458,6 +468,7 @@ >>> if (fdo->fd != -1) { >>> fdo->fileOffset = lseek(fdo->fd, 0, SEEK_CUR); >>> } else { >>> + DPRINTF(FileDescriptors, "find_file_offsets fd:%d >>> filename:%s\n", fdo->fd, fdo->filename); >>> fdo->filename = "NULL"; >>> fdo->fileOffset = 0; >>> } >>> @@ -480,6 +491,7 @@ >>> SERIALIZE_SCALAR(flags); >>> SERIALIZE_SCALAR(readPipeSource); >>> SERIALIZE_SCALAR(fileOffset); >>> + SERIALIZE_SCALAR(mode); >>> } >>> >>> void >>> @@ -491,6 +503,7 @@ >>> UNSERIALIZE_SCALAR(flags); >>> UNSERIALIZE_SCALAR(readPipeSource); >>> UNSERIALIZE_SCALAR(fileOffset); >>> + UNSERIALIZE_SCALAR(mode); >>> } >>> >>> void >>> diff --git a/src/sim/serialize.cc b/src/sim/serialize.cc >>> --- a/src/sim/serialize.cc >>> +++ b/src/sim/serialize.cc >>> @@ -411,7 +411,7 @@ >>> { >>> const string §ion = name(); >>> Tick tick; >>> - paramIn(cp, section, "curTick", tick); >>> + paramIn(cp, section, "curTick()", tick); >>> curTick(tick); >>> >>> mainEventQueue.unserialize(cp, "MainEventQueue"); >>> >>> >>> _______________________________________________ >>> m5-dev mailing list >>> m5-dev@m5sim.org >>> http://m5sim.org/mailman/listinfo/m5-dev >>> >>> >> >> _______________________________________________ >> m5-dev mailing list >> m5-dev@m5sim.org >> http://m5sim.org/mailman/listinfo/m5-dev >> >> > > _______________________________________________ > m5-dev mailing list > m5-dev@m5sim.org > http://m5sim.org/mailman/listinfo/m5-dev > >
_______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev