On Jan 27, 2004, at 3:47 PM, Cory Spencer wrote:
Perhaps someone with a bit more familiarity with the Parrot IO
subsystem
could give me some guidance here. I'm currently trying to get a new
'peek' opcode working, and I'm having difficulties getting the io_unix
layer implemented correctly.
As far as I know, I'd get a call down into the io_unix layer when the
ParrotIO object isn't buffered. What I want to be able to do is to
read()/fread() a character off of the io-fd filedescriptor, copy it
into
the buffer, then ungetc() it back onto the stream.
You can't push a character back onto a Unix file descriptor. In order
to emulate this for parrot, you'll need some storage hanging off of the
ParrotIO structure to store the pushed back characters, and then
munge the read methods to pull data from here before reading from the
real descriptor, if there has been anything pushed back. This is,
essentially what the C std. lib. buffered IO API does--the core Unix IO
API doesn't provide this functionality. For parrot, I think that we
should only do this for the io_buf layer (and maybe the io_stdio
layer), which is the buffered IO layer, and already has a buffer which
can be used for this purpose. I don't think it's appropriate for the
io_unix layer--I see that as a direct wrapper around the Unix API.
Unfortunately, however, ungetc requires a (FILE *), while the ParrotIO
object carries around only a raw file descriptor (I think).
Yes, the C std. lib. IO API is a wrapper on top of the core OS IO
routines (for Unix or Windows), and we're using the core IO routines to
implement our IO functionality, rather than going through an extra
layer. (And io_stdio is based on the C std. lib., and I believe is
provided so that it can be used on systems for which none of the other
base layers is available--non-Unix and non-Windows.)
I've seen some instances where people will cast the raw descriptor to a
(FILE *)
I can't imagine where that would ever work. A (FILE *) is a pointer to
a struct which stores various bits of data, including the actual file
descriptor. A file descriptor is just an integer, and isn't going to be
interpretable as a pointer to such a struct--using the core Unix IO
API, no such struct will have been created anywhere in memory. So you
can't get a FILE* via any sort of casting, at least not on Unix
platforms.
however the man page for ungetc warns ominously in its BUGS
section that:
It is not advisable to mix calls to input functions from
the stdio library with low - level calls to read() for the
file descriptor associated with the input stream; the
results will be undefined and very probably not what you
want.
This is warning about something else. It's saying don't use the C API
to do IO on a FILE*, and also use the Unix IO API on the descriptor
which is is the fileno() of that FILE*. But even this you wouldn't do
by casting--you'd either get the descriptor from the FILE* using
fileno(), or use fdopen() to create a FILE* from a descriptor.
But in any event, we don't want to use the C std. lib. IO API inside of
the io_unix layer.
That being said, what is the best course for buffering such characters
at
the io_unix layer? I apparently am not able to use the standard
library
functions to do so (additionally, they only guarantee that you can peek
and replace a single character).
As I said above, I think we'd only want to do this for the io_buf
layer, though others may disagree. If we do want to do it at the
io_unix layer, then we can just copy down a bunch of code from io_buf,
because we will be making the io_unix layer a buffered layer (with the
difference being that the buffer would only be populated in the case of
pushing back read items, an not during reads).
JEff