Re: IO Multiplexing
On Nov 12, 2:21 pm, stefa...@cox.net (Stefan O'Rear) wrote: On Thu, Nov 11, 2010 at 05:47:46PM -0800, Ben Goldberg wrote: I would like to know, is perl6 going to have something like select (with arguments created by fileno/vec), or something like IO::Select (with which the user doesn't need to know about the implementation, which happens to be done with fileno/vec/select), or only an event loop. I will be unhappy if Perl 6 doesn't provide all three. However, the spec in question doesn't really exist. The IO synopsis is garbage and should be deleted and rewritten from scratch by someone who knows what they're talking about. I would recommend that there NOT be any sort of fileno exposed to the user, unless he goes out of the way to get it -- any function (in particular, posix functions) should simply take a perl filehandle, and that function in turn would pull out the fileno (or fail appropriately, if the filehandle doesn't have a fileno). If users want to know if filehandles correspond to the same underlying file, then there could be a method -- perhaps $fh.uses_same_desciptor($fh2), or somesuch. This goes without saying. One caveat - it should be possible to pass integer file descriptors. For functions in the posix module, sure; but for everywhere else, if the user wants to pass integer file descriptors to a function which is expecting an IO object, maybe not. But it should be easy (maybe even trivial) to create a new IO object from an integer file descriptor. If there's a select() builtin (and I'd much rather that there not be -- it should be hidden away in a class, like perl5's IO::Select), I'd very much hope that it would take and return Sets of filehandles, not vec packed strings. I'd prefer there not be one[**] Absolutely not. TIMTOWDI. There will be a select() function (in the POSIX module, not the default namespace!), and it will take parameters as close as possible to POSIX.1's definition. Perl without system calls is not Perl. I'm totally fine with discouraging casual use, though, which is why it shouldn't be in the default namespace. +1 :) If there's something like perl5's IO::Select, it should be able to just work regardless of whether the perl filehandles are sockets, regular files, or user-created pure-perl filehandles (which might never block, or which might use one or more normal filehandles internally, which in turn might potentially block). This is what I'd prefer. That is a good doctoral thesis topic. Require that IO objects provide a uniform interface by means of which the multiplexer object can determine blockability. Any user-defined IO handle which doesn't wait on external events will never block, and it can say so. A user defined IO handle which waits on a built in IO handle can say what handle it will need to wait upon, and what operation it will need to wait for. A user defined IO handle which waits for a signal can say what signal it is waiting for. A user defined IO handle which waits for a condition variable can say what variable it's waiting on, and what kind of state change it's looking for. Given the complexity of looking for other things (in particular, IPC that doesn't use streams), the simple solution is to require that they write a class for it, and have that class (internally) create a kernel thread which will block on the desired IPC, and write a byte to a pipe... this allows the perl level IO handle to say to the multiplexer that it's waiting on the read end of that pipe. This is roughly the same technique that we would use to safely detect signals. Lastly, if perl6 has an efficient enough built-in event loop, and sufficiently lightweight coroutines (or maybe I should say fibers?), then we might not need to have any kind of explicit multiplexing. TIMTOWDI. Perl without system calls is not Perl. Yeah, but if it's more efficient to not directly use those system calls, then what's the point of having them? For example, any time user code does a read operation on a handle that isn't (from the user code's point of view) in nonblocking mode, the filehandle implementation would tell the the event loop to yield to it when the handle becomes readable, then it would yield to the event loop, then (once it gets back control) read from the handle.[*] This is why S16 is junk - too much blue-sky thinking, not enough pragmatism and practical experience. S16 has almost nothing in it about actual IO (it does talk about how STDIN, STDOUT, STDERR, and ARGV will be replaced with $*IN, $*OUT, $*ERR, and $*ARGFILES, and how they will be dynamically overrideable). Did you mean to say S32::IO? While S32::IO describes what methods and roles IO objects will have/ do, the only thing it says about multiplexing is that both of perl5's select() operators will disappear. It's fairly obvious that select() / select(EXPR) will be replaced with the dynamically overrideable
IO Multiplexing
I would like to know, is perl6 going to have something like select (with arguments created by fileno/vec), or something like IO::Select (with which the user doesn't need to know about the implementation, which happens to be done with fileno/vec/select), or only an event loop. I would recommend that there NOT be any sort of fileno exposed to the user, unless he goes out of the way to get it -- any function (in particular, posix functions) should simply take a perl filehandle, and that function in turn would pull out the fileno (or fail appropriately, if the filehandle doesn't have a fileno). If users want to know if filehandles correspond to the same underlying file, then there could be a method -- perhaps $fh.uses_same_desciptor($fh2), or somesuch. If there's a select() builtin (and I'd much rather that there not be -- it should be hidden away in a class, like perl5's IO::Select), I'd very much hope that it would take and return Sets of filehandles, not vec packed strings. I'd prefer there not be one[**] If there's something like perl5's IO::Select, it should be able to just work regardless of whether the perl filehandles are sockets, regular files, or user-created pure-perl filehandles (which might never block, or which might use one or more normal filehandles internally, which in turn might potentially block). This is what I'd prefer. Lastly, if perl6 has an efficient enough built-in event loop, and sufficiently lightweight coroutines (or maybe I should say fibers?), then we might not need to have any kind of explicit multiplexing. For example, any time user code does a read operation on a handle that isn't (from the user code's point of view) in nonblocking mode, the filehandle implementation would tell the the event loop to yield to it when the handle becomes readable, then it would yield to the event loop, then (once it gets back control) read from the handle.[*] This provides lots of convenience, but it would resemble Java IO before the NIO -- except with one fiber per handle instead of one thread per handle. Coroutines/green threads/fibers are much lighter weight than real threads, but often aren't as fast as a well-written select() loop specially written for the user's task. Thus, I'd hope for perl6 to have an IO::Select, and automatically- yielding [*] blocking IO, and not have a select() builtin. [**] [*] This is a simplification: A) If a user explicitly marks a filehandle as not yielding to other coroutines, it would do a blocking read (or whatever) instead of going through the event loop rigmarole. B) If perl6 was compiled with an asynchronous IO library (or is on windows and is not using stdio and has (Read|Write)FileEX support), then it might start the Async IO operation, tell the event loop to wake it when the operation completes, then yield to the event loop. C) Depending on circumstances, it *may* be more efficient to have the event loop itself do the reading or other IO itself, and schedule the fibers for which the IO was done, than to have the fibers do the IO. TMTOWTDI. This would be especially important if perl is compiled with async IO -- the event loop might first wait for the fds to be readable/ etc, *then* start the async IO for those fds, then schedule the fibers for which the performed IO has completed, thus minimizing the number of outstanding async io operations. [**] The main reason I'd prefer that perl6 not have a select() builtin is that every time it's called perl would need to convert user-level Sets of filehandles into the underlying implementations' versions of them (fd_sets on unixy, fd_sets and/or an event queue handle on windows), and then back to perl Set objects, and free up the implementation version of the filehandle set... this is inefficient. A well written IO::Select-like object could create (potentially empty) versions of the OS's set of filehandles when it's created, add to that set as needed, and NOT destroy that implementation-specific set until the IO::Select object itself is destroyed. Perl5's IO::Select does this with the packed bitsets that it creates to pas to select. It could do improve it's efficiency by using fd_sets instead of bitstrings, and not use the perl select(), but the C select(2) instead. Better still would be epoll. In this case, avoiding repeated setup makes an object multiplexer model enormously more efficient than something like select(). Similarly, on windows, if we WSAEventSelect or WSAAsyncSelect to create readability/ writability/ etc events for IO operations we want to wait on, and [WSA]WaitForMultipleEvents as the blocking operation, then having an object multiplexer (which keeps events between one call to the next) is far better than a simple subroutine (which needs to cancel those events after it blocks and before it returns).
Re: Tweaking junctions
On Oct 22, 6:41 pm, dam...@conway.org (Damian Conway) wrote: Dave Whipp wrote: When this issue has been raised in the past, the response has been that junctions are not really intended to be useful outside of the narrow purpose for which they were introduced. Hmm. There are intentions, and then there are intentions. I know what I intended when I invented the original idea, and it wasn't just the narrow purpose for which they were added to Perl 6. :-) Problem 2 could be solved by defining a new (and public!) C.eigenstates method in the Junction class. [...] I think that you're proposed solution is a bit too specific: That's because I didn't explain Part B of my nefarious plan! namely that, if you'd only give me proper eigenstates, I'd give you an even nicer alternative. I actually think that the meta doesn't belong on the operator at all (though I have no problem with that idea in itself). Instead, I think the meta should be placed on the data (which, of course, is what any(), all(), one(), and none() already do). So I'm going to go on to propose that we create a fifth class of Junction: the transjunction, with corresponding keyword Cevery. [snip] I'm probably missing something, but wouldn't it have been easier to write that module by using eval STRING to create all of those infix operators? Start with a list of the names of the operators, generate a string containing all four argument variations for each operator, then eval it.
Lazy Strings and Regexes
I know that perl6 has / will have lazy strings, since (in S32::Containers) the List role defines a cat method, which returns a Cat object, which does the Str interface, but generates the string lazily. First, are Cat objects documented anywhere else? Secondly, if a regular expression match is done on a lazy string, is that lazy string turned into a normal string? If we can efficiently match against a lazy string, and if this doesn't turn the lazy string into a (large) normal string, then the best way to process a file might be something similar to: my $fh = open ... err die; my $contents = cat($fh.lines); , followed by matching on $contents. Better still would be to provide a way for filehandles to be directly asked to produce a lazy Str which reflects the file.
threads?
Has there been any decision yet over what model(s) of threads perl6 will support? Will they be POSIX-like? ithread-like? green-thread-like? It is my hope that more than one model will be supported... something that would allow the most lightweight threads possible to be used where possible, and ithread-like behavior for backwards compatibility, and perhaps something in-between where the lightest threads won't work, but ithreads are too slow. If perl6 can statically (at compile time) analyse subroutines and methods and determine if they're reentrant, then it could automatically use the lightest weight threads when it knows that the entry sub won't have side effects or alter global data. If an otherwise-reentrant subroutine calls other subs which have been labelled by their authors as thread-safe, then that top subroutine can also be assumed to be thread-safe. This would be when the intermediate weight threads might be used. If thread-unsafe subroutines are called, then something like ithreads might be used. To allow the programmer to force perl6 to use lighter threads than it would choose by static analysis, he should be able to declare methods, subs, and blocks to be reentrant or threads safe, even if they don't look that way to the compiler. Of course, he would be doing so at his own risk, but he should be allowed to do it (maybe with a warning).