Re: Proposal for groups
"BMK" == Bradley M Kuhn [EMAIL PROTECTED] writes: BMK If we do this, please also make perl6-internals-design-monitor BMK or something like that, which is a list that simply redistributes BMK mail from perl6-internals-design to its subscribers. In other BMK words, only perl6-internals-design post would go there, but no BMK subscriber could post. Just be careful about the perl6-all redirection. Don't allow registration on both redirection lists. Hmm, How would this work? Headers would be re-written? How would 'critical' comments get to the -internals-design list? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Opcodes (was Re: The external interface for the parser piece)
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: The "add" op would, in C code, do something like: void add() { P6Scaler *addend; P6Scaler *adder; addend = pop(); adder = pop(); push addend-vtable-add(addend, adder); } it would be up to the addend-vtable-add() to figure out how to do the actual addition, and what type to return. DS Yup. I think it'll be a little more complex than that in the call, DS something like: addend- vtable-(add[typeof adder])(adder); DS The extra level of indirection may hurt in the general case, but I think DS it's a win to call the "add an int scalar to me" function rather than have DS a generic "add this scalar to me" function that figures out the type of the DS scalar passed and then Does The Right Thing. I hope. (Yeah, I'm betting DS that the extra indirect will be cheaper than the extra code. But I'm not DS writing that in stone until we can do some benchmarking) Is all that really necessary? Why not a non-vtbl function that knows how to add numeric types? I would have wanted to limit the vtbl to self manipulation functions. Set, get, convert, etc. Cross object operations would/should be outside the realm of the object. (It seems like trying to lift yourself by the bootstraps.) chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: SvPV*
"JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes: 2) An attached table of attributes and ranges to which they apply? Uses less memory for sparse attributes, but means that it's hard work every time we have to interrogate or shuffle characters as we need to check all the ranges each time to see if the characters we are manipulating have metadata. JH I believe this alternative has been discussed once in a while. Which JH ranges an operation affects is a log(N) operation on the character JH position (binary search), and the ranges can also be kept sorted among JH themselves on (primary key start position, secondary key end JH position), so that finding out the victim ranges is also a log(N). JH Admittedly, log(N) tends to be larger than 1, and certainly larger JH than 0 :-) Also, using UTF-8 (or any variable length encoding) is JH a pain since you can't any more just happily offset to the data. JH One could also implement SVs as balanced trees, splitting and merging JH as the scalar grows and shrinks. I'd offer the possiblity that there are two (or perhaps more) different problems here. One is the current bunch of bytes (string, executable to be twiddled) Another which the attribute on strings seems to be structured data. Squeezing attributes onto a buffer, seems to be shoehorning a more general problem onto a specific implementation. Getting an efficient representation of a meaningful structure should be done a new data type. (I'm thinking of representing COBOL records/data, or even XML documents) chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: SvPV*
"NC" == Nicholas Clark [EMAIL PROTECTED] writes: NC Have I misunderstood you if I suggest that "two or more" is actually a NC continuous range of representation from NC 1 (contiguous linear) string data with 0 or more attribute attached to each NC character where the string's text is the backbone NC [and the global and local order of the characters in string is crucial NCto the value and equality with other variables] NC 2 structured data (eg XML) where the string's text is just part of the data NC held in the structure, and you could sort the data in different ways NC without changing its value NC Are those end members in a continuum? or are hybrids of the 2 impossible? NC Am I barking up the wrong tree completely? That's one way of looking at it. But I'm more inclined to think of the structured data type as a layer above the raw bits. I see the association of attributes with the underlying data as an extra 'service'. If for no other reason, there are many ways of having the attributes distribute across, deletions, additions, and moves. That is a policy decision that should not be done at the perl internal level. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: TIL redux (was Re: What will the Perl6 code name be?)
Language confusion. Ariel was discussing TIL you are discussion Threaded execution. Two different concepts. chaim "JvV" == John van V [EMAIL PROTECTED] writes: JvV On 28 Oct 2000 08:06:57 +0200, Ariel Scolnicov [EMAIL PROTECTED] wrote : threaded code is so much slower; this can also be seen as an indictment of threaded code). JvV Now I am really confused. This directly contradicts the Threaded Perl RFC. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: TIL redux (was Re: What will the Perl6 code name be?)
[cc'ed to perl6-internals] "AS" == Ariel Scolnicov [EMAIL PROTECTED] writes: AS A TIL doesn't stand in the way. You just don't get the same AS advantages (e.g. code compactness, e.g. reasonably easy peephole AS optimisation for unthreaded code) if you try to compile to a TIL. AS Which is not surprising, since Perl is not Forth. Sorry, I'm not following. Why do you lose all these? Why is a TIL not compact? (Hard to imagine anything more compact.) What does a TIL have to do with peephole optimization? A word has to be done or not. If there are some magic combinations of operations that are done very regularly, a new word that does that combo could be provided. If the representation doesn't allow for certain optimizations, the TIL is not the optree, but rather the final executable form. The compiler could in fact create new words optimized just for the job. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Threaded Perl bytecode (was: Re: stackless python)
"AT" == Adam Turoff [EMAIL PROTECTED] writes: AT On Tue, Oct 24, 2000 at 10:55:29AM -0400, Chaim Frenkel wrote: I don't see it. I would find it extremely akward to allow thread 1:*foo = \one_foo; thread 2:*foo = \other_foo; [...] copy the foo body to a new location. replace the old foo body with an indirection (I believe this is atomic.) AT Actually, that shouldn't be awkward, if both threads have their own AT private symbol tables. As you pointed out below, we lose the use of bytecode threading. As all lookups need to go through the symbol table. Actually, we probably lose any pre-compilation wins, since all function lookups need to go through the symbol table. AT In any case, that's a different kind of threading. IPC threading AT has nothing to do with bytecode threading (for the most part). AT What you describe here is IPC threading. Larry was talking about AT bytecode threading. :-) No, I was pointing out the interaction between the two. If you want the two execution threads to be able to have seperate meanings for *foo, then we need to have seperate symbol tables. If we want them shared then we need mutexes and dynamic lookups. If we want to have shared optrees (or threaded bytecode) we need to prevent this or find a usable workaround. AT Bytecode threading is a concept pioneered in Forth ~30 years ago. Forth AT compiles incredibly easily into an intermediate representation. That AT intermediate representation is executed by a very tight interpreter AT (on the order of a few dozen instructions) that eliminates the need for AT standard sub calls (push the registers on the stack, push the params AT onto the stack and JSR). The interpreter is optional. There does not need to be an interpreter. The pointers can be machine level JSR. Which removes the interpreter loop and runs a machine speed. AT Forth works by passing all parameters on the data stack, so there's no AT explicit need to do the JSR or save registers. "Function" calls are done AT by adding bytecode that simply says "continue here" where "here" is the AT current definition of the sub being called. As a result, the current AT definition of a function is hard-coded into be the callee when the caller AT is compiled. [*] Err, from my reading there is no come from. The PC is saved on the execution stack and restored when the inner loop exits the current nesting level. Actually, being able to use registers would do wonders to speed up the calls. I vaguely recall that the sparc has some sort of register windows and that most of the parameters can be passed in a register. (At this point we are at a major porting effort. But the inner loop TIL would be the easiest to port.) AT The problem with AT *main::localtime = \foo; AT *foo = \bar; AT when dealing with threaded bytecode is that the threading specifically AT eliminates the indirection in the name of speed. Because Perl expects AT this kind of re-assignment to be done dynamically, threaded bytecodes AT aren't a good fit without accepting a huge bunch of modifications to Perl AT behavior as we know it. (Using threaded bytecodes as an intermediate AT interpretation also confound optimization, since so little of the AT context is saved. Intentionally.) Actually from my reading one doesn't have to lose it entirely. All TIL functions have some sort of header. The pointer to a TIL function points to the first entry/pointer in the function. In the case of real machine level code, the pointer is to a routine that actually invokes the function. In the case of a higher level TIL function, the pointer is to a function that nests the inner loop. In the event of redirecting a function. This pointer can be redirected to an appropriate routine that either fixes up the original code or simply redirects to the new version. (And the old code can be reactivated easily) AT *: Forth is an interactive development environment of sorts and AT the "current definition of a sub" may change over time, but the AT previously compiled calling functions won't be updated after the AT sub is redefined, unless they're recompiled to use the new definition. The current defintion of a sub doesn't change. Only a new entry in the dictonary (symbol table) now points at a new body. If the definition is deleted, the old value reappears. This is no different than what happens in postscript. One makes the decision to either do /foo { ... } def and take the lookup hits, or /foo { ... } bind def and locks in the current meanings. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: [not quite an RFC] shared bytecode/optree
"BS" == Benjamin Stuhl [EMAIL PROTECTED] writes: BS 1. Bytecode can just be mmap'ed or read in, no playing BS around with relocations on loading or games with RVAs BS (which can't be used anyway, since variable RVAs vary based BS on what's been allocated or freed earlier). (What is an RVA?) And how does the actual runtime use a relocatable pointer? If it is an offset, then any access becomes an add. And depending upon the source of the pointer, it would either be a real address or an offset. Or if everything is a handle, then each access requires two fetches. And I don't see where you avoided the relocation. The handle table that would come in with the bytecode would need to be adjusted to reflect the real address. I vaguly can see a TIL that uses machine code linkage (real machine code jumps) that perhaps could use relative addressing as not needing relocation. But I'm not sure that all architectures support long enough relative jumps/calls. Doing the actual relocation should be quite fast. I believe that all current executables have to be relocated upon loading. Not to mention the calls to shared modules/dlls. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Threaded Perl bytecode (was: Re: stackless python)
"KF" == Ken Fox [EMAIL PROTECTED] writes: KF Adam Turoff wrote: when dealing with threaded bytecode is that the threading specifically eliminates the indirection in the name of speed. KF Yes. Chaim was saying that for the functions that need indirection, KF they could use stubs. You don't need to guess in advance which ones KF need indirection because at run-time you can just copy the old code KF to a new location and *write over* the old location with a "fetch pointer KF and tail call" stub. All bytecode pointers stay the same -- they just KF point to the stub now. The only restriction on this technique is that KF the no sub body can be smaller than the indirection stub. (We could KF easily make a single bytecode op that does a symbol table lookup KF and tail call so I don't see any practical restrictions at all.) We may not even need to copy the body. If the header of the function is target location, the header could any one of nop, nest another inner loop lookup current symbol fixup caller or jump to new target. (Hmm, with Q::S, it could be all of them in constant time.) chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Threaded Perl bytecode (was: Re: stackless python)
"AT" == Adam Turoff [EMAIL PROTECTED] writes: Wouldn't just the appearance of *foo = \other_foo, be enough to tell the compiler to treat all foo's (or perhaps if there were some dataflow analysis some region of code) to use indirection? AT You're forgetting eval "*foo = \other_foo" and the like. :-) AT And, if modules are threaded when they're bytecompiled, it's rather AT difficult to intuit some random invocant's use of *foo = \other AT (at runtime) to add the required indirection post hoc. AT Perl's expected behavior almost requires unthreaded bytecode. Threaded AT bytecode could work, if it's explicitly requested (thus making AT *main::localtime = *foo = \bar; a fatal error, at the user's request.) I don't see it. I would find it extremely akward to allow thread 1: *foo = \one_foo; thread 2: *foo = \other_foo; Rather, this style should be done via a variable indirection {$foo}. I don't see much of a speed hit. Until the *foo assignment is actually done, all threading is directly to the body of foo. When the *foo assignment is done, copy the foo body to a new location. replace the old foo body with an indirection (I believe this is atomic.) And optionally, the indirection could be to a fixup routine, that would adjust the caller to directly point at the new body. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Threaded Perl bytecode (was: Re: stackless python)
"JvV" == John van V [EMAIL PROTECTED] writes: JvV If this is the case, the code underlying the treading would utilize normal functions to poll the concurrent event streams and programmers could JvV choose between the threads and functions depending on their levels of comfort. JvV This is good for my hypothetical remote sensing devices because the byte code interpreter would be single threaded, presumably smaller, and JvV a lot easier to test. Different threads. I believe the original was Threaded Interpreter Code, (think forth) i.e. using pointers (or direct machine calls) to other body of code made up of pointers or a real piece of code. You seem to be thinking of threaded execution. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Threaded Perl bytecode (was: Re: stackless python)
"AT" == Adam Turoff [EMAIL PROTECTED] writes: AT If Perl bytecode were to become threaded, it would be rather troublesome. AT It would probably require some attribute or early compile time AT declaration (in main::BEGIN) to tag specific subs/builtins to be AT overridden at runtime. It would also force a runtime error when AT attempting to override a threaded sub; that it can't be overridden AT anymore violates the principle of least surprise wrt Perl5. AT It would also mean that if anything was overriden anywhere, no AT module code could be read in as bytecode, since it may need to be AT rethreaded to incorporate overrideable subs/builtins. I'm missing something here. Wouldn't just the appearance of *foo = \other_foo, be enough to tell the compiler to treat all foo's (or perhaps if there were some dataflow analysis some region of code) to use indirection? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 334 (v1) Perl should allow specially attributed subs to be called as C functions
There is an intermediate method, have our own execution and data stack. Basically build a TIL interpreter. This might be intermediate in speed between raw machine code and the perl vararg calls. If not intermediate in speed, I suspect it would involve cleaner looking code. All functions would look the same all calls between internal functions would look the same. chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: Do you need this "deep carnal knowledge" to make it efficient, or just to make the thing fly at all? DS Depends on how we implement the thing. DS If we build up real function code at runtime, we can encapsulate all the DS state we need in a single function pointer, and don't need to have the DS caller pass in any state at all. This is good, since it means the caller DS can do: DS (*foo)(1, 2, 3) DS to call the perl sub whose function poiner's stored in foo. (Well, assuming DS I got the syntax right) The bad thing is we need to know how to generate DS the function header and whatever bits we need to have to pull arguments off DS the stack and out of registers and turn 'em into perl PMCs, then call the DS real perl code. DS The alternate method is to use something like we've got now with the DS perl_call_* stuff, where you pass in a pointer to the real perl function to DS a generic (vararg'd) C function, like: DSperl_call(perl_cv_ptr, 1, 2, 3); DS the bad bit about that is it means that calls to perl functions are DS different than calls to C functions, and I'm trying not to do that--I DS really do want to be able to get real function pointers that can be used in -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 334 (v1) Perl should allow specially attributed subs to be called as C functions
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS At 01:50 PM 10/10/00 -0400, Chaim Frenkel wrote: There is an intermediate method, have our own execution and data stack. Basically build a TIL interpreter. This might be intermediate in speed between raw machine code and the perl vararg calls. DS Perl functions that are called from outside will have to have some DS sort of interpreter attached to 'em. I can see either a default DS interpreter, or the one they were compiled into being valid as a DS choice. Hmm, I was probably off base. I was thinking of multiple implementations of the runops methodology. Not of how to call into perl only between perl funcs. If at all possible there should be only one way to write the function and one way to call it. So that the innards can be independent of the actual implementation. A Calling from the outside should be extremely limited. One shouldnt' be allowed to straddle the fence. If not intermediate in speed, I suspect it would involve cleaner looking code. All functions would look the same all calls between internal functions would look the same. DS If there's no hit, I'd love to have all perl functions callable from DS outside. I'm not sure that'll be the case, though I'm all for it... I don't think you want that. Calling a perlop directly has to mean that the caller is signing in blood that there are no guarentees of when it would break. We might want someone peeking behind the curtain to supply an expected version, and we could either accomadate it or break it early enough to give sufficient time to adjust the code. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 361 (v1) Simplifying split()
"TC" == Tom Christiansen [EMAIL PROTECTED] writes: TC The reason that this was done this way was for the common TC case of someone forgetting to chop an incoming line and TC then splitting on whitespace. TC while () { TC @F = split; TC ... TC } Interesting. I thought it was to make it more natural. When splitting on whitespace, one is interested in the non-space tokens. Effectively the leading and trailing whitespace isn't there. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 326 (v1) Symbols, symbols everywhere
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS For the internals, though... DS This would be very useful, and it's a feature I'd really like to implement. DS Basically you're asking for pre-computed, indirect, shared hash keys. This DS sounds like a Good Plan to me. Why precomputed? Any 'interned' string has a unique value (e.g. address). Though wouldn't they have to be garbage collected? Short lived hashes with constantly changing keys, the shared hash keys would keep growing. Actually, this might be something useful at the user level. Many times I do this @record{@keys} = new_values(); Using a set of 'intern'ed strings might make it more efficient. And unless we are able to note that @keys is always the same,the hashes would have to keep getting recomputed. With the symbols we might be able to recognize the constant set. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 301 (v1) Cache byte-compiled programs and modules
"RA" == Russ Allbery [EMAIL PROTECTED] writes: RA Michael Maraist [EMAIL PROTECTED] writes: I suggested this a while ago, and the response was that automatically writing files is a security risk. You should extend your RFC to describe a caching directory or configuration. RA This will be completely impossible to implement in some installation RA environments, such as AFS or read-only remote NFS mounts. I really don't RA like software that tries to play dynamic compilation tricks; please just RA compile at installation time and then leave it alone. This isn't really a problem. Purify does this already. If it is not allowed to write into the original directory, You give it a cache directory and it does appropriate magic to map to the original file and associate the correct user to the cached version. eg. cache_dir/usr/local/perl/.../Posix.pm.1050:22 # George did this one And this would be disabled under -T chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 214 (v1) Emit warnings and errors based on unoptimized code
The discussion was about optimized code. WHere the original location would have been lost. A multiline constant, seems like something that could keep the location information. chaim "TC" == Tom Christiansen [EMAIL PROTECTED] writes: Yup. Worst case we can always point at where the line starts, if it still exists. TC Make it: TC Division by zero error on statement beginning at line xx TC Consider multiline constants -- where do you say the warning occurred? TC print EOF; TC blah TC blah TC $fred TC blah TC blah -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 227 (v1) Extend the window to turn on taint mode
"JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes: (Someone remind me, What is the point of -T if not running setuid?) JH Being paranoid is never a bad idea because They are always out to get you. That's fine, but tell me what security breach can be caused by not having a -T? The perl code is available to be read. So what can a perl program do that the black hat couldn't by tweaking the code? The code is running under the black hat's priviledges and uid. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 227 (v1) Extend the window to turn on taint mode
"AT" == Adam Turoff [EMAIL PROTECTED] writes: AT The crux of my proposal/request is that when perl6 innards are AT designed, -T processing is handled the same way -p and -i are. AT That is, option processing should start out cleaner than what AT is in 5.7.0 or what was in 5.004 (at least, wrt -T). I'll agree with you. But unless you propose to change the semantics of when and where arguments are processed, you have timing problems. With no command line arguments perl can honor everything on the #! line. With command line arguments, perl has to process them to know how to interpret the script. The only other mechanism that might be worthwhile would be for perl to notice the -T and then give up and re-exec itself with an added -T at the front of the line. This would be workable as long as none of the -M's do anything to change state. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 227 (v1) Extend the window to turn on taint mode
"JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes: JH It may not be. Think CGI. JH The code is running under what ever poor security measures the silly JH subclued webmaster set it up to be, and has access to which ever files JH yadayadayada. No command line switches there. Only the #!. If the subclued webmaster has perl in his cgi-bin directory, -T is his least worry. Hmm, or are you thinking of a shell script that's calling perl? Then he has lots of holes to worry about. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 227 (v1) Extend the window to turn on taint mode
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS Any time the code being executed isn't being run as the person asking for DS its execution you can have problems. Think daemons in perl, or DS client-server code. (Like CGI programs, or mailing-list managers) Jobs run DS automagically by privileged users (and arguably not automagically) can be DS targets. Think odd filenames in /tmp and cron jobs owned by root. But these all lack command line switches that are passed to perl. If one is debugging a script with a #! -T, then just rerun the job and add a -T right at the front. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 214 (v1) Emit warnings and errors based on unoptimized code
I don't believe that the optree has to be bloated. I think having the actual line number in the optree vs. having a seperate structure mapping offsets to line numbers are the same. Then an error report would be akin to error_me_this(current_op_address, "Foo didn't baz in time. Aborting") Having the bloat in a seperate memory area lets it get swapped out until absolutely necessary. And helps preseve locality for the optree. chaim "DS" == Dave Storrs [EMAIL PROTECTED] writes: DS As to solving problem #1 (which is, arguably, the bigger problem), DS suppose we add a new switch to perl? I propose we add the -H switch DS (mnemonic: *H*elpful errors/warnings). When -H is set, the optree would DS be generated with a sufficient amount of bloat that it could report the DS errors/warnings at the exact spot they occur, even down to reporting the DS appropriate failure line in a multiline statement. We don't worry about DS bloat or slowdown, because the assumption is that -H is only used during DS debugging or when speed doesn't matter, and it will be turned off when the DS code goes to production. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 227 (v1) Extend the window to turn on taint mode
I vaguely recall when Chip put that in. He worked pretty hard to adjust the command line/#! option processing. (Something about unsafe operations already being done before the script is read.) You are asking for the first line of the input script be read before any of the command line arguments are processed, this line would then be searched for the -T and have that propogated to the front of any command line arguments. Sounds messy. Hmm, got one. '-S' searches the PATH for the script. (Someone remind me, What is the point of -T if not running setuid?) chaim "PRL" == Perl6 RFC Librarian [EMAIL PROTECTED] writes: PRL Perl complains when the -T flag is used with the #! PRL mechanism, and perl is explicitly invoked on the PRL commandline without the -T flag: PRL This RFC proposes that when Perl is explicitly invoked PRL on the commandline, and runs a script that contains the PRL -T option on the #! line, Perl should just turn on PRL taint mode and not complain about it. PRL =head1 MIGRATION ISSUES PRL None. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 214 (v1) Emit warnings and errors based on unoptimized code
"NT" == Nathan Torkington [EMAIL PROTECTED] writes: NT I take this oblique comment to mean that it'd bloat the op-tree too NT much? NT I was thinking of this over lunch. I want to be able to strip the NT instruction sequence of line number, package, etc. information, in the NT name of a smaller memory footprint and smaller distributed bytecode. NT It'd make debugging tricky, but if there was still a sequence number NT ("error at opcode #1590") preserved, the user could produce an NT unstripped executable and then use the sequence number to see where NT the problem was. I don't see the problem. Seperate the location from the optree. The optree is opimized up the gazoo. A seperate table, which would be paged out as not needed until the problem strikes, would have the cross reference between line numbers/files and op code File cross-reference (might be just the %INC) file#, filename from, to, file#, line# ... If the optimizer moves some opcodes around, it would slice and dice the relevent offset records to keep track. (I saw this in the Stratus VOS compiler/linker) Thoughts? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFCs for thread models
"SWM" == Steven W McDougall [EMAIL PROTECTED] writes: SWM If you actually compile a Perl program, like SWM$a = $b SWM and then look at the op tree, you won't find the symbol "$b", or "b" SWM anywhere in it. The fetch() op does not have the name of the variable SWM $b; rather, it holds a pointer to the value for $b. Where did you get this idea from? P5 currently does many lookups for names. All globals. Lexicals live elsewhere. SWM If each thread is to have its own value for $b, then the fetch() op SWM can't hold a pointer to *the* value. Instead, it must hold a pointer SWM to a map that indexes from thread ID to the value of $b for that SWM thread. Thread IDs tend to be sparse, so the map can't be implemented SWM as an array. It will have to be a hash, or a B*-tree, or a balanced SWM B-tree, or the like. You are imagining an implementation and then arguing against it. What about a simple block of reserved data per'stack frame' and the $b becomes an offset into that area? And then there are all the other offset for variables that are in outer scopes. Here is my current 'guess'. A single pointer to the thread interpreters private data. A thread stack (either machine or implemented) A thread private area for evaled code op trees (and Damian specials :-) A thread private file scope lexical area The lexical variables would live on the stack in some frame, with outer scope lexicals directly addressable (I don't recall all of the details but this is standard compiler stuff, I think the dragon book covers this in detail) The shared variables (e.g. main::*) would live in the well protected global area. Now where sub recursive() { my $a :shared; ; return recursive() } would put $a or even which $a is meant, is left as an excersize for someone brighter than me. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"NI" == Nick Ing-Simmons [EMAIL PROTECTED] writes: NI Chaim Frenkel [EMAIL PROTECTED] writes: NI Well if you want to place that restriction on perl6 so be it but in perl5 NI I can say NI tie $a[4],'Something'; That I didn't realize. NI Indeed that is exactly how tied arrays work - they (automatically) add NI 'p' magic (internal tie) to their elements. Hmm, I always understood a tied array to be the _array_ not each individual element. NI Tk apps to this all the time : NI $parent-Lable(-textvariable = \$somehash{'Foo'}); NI The reference is just to get the actual element rather than a copy. NI Tk then ties the actual element so it can see STORE ops and up date NI label. Would it be a loss to not allow the elements? The tie would then be to the aggregate. I might argue that under threading tieing to the aggregate may be 'more' correct for coherency (locking the aggregate before accessing.) chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"AB" == Alan Burlison [EMAIL PROTECTED] writes: AB Chaim Frenkel wrote: You aren't being clear here. fetch($a) fetch($a) fetch($b) ... add ... store($a) store($a) Now all of the perl internals are done 'safely' but the result is garbage. You don't even know the result of the addition. AB Sorry you are right, I wasn't clear. You are correct - the final value AB of $a will depend on the exact ordering of the FETCHEs and STOREs in the AB two threads. As I said - tough. The problem is that defining a AB 'statement' is hard. Does map or grep constitute a single statement? I AB bet most perl programmers would say 'Yes'. However I suspect it AB wouldn't be practical to make it auto-locking in the manner you AB describe. In that case you aren't actually making anyone's life easier AB by adding auto-locking, as they now have a whole new problem to solve - AB remembering which operations are and aren't auto-locking. Explicit AB locks don't require a feat of memory - they are there for all to see in AB the code. I want to make _my_ life easier. I don't expect to have mucho communication between threads, the more communication the more lossage in performace to the sheer handshaking. So with minimal interaction, why bother with the sprinkling of the lock. You in effect are tell _all_ users that they must do lock($a) ... unlock($a) around all :shared variables. Now if that has to be done why not do it automatically. AB The other issue is that auto-locking operations will inevitably be done AB inside explicitly locked sections. This is firstly inefficient as it AB adds another level of locking, and secondly may well be prone to causing AB deadlocks. Aha, You might have missed one of my comments. A lock() within a scope would turn off the auto locking. The user _knows_ what he wants and is now willing to accept responsibility. Doing that store of a value in $h, ior pushing something onto @queue is going to be a complex operation. If you are going to keep a lock on %h while the entire expression/statement completes, then you have essentially given me an atomic operation which is what I would like. AB And you have given me something that I don't like, which is to make AB every shared hash a serialisation point. I'm sure you've done a lot of work in the core and serialization. But haven't you seen that when you want to lock an entry deep in the heart of some chain, you have to _first_ lock the chain to prevent changes? So, lock(aggregate) fetch(key) lock(chain) fetch(chain) lock(value) fetch(value) unlock(value) unlock(chain) unlock(key) unlock(aggregate) Actually, these could be readlocks, and they might be freed as soon as they aren't needed, but I'm not sure that the rule to keep holding might be better. (e.g. Promotion to an exclusive), but these algorithms have already been worked on for quite a long time. AB If I'm thinking of speeding up an app that uses a shared hash by AB threading it I'll see limited speedup because under your scheme, AB any accesses will be serialised by that damn automatic lock that I AB DON'T WANT! Then just use lock() in the scope. AB A more common approach to locking hashes is to have a lock per AB chain - this allows concurrent updates to happen as long as they AB are on different chains. Don't forget that the aggregate needs to be locked before trying to lock the chain. The aggregate may disappear underneath you unless you lock it down. AB Also, I'm not clear what type of automatic lock you are intending AB to cripple me with - an exclusive one or a read/write lock for AB example. My shared variable might be mainly read-only, so AB automatically taking out an exclusive lock every time I fetch its AB value isn't really helping me much. I agree with having a read-only vs. exclusive. But do all platforms provide this type of locking? If not would the overhead of implementing it kill any performance wins. Does promoting a read-only to an exclusive cost that much? So the automatic lock would be a read-only and the store promotes to an exclusive during the short storage period. AB I think what I'm trying to say is please stop trying to be helpful AB by adding auto locking, because in most cases it will just get in AB the way. Here we are arguing about which is the more common access method. Multiple shared or singluar shared. This is an experience issue. Those times that I've done threaded code, I've kept the sharing down to a minimum. Mostly work queues, or using a variable to create a critical section. AB If you *really* desperately want it, I think it should be optional, e.g. ABmy $a : shared, auto lock; AB or somesuch. This will probably be fine for those people who are using AB threads but who don't actually understand what they
Re: RFC 178 (v2) Lightweight Threads
"AB" == Alan Burlison [EMAIL PROTECTED] writes: AB Chaim Frenkel wrote: What tied scalar? All you can contain in an aggregate is a reference to a tied scalar. The bucket in the aggregate is a regular bucket. No? AB So you don't intend being able to roll back anything that has been AB modified via a reference then? And if you do intend to allow this, how AB will you know when to stop chasing references? What happens if there AB are circular references? How much time do you think it will take to AB scan a 4Gb array to find out which elements need to be checkpointed? AB Please consider carefully the potential consequences of your proposal. No scanning. I was considering that all variables on a store would safe store the previous value in a thread specific holding area[*]. Then upon a deadlock/rollback, the changed values would be restored. (This restoration should be valid, since the change could not have taken place without an exclusive lock on the variable.) Then the execution stack and program counter would be reset to the checkpoint. And then restarted. chaim [*] Think of it as the transaction log. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"AB" == Alan Burlison [EMAIL PROTECTED] writes: AB Please consider carefully the potential consequences of your proposal. I just realized, that no one has submitted a language level proposal how deadlocks are detected, delivered to the perl program, how they are to be recovered from, What happens to the held locks, etc. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads(multiversionning)
"r" == raptor [EMAIL PROTECTED] writes: r ]- what if we don't use "locks", but multple versions of the same variable r !!! What I have in mind : r If there is transaction-based variables THEN we can use multiversioning r mechanism like some DB - Interbase for example. r Check here : http://216.217.141.125/document/InternalsOverview.htm r just thoughts, i've not read the whole discussion. Doesn't really help. You just move the problem to commit time. Remember, the final result has to be _as if_ all of the interleaved changes were done serially (one thread finishing before the other). If this can not be done, then one or the other thread has to be notified of deadlock and the relevant changes thrown away. (As a former boss liked to say, "Work is conserved." or perhaps TANSTAFL) chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
(We are not (quite) discussing what to do for Perl6 any longer. I'm going though a learning phase here. I.e. where are my thoughts miswired.) "AB" == Alan Burlison [EMAIL PROTECTED] writes: Actually, I wasn't. I was considering the locking/deadlock handling part of database engines. (Map row - variable.) AB Locking, transactions and deadlock detection are all related, but aren't AB the same thing. Relational databases and procedural programming AB languages aren't the same thing. Beware of misleading comparisons. You are conflating what I'm saying. Doing locking and deadlock detection is the mapping. Transactions/rollback is what I was suggesting perl could use to accomplish under the covers recovery. How on earth does a compiler recognize checkpoints (or whatever they are called) in an expression. AB If you are talking about SQL it doesn't. You have to explicitly say AB where you want a transaction completed (COMMIT) or aborted (ROLLBACK). AB Rollback goes back to the point of the last COMMMIT. Sorry, I meant 'C' and Nick pointed out the correct term was sequence point. I'm probably way off base, but this was what I had in mind. (I. == Internal) I.Object - A non-tied scalar or aggregate object I.Expression - An expression (no function calls) involving only SObjects I.Operation - (non-io operators) operating on I.Expressions I.Function - A function that is made up of only I.Operations/I.Expressions I.Statement - A statment made up of only I.Functions, I.Operations and I.Expressions AB And if the aggregate contains a tied scalar - what then? The only way AB of knowing this would be to check every item of an aggregate before AB starting. I think not. What tied scalar? All you can contain in an aggregate is a reference to a tied scalar. The bucket in the aggregate is a regular bucket. No? Because if we can recover, we can take locks in arbitrary order and simply retry on deadlock. A variable could put its prior value into an undo log for use in recovery. AB Nope. Which one of the competing transactions wins? Do you want a AB nondeterministic outcome? It is already non-deterministic. Even if you lock up the gazoo, depending upon how the threads get there the value can be anything. Thread aThread B lock($a); $a=2; unlock($a); lock($a); $a=5; unlock($a); Is the value 5 or 2? It doesn't matter. All that a sequence of locking has to accomplish is to make them look as one or the other completed in sequence. (I've got a reference here somewhere to this definition of consistancy) The approach that I was suggesting is somewhat akin to (what I understand) a versioning approach to transactions would take. AB Deadlocks are the bane of any DBAs life. Not any of the DBAs that I'm familiar with. They just let the application programmers duke it out. AB If you get a deadlock it means your application is broken - it is AB trying to do two things which are mutually inconsistent at the AB same time. Sorry, that doesn't mean anything. There may be more than one application in a Database. And they may have very logical things that they need done in a different order. The Deadlock could quite well be the effect of the database engine. (I know sybase does this (or at least did it a few revisions ago. It took the locks it needed on an index a bit late.) A deadlock is not a sin or something wrong. Avoiding it is a useful (extremely useful) optimization. Working with it might be another approach. I think of it like I think of ethernet's back off and retry. AB If you feel that automatically resolving this class of problem is AB an appropriate thing for perl to do. Because I did it already in a simple situation. I wrote a layer that handled database interactions. Given a set of database operations, I saved a queue of all operations. If a deadlock occured I retried it until successful _unless_ I had already returned some data to the client. Once some data was returned I cleaned out the queue. The recovery was invisible to the client. Since no data ever left my service layer, no external effects/changes could have been made. Similarly, all of the locking and deadlocks here could be internal to perl, and never visible to the user, so taking out a series of locks, even if they do deadlock, perl can recover. Again, this is probably too expensive and complex, but it isn't something that is completely infeasible. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"AB" == Alan Burlison [EMAIL PROTECTED] writes: my $a :shared; $a += $b; AB If you read my suggestion carefully, you would see that I explicitly AB covered this case and said that the internal consistency of $a would AB always be maintained (it would have to be otherwise the interpreter AB would explode), so two threads both adding to a shared $a would result AB in $a being updated appropriately - it is just that you wouldn't know AB the order in which the two additions were made. You aren't being clear here. fetch($a) fetch($a) fetch($b) ... add ... store($a) store($a) Now all of the perl internals are done 'safely' but the result is garbage. You don't even know the result of the addition. Without some of this minimal consistency, Every shared variable even those without cross variable consistancy, will need locks sprinkled around. AB I think you are getting confused between the locking needed within the AB interpreter to ensure that it's internal state is always consistent and AB sane, and the explicit application-level locking that will have to be in AB multithreaded perl programs to make them function correctly. AB Interpreter consistency and application correctness are *not* the same AB thing. I just said the same thing to someone else. I've been assuming that perl would make sure it doesn't dump core. I've been arguing for having perl do a minimal guarentee at the user level. my %h :shared; $h{$xyz} = $somevalue; my @queue :shared; push(@queue, $b); AB Again, all of these would have to be OK in an interpreter that ensured AB internal consistency. The trouble is if you want to update both $a, %h AB and @queue in an atomic fashion - then the application programmer MUST AB state his intent to the interpreter by providing explicit locking around AB the 3 updates. Sorry, internal consistancy isn't enough. Doing that store of a value in $h, ior pushing something onto @queue is going to be a complex operation. If you are going to keep a lock on %h while the entire expression/statement completes, then you have essentially given me an atomic operation which is what I would like. I think we all would agree that an op is atomic. +, op=, push, delete exists, etc. Yes? Then let's go on from there. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 130 (v4) Transaction-enabled variables for Perl6
"JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes: I don't think we can do this immediately. Can you come up with the right API and/or hooks that are needed so that it might be retrofited? JH I think language magic helping to do the user level data locking is JH a dead-in-the-water idea. For some very simply problems it may work, JH but that support need not be implemented by internals. I'm not advocating language magic. Just what level of support from the core is needed to be able to do this properly. The inter-thread communication and (the actual) locking would need to be in the core. Other callbacks and notifications would be delivered by the core. I think this is similar to what was done for perldb. Or what might happen for n-D matrix/tensors/arrays/whatevers. A plugable replacement module for doing transactions. Though I don't think you would mind having sub mycritical : lock { } # critical section here. or sub onlyone : method, lock { } # lock the object/class chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"NI" == Nick Ing-Simmons [EMAIL PROTECTED] writes: NI The snag with attempting to automate such things is illustrated by : NI thread Athread B NI $a = $a + $b++; $b = $b + $a++; NI So we need to 'lock' both $a and $b both sides. NI So thread A will attempt to acquire locks on $a,$b (say) NI and (in this case by symetry but perhaps just by bad luck) thread B will NI go for locks on $b,$a - opposite order. They then both get 1st lock NI they wanted and stall waiting for the 2nd. We are in then in NI a "classic" deadly embrace. NI So the 'dragons' that Dan alludes to are those of intuiting the locks NI and the sequence of the locks to acquire, deadlock detection and backoff, ... Agreed. But for a single 'statement', it may be possible to gather all the objects needing a lock and then grabbing them in order (say by address). Also the thread doesn't need to make any changes until all the locks are available so a backoff algorithm may work. This would keep a _single_ statment 'consistent'. But wouldn't do anything for anything more complex. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS Well, there'll be safe access to individual variables when perl needs to DS access them, but that's about it. DS Some things we can guarantee to be atomic. The auto increment/decrement DS operators can be reasonably guaranteed atomic, for example. But I don't DS think we should go further than "instantaneous access to shared data will DS see consistent internal data structures". This is going to be tricky. A list of atomic guarentees by perl will be needed. $a[++$b]; pop(@a); push(@a, @b); Will these? And given that users will be doing the locking. What do you see for handling deadlock detection and recovery/retry. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 136 (v2) Implementation of hash iterators
"TH" == Tom Hughes [EMAIL PROTECTED] writes: TH I guess we can translate all uses of keys and values when doing TH the p52p6 conversion - so that this: TH foreach my $x (keys %y) TH { TH $y{$x+1} = 1; TH } TH becomes: TH foreach my $x (@temp = keys %y) TH { TH $y{$x+1} = 1; TH } TH Of course we'd have to make sure the optimised realised that the TH assignment caused the list to be flattened so it didn't try and TH elide the 'useless' assignment ;-) Why not lock(%y); foreach my $x (keys %y) { $y{$x+1} = 1; } unlock(%y); Hmm, I just realized, perhaps we can just punt. Any p5 program that doesn't use Threads can be left alone. Using p5 threads would then need manual intervention. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"UG" == Uri Guttman [EMAIL PROTECTED] writes: UG i don't see how you can do atomic ops easily. assuming interpreter UG threads as the model, an interpreter could run in the middle of another UG and corrupt it. most perl ops do too much work for any easy way to make UG them atomic without explicit locks/mutexes. leave the locking to the UG coder and keep perl clean. in fact the whole concept of transactions in UG perl makes me queasy. leave that to the RDBMS and their ilk. If this is true, then give up on threads. Perl will have to do atomic operations, if for no other reason than to keep from core dumping and maintaining sane states. If going from an int to a bigint is not atomic, woe to anyone using threads. If it is atomic, then the ++ has to be atomic, since the actual operation isn't complete until it finishes. Think ++$a(before int, after ++ value is bigint) Some series of points (I can't remember what they are called in C) where operations are consider to have completed will have to be defined, between these points operations will have to be atomic. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 136 (v2) Implementation of hash iterators
"TH" == Tom Hughes [EMAIL PROTECTED] writes: TH I wasn't just talking about the threaded case though - the point TH which I was making was that of what happens if a single threaded TH program alters a hash in the middle of iterating it. TH Currently keys and values are flattened when they are seen so any TH change to the hash is not reflected in the resulting list. If we TH are iterating instead of flattening then we need to address that TH somehow. I'd rather not have the expansion performed. Some other mechanism, either under the covers or perhaps even specified in the language. The only real issue is if the change effects the iterator order. Changes to values should be allowed without out any adverse effects. Changes to the iterator order (inserted/deleted keys, push/pop) can be either, "don't do that", or queued up until the iterator is done or past the effected point. I'm partial to the don't do that approach. It can easily be handled at the user level. delete @hash{@delete_these}; @hash{keys %add_these} = values %add_these; chaim (Hmm, push(%hash, %add_these)) -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 136 (v2) Implementation of hash iterators
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: This could be a lot more efficient than modifying the vtbl and filling up the stack with the keys. I really am suspicious of replacing the vtbl entry, there may be more than one thread working its way through the hash. DS Or have a "next" vtable function that takes a token and returns the next DS entry in the variable. Each iterator keeps its own "current token" and the DS variable's just responsible for figuring out what should get returned next. DS We could also have a "prev" entry to walk backwards, if we wanted. The problem to be handled is how to modify the hash/array while the iterator is live. (Multiple active iterators, in multiple threads.) Even my versioning suggestion is problematic. We have the problem of inconsistent views of the hash. Even if iterator A only looks at version A (e.g. generation number). What version should its changes apply toward? Or should we simply make an iterator lock out access until completed? How about this. Iterators lock the hash upon the first access and release it when either finished, reset or destroyed. Your mechanism, is more like a seek,read,tell sequence without any guarentees between access. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: YAVTBL: yet another vtbl scheme
"BS" == Benjamin Stuhl [EMAIL PROTECTED] writes: BS variables BS "know" how to perform ops upon themselves. An operation is BS separate from the data it operates on. Therefore, I propose BS the following vtbl scheme, with two goals: The point is to avoid the switches. There is no need for a type flag, in the normal high speed operation. The operator that's in the right place at the right time gets called. BS opcodes. Every op checks to see if it is overloaded, and if BS it is, calls that. Some ops don't need to (ie, vec() can BS just do a set_string and add an OVERLOAD for the bitwise BS ops). But that's against the point. Nothing has to check. The only operation that is called is the correct one. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v1) Lightweight Threads
"SWM" == Steven W McDougall [EMAIL PROTECTED] writes: Aha, I get it. -internals has been assuming that one _must_ specify the sharing. You want it to be infered. I think that's asking for too much DWIMery. SWM Question: Can the interpreter determine when a variable becomes SWM shared? SWM Answer: No. Then neglecting to put a :shared attribute on a shared SWM variable will crash the interpreter. This doesn't seem very Perlish. Err, no. It won't crash the interpreter. It'll make the script operate incorrectly. SWM Answer: Yes. Then the interpreter can take the opportunity to install SWM a mutex on the variable. But do we want to? Perhaps we want to take away the value from the other thread. There are two different access that you weren't clear about. One is access to the name ($a). The other is to the contents. In the case of regular thingees, the name is the thingee. But the case is different for references. There are two accesses needed. One to the name holding the reference, the other is the referenced item. I think what you are discussing is probably a reference. my $a :shared; my $b; $a = \$b; Otherwise what is the problem. In this case. Putting a reference into a shared variable would wrap the reference. Would this satisfy your needs? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v1) Lightweight Threads
"SWM" == Steven W McDougall [EMAIL PROTECTED] writes: SWM All I want the language to guarantee is internal thread-safety. SWM Everything else is the user's problem. Somehow I would have thought that goes without saying. But I don't agree that all the rest is a user issue, is too short-sighted. The job of perl is to make things easy, and the hard possible. Single thingee access mediation, should be done automatically by perl. The multi-thingee complex mediation should have the user step in, since solving it (correctly and efficiently) is a complex problem. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 136 (v2) Implementation of hash iterators
"PRL" == Perl6 RFC Librarian [EMAIL PROTECTED] writes: PRL =head2 Freezing state for keys and values efficiently PRL I believe this problem can be solved by using the vtable for the PRL hash to wrap any mutating functions with a completion routine that PRL will advance the iterator to completion, creating a temporary list PRL of copied keys/values that it can then continue to iterate over. Have versioned hash entries. Iterators would know what version of the hash they are operating on. This could be a lot more efficient than modifying the vtbl and filling up the stack with the keys. I really am suspicious of replacing the vtbl entry, there may be more than one thread working its way through the hash. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"SWM" == Steven W McDougall [EMAIL PROTECTED] writes: PRL All threads see the same compiled subroutines Why? Why not allow two different threads to have a different view of the universe? SWM 1. That's how it works in compiled languages. You have one .exe, and SWM all threads run it. Perl is not C. One of its strengths is its introspection. SWM 2. Thread programming is difficult to begin with. A language where SWM different threads see different code could be *very* difficult to SWM program in. I'm thinking of threads as fork on steroids. Fork doesn't let you easily share things. What we really should get is the isolation of fork, but with the ease of sharing what is necessary. And I don't know about you, but I don't see what is morally wrong with having one thread using foo and getting 7 back and another using foo and getting an -42. PRL All threads share the same global variables _All_ or only as requested by the user (ala :shared)? SWM All. Dan has gone through this with perl5 and he really would rather not have to go through that. He would like the amount of data that needs protection reduced. You are also creating problems when I don't want mediation. What if I know better than perl and I want to us a single item to protect a critical section? PRL Threads can share block-scoped lexicals by passing a reference to a PRL lexical into a thread, by declaring one subroutine within the scope of PRL another, or with closures. Sounds complex to me. Why not make it simply visible by marking it as such? SWM These are the ways in which one subroutine can get access to the SWM lexical variables of another in Perl5. RFC178 specifies that these SWM mechanisms work across threads. References are a completely different animal than access. A data item is independent of a thread. It is a chunk of memory. If a thread can see it, then it is available. PRL The interpreter guarantees data coherence It can't, don't even try. What if I need two or more variables kept in sync. The user has to mediate. Perl can't determine this. SWM Data coherence just means that the interpreter won't crash or corrupt SWM its internal data representation. RFC178 uses the term *data SWM synchronization* for coordinating access to multiple variables between SWM threads. Then this RFC seems to be confusing two things. This is for -internals we don't even have any internal structures, so how can you be protecting them. If you are working at the language level this is the wrong forum. Perhaps, I'm archaic, but I really wouldn't mind if the thread model basically copied the fork() model and required those variable that have to live across threads to be marked as :shared. SWM Sigh...if that's the best I can get, I'll take it. I'm not the decisor here, I'm just pointing out another way to look at the problem. I really don't think you want to have _all_ variable actually visible. Even if they were, you will most likely have only a limited number that you want visible. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: A tentative list of vtable functions
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS Okay, here's a list of functions I think should go into variable vtables. DS Functions marked with a * will take an optional type offset so we can DS handle asking for various permutations of the basic type. DS type DS name What are this used for? DS get_bool Is this allowed to return a non-true/false result? Or is everything true or false? DS get_string * DS get_int * DS get_float * What does the optional type argument do? What about collection types? Why not simply collapse these into a single one with an option argument? What should a get_* do if inappropriate for its type? DS get_value What does this do that the above three do not? DS set_string * DS set_int * DS set_float * DS set_value DS add * DS subtract * DS multiply * DS divide * DS modulus * Where is the argument to be added/subtracted/etc. ? On the stack? DS clone (returns a new copy of the thing in question) Isn't this a deep problem? (Should be near the top of the vtbl DS new (creates a new thing) Why does the thing have to do a new? (move earlier) DS concatenate DS is_equal (true if this thing is equal to the parameter thing) DS is_same (True if this thing is the same thing as the parameter thing) How does THIS figure out how to get THAT to give a usable value? DS logical_or DS logical_and DS logical_not DS bind (For =~) DS repeat (For x) Are these so that operators can be overriden? DS Anyone got anything to add before I throw together the base vtable RFC? Are you going to fully specify the expected input and results in the RFC? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: A tentative list of vtable functions
"NI" == Nick Ing-Simmons [EMAIL PROTECTED] writes: NI Dan Sugalski [EMAIL PROTECTED] writes: is_equal (true if this thing is equal to the parameter thing) is_same (True if this thing is the same thing as the parameter thing) NI is_equal in what sense? (String, Number, ...) NI and how is is_same different from just comparing addresses of the things? Proxies? Wrappers? The proxy might want to answer on behalf of the proxied. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 130 (v4) Transaction-enabled variables for Perl6
"KF" == Ken Fox [EMAIL PROTECTED] writes: KF Chaim Frenkel wrote: You are now biting off quite a bit. KF What good is half a transaction? If transactions are to be useful, KF they should be fully supported -- including rolling back stuff some KF third party module did to its internal variables. (Maybe that's a KF little extreme ;) Definitely, especially if the third party already wrote something into a file or sent it out over the network. I believe that this will increase the deadlock possibilities. Without a transaction, it might have been possible to get in and out of a subroutine without holding the lock except within the subroutine. With a transaction, all variables touched within the dynamic scope of the transaction will have to be locked. KF Dead lock detection? Throw an exception and re-try the transaction? You haven't handled the problem of external changes. re-trying may not be a valid option. KF I don't think there's any good answer for this. Joe Programmer must avoid KF shooting himself in the foot. (BTW, if there's a possibility for deadlock KF with a transaction, there's the possiblity of data corruption without KF a transaction. I don't think transactions increase your exposure.) Hmm, , no, a transaction needs to hold the locks longer. A non-transaction version would simply lock the variable until it was done. Different problem space. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 130 (v4) Transaction-enabled variables for Perl6
You are now biting off quite a bit. This is the generic problem that all database systems have to do to handle transactions. I believe that this will increase the deadlock possibilities. Without a transaction, it might have been possible to get in and out of a subroutine without holding the lock except within the subroutine. With a transaction, all variables touched within the dynamic scope of the transaction will have to be locked. chaim "KF" == Ken Fox [EMAIL PROTECTED] writes: KF I was thinking about this same problem while reading RFC 130. It seems KF like transactions and exceptions are closely related. Should we combine KF the two? KF try transaction { KF ... KF } KF That's a really interesting extension to exceptions -- code that has KF no side-effects unless it finishes. BTW, how useful are transactions on KF internal variables if we don't provide I/O transactions too? KF Since the "transaction" keyword can only appear after "try", it KF doesn't have to appear as a new global keyword. KF This type of language feature strikes me as something similar to the KF Pascalish "with" proposal -- the "transaction" keyword triggers munging KF of the variables used in the following block. Obviously the munging KF is very different between these, but if we allow the general concept KF of munging the intermediate code (parse tree or OP tree or whatever), KF then both "with" and "transation" might be implemented from user code KF outside the core. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 130 (v4) Transaction-enabled variables for Perl6
"SF" == Steve Fink [EMAIL PROTECTED] writes: SF Or what about a variable attribute: SF my $x : transactional SF and making the effect completely lexical? Why would other scopes need to SF see such variables? You haven't handled the multiple variable variety. You will need to able to have a group of variables be transaction protected. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 146 (v1) Remove socket functions from core
"TC" == Tom Christiansen [EMAIL PROTECTED] writes: TC Perl is *not* fun when it supplies nothing by default, the way C does(n't). TC If you want a language that can do nothing by itself, fine, but don't TC call it Perl. Given these: I'm not sure that we are talking about the same thing. My understanding of -internals (and Dan) is that all the current perl (or whatever Larry leaves) will continue to be there. It is an internals issue where it really lives. So if socket() is removed from the core (the executable). Perl upon noticing a socket() without a user specified use that might override it. Will transparently make it available along with all the associated constants. If a performance hit arises, then -internals will address the issue. But nothing in the language will change. I made the suggestino a while back, that if this is true for core. It might be feasible for non-core modules (assuming some sort of registry) so that an implicit use might be performed. (I'm ignoring the problems of multiple versions or multiple conflicting routines of the same name.) Are we still far apart? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 155 (v1) Remove geometric functions from core
I don't think that you should require a use. That is too violent a change. Moving things that were in the core of Perl5 out should be invisible to the user. I strenuosly object to having to add use, for every stupid module. Anything that is part of the shipped perl should not need a use. The entire set of constants and namespace should be immediately avaiable. The only possible use for a use for core functions would be to pass options or perhaps to select a non-default version. Modules that are from CPAN or local should be able to be promoted to autoloadable by some simple mechanism. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 127 (v1) Sane resolution to large function returns
I'm missing what you are trying to say. Are you suggesting that $foo = @bar no longer mean ($foo = scalar(@bar)) == 3 ? I wasn't suggesting going that far. Just a little more DWIM. So that ($foo, @bar, $baz) = (1,2,3) # $foo = 1 @bar=(2,3), $baz = undef # or # $foo = 1 @bar=(2), $baz = 3 ($foo, @bar, $baz) = (1,(2,3),4) # $foo = 1 @bar=(2,3), $baz = 4 But ($foo, $baz, @bar) = (1,(2,3),4) # $foo = 1 $baz=2, @bar=(3,4) Actually, looking at it like that makes it an ugly situation. The 'new' expectation would be to have it become # $foo=1 $baz=2 @bar=(4) *blech*, I'm glad that you're doing the thinking. chaim "LW" == Larry Wall [EMAIL PROTECTED] writes: LW Chaim Frenkel writes: LW : LW P.S. I think we *could* let @foo and %bar return an object ref in scalar LW : LW context, as long as the object returned overloads itself to behave as LW : LW arrays and hashes currently do in scalar context. LW : LW : Isn't this an internals issue? LW Not completely. The scalar value would visably be a built-in object: LW @bar = (0,1,2); LW $foo = @bar;# now means \@bar, not (\@bar)-num LW print ref $foo, $foo-num, $foo-str, ($foo-bool ? "true" : "false"); LW ^D LW ARRAY3(0,1,2)true LW One implication of this approach is that we'd break the rule that says LW references are always true. Not clear if that's a problem. It's basically LW already broken with bool overloading, and defined still works. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 127 (v1) Sane resolution to large function returns
"LW" == Larry Wall [EMAIL PROTECTED] writes: LW Dan Sugalski writes: LW : And do we want to consider making this (and its ilk) Do The Right Thing? LW : LW :(@foo, @bar) = (@bar, @foo); LW We certainly want to consider it, though perhaps not in -internals. LW You can talk about passing @bar and @foo around as lazy lists, and LW maybe even do lazy list-flattening, but I don't see how that works yet, LW even in the absence of overlap. The basic issue here may come LW down to whether the LHS of an assignment can supply a prototype for the LW entire assignment that forces everything to be treated as objects LW rather than lists. LW That is, right now, we can only have a scalar assignment prototype of ($), LW and a list assignment prototype of (@). We need a prototype (not just LW for assignment) that says "all the rest of these arguments are objects", LW so we don't have to use prototypes like (;\@\@\@\@\@\@\@\@\@\@\@\@\@\@\@). LW Or (\@*) for short. Isn't that what Damian's named (whatever, formerly known as prototypes) does? @_ in the absence of a named argument list would continue to act _as if_ the argument list were flattened. With a named argument list, it would make the actual refs on the stack visible. The question that I think Dan proposed is how much breakage would infering (@foo, @bar) = (@bar, @foo) to mean treat RHS as objects, cause. Wouldn't having and @ anywhere but the last position in the list would be a useful indicator. I can see someone (Probably Randal or Tom) wanting to initialize a list that way. But for the rest of us, it isn't that useful. LW P.S. I think we *could* let @foo and %bar return an object ref in scalar LW context, as long as the object returned overloads itself to behave as LW arrays and hashes currently do in scalar context. Isn't this an internals issue? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 127 (v1) Sane resolution to large function returns
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS At 02:25 PM 8/24/00 -0400, Chaim Frenkel wrote: But ($foo, $baz, @bar) = (1,(2,3),4) # $foo = 1 $baz=2, @bar=(3,4) Actually, looking at it like that makes it an ugly situation. The 'new' expectation would be to have it become # $foo=1 $baz=2 @bar=(4) DS Wouldn't that be $baz = 3, since the middle list would be taken in scalar DS context? Don't make me go there. I already abhor, sub foo { (3,2,1) } $x = foo; # $x == 1 I wish Larry would let it go away, a quick death. I can see the horror from the internals standpoint. You have to special case a literal 'list'. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 136 (v1) Implementation of hash iterators
"LW" == Larry Wall [EMAIL PROTECTED] writes: LW Dan Sugalski writes: LW : I have had the "Well, Duh!" flash, though, and now do realize that having LW : multiple iterators over a hash or array simultaneously could be rather handy. LW You can also have the opposite "Well, Duh!" flash and realize that most LW DBM implementations only support a single iterator at a time. For some LW definition of support. That's the main reason for Perl's current LW limitation. I don't understand this. Are there DBM's that don't understand nextkey? Isn't this the another version of having an indirection? DBM's that don't allow multiple iterators means the porter to the DBM has to supply a wrapper that does. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Do threads support SMP?
This belongs on -internals. The threading model will probably be identical for all ports. And my suspicion is that -internals will use whatever the platform provides. I don't think we want to write a portable threading capability. chaim "SWM" == Steven W McDougall [EMAIL PROTECTED] writes: SWM Does Perl6 support Symmetric MultiProcessing (SMP)? SWM This is a *huge* issue. It affects everything else that we do with SWM threads. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 130 (v1) Transaction-enabled variables for Perl6
I don't think you should even attempt to version/transaction protect a tied variable. Anything that leaves the memory or could leave the memory (e.g. socket write) should probably not be versioned. Unless the tied variable somehow is able to tie itself into the transaction manager. It is up for grabs. But if it participates then as far as the 'caller' or user is concerned it looks like a variable and acts like a variable. It must be a variable. chaim "d" == dLux [EMAIL PROTECTED] writes: d /--- On Thu, Aug 17, 2000 at 06:17:51PM -0400, Chaim Frenkel wrote: d | Though this is a tough problem especially in the face of threads. d | Though versioned variables may be able to remove most of the d | locking d | issues and move it down into the commit phase. d Yes, but you can give strange results when using a versioned tied d variable. d That's why I thought versioning and more intelligent locking has d quite a lot overhead. d \--- -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 130 (v1) Transaction-enabled variables for Perl6
Think about deadlock, extra overhead, and time taken to take the lock. If a set of variables should be locked as a unit only one mutex should be assigned. What to do about overlapping members chaim "DLN" == David L Nicol [EMAIL PROTECTED] writes: DLN I wrote a transaction-enabled database in perl 5, using fork() DLN for my multithreading and flock() for my mutexes. It continues DLN to work just fine. DLN Threading gives us mutexes. Are you just saying that DLN every variable should have its own mutex, and perl's = assignment DLN operator should implicitly set the mutex? DLN Giving every variable its own mutex would mean we could have DLN greater control if needed --- something like flock(SH|EX|NB|UN) DLN semantics on anything that can be an Lvalue. DLN How about rewriting it as an extension to flock() involving mutexes DLN and sending it to perl6-language? -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 35 (v1) A proposed internal base format for perl
"LW" == Larry Wall [EMAIL PROTECTED] writes: LW On the other hand, targeting JVM and IL.NET might keep us honest enough. What is IL.NET? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 35 (v1) A proposed internal base format for perl
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: A TIL doesn't have to be machine code. A first pass for a port that does the TIL inner loop in C, should be quite portable. DS Okay, now I'm confused. I was under the impression that doing a TIL really DS required getting down 'n dirty with assembly-level jmps and jsrs, the DS alternative being the op-loop we've got now, more or less. DS Or are you thinking of having the op functions just goto the start of the DS next one on the list? I'd think you'd blow your stack reasonably soon that DS way. (Why do I sense my reading list is about to get larger?) Err, no. Think of it this way. A TIL level sub ^TIL header code# ptr to real Function ^funcA # Start of a til function ^funcB # start of a til function ^funcC So all pointers point at a real function. In the lowest level case, it is pure machine code to be executed. If it is a TIL level sub, the pointer is to a routine that pushes the current TIL program counter and reenters the inner loop. This is with an inner loop. The dispatching could be sped up at the cost of space by converting the pointers into real calls, and replacing calls to push functions with real pushes. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Welcome to Perl Vtbls
This discussion of vtbls and how to do cross products has me confused. I thought there were two different _real_ vtbls and an op tree. The low-level core 'objects' have a vtbl that handles operations on the object itself. The higher level perl user objects have a vtbl that handles method dispatch. Again operations on the object itself. Cross operations, addition, concatination, etc. Are handled in the optree. I can't see how objectA's vtbl can handle a cross-operation to objectB's vtbl. Enlightenment sought. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: pramgas as compile-time-only
"LW" == Larry Wall [EMAIL PROTECTED] writes: LW No, they're only stored once per statement, as far as I recall. This LW is a great way to handle all sorts of lexically scoped things, provided LW they don't require finer specificity than a statement. Each new LW statement just rams a new cop pointer into curcop and you're done with LW it. Think of it as a funny kind of vtbl pointer. You potentially LW change a whole bunch of semantics by one pointer assignment. Any LW opcode within the statement can look up anything it likes in the LW current lexical context merely by following the curcop pointer back. What about those lexical pragmas that are manipulated at runtime. I don't think you want to have the values stored in the optree (or its replacement) Otherwise we get into major issues in handling the optree between threads (and reducing sharing.) A "parallel" structure to store those items that cover a range of statements (or parts of an optree) should do it. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 35 (v1) A proposed internal base format for perl
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS How much of the current base of target ports are you willing to give up in DS the first cut for fast? The TIL suggestion, amongst others, has the DS potential to speed things up rather a lot, but it has the disadvantage of DS requiring intimate knowledge of each target port. My preference is to get DS a snappy interpreter and leave the Java JIT-equivalents to the various DS chip/OS vendors, but I'd bet the TIL style would be faster. I don't think so. A TIL doesn't have to be machine code. A first pass for a port that does the TIL inner loop in C, should be quite portable. A faster port, that intimately understands the compiler and how to play with it, can be done at lesiure. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Internal Filename Representations (was Re: Summary of I/O related RFCs)
"JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes: JH I agree with Dan: people do seem to get into "have to" mode awfully JH soon and easily. The proposed framework is supposed to make it easy JH to handle file names and make Perl internals (well, not internal- JH internals, but the IO layer) better support multiplatform situations, JH both when scripts have to run on multiple platforms, and when JH an application is running on multiple platforms _simultaneously_, JH servers-clients, peers-peers, etc. Fine. Then all the 'parts' of a generic filename (or should that be a resource name) are pure strings? Who supplies the logic to scrunch them together into an understandable (to the os) form? And how do we make it easy to pass in a name to open? $fh = open Perl::Canonical(Host="remote" ,OS="VMS" ,Device= "DAO" ,Path=("path", "to", "resource") ,Name= "filename" ,Type="txt" ,Version= ":oldest" ); Seems messy. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Ideas On what a SV is
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS At 10:26 PM 8/12/00 -0400, Chaim Frenkel wrote: If we are going to have pre-compiled modules (bytecode or machine) why bother with AUTOLOAD? The "registry" would handle the 'autoloading' DS It can't, though. We don't know at compile time (especially if we're DS compiling just a module) what AUTOLOAD could be called with. Sorry, What I meant was, the need for AUTOLOAD went down. Pulling in modules as needed, for speed reasons. If items are loaded on demand (socket, dbm, http, whatever) so can arbitrary pieces of code. Blue Sky All compiled code is in a special file that allows mmap access.[1] Then an unrecognized subroutine, can be checked against the registry. And an automagical use can be performed. /Blue Sky Actually that's a little too automagical, it might make typos turn into wierd calls. chaim [1] I'm remembering my MVS days with all code in PDS with the compiled code arranged on disk to quickly load itself into memory. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Program lifecycle
You may also want to be able to short circuit some of the steps. Especially where the startup time may outweigh the win of optimization. And if there could be different execution engines. Machine level, bytecode, (and perhaps straight out of the syntax tree.) Hmm, might that make some debugging easier? chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS The structure I've been thinking of looks like: DS Program Text DS | DS | DS | DS V DS +--+ DS | Lex/parse | DS +--+ DS | DS Syntax tree DS | DS V DS +--+ DS | Bytecoder| DS +--+ DS | DS Bytecodes DS | DS V DS +--+ DS | Optimizer| DS +--+ DS | DS Optimized DS bytecodes DS | DS V DS +--+ DS | Execution| DS | Engine | DS +--+ DS With each box being replaceable, and the process being freezable between DS boxes. The lexer and parser probably ought to be separated, thinking about DS it, and we probably want to allow folks to wedge at least C code into each DS bit. (I'm not sure whether allowing you to write part of the optimizer in DS perl would be a win, but I suppose if it was saving the byte stream to disk...) -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Program lifecycle
"NT" == Nathan Torkington [EMAIL PROTECTED] writes: NT - source filters munge the pure source code NT - cpp-like macros would work with token streams NT - pretty printers need unmunged tokens in an unoptimized tree, which NTmay well be unfeasible I was thinking of macros as being passed some arguments but then can either manipulate the raw source code or ask the lexer/parser for parsed tokens. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Method call optimization.
"GB" == Graham Barr [EMAIL PROTECTED] writes: GB On Thu, Aug 10, 2000 at 04:54:25PM -0400, Dan Sugalski wrote: At 08:31 PM 8/10/00 +, Nick Ing-Simmons wrote: You just re-invented "look up the name in a hash table" ;-) You now have one big hash table rather than several small ones. Which _may_ give side benefits - but I doubt it. If we prefigure a bunch of things (hash values of method names, store package stash pointers in objects, and pre-lookup CVs for objects that are typed) it'll save us maybe one level of pointer indirection and a type comparison. If the object isn't the same type we pay for a type comparison and hashtable lookup. Not free, but not expensive either. 'Specially if we get way too clever and cache the new CV and type in the opcode for the next time around, presuming we'll have the same type the next time through. GB Or if someone has defined a new sub, you don't know it was not the GB one you stashed. Leave a stub behind at the old address that would trigger the repointing. (I'm not clear what to do for refcounting the old address) GB I am not sure it got into perl5, but pre-computing the hash value of GB the method name and storing it in the op is one thing. Maybe also trying GB a bit harder to keep package hashes more flat. GB Another thing that may help is if all the keys in package hashes are shared GB and also shared with constant method names in the op tree. Then when GB scanning the chain you only need do a pointer comparison and not a GB string compare. I don't think anything should be in the op tree. The optree (or whatever the engine is) should only be operations. Data should be either in the object, vtbl, or stack. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: pramgas as compile-time-only
"GB" == Graham Barr [EMAIL PROTECTED] writes: A different op would be a better performance win. Even those sections that didn't want the check has to pay for it. GB That may not be completly true. You would in effect be increasing the GB size of code for perl itself. Whether or not it would be a win would GB depend on how many times the extra code caused a cache miss and a fetch GB from main memory. GB As Chip says, human intuition is a very bad benchmark. Does the cache hit/miss depend on the nearness of the code or simply on code path? Obviously having the checked version be a wrapper of the base op and near it on the same page would be a VM win. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Ramblings on base class for SV etc.
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS The problem with that is that vtables need to be constant, and the list of DS subs in a package isn't constant, nor is it always known at compile time. Why? All you need is a constant location. Given that perl has to see _all_ subroutine names each new name is assigned an offset. (Or to save space an indirection to the offset.) If even the actual method is a variable, then we fall back to a name lookup to determine the offset. DS I'm all for wedging a pointer to the package symbol table into an object, DS and I'm all for pre-hashing the method name into the method call op, and DS I'm all for linking the @ISAs (or whatever they end up being) so that a DS change to a parent package propagates a weak ref into the child packages DS assuming there's no method of that name already. I'm not sure what else we DS can do, though. We will have to wait for the code to shape up. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Multi-object locks (was Re: RFC 35 / Re: perl6-internals-gc sublist)
"JT" == John Tobey [EMAIL PROTECTED] writes: JT SVs are never downgraded, so no one's source and destination are JT another's respective destination and source. Maybe the above sequence JT isn't exactly right, but if we adhere to strict rules for lock JT sequencing, there won't be a deadlock, right? Which brings to mind, (probably more appropriate to -language) the user needs a mechanism to handle multi-object locking, or a clean method to order his lock aquisition. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Ramblings on base class for SV etc.
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS I was thinking of that--the int portion has the number of keys/entries for DS both hashes and arrays, the char * holds the pointer to the sparse array DS vector or something. Sounds like a good candidate for Implementation DS Defined Data... Sounds like one of the vtbl entries would be dump_self Hmm, will vtbl get rid of all the magic hacks? chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
pramgas as compile-time-only
Are there any (p5p) pragmas that have a runtime effect? Would requiring/limiting pragmas to have a strictly compile time effect have a win for internals (performance, development, whatever)? Would there be any breakage? For example, warnings. Rather than checking the flag in each op. A wrapper op that does the check, would dispatch to the underlying op. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 38 (v1) Standardise Handling Of Abnormal Numbers
If one does 'use tristate' I would hope they know what they are doing or asking for. And in fact, I would want the warning turned off. One of the pains in doing sybperl, is I have to liberally sprinkle $^W=0 or do a lot of defined() ? :, to avoid the warnings. chaim "GB" == Graham Barr [EMAIL PROTECTED] writes: GB On Mon, Aug 07, 2000 at 06:05:30AM -0400, Chaim Frenkel wrote: What are the issues doing it through the vtbl of 'self'? Though if the op does it, there would be a different op under the tristate pragma. GB Not true. Right now you get a warning for use of uninit when undef GB is used. How about under trisatte, these warnings just result in GB the return of undef. GB just a thought. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Ramblings on base class for SV etc.
"NI" == Nick Ing-Simmons [EMAIL PROTECTED] writes: Hmm, will vtbl get rid of all the magic hacks? NI The "mg.c" 'magic hacks' are in essence applying vtable semantics (they NI are even called vtables in the sources) to a subset of "values". NI So yes vtables mean evrything is "magic" so nothing needs "special magic"... Some 'official' method of passing on calls will be needed. So that it is easier to write magic. I don't recall anyone commenting on my suggestion: Have every Package generate a vtbl for each subroutine in the package. Then when something is blessed into the package (if this is retained for OO) then the objects vtbl becomes the precompiled merger of vtbls based upon the inheritence tree. If we use a TIL for the engine then the vtbl is a direct pointer. Otherwise it can be trampoline code. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 35 / Re: perl6-internals-gc sublist
Why waste space on flags? The vtbl should handle that. There are a limited number of valid combinations, the vtbl could alter the behavior and trading space for speed, no need to check flags. Alter the vtbl instead. (Perhaps a simple state machine.) And why waste the GC data. I suspect that most variables will not be shared. Put the sync data into a seperate area and then access the external data. If the sync data is in the SV, I believe there is a race condition between the destruction and grabbing a lock. And why carry around IV/NV and ptr? Make it into a union, to allow room. The string/number duality should be handled by the vtbl. So a string that is never accessed as a number, doesn't waste the space. And numbers that are rarely accessed as a string save some room. And as needed the vtbl can be promoted to a duality version that maintains both. chaim "NI" == Nick Ing-Simmons [EMAIL PROTECTED] writes: NI So my own favourite right now allows a little more - to keep data and token NI together for cache, and to avoid extra malloc() in simple cases. NI Some variant like: NI struct { NI vtable_t *vtable; // _new_ - explcit type NI IV flags; // sv_flags NI void *sync_data;// _new_ - for threads etc. NI IV GC_data; // REFCNT NI void *ptr; // SvPV, SvRV NI IV iv;// SvIV, SvCUR/SvOFF NI NV nv;// SvNV NI } NI The other extreme might be just a pointer with LS 3 bits snaffled for NI "mark" and two "vital" flags - but that just means above lives via the NI pointer and everything is one more de-ref away _AND_ needs masking NI except for "all flags zero" case (which had better be the common one). NI As I recall my LISP it has two pointers + flags -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 38 (v1) Standardise Handling Of Abnormal Numbers
I was thinking of RFC'ing tri-state logic. Would it be worthwhile to make it seperate or to extend your RFC? I'd like to be able to mimic what the rules for nulls are in databases. (I only know sybase's rules, so there may be differences between vendors. Could somone that is aware of the ANSI standard chime in.) number + NULL = NULL number * NULL = NULL number - NULL = NULL number / NULL = NULL string + NULL = string value == NULL : false NULL == NULL : false value != NULL : false NULL != NULL : false In sybase, summing or averaging a column with nulls simply ignores the nulls. How this would effect the proposed reduce function or perhaps that map/grep functions I don't have a clear proposal. (I also found an interesting effect in Sybase NaN != NaN false and NaN == NaN false ) chaim "PRL" == Perl6 RFC Librarian [EMAIL PROTECTED] writes: PRL The handling of various abnormal numeric entities like infinities PRL (positive, negative), not-a-numbers (NaNs, various kinds of those, PRL signaling, non-signaling), epsilons (positive and negative) is left to PRL the native math libraries. A more concerted effort to standardise the PRL behaviour of Perl across the platforms would be desirable. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Language RFC Summary 4th August 2000
"BCW" == Bryan C Warnock [EMAIL PROTECTED] writes: BCW On Fri, 04 Aug 2000, Uri Guttman wrote: "s" == skud [EMAIL PROTECTED] writes: s Up for grabs: s - s Formats out of core BCW Somehow, I missed this message. BCW I don't think that's a language issue. Whether Perl continues to BCW support formats certainly is, but its location within Perl is more of BCW an internals thing. Not quite. It reflects up to the language. Are all non-core items, hard coded into the language or are they able to be recognized as an already installed module. This would help avoid the proliferation of uses. And let perl find the right use. The only need for use would then be to customize the behavior. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: Imrpoving tie() (Re: RFC 15 (v1) Stronger typing through tie.)
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS The language semantics of tie strongly impact the internals. tie() is DS basically a declaration that the rules are completely different (and DS unknown at compile time) for the tied variable. Shoots down optimization a DS bunch, since access to a tied varible has to be treated as a function call DS rather than as access to data with known behaviours. Why? The vtbl for a tied variable would do all the work. Either the pointer is to a springboard into perl code, or internal code, or the XS replacement code. We might be able to add a hint hook to the module (or the vtbl) that would help optimization and the compiler. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: date interface (was Re: perl6 requirements, on bootstrap)
Versions, dear boy. Versions. Don't forget versions. We will need them. (This really belongs on -internals. Reply-to: adjusted) And while were here, does anyone understand kpathsea? Would it be a win. I think it would. chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS lexer saw a name on the list it'd automagically mark that shared library DS for loading. DS So if the list had: DS localtime|$|@|op|time.so|localtime DS perl would know that the localtime function took a scalar, returned a list, DS is called like an opcode, and lives in time.so with a name of localtime. If DS (and only if) you used localtime, perl would load in time.so for you. In DS the optree (or bytecode stream or whatever) perl would have the DS I_cant_believe_its_not_an_opcode opcode with a pointer to the function we DS loaded in from time.so. DS This way it looks like an opcode, talks like an opcode, looks like an DS opcode, but isn't an opcode taking up valuable space. (Not to mention DS making the optimizer more complex--the fewer the opcodes the easier its DS likely to be) DS I've got an RFC started on this. DS The list would presumably be added to occasionally when a module is installed -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: $^O and $^T variables (on-the-fly tainting)
"DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS $Config{osname}, I think. I'm not thrilled with that, mainly because it DS means loading up Config, which ain't cheap. Why not have $Config hard coded into the executable? Perl has to have it or know it. So why not make it part of the executable. Then moving an executable around would carry the correct luggage. However, this got me thinking. Here is an idea I'd like to see: The existence of a $^T variable for controlling tainting in the same way that $^W controls warnings. Now *that* would be cool. I realize the current implementation of tainting requires it starts with the interpreter, but hey we're rewriting the internals, right? DS So put in an RFC. :) Seriously, finer grain control over tainting's not DS unreasonable, and I can think of a few ways to do it, but they need to be DS designed in *now*, not later. Just remember, Larry's dislike of making untainting easy. I'd rather not have multiple characters. A option hash or even a longer namespace would be more readable. $Perl::Warnings{undef} = 1; $Perl::Tainting = 1; chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC: On-the-fly tainting via $^T
Please explain how having a no taint block would still keep the spirit of not making untainting easy? Just add a no taint at the top of ones code, and the -T goes away. chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS I think I'd prefer to leave untainting to regexes. DS What I was thinking of was something along the lines of a lexically scoped DS pragma--"use taint"/"no taint". (We could do this by sticking in an opcode DS to set/unset the tainting status, as well as the warning status, and so on) DS Taint checking is disabled in a no taint block. Whether we still set the DS taint status on a scalar could depend on the -T switch, so data would still DS be tainted in a no taint block. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Inner loop (was Re: type-checking [Was: What is Perl?])
May I offer an alternative. Why do an interpreter? I remember reading good things about Threaded Interpreters (e.g. Forth) So why not do a TIL? Compile it to machine calls/jumps. Should be much faster than the inner run loop. This would fit in with Dan and Nick's keep it in cache. So there could be several different runtime stages. On machines where perl knows how to assemble machine instructions do it in raw executable code. On machines where perl doesn't know how to. Write an small assembler language stub that does the Threaded code. On those machines where we can't even do that. Interpret the threaded code in C. chaim "DS" == Dan Sugalski [EMAIL PROTECTED] writes: DS I was thinking that, since the compiler has most of the information, a DS "type check" opcode could be used, and inserted only where needed. If, for DS example, you had: DSmy ($foo, $bar); DSmy ($here, $there) : Place; DS$foo = $bar; DS$here = $there; -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183