Re: Wikipedia example
Hey there, sorry about not responding. My mailer hid this message from me. I was actually about to reply asking what the deal was. ;) chromatic wrote: On Tuesday 03 October 2006 13:41, Aaron Sherman wrote: This contains the Makefile, README, .pg grammar, a -harness.pir that executes the parser on a sample string and dumps the parse tree and a -stress.pir that runs 50,000 trial runs to see how fast PGE is (not too shabby is the answer, as it comes in at about 1/2 the time of a P::RD version for the simple example, and gets a bigger lead the more complex the input expression). I can't get this to work. If I run 'make' in the target directory, I get a PASM file (with the .pir) extension. Then if I run either of the PIR files, I get: $ parrot wptest-harness.pir error:imcc:syntax error, unexpected LABEL, expecting $end in file 'wptest.pir' line 3 That's odd. I double-checked the diffs to make sure I didn't send out an old version. This code works fine on my box, which I just updated to r14904. For me, line 3 of wptest.pir is main:. Should that be a syntax error? That's the very first line of code output by: ../../../parrot -o wptest.pir ../../../compilers/pge/pgc.pir wptest.pg so if there's a problem with it, I'm not sure that I could actually fix it. Any pointers appreciated! Here's what I get: $ ../../../parrot wptest-harness.pir Parsing simple expression: 1+(1+1) Match results begin: VAR1 = PMC 'PGE::Match' = 1+(1+1) @ 0 { expr = PMC 'PGE::Match' = + @ 1 { type = infix:+ [0] = PMC 'PGE::Match' = 1 @ 0 { number = PMC 'PGE::Match' = 1 @ 0 type = term: } [1] = PMC 'PGE::Match' = (1+1) @ 2 { expr = PMC 'PGE::Match' = 1+1 @ 3 { expr = PMC 'PGE::Match' = + @ 4 { type = infix:+ [0] = PMC 'PGE::Match' = 1 @ 3 { number = PMC 'PGE::Match' = 1 @ 3 type = term: } [1] = PMC 'PGE::Match' = 1 @ 5 { number = PMC 'PGE::Match' = 1 @ 5 type = term: } } } type = term: } } } match complete
Re: Wikipedia example
Markus Triska wrote: Aaron Sherman writes: +Written in 2006 by Aaron Sherman, and distrbuted Typo: distributed You are correct, sir. This was not, in fact some strange attempt to seize control of the Parrot codebase ;)
Re: requirements gathering on mini transformation language
chromatic wrote: On Thursday 28 September 2006 14:51, Markus Triska wrote: Allison Randal writes: mini transformation language to use in the compiler tools. For what purpose, roughly? I've some experience with rule-based peep-hole optimisations. If it's in that area, I volunteer. That's part of it, but mostly it's for transforming one tree-based representation of a program into another. See for example Pheme's lib/*.tg files. I'm confused. I thought that this is what TGE did. Is TGE going away, or are we talking about something that extends TGE in some way?
Wikipedia example
article: + +http://en.wikipedia.org/wiki/Parser_Grammar_Engine + +This code is here so that others can benefit from a simple example, +and so that anyone who updates PGE can see if it affects the ability +to handle the example given in that article. + +Written in 2006 by Aaron Sherman, and distrbuted under the same terms +as the rest of the Parrot distribution that this should have come +with.
Re: LLVM and HLVM
On 8/23/06, peter baylies [EMAIL PROTECTED] wrote: On 8/22/06, John Siracusa [EMAIL PROTECTED] wrote: Has anyone looked at LLVM lately? [...] On the other hand, Parrot built quite nicely on x86-64, although I think I like the 32-bit build (which also built just fine, albeit without ICU) better due to the excellent JIT support. Not sure if the list will let this through, since I'm subscribed under another account, but here's the problem with that: llvm is a very light layer, but it's yet another layer. To put it between parrot and hardware would mean that parrot is JITing to LLVM byte-code, which is JITing to machine code. Not really ideal. -- Aaron Sherman Senior Systems Engineer and Toolsmith [EMAIL PROTECTED] or [EMAIL PROTECTED]
Re: LLVM and HLVM
John Siracusa wrote: On 8/23/06 4:09 PM, Aaron Sherman wrote: here's the problem with that: llvm is a very light layer, but it's yet another layer. To put it between parrot and hardware would mean that parrot is JITing to LLVM byte-code, which is JITing to machine code. Not really ideal. ...unless LLVM does a much better job of native code generation than the existing Parrot code, that is. Optimization seems to be LLVM's thing. Keep in mind that you're not talking about some HLL generating LLVM bytecode. You're talking about Parrot reading in Parrot byte code, JITing to LLVM and then going through that dance again. The amount of lossage in those layers of translation simply cannot be worth whatever the difference is between LLVM optimization and Parrot's JIT, since Parrot will already have generated code that makes it MORE difficult to optimize. I'll buy it if I see numbers, but I'm highly skeptical.
Re: End the Hollerith Tyranny? (linelength.t)
On Mon, 2006-08-21 at 08:45 -0700, Chip Salzenberg wrote: On Mon, Aug 21, 2006 at 10:48:59AM -0400, Will Coleda wrote: The way you phrase the question, you're not going to get any of these answers. Who is programming parrot on their *physical* VT100? =-). The primary reason for an 80 column limit is developer convenience, I think. Well, that's fair. Many of us are old enough to have used such limited hardware, but it's all surely been relegated to the trashheap by now. So: Would anyone be inconvenienced by exceeding 80 columns regularly; and, how? I typically measure my screen real estate in discrete 80-col units. My layout of terminals, editors (emacs, xemacs, vim and gvim mixed fairly liberally for different purposes) and other applications is suited to editing in 80-column units. When I have to re-size a window to larger than that, it's a pain, but not a terribly hurtful one. I like the Gnome Style document for reference here. They talk about 8-space tabs, but it's the same issue as 80-column text: Using 8-space tabs for indentation provides a number of benefits. It makes the code easier to read, since the indentation is clearly marked. It also helps you keep your code honest by forcing you to split functions into more modular and well-defined chunks - if your indentation goes too far to the right, then it means your function is designed badly and you should split it to make it more modular or re-think it. Just my $0.02. P.S.: Seems only fair that if we're sticking with C89, we stick with 1989 terminal sizes. =-) Some selectivity is in order, or we'll have to target 1989 memory sizes, disk capacities, and network bandwidth There are times that I use emacs, not because it's the right tool for the job that *it* is doing, but because I have to use it over a lossy line, and NOTHING that I've found squeezes quite so much out of curses-like rendering over a slow line. Why? Because it was designed to do multi-window editing over 1200-9600 bps modem lines without making its users want to kill themselves. There is something to be said for working well in low-resource environments, especially when talking about a VM that might need to be placed into the control circuitry of, say, an elevator control system.
Re: PMC flags
On Wed, 2005-05-04 at 01:51, Leopold Toetsch wrote: Aaron Sherman wrote: On Mon, 2005-05-02 at 08:58 +0200, Leopold Toetsch wrote: Here is some example P5 source from pp_pow in pp.c: I presume that Ponie eventually will run Parrot opcodes. pp_pow() is like all these functions part of the perl runloop. Therefore it'll be infix .MMD_POW, P1, P2, P3 Sorry, I wasn't being clear with my example and the fact that it was just that. Yes, I understand that the opcode pow will become Parrot... well, at least I think I understand that, but I'm not 100% sure. There's some very hairy magic in pp.c and pp_hot.c that Perl code relies on, and in some places (e.g. pp_add), its behavior is not compatible with Parrot in some very fundamental ways (e.g. all addition is done as unsigned integers if/when possible to gain the overflow semantics of C8X uint, check the novella-length comments in pp.c for details). ALL THAT ASIDE, however, I did understand that the goal was not explicitly to leave pp_pow as it is. I was using it as an example of the kind of code that uses the Perl 5 core. XS, the Perl 5 runtime (e.g. the regexp engine), parser, etc. all rely on the same sort of constructs that I used pp_pow to outline. If you're writing a compiler from scratch, I can see that being mostly true. However, in trying to port Perl 5 guts over to Parrot, there's a lot of code that relies on knowing what's going on (e.g. if a value has ever been a number, etc.) Most of the guts are called from the runloop. But there is of course XS code that messes with SV internals. Ignore code that messes with SV internals. Code in many parts of the runtime that almost certainly won't be re-written use the SV interface provided by sv.h correctly, doing tons of flag tests all the time. Literally every operation requires several flag tests! -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: PMC flags
On Mon, 2005-05-02 at 08:58 +0200, Leopold Toetsch wrote: Nicholas Clark [EMAIL PROTECTED] wrote: 1 bit for SVf_IOK 1 bit for SVf_NOK 1 bit for SVf_POK 1 bit for SVf_ROK I'd not mess around with (or introduce) flag bits. The more that this would only cover perl5 PMCs. Presuming that the Perl5 PMCs are subtypes of Parrot core PMCs, I'd do ... [... code doing a string isa check on the type ...] The vtable functions Cisa and Cdoes, which take now a string, are a bit heavy-weighted and might get an extension in the log run that take an integer flag. Unless this happens, this would be a HUGE performance hit. After all, Sv*OK is called all over the place in the Perl 5 code, including many places where performance is an issue. [...] 2 bits to say what we're storing in the union The vtable is that information: INTVAL i = VTABLE_get_integer(interpreter, pmc); FLOATVAL n = VTABLE_get_number(interpreter, pmc); Here is some example P5 source from pp_pow in pp.c: if (SvIOK(TOPm1s)) { bool baseuok = SvUOK(TOPm1s); UV baseuv; if (baseuok) { baseuv = SvUVX(TOPm1s); } else { IV iv = SvIVX(TOPm1s); and here that is, run through the C pre-processor: pre-processor: if *(sp-1)))-sv_flags 0x0001)) { char baseuok = *(sp-1)))-sv_flags (0x0001|0x8000)) == (0x0001|0x8000)); UV baseuv; if (baseuok) { baseuv = ((XPVUV*) ((*(sp-1)))-sv_any)-xuv_uv; } else { IV iv = ((XPVIV*) ((*(sp-1)))-sv_any)-xiv_iv; Notice that there is exactly no function calling going on there. To change that to (pseudocode): if (isa_int_test(TOPm1s)) { bool baseuok = isa_uint_test(TOPm1s); UV baseuv; if (baseuok) { baseuv = invoke_uint_vtable_get(TOPm1s); } else { IV iv = invoke_int_vtable_get(TOPm1s); Well... even after JIT compilation, function call overhead is function call overhead, no? just do the right thing. Usually there is no need to query the PMC what it is. If you're writing a compiler from scratch, I can see that being mostly true. However, in trying to port Perl 5 guts over to Parrot, there's a lot of code that relies on knowing what's going on (e.g. if a value has ever been a number, etc.)
Common LISP
I forwarded the Common LISP notice to a friend of mine who works on CMUCL internals, and he suggested: [they should think about starting with] CMUCL and retarget [sic] it for the new VM. That way he gets all the type inference for free [which would increase performance] Just thought I'd pass that along, in case it's of interest. I'm not a CL guy at all, so I wouldn't even know how type inference helps performance. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: I wish to understand the JIT machine code generator
On Thu, 2005-04-14 at 01:50 -0400, [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I have been trying to examine the i386 code generator to see how feasible it would be to create an AMD64 code generator. [...] I'm going to copy the i386 path to an a64 path and have at it. I'm hoping it won't be much of a stretch to get 64-bit code generated -- although REASONABLE 64-bit code is another problem. But first I want to ask if anybody else is doing this already. Just one thought on stylistic conventions. GCC uses the following naming conventions for AMD processors, and I imagine that they have had to do this because they have discovered over the years that it makes sense: k6 AMD K6 CPU with MMX instruction set support. k6-2, k6-3 Improved versions of AMD K6 CPU with MMX and 3dNOW! instruc- tion set support. athlon, athlon-tbird AMD Athlon CPU with MMX, 3dNOW!, enhanced 3dNOW! and SSE prefetch instructions support. athlon-4, athlon-xp, athlon-mp Improved AMD Athlon CPU with MMX, 3dNOW!, enhanced 3dNOW! and full SSE instruction set support. k8, opteron, athlon64, athlon-fx AMD K8 core based CPUs with x86-64 instruction set support. (This supersets MMX, SSE, SSE2, 3dNOW!, enhanced 3dNOW! and 64-bit instruction set extensions.) Since Parrot's JIT will need to think about the CPU in ways that are roughly analogous to the way GCC's RTL-to-machine generator thinks about it, perhaps you'll want to start off with similar categorization. Also, looking at the m-* files in the GCC source tree could easily give you some pointers on what it is that you might want to be thinking about for optimal code-gen under Parrot. It would be awesome if someone figured out a way to translate GCC's m-* templates into some sort of starting point for Parrot JIT definitions, but that might be too science-fictiony to make sense, as GCC is defining translations for RTL sequences and Parrot is defining translations for PBC ops, which are very different.
Re: A sketch of the security model
On Thu, 2005-04-14 at 09:11, Dan Sugalski wrote: At 10:03 PM -0400 4/13/05, Michael Walter wrote: On 4/13/05, Dan Sugalski [EMAIL PROTECTED] wrote: All security is done on a per-interpreter basis. (really on a per-thread basis, but since we're one-thread per interpreter it's essentially the same thing) Just to get me back on track: Does this mean that when you spawn a thread, a separate interpreter runs in/manages that thread, or something else? We'd decided that each thread has its own interpreter. Parrot doesn't get any lighter-weight than an interpreter, since trying to have multiple threads of control share an interpreter seems to be a good way to die a horrible death. So to follow up on Michael's question: does this mean that you spawn a new thread, instance an interpreter, and then begin executing shared code? What about data? I assume that all has to be shared, since shared data is a fundamental piece of any threaded application's assumptions. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: A sketch of the security model
On Thu, 2005-04-14 at 13:22 -0400, Dan Sugalski wrote: Anyway, a number of people I deeply respect (and who do this sort of thing for a living, at deep levels) have told me flat-out that we're better not having a security system than we are trying to roll our own, and the common response to We're lifting VMS' has been Good. Do that. Well, if you were lifting VMS's security model, that would be fine, but you're really not. You're lifting the idea of VMS's security model. A security model is a many-fold thing (I've only so far discussed the highest and most user-visible level because they are the bits most applicable to Parrot). You're talking about cherry-picking certain bits and re-designing the rest to fit. I have NO PROBLEM with that, but I want to make sure that you don't think this is the easy way to go. It's not. You're biting off a HUGE amount of work, and your first 2 attempts will likely be utterly wrong (if history is any guide, not because you're not smart and capable). I think it would be easier to start from scratch, personally. I understand your concerns, but I don't think you run any less risk by creating a new VM security model out of an OS security model than you do by creating a new one. They both create many opportunities to make a mistake. That's not been the general consensus I've seen from people doing security research and implementation. They both create many opportunities to make a mistake. Really. Go ask the folks at Microsoft who lifted VMS for NT's security model, and then go ask the folks at Sun who rolled their own with Java. Both have had significant pain. If you really want to reduce the chances that you'll make a mistake, swipe the security model from JVM or CLR and start with that. At least those have been tested in the large, and map closer to what Parrot wants to do than VMS. The problem is twofold with those. First, there's some indications that they're busted, They're not busted so much as in many places they have needed significant work. I think that the general consensus right now is that JVM is fairly well sorted out in 1.5, and CLR is moving along well. I would say that at an infrastructure level they're both more than sufficient models, and that's all you're going to lift anyway (unless you were considering lifting code from mono, which I'm not sure is workable license-wise). and second (and more importantly) they're both very coarse-grained, and that leads to excessive privs being handed out, which increases your exposure to damage. That's fine. Merging down either JVM or CLR's privs into a granularity that you're happy with should work fine, and again, privs are only a small part of the security model. If you want a better picture these sources might be useful: http://developer.intel.com/technology/itj/2003/volume07issue01/art05_security/vol7iss1_art05.pdf http://java.sun.com/docs/books/security/ http://www.arctecgroup.net/ISB0705GP.pdf http://www.arctecgroup.net/ISB0706GP.pdf http://www.arctecgroup.net/ISB0707GP.pdf Don't get me wrong. I loved VMS back in the day. It was a pain in the ass at times, but what isn't. It's just that it's not a VM trying to execute byte-code... it's an operating system which directly manages hardware. Yeah, but don't forget that for all intents and purposes parrot is an OS trying to execute bytecode, VMS security was interesting because it was one of the first systems to substantially abstract the security of the system from the security of the hardware. You don't get to touch hardware because you're user-land, so you have a very different set of concerns. You do, however, have roughly the same set of concerns as the JVM and CLR. That's why I suggested them. If you don't like them, that's cool, I was only trying to save those of you who have enough time to think about something as large as security infrastructure some time and pain. I don't have that kind of spare time, so I bow to your superior ability to manage your schedule.
Re: A sketch of the security model
to execute byte-code... it's an operating system which directly manages hardware. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Pugs Q for the Parrot FAQ?
On Thu, 2005-03-31 at 12:04, Nicholas Clark wrote: Patches welcome, as I'm not sure of the best way to phrase the cross language stuff to follow on smoothly. Also, Parrot provides access to Perl 6 from other languages and to those other languages from Perl 6 at run-time, a feature which is both complex and highly beneficial for all concerned. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Pugs Q for the Parrot FAQ?
On Wed, 2005-03-30 at 14:58, Nicholas Clark wrote: Based on the wheat on IRC this evening, is this question/answer worth adding to the Parrot FAQ on parrotcode.org? Pugs is going great shakes - why not just toss Parrot and run Perl 6 on Pugs? Autrijus Tang, the lead on the Pugs project, notes that an *unoptimised* Parrot is already 30% faster than Haskell. Add compiler optimisation and a few planned optimisations and Parrot will beat Pugs for speed hands down. Autrijus things that Pugs could be made faster with some Haskell compiler tricks, but it's harder work and less effective than the Parrot optimisations we already know how to do. Good answer, and other than adding a bit about cross-language usage I'd stop there (memory issues are important but complex, and you've already made your point with this brief answer). The next question is: Q: OK, so Parrot is fast... Pugs can back-end to Parrot, right? A: Yes (though at this time, that's in the early stages). Still, the ultimate goal is for Perl 6 to be self-hosting (that is, written in itself) in order to improve introspection, debugger capabilities, compile-time semantic modulation, etc. For this reason, Pugs will probably be the compiler that first compiles the ultimate Perl 6 compiler, but thereafter Pugs will no longer be the primary reference implementation. This is documented by the Pugs team at http://svn.perl.org/perl6/pugs/trunk/docs/01Overview.html -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Python, Parrot, and lexical scopes
On Mon, 2004-10-18 at 07:55, Sam Ruby wrote: I've been trying to make sense of Python's scoping in the context of Parrot, and posted a few thoughts on my weblog: http://www.intertwingly.net/blog/2004/10/18/Python-Parrot-and-Lexical-Scopes It seems like everything on that page boils down to: all functions are module-scoped closures. Your example: Consider the following scope1.py: from scope2 import * print f(), foo foo = 1 print f(), foo and scope2.py: foo = 2 def f(): return foo The expected output is: 2 2 2 1 Is also useful for context, but I don't think you need the Perl translation to explain it. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
rx_compile and FSAs
I've done quite a lot of thinking about Parrot's rx_compile op, as I was thinking about implementing it. However, I've come to the conclusion that the definition of the op as it stands is too shallow. Please consider this definition and let me know if implementing it would be worth it to Parrot as a whole, or if this is a case of being too generic. rx_compile out P0, in S0, in I0 Produce an FSA continuation (see fsa_to_continuation) in P0 which matches the regular expression in S0. The syntax of the regular expression is determined by the type value in I0. This type must be a valid type as returned by rx_load_type. This op is simply a combination of rx_to_fsa and fsa_to_continuation rx_to_fsa out P0, in S0, in I0 Pass the regular expression string parameter to the specfied regular expression parser based on the type parameter (as returned from rx_load_type). Returns an FSA PMC. rx_load_type out I0, in S0 Dynamically load an rx compiler by name and set the first parameter to the identifier for that compiler (for use with rx_compile). The default, minimalistic regular expression syntax has identifier 0. The compiler itself must invoke fsa_new at some point in order to generate its return value. fsa_new out P0, in I0, in P1, in P3, in I1 Given inputs: max state, alphabet, transition matrix and start state, produce an FSA output object which implements the requirements. The alphabet can be one of several values, TBD, but noted that characters are only one possible type of alphabet. The transition matrix is an array of arrays, each of which contains 3 to 5 values: start state, input range(s), target state and an optional integer set to 1 or 0 to indicate if this is a final state or not and an optional integer set to 0, 1 or 2 to indicate if this state consumes its input and records it (0), consumes its input and does not record it (1) or does not consume the input token. Input ranges are going to be similar values to alphabet. More detail may be added to the transition matrix as needed. Note that target state may end up needing to be either an integer state value or a continuation. fsa_to_continuation out P0, in P1 Take as input a valid FSA and return a continuation which, when invoked, will simulate the FSA on its input parameter, and return an FSA status PMC. If the FSA has never been compiled to Parrot, this op compiles it, otherwise the same continuation is returned. For all intents and purposes, assume that this is a black box, and although in most cases the continuation returned will invoke newly generated bytecode (representing the FSA's state transitions), that need not be the case. fsa_minimize out P0, in P1 Attempt to minimize the FSA and return a new FSA as output. The goal here is to provide a generic FSA mechanism which can accomodate the various regular expression syntaxes as input, or can be generated by hand by a compiler writer who wishes to have control over the details (or provide FSA semantics over more complex tokens than just characters). Either way, the result is an FSA which can be executed (aka simulated) by invoking its continuation. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: rx_compile and FSAs
On Wed, 2004-10-13 at 10:29, Leopold Toetsch wrote: Aaron Sherman [EMAIL PROTECTED] wrote: I've done quite a lot of thinking about Parrot's rx_compile op, as I was thinking about implementing it. Given that rx_compile syntax and semantics aren't really final and second that compiling a rx takes substantial time, I'd do something like this: [...] You can experiment with needed methods, implement new ... You can subclass the Rx_Compiler, implement it in PIR and what not. Eventually for gaining the last bit of speed, we could make opcodes for the methods. Sounds good. I need to look at the NCI stuff. I was going to skip over that at first and build the default rx compiler (value 0) inline and focus on the FSA-to-bytecode implementation, but if you think that that's going to need up-front engineering, I'll look at it now (well, now as in when I get home tonight). -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: rx_compile and FSAs
On Wed, 2004-10-13 at 10:44, Matt Fowles wrote: I am of the opinion that we should treat regular expression as simply another language. Thus one can register different compilers for different regex syntaxes and we do not need to add more opcodes for them. That is essentially what I've proposed, however it is important to realize 2 things: * Just because parrot exports a set of regex syntaxes does not mean that those are the syntaxes that the user of a language will see. Language designers might pre-process down to one of those, or they might simply avoid all of them. * I'm shifting the interface from rx_compile (which is still there, though possibly as a PMC, given Leo's comments) to new_fsa, and letting language designers write their own rx compiler. Because states in the FSA can be continuations (which, yes, means it's not a true FSA, because it's not finite) you should be able to implement arbitrarily complex regexes, and even Perl 6 rules (which are not regular expressions or true FSAs at all because they maintain state as they recurse infinitely) without stepping outside of this framework. I am certain that I will make many mistakes while implementing this, as I'm learning how to create and manage FSAs, but putting an implementation in place seems to me to be the right way to start. This also has the advantage of placing their internals in a black box off to the side. So, the regex compiler can choose to agressively compile and optimize from the start or do it lazy at its whim while hiding behind the interface that the compile opcode already presents. Again, this is all true, though it is also possible for the compiler writer to take control by directly building and managing the FSA (e.g. directing when/if minimization takes place, which is an optimization that MUST take place before Parrot bytecode is generated). The key here is that languages will be able to use diverse regular expression syntaxes AND directly generated FSAs (e.g. for matching high-level objects rather than characters), but use them all in the same way. This is a fundamentally OO approach to FSA management, which was inspired for me by the good work done by some of the excellent FSA toolkits out there (which, woefully, are mostly in C++, and not truly open source, so we cannot use them directly). Several things are standing in my way right now, but I'm confident that I'll find solutions: 1. The concept of an alphabet is, as yet, vague. Obviously ASCII characters, ISO-Latin-1 characters and Unicode code points are all possible input ranges, but so too are any finite range of integers or enumerated values. More research will be required to find out how to best special-case the common cases and make them efficient, using the already discussed low-level parrot opcodes for string matching. 2. It's not clear how a continuation state behaves when it invokes its return continuation... how it directs the FSA to its next state is probably my trickiest problem, as it must be a mechanism that is compatible with the concept of an FSA (e.g. you may have to declare what states any given continuation can trigger upon return, or simply pass multiple return continuations, one for each possible next state after the continuation). 3. I don't yet know if my extended translation matrix is sufficient for modern patter-matching (ala Perl 5). Certainly it's not sufficient for look-behind assertions, but that particular case might have to be a continuation state, rather than a regular node in the FSA. Other than that sort of thing, I do want all of Perl 5's regular expression syntax to be representable in my Parrot FSAs (which, in case anyone was wondering, I'm thinking strongly of translating directly from the NDFA, rather than translating to the corresponding DFA first... that is a harder problem, but probably one more suited to this application, which incidentally opens the door to possible auto-parallelization later on). I'll be working on these over the weekend at my parents' place (nothing like the ocean to make thinking easier). -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Getting a char from stdin without waiting for \n
On Wed, 2004-10-13 at 12:10, Matt Diephouse wrote: I'm still working on a new Forth implementation... Forth has a word `key` that gets one character from stdin. It shouldn't wait for a newline to get the character. Is there any way to implement this currently in PIR? You can't do this in a standard, portable way in C, so I doubt that PIR has such a mechanism. Here's a reference from the comp.lang.c FAQ: http://www.eskimo.com/~scs/C-faq/q19.1.html -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: --pirate and coroutines
On Mon, 2004-10-11 at 14:03, Sam Ruby wrote: Here's a script that will run in both Python and Perl. It simply will return different results. print 1 + 2,\n,; print 45%s8 % 7,\n,; print 45 / 7 ,\n,; print ['a','b','c'],\n,; print ['a'] + ['b'],\n,; print 9 + None ,\n,; That last line fails under my python, but ignoring that: example 1: I don't get the complexity here. Python's add is a special case function that doesn't have much to do with the Parrot addition operators except in that, in some cases, it calls them (and in other cases, it calls ops like join). We have to distinguish the job of the compiler writer from the job of the Parrot interpreter writer here, and I don't think this is Parrot's concern. example 2: This is just abusing the difference between % in Python and in Perl. No shock there. Don't confuse syntax with semantics. example 3: Python does integer math on integers, Perl does floating-point. Again, not an issue. I'll stop there. I think it's clear that there's two desires here. One I think is reasonable, and one is (IMHO) not. To want to be able to pass a PyString to a Perl function is fine. To want to be able to pass a PyString to a Perl function and have it mutate any code that uses it into Python semantics just doesn't make sense to me. The caller's semantics will drive how such things behave, and those semantics will be imposed by the compiler in many cases, not Parrot. If, for example, the caller turns: a + b into: [...] join result, a, b [...] then that's what happens, and if a happens to be a PerlString, too bad that it doesn't think that + can mean concatenation. If your compiler turns everything into an object, then the surprises that develop from using variables that are alien to your object tree are not, in fact, surprising. Your concerns about hash behavior make more sense to me, and I do think that hashing should be a core property of all PMCs. For those who don't agree, go take a look at the mess that is C++-STL, and try this: std::hash_mapstd::string myhash; I love the STL in concept, but in practice it's full of gotchas like this because the object tree is too disjoint because the foundation was never set in such a way that the later pieces fit. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: [pid-mode.el] cannot edit
On Mon, 04 Oct 2004 21:39:59 +0100, Piers Cawley [EMAIL PROTECTED] wrote: Stéphane Payrard [EMAIL PROTECTED] writes: On Fri, Oct 01, 2004 at 06:09:37PM +0200, Jerome Quelin wrote: This function is defined in emacs: line-beginning-position is a built-in function. (line-beginning-position optional N) switch to emacs. :) Or patch pir-mode.el, your choice. That should be something like: (defun line-beginning-position (optional n) Return the character position of the first character on the current line. With argument N not nil or 1, move forward N - 1 lines first. If scan reaches end of buffer, return that position. (save-excursion (beginning-of-line) (if n (next-line n)) (point))) no? -- Aaron Sherman Senior Systems Engineer and Toolsmith [EMAIL PROTECTED] or [EMAIL PROTECTED]
Re: [pid-mode.el] cannot edit
On Fri, 2004-10-01 at 18:22, John Paul Wallington wrote: Jerome Quelin [EMAIL PROTECTED] writes: And the minibuffer tells me: Symbol's function definition is void: line-beginning-position I'm using xemacs 21.4.14 You could put something like: (defalias 'line-beginning-position 'point-at-bol) in your XEmacs init file. Sorry, I missed that comment before I wrote my own. Good catch. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Metaclasses
On Mon, 4 Oct 2004 13:24:58 -0400, Dan Sugalski [EMAIL PROTECTED] wrote: On Mon, 4 Oct 2004 11:45:50 -0400, Dan Sugalski [EMAIL PROTECTED] wrote: Okay, color me officially confused. I'm working on the assumption that metaclasses are needed, but I don't, as yet, understand them. At 12:09 PM -0400 10/4/04, Michael Walter wrote: http://members.rogers.com/mcfletch/programming/metaclasses.pdf I do have that one. Unfortunately it's the PDF of slides and, while it looks like if I was at the talk it'd all make sense, without the talk that goes with 'em... not so much sense. A metaclass is simply an object which represents the class itself and can perform operations on the class. One (IMHO bad) reason to do this is for aspect oriented programming, AKA making object oriented programming even harder to debug. This is where you modify a class on-the-fly to inject behaviors or conditions before or after events (usually method invocations, specfically). Metaclasses can also be used to do things like sub-class on the fly (e.g. mix-ins). When you do this, you invoke a method on the metaclass which requests a new class (a sort of copy constructor for classes) with a new set of behaviors (usually via an inheritance mechanism). Perl 6, for example, will be able to say: my Dog $spot .= new; my Dog $greyhound := $spot but Animal::Fast; or something that looks strikingly like that. In this case, $greyhound isn't really a Dog, it's an anonymous class type which was generated by the but and instantiated from $spot. More importantly, you could do the same thing when you say: my Dog $spot .= new; my Dog $dogbiscut .= new; $dogbiscut.race($spot but Animal::Fast); Here, the Dog.race method takes, we presume, a Dog, but we're passing it something that has an additional role attached (Animal::Fast). Everything still works, but the role might change how this particular dog works in ways that the original Dog designer didn't have in mind. IMHO this is the correct reason (though there are other reasons, mostly dealing with introspection and debugging) to want metaclasses. Mixins for Python, Ruby and Perl 6 become trivial given this mechanism. Of course, I'm not SURE you need to put this in Parrot... it really depends on how valuable it is to be able to do it the same way in all compilers. If you want to embrace this in Parrot, you're probably going to want to use something akin to the Perl 6 model, since it's designed to be able to emulate the Python, Ruby and Scheme models. -- Aaron Sherman Senior Systems Engineer and Toolsmith [EMAIL PROTECTED] or [EMAIL PROTECTED]
Re: rand opcodes are deprecated
On Tue, 2004-09-28 at 03:53, Leopold Toetsch wrote: We already have the Random PMC with vtables to create random numbers. There's really no need to have opcodes too. If there aren't serious arguments for keeping these opcodes, they'll be removed for the release. Didn't you and I specifically have this discussion when I wrote those opcodes? I don't recall the justification at the time, but I thought that we'd reached an understanding then The opcodes only act as a front-end to the singleton PMC, so there's no duplication of the implementation. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Namespaces again
On Mon, 2004-09-27 at 13:04, Chip Salzenberg wrote: For Perl, I get that. But for Python, AFAICT, namespaces are *supposed* to be in the same, er, namespace, as variables. No? Yes, and what's more the suggestion of using :: in Parrot won't work perfectly either (I'm pretty sure that there are LISP variants, possibly including scheme) that use : in identifiers. Rather than trying to shuffle through the keyboard and find that special character that can be used, why not have each language do it the way that language is comfortable (e.g. place it in the regular namespace as a variable like Python or place it in the regular namespace, but append noise like Perl or hide it in some creative way for other languages). For the most part, there's no performance penalty in having a callback that the language/library/compiler provides because access to the objects in question will be via a PMC, and only LOOKUP of that PMC will be via namespace, no? In that way, you could: namespace_register perl5_namespace_callback, Perl5 namespace_register python_namespace_callback, Python [...] namespace_lookupP6, F\0o::Bar, Perl5 namespace_lookupP7, foo.bar, Python the namespace callback could take a string and return whatever Parrot needs to look up a namespace (Array PMC?), having encoded it according to Parrot's rules. That way, you can solve this however you like (heck, put a between them if you want... in fact, I kind of like that). -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Namespaces again
On Tue, 2004-09-28 at 12:05, Jeff Clites wrote: On Sep 28, 2004, at 7:02 AM, Aaron Sherman wrote: why not have each language do it the way that language is comfortable (e.g. place it in the regular namespace as a variable like Python or place it in the regular namespace, but append noise like Perl or hide it in some creative way for other languages). That's similar in spirit to what I proposed of allowing PMC-subclassing of the default ParrotNamespace, so that namespaces created from different languages (often implicitly) could have different behaviors. But I'd keep the pulling-apart of F\0o::Bar into [F\0o; Bar] a compile-time task, so that at runtime the work's already been done (since the compiler knows what language it's compiling). Sounds reasonable, thought you can't ALWAYS do it in advance, you certainly want to do as much as possible up-front. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Why lexical pads
On Fri, 2004-09-24 at 10:03, KJ wrote: So, my question is, why would one need lexical pads anyway (why are they there)? They are there so that variables can be found by name in a lexically scoped way. One example, in Perl 5, of this need is: my $foo = 1; return sub { $foo ++ }; Here, you keep this pad around for use by the anon sub (and anyone else who still has access to that lexical scope) to find and modify the same $foo every time. In this case it doesn't look like a by-name lookup, and once optimized, it probably won't be, but remember that you are allowed to say: perl -le 'sub x {my $foo = 1; return sub { ${foo}++ } }$x=x();print $x-(), $x-(), $x-()' Which prints 012 because of the ability to find foo by name. Of course, you can emulate this behavior, but in doing so, you're going to have to invent the pad :) Someone else suggested that you need this for string eval, but you don't really. You need it for by-name lookups, which string evals just happen to also need. If you can't do by-name lookups, then string eval doesn't need pads (and thus won't be able to access locals). -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Why lexical pads
On Fri, 2004-09-24 at 12:36, Jeff Clites wrote: Ha, you're example is actually wrong (but tricked me for a second). Here's a simpler case to demonstrate that you can't look up lexicals by name (in Perl5): You are, of course, correct. If I'd been ignorant of that in the first place, this would be much less embarassing ;-) However, the point is still sound, and that WILL work in P6, as I understand it. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Synopsis 9 draft 1
On Mon, 2004-09-13 at 07:19, [EMAIL PROTECTED] wrote: On Mon, 6 Sep 2004, Aaron Sherman wrote: Sized low-level types are named most generally by appending the number of bits to a generic low-level type name: [...] int1 int2 int4 int8 int16 int32 int64 Ok, so Parrot doesn't have those. Parrot has int. The above generic low-level types are specific instances of a more general specification-based type system, with features grouped roughly as: Martin, I don't think you can reasonably have the integer registers typed so as to allow for multiple storage representations. For one, the very fact that they lack such baggage is what makes them useful. It *may* make sense to provide an unsigned, 64-bit integer somewhere, though (I hesitate to say as an alternate register type, since that touches so much of Parrot). The real question is this: is this just a Perl 6 thing (if so, then it's fodder for the newly created p6c, and we should drop it), or will/should other high level languages be defining sized integer types through Parrot? If so, then I don't think the current idea of having an Integer PMC is going to be as generic as suggested. If you think that the limitation of not having a handy 64-bit type on a 32-bit system is no big deal, check out the convolutions one Python user suggests under Windows just to store the time: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/303344 Yeah, you're gonna want to not do that ;-) -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: multiple languages clarification - newbie
On Wed, 2004-09-08 at 18:02, Richard Jolly wrote: Can you really do this: #!/usr/bin/perl6 use __Python::sys;# whatever syntax sys.stdout.write( 'hi there');# perl6 syntax matches python syntax There's some confusion in the responses between syntax merging (not appropriate for p6i, and not really what you asked) and access to modules in other languages. Yes, what you ask is essentially possible. It's not clear that you would use ., though the way python does module heirarchies, maybe (it sort of is closer to a method invocation rather than a sub-namespace). Parrot represents namespaces, calling conventions and high level data in generic ways so that they can be moved between languages which might have different syntax and semantics. Because of that something like: foo(bar, 1, new Chunder); in Perl 5 / Ponie could easily call a foo that was defined in some other Parrot language, say Python: def foo (x, y, z): print String was %s, number was %d, other %s%( x, y, z.yawn()) In the case of the string, Python would get a PMC which it could perform all of the normal string operations on, even though it would be a Perlish string object rather than a pythonish string object. Same basic deal in terms of the number. In the case of the objet, Python would get a PMC object which has a means of invoking a method. Python would invoke yawn on z with no parameters and the Perl: package Chunder; ... sub yawn { my $self = shift; print Now yawning\n } would be invoked as normal. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: No Autoconf, dammit!
On Wed, 2004-09-08 at 12:40, Larry Wall wrote: have to be careful to separate architectural parameters from policy parameters. An architectural parameter says your integers are 32 bits. A policy parameter says you want to install the documentation in the /foo/bar/baz directory. Cross compilation has to nail down the architectural parameters while potentially deferring decisions on policy to a later installation step. Actually, I'd say that they're all architectural parameters, and you want to put them all in the database (e.g. you should define that for Fedora Linux Core 2, the default documentation area is /usr/share/man, but for SunOS 3, it's /usr/man). What the person compiling the program OVERRIDES is their call (and in some contexts, the size of integers is a POLICY decision, not architectural). -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Cross compiling (extracting knowlege from autoconf?)
On Wed, 2004-09-08 at 17:40, Rhys Weatherley wrote: On Thursday 09 September 2004 02:40 am, Larry Wall wrote: An interesting question would be whether we can bootstrap a Parrot cross-compile database using autoconf's *data* without buying into the shellism of autoconf. Or give someone the tool to extract the data from the autoconf database themselves, so we don't have to ship it. What autoconf database? Autoconf uses probing for cross-compilation as well. Well, that's one of the big problems with autoconf: it's NOT a database. For example: # AC_FUNC_GETMNTENT # - AN_FUNCTION([getmntent], [AC_FUNC_GETMNTENT]) AC_DEFUN([AC_FUNC_GETMNTENT], [# getmntent is in -lsun on Irix 4, -lseq on Dynix/PTX, -lgen on Unixware. AC_CHECK_LIB(sun, getmntent, LIBS=-lsun $LIBS, [AC_CHECK_LIB(seq, getmntent, LIBS=-lseq $LIBS, [AC_CHECK_LIB(gen, getmntent, LIBS=-lgen $LIBS)])]) AC_CHECK_FUNCS(getmntent) ]) There's knowledge encoded in that, but it's not abstracted sufficiently. Some assumptions could be made, and autoconf's knowledge could be distilled a bit and then extracted into the [cross-]compiling database that would be needed by Parrot. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: No-C, no programming project: Some configure investigation
On Tue, 2004-09-07 at 08:00, Jens Rieks wrote: On Tuesday 07 September 2004 07:52, Robert Schwebel wrote: Would autoconf/automake be an option for the C part of parrot? No, its only available on a few systems. Ok, this is probably a moot conversation because Metaconfig (http://www.linux-mag.com/2002-12/compile_03.html) was written by Larry Wall for rn, and the Perl community has some serious social inertia when it comes to switching to any other configuration tool. That said, autoconf is only available on a few systems. A few being defined as everything I've ever heard of. Seriously, I've never come across any system that lacked autoconf support AND which a high level language like those that would target Parrot, ran on. If you're referring to the number of systems that have autoconf itself actually INSTALLED by default, that's just as moot as the fact that almost no systems have Metaconfig installed. You never run Metaconfig or autoconf as an end-user/installer, you run the resulting [Cc]onfigure script. autoconf (+automake, etc.) is an excellent tool, and while Metaconfig is somewhat more limited, it too is an excellent tool, especially for handling high level languages. Neither tool is wrong for the job, but I expect that people who install Parrot will not be shocked by the classic -des -Dprefix=/usr type of invocation -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Namespaces
On Tue, 2004-09-07 at 09:26, Dan Sugalski wrote: *) Namespaces are hierarchical So we can have [foo; bar; baz] for a namespace. Woo hoo and all that. It'd map to the equivalent perl namespace of foo::bar::baz. [...] It's also possible to hoist a sub-space up a few levels, so that the [IO] space and the [__INTERNAL; Perl5, IO] namespace are the same thing This sounds fine, except for the higher level question of who controls the root. That is, does a module Python decide to define its bits in the Python space AND export them to the root? The other way to go is to say: #!/usr/bin/perl6 use __Python::os; Which has the interesting result that no one ever need touch the root. There's simply a search path that each language uses that would default to [] and its own [__Internal;$language]. Alternate names are fine. I'm seriously tempted to make it [\0\0] Heehee. I don't think __INTERNAL is so bad. *) Each language has its own private second-level namespace. Core library code goes in here. So perl 5 builtins, for example, would hang off of [__INTERNAL; perl5] unless it wants something lower-down Ok, so that's where split would go, right? Does that mean that if, in Python, I wanted to use Perl 5's split, I'd just have to: import __INTERNAL.perl5 list = split '\\ba(?=b)', the_string, 5 ? That's some nifty beans! Sounds great, Dan. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: No-C, no programming project: Some configure investigation
On Tue, 2004-09-07 at 11:59, Andrew Dougherty wrote: Both autoconf and metaconfig assume a unix-like environment. Ambitious plans for parrot's configure include non-unix environments too, such as VMS and all the ports where perl5 uses a manually-generated config.* template. autoconf assumes m4 and shell and some other primitive tools, all of which have GNU ports to just about everything I've had to touch. VMS is the bastard child of autoconf right now, but back when VMS was a platform that folks used, it certainly was supported (as late as 1999 it worked great). I don't think it's the UNIXishness of the platform as much as the popularity of the platform that guides autoconf support (e.g. any platform for which patches are contributed). The guy who tried to update the autoconf 1.x version this year kind of kicked the autoconf guys in the shins verbally and got nowhere as a result. He then forked it and has since apparently dropped support. That said, if you have a manually generated template as you do for Perl 5, you can do the same for autoconf, no? I'm not advocating autoconf here, just exploring the lay of the land. Personally, I think Parrot should do whatever makes it easiest for maintainers. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
SDL usage broken?
When I try to run one of the SDL examples (any of them), I get: SDL::fetch_layout warning: layout 'Pixels' not found! Segmentation fault When I edit runtime/parrot/library/SDL.imc and add the call to _set_Pixels_layout in at line 60, I remove that warning, but still get the seg-fault: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -150897856 (LWP 15959)] 0x08161133 in ins_writes2 (ins=0x8985c20, t=75) at imcc/instructions.c:138 138 if (ins-opnum == w_special[i + (strchr(types, t) - types)]) (gdb) bt #0 0x08161133 in ins_writes2 (ins=0x8985c20, t=75) at imcc/instructions.c:138 #1 0x08162fd4 in analyse_life_block (interpreter=0x879d008, bb=0x8aeff98, r=0x8983398) at imcc/cfg.c:583 #2 0x08162e3f in analyse_life_symbol (interpreter=0x879d008, unit=0x897ed48, r=0x8983398) at imcc/cfg.c:523 #3 0x08162d86 in life_analysis (interpreter=0x879d008, unit=0x897ed48) at imcc/cfg.c:499 #4 0x08164137 in imc_reg_alloc (interpreter=0x879d008, unit=0x897ed48) at imcc/reg_alloc.c:172 #5 0x0815efdc in imc_compile_unit (interp=0x879d008, unit=0x897ed48) at imcc/imc.c:111 #6 0x0815ef33 in imc_compile_all_units (interp=0x879d008) at imcc/imc.c:68 #7 0x0815edc6 in compile_file (interp=0x879d008, file=0x8b3ff90) at imcc.l:920 #8 0x0816cb8b in imcc_compile_file (interp=0x879d008, s=0x897e4d0 library/SDL/Surface.imc) at imcc/parser_util.c:574 #9 0x081fb259 in pcf_p_It (interpreter=0x879d008, self=0x893b0c8) at src/nci.c:1862 #10 0x081d16a4 in Parrot_Compiler_invoke (interpreter=0x879d008, pmc=0x893b0c8, code_ptr=0x8926e50) at classes/compiler.c:56 #11 0x080a38f8 in Parrot_load_bytecode (interpreter=0x879d008, filename=0x8976de0 library/SDL/Surface.imc) at src/packfile.c:3103 #12 0x080ede8c in Parrot_load_bytecode_sc (cur_opcode=0x8980b70, Hope this helps! -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Synopsis 9 draft 1
Taking this to p6i, in order to get Parroty for a few On Thu, 2004-09-02 at 19:47, Larry Wall wrote: =head1 Overview This synopsis summarizes the non-existent Apocalypse 9, which discussed in detail the design of Perl 6 data structures. [...] =head1 Sized types Sized low-level types are named most generally by appending the number of bits to a generic low-level type name: int1 int2 int4 int8 int16 int32 (aka int on 32-bit machines) int64 (aka int on 64-bit machines) Ok, so Parrot doesn't have those. Parrot has int. Presumably this means that when the high-level language programmer (Perl 6 here, but that's just an example) tries to get lower level by explicitly using a sized type, they're going to have to be working in a PMC of some type like PerlInt16, which (for reasons of overflow behavior and a few other things) can almost never be optimized down into an integer register. It seems to me that this causes a dilema for high-level languages where providing to the user what appears to be finer grained control over implementation actually makes them work at a higher level of abstraction. How would Parrot expect languages to implement such features? Should there be a set of (highly JIT-optimizable) PMCs that present sized type features, should the core register types be sizable somehow or should languages just be left to roll their own PMCs that do whatever they want? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: No-C, no programming project: Some configure investigation
On Mon, 2004-09-06 at 12:42, Dan Sugalski wrote: Right now configure.pl pulls a bunch of configuration information straight out of the current perl configuration. We need to stop that, and this is as good a time as any. If someone could go through and make a list of what info configure.pl pulls from perl, I'll start writing (or snagging :) the probing code to do it ourselves, so we can be perl-free, at least from a configuration standpoint. I think right now that info is all in config/init/data.pl, and it's actually fairly well documented. Here's all of the variables that rely on the %Config data from Perl's Config.pm: optimize = $optimize ? $Config{optimize} : '', # Compiler -- used to turn .c files into object files. # (Usually cc or cl, or something like that.) cc= $Config{cc}, ccflags = $Config{ccflags}, ccwarn= exists($Config{ccwarn}) ? $Config{ccwarn} : '', # Flags used to indicate this object file is to be compiled # with position-independent code suitable for dynamic loading. cc_shared = $Config{cccdlflags}, # e.g. -fpic for GNU cc. # Linker, used to link object files (plus libraries) into # an executable. It is usually $cc on Unix-ish systems. # VMS and Win32 might use Link. # Perl5's Configure doesn't distinguish linking from loading, so # make a reasonable guess at defaults. link = $Config{cc}, linkflags = $Config{ldflags}, # ld: Tool used to build dynamically loadable libraries. Often # $cc on Unix-ish systems, but apparently sometimes it's ld. ld= $Config{ld}, ldflags = $Config{ldflags}, libs = $Config{libs}, exe = $Config{_exe}, # executable files extension ld_shared = $Config{lddlflags}, ar= $Config{ar}, ranlib= $Config{ranlib}, make = $Config{make}, make_set_make = $Config{make_set_make}, -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: No-C, no programming project: Some configure investigation
On Mon, 2004-09-06 at 18:29, Aaron Sherman wrote: On Mon, 2004-09-06 at 12:42, Dan Sugalski wrote: If someone could go through and make a list of what info configure.pl pulls from perl, I'll start writing (or snagging :) the probing code to do it ourselves, so we can be perl-free, at least from a configuration standpoint. optimize = $optimize ? $Config{optimize} : '', # Compiler -- used to turn .c files into object files. # (Usually cc or cl, or something like that.) cc= $Config{cc}, ccflags = $Config{ccflags}, ccwarn= exists($Config{ccwarn}) ? $Config{ccwarn} : '', # Flags used to indicate this object file is to be compiled # with position-independent code suitable for dynamic loading. cc_shared = $Config{cccdlflags}, # e.g. -fpic for GNU cc. # Linker, used to link object files (plus libraries) into # an executable. It is usually $cc on Unix-ish systems. # VMS and Win32 might use Link. # Perl5's Configure doesn't distinguish linking from loading, so # make a reasonable guess at defaults. link = $Config{cc}, linkflags = $Config{ldflags}, # ld: Tool used to build dynamically loadable libraries. Often # $cc on Unix-ish systems, but apparently sometimes it's ld. ld= $Config{ld}, ldflags = $Config{ldflags}, libs = $Config{libs}, exe = $Config{_exe}, # executable files extension ld_shared = $Config{lddlflags}, ar= $Config{ar}, ranlib= $Config{ranlib}, make = $Config{make}, make_set_make = $Config{make_set_make}, Add to that: archname (used in several config/auto/*.pl files) sig_name (used in config/auto/*.pl files, but also t/ and lib/) ./config/auto/pack.pl:39:if (($] = 5.006) ($size == $longsize) ($size == $Config{longsize}) ) { ./config/auto/pack.pl:45:elsif ($size == 8 || $Config{use64bitint} eq 'define') { parrotbug/ uses a bunch of Config fields too And then there's everything that uses an i_* field from Configure::Data, but I only SEE i_malloc getting called there. And finally, I don't know what uses config_lib.pasm, but it seems to write out a copy of Perl's Config with some extra stuff added in to runtime/parrot/include/config.fpmc, so anything that references that structure too... anyone know what references that? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: No-C, no programming project: Some configure investigation
On Mon, 2004-09-06 at 18:29, Aaron Sherman wrote: I think right now that info is all in config/init/data.pl, and it's Scratch that. I was grepping through the tree for Config{ which turns out to not catch the way %Config is used in most of the tree... I'll have a look and get you the details. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Semantics for regexes
On Wed, 2004-09-01 at 17:00, Larry Wall wrote: : Let's get concrete: : : rule foo { a $x:=(b*) c } : abbabc : : So, if I understand Parrot and Perl 6 correctly (heh, fat chance), a : slight modification to the calling convention of the closure that : represents a rule (possibly even a raw .Closure) could add a pad that : the callee is expected to fill in with any hypotheticals defined during : execution. Okay, except that hypotheticality is an attribute of a variable's value, not of the pad it's in. Yes, I think I got that part, and perhaps I was being unclear or am still missing something. Here's what I was saying, a slightly different way: As you enter a rule, you establish a new, free-floating pad. It *is* stored on the current pad stack (so that its variables are available to the rule and its closures), but, more importantly, it is part of the rule's state because it is stored in C$0 When you bind a hypothetical it goes into this pad. When you unbind a hypothetical (fail/backtrack) it is deleted from this pad (its value doesn't just get undef). When you return from the rule (and this is the key), you return C$0, which, along with other state, contains a reference to this pad (and the pad, of course contains a circular reference to C$0). The caller can now do one of two things: * Push this pad onto its stack. Pro: simple and fast * Copy each variable from this pad in a smart way, searching up the pad stack for a candidate variable to replace, and defaulting to storing it in the inner-most pad as a new lexical. I think the second one is the one you are describing (and described in A5). The first is, IMHO, the cleaner solution, but I'm not suggesting anything really, just pointing out the options. My real point is that if you just establish such a free-floating hypopad (sounds like something Dr. McCoy would use) in the rule, then you get all of the hypothetical/backtracking behavior that you want, regardless of how the caller integrates the variables with its scope. It also keeps rules from having to search up through existing scope levels themselves, keeping their complexity constrained to what they know best: matching regular expressions and grammars. Perl's calling conventions manage all of the extra complexity on return, and that's probably where stack-walking code should go anyway. : Essentially every close-paren triggers binding, and every back-track : over a close-paren triggers clearing. Yes, that's essentially correct. My quibble was simply that it may be hard to keep track of what to clear out in the case of calling a failure continuation. I'm not sure if that's going to be true or not, as thinking in terms of failure continuations hurts my brain ;-) Still, I'm 99% sure that what I describe above puts all of the what to clear state in the pad that you return. Nice and easy. A side point to Dan: In reading P6PE, I don't see an op for deleting an entry from a pad. At least for this, and I think for some other things that aren't coming to mind right now, it's probably going to be needed. If it's already there, but just not in P6PE, cool and thanks! ;-) -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Semantics for regexes
On Thu, 2004-09-02 at 11:27, Felix Gallo wrote: Although the next regex engine has to deal with the horribly crufty new perl6 syntax Keep in mind that Perl 6 regexen are really just Perl 5 regexen with a call stack and backtracking control. Absolutely everything else that I see in P6 is either just a different syntax for the same thing (e.g. character classes) or unrelated to the actual regex engine itself (e.g. hypotheticals). There's nothing that I see in this that would slow down a mundane regexp OTHER than Unicode, and in many respects P5 has already taken that hit. Now, that's not to say that: rule perl6_program { perl5_program | perl6_statement(*) } is supposed to run as fast as a Perl 5 regexp, but that's a WHOLE other beast, and we don't expect it to be as simple as a regexp. Under the hood, I would expect that P6 regexen will be broken down into matching and flow control parts, and handed off to Parrot differently. While there might or might not be an op for the matching part, the flow control part is just code (though code with significant magic, I will admit). In other words, you might see: rule { a b c } get broken down into: rule { {pasm('regexp P0, P1, a')} b {pasm('regexp P0, P1, c')} } As Dan points out regexp might not exist, or (this seems more likely) might just be a call-back into a tiny regexp compiler that generates Parrot bytecode for the convenience of languages for which regex is not a core feature that the compiler would want to get its hands dirty with. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Semantics for regexes
Ok, I get it now, thanks Larry. I do still think that you can do what I suggest, but I realize that it's not as easy as handing around a single pad, you would actually need to maintain either a list of pads (outside of the built-in pad stack, probably inside of C$0) or a list of C$0s, each with their own pad. If you do that, there's something really NIFTY that falls out of it: premature (possibly temporary) exit from a rule results in the restoration of all hypotheticals to their pre-rule state (because the pads in which they live are no longer active (and possibly gone)). Should you resume such a rule (e.g. because you had gone off to handle a signal or exception), the hypotheticals all pick up their states again and proceed. This kind of atomic hypothetical-updating / reversion could be very valuable (and fast!), since exposing the state of hypotheticals at the moment of a signal or exception doesn't really make a lot of sense, and could cause some programs to fail in surprising ways. Thanks for entertaining this brain-detour of mine. I return you to your regularly scheduled language design without further ado. Sorry to inhabit your 1% unsureness, but that's precisely where I am. Never doubted it ;-) -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: perl6 garbage collector?
On Mon, 2004-08-30 at 14:40, Ozgun Erdogan wrote: Currently, we're using perl-5.6.1 and are having problems with memory leaks - thanks to reference counting. You'll have to break reference loops explicitely. If only I had known where those circular references are. I have a circular ref. detector tool, but it still doesn't get them. The thing is, you could do an SvREFCNT_inc, and boom you have a memory leak. Ok, you're no longer talking about Perl (the language) but rather about Perl 5's internals. Different beast. This is not the right list for debugging that kind of thing, so I won't go into it, but suffice to say that if you have trouble managing your references through XS, incorporating Parrot's GC into Perl 5 would be near impossible. That's not intended as a slight, believe me, I put myself in the same category (reference counting in Perl 5 is very difficult to grok from the docs, as the docs make some assumptions about how much you know about how Perl constructs scopes). All that aside, Ponie is your friend. As Ponie matures, it will provide what you need, and your XS could be transitioned over into Parrot bytecode. For now, if I were you I would upgrade to 5.8.x and try to make sure that every value that you move between your XS and Perl is properly mortal (see the perlapi, perlguts and perlxs man pages). -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Library loading
On Sat, 2004-08-28 at 16:17, Dan Sugalski wrote: Time to finish this one and ensconce the API into the embedding interface. That reminds me, I was reading P6PE yesterday, and I came across a scary bit on loading of shared libraries. The statement was made that Parrot would search the current directory first. Perhaps this was an over-simplification, but if not, PLEASE, re-consider. Security implications aside (and they're huge), Parrot should probably be searching its installation area (possibly overridden by an environment variable) followed by whatever system path (e.g. LD_LIBRARY_PATH, ldconfig or whatever your OS uses) is given to Parrot externally, so as not to modify the behavior of a program based on the current directory of the user running it. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Proposal for a new PMC layout and more
On Wed, 2004-09-01 at 11:17, Leopold Toetsch wrote: Comments welcome, Honestly, much of this goes beyond my meager understanding of Parrot internals, but I've read it, and most of it seems reasonable. Just on point where you may not have considered a logical alternative: =head2 2.6. Morphing Undefs Currently all binary (and other) opcodes need an existing destination PMC. The normal sequence a compiler emits is something like this: $P0 = new Undef $P0 = a + b Since you've lopped a lot of space off of PMCs, Undefs could be made large enough to fit a basic buffer PMC (3 words). In that case, they could always be upgraded in-place to integer PMCs, float PMCs, very simple objects, references and buffers. Everything else would need to go through a copy-upgrade. The trade-off is that all PMCs would be 3 words unless special code was emitted that avoided this for smaller (integer, float, reference) PMCs. I'm not saying that this is a BETTER plan, just an idea to think about and a different set of trade-offs. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Semantics for regexes
On Wed, 2004-09-01 at 16:07, Larry Wall wrote: I see one other potential gotcha with respect to backtracking and closures. In P6, a closure can declare a hypothetical variable that is restored only if the closure exits unsuccessfully. Within a rule, an embedded closure is unsuccessful if it is backtracked over. But that implies that you can't know whether you have a successful return until the entire regex is matched, all the way down, and all the way back out the top, or at least out far enough that you know you can't backtrack into this closure. Abstractly, the closure doesn't return until the entire rest of the match is decided. Internally, of course, the closure probably returns as soon as you run into the end of it. Let's get concrete: rule foo { a $x:=(b*) c } abbabc So, if I understand Parrot and Perl 6 correctly (heh, fat chance), a slight modification to the calling convention of the closure that represents a rule (possibly even a raw .Closure) could add a pad that the callee is expected to fill in with any hypotheticals defined during execution. The following would happen in the example above: store_lex bb into hypopad($x) after abb find a and fail the rule, backtracking (clear hypopad($x)) store_lex b into hypopad($x) after backtracking over one b find b next and fail the rule, backtracking again (clear) store_lex b into hypopad($x) after second ab find c and succeed rule foo, return hypopad Essentially every close-paren triggers binding, and every back-track over a close-paren triggers clearing. Because this is all part of the calling convention for a rule, there's no difference between a rule passing back hypotheticals to its caller and a sub-rule doing so to the rule which called IT. Is that workable? Does it address your concern, Larry, or did I miss your point? -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Semantics for regexes
On Wed, 2004-09-01 at 16:33, Aaron Sherman wrote: rule foo { a $x:=(b*) c } In the rest of my message I acted as if that read: rule foo { a $x:=(b+) c } so, we may as well pretend that that's what I meant to say ;-) -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
OT: SPF problem with the list?
Please let me know who is appropriate for this, and whatever you do, please don't reply to / CC the list. We don't need to bog down the works with discussion of spam filtering. I'm noticing that mail from perl6-* is showing up with this header: Received-SPF: softfail (mail.ajs.com: transitioning domain of perl.org does not designate 63.251.223.186 as permitted sender) client-ip=63.251.223.186; [EMAIL PROTECTED]; helo=lists.develooper.com; That is added by my local SPF-checker. It seems that x6.develooper.com [63.251.223.186], which is sending these out this mail is not in perl.org's SPF record (which would be fine if perl.org had no SPF record, but it does). There's an easy way to say and all of this other domain's MXes too in SPF, which is probably what was intended. This is causing my spam filtering to slightly bump p6 mail toward spam (though so far, I don't think I've gotten any false positives). I take a somewhat proprietary interest in perl.org working well for various historical/sentimental reasons, so I'd be happy to help with any debugging / diagnosing of this if that would help. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: NCI and callback functions
Leopold Toetsch wrote: Leopold Toetsch wrote: Stephane Peiry wrote: g_return_val_if_fail (G_IS_OBJECT (gobject), 0); Fails here gtk shouldn't make assumption on the user_data argument IMHO. The whole idea behind callbacks is, that there is a userdata argument that get's passed through transparently. GTK is taking a gobject only. So there is currently no way to use the existing callback scheme. Can't you wrap what you want to pass in a GObject?
Re: Something to ponder
On Wed, 2004-08-18 at 15:57, Felix Gallo wrote: Dan writes: sub foo :come_from('+', int, int) {} One problem with MMD in general, and return specifically, is 'what happens if multiple M match the same D requirements? i.e., That's a question, not a problem. It's easy to answer questions ;-) I assume we're talking about first-to-match only, but I haven't looked at the code. You could always go look at the MMD code in the Parrot source... However, I'm not sure what Dan meant there. Perhaps he mis-spoke, or perhaps I don't understand this at all... that's a calling signature, not a return signature. I would expect: sub foo :come_from('+', int) {...} # Handle integer returns sub foo :come_from('+', num) {...} # Handle floating point That's very different from a come_from that would operate on the calling signature (which needs return continuations too, but differently). Or did we just switch to talking about something different while I wasn't looking? If the answer is 'all get executed', this could be useful for any languages interested in implementing aspect-oriented programming as a first class language feature, e.g. You can build one from the other trivially, though and that doesn't affect, in the slightest, how first class the feature is in a language that uses Parrot, only how interchangeable it is between languages. That, of course, dodges the question of how much aspect oriented programming is an attempt to being the beauty of Intercal to ugly, usable programming languages sub debug_log :come_from(:benchmark_me) { my $function_name = shift; print STDERR debug: $function_name at . time() . \n; } Ok, this is starting to look like people speaking seriously about using Intercal's COME FROM (http://c2.com/cgi/wiki?ComeFrom)... can we just step back and take a deep breath of AIR please? Seriously, this is starting to creep me out. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Something to ponder
On Tue, 2004-08-17 at 16:22, Felix Gallo wrote: On Tue, Aug 17, 2004 at 04:08:34PM -0400, Dan Sugalski wrote: 1) We're going to have MMD for functions soon 2) Function invocation and return continuation invocation's essentially identical 3) Therefore returning from a sub/method can do MMD return based on the return values $x -\ \ @mylist -+--- $obj.mymmdsub; / %hash --/ How very fungible of you ;-) Still, I think that's a nice APPLICATION, but the concept is more flexible, if I understand it correctly. It would be something that would look more like a cross between exception handling and a switch statement. I would think it would look more like (again, Perlish example): $sock.peername() does returntype( Net::Address::IP - $ip { die Remote host unresolvable: '$ip'; }, Net::Address - $addr { die Non IP unresolvable address: '$addr'; }, Str - $_ { print Seemingly valid hostname: '$_'\n; }); Of course, that's just Perl. Perhaps Python would add something that would look like: returnswitch: sock.peername() returncase os.net.addr.ip: lambda ip: raise OSError, Unresolvable %s % str(ip) returncase os.net.addr: lambda addr: raise OSError, Unresolvable non-IP %s % str(ip) returncase str: lambda name: print Seemingly valid hostname: '%s' % name My python skills are still developing, so pardon me if I've gotten it wrong, and I'm just inventing os.net.addr.ip for purposes of pseudo-code Is that the kind of thing you had in mind, Dan, or am I misunderstanding how return continuations work? -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: What Unicode means to us
On Mon, 2004-08-09 at 14:14, Dan Sugalski wrote: Additionally if we have source text which is Latin-n, EBCDIC, ASCII, or whatever we must be able to convert it with no loss to Unicode. (Which I believe is now doable with Unicode 4.0) Losslessly converting Unicode to ASCII/EBCDIC/whatever is *not* required, which is fine as it's theoretically (and often practically) impossible. Can I suggest instead: If we have source text which is comprised of a non-Unicode character-set we must be able to convert it with minimal loss to Unicode (minimal being defined as zero for all Unicode-subset character sets). Converting Unicode to non-Unicode character sets will be lossless where possible, and will attempt to encode the name of the character in ASCII characters into the target character set. An example would be the conversion of the UTF-8 string (in Perl 5 notation): foo \x{263a} bar to the ASCII representation: foo {SMILING FACE, WHITE} bar There are 4 possible failure modes, each resulting in a conversion exception: 1) the ASCII name is not available 2) the ASCII name cannot be converted into the target character set (recursive name-lookups are not allowed, nor would they be very useful) 3) a VM parameter requesting exceptions on failed character-set conversions has been set to a true value 4) the source is a PMC and that PMC has a property indicating that exceptions should be generated on failed conversions. This just seems a bit more useful in the general case to me, while allowing the language implementation the option of requesting an exception either globally or per-PMC. Thoughts? -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: What Unicode means to us
I don't want to argue per-se (that doesn't do anyone any good), so if your mind is made up, that's cool... still, I think there's some value in exploring the options, so read on if you're so inclined. On Wed, 2004-08-11 at 04:40, Dan Sugalski wrote: Converting Unicode to non-Unicode character sets will be lossless where possible, and will attempt to encode the name of the character in ASCII characters into the target character set. Gack. No, I think this'd be a bad idea as the default behavior. Well ok, why not make an exception the default behavior then? Just reverse what I suggested from the default to the option. It's still mighty handy for a language (any Parrot-based language) to be able to render a meaningful string in any ASCII-capable encoding from any Unicode subset. I think the only problem would be in the realm of directionality of script, but I assume that all non L-R scripts have some convention for injecting snippits of L-R, just as en-US injects R-L, easy as . What's right is up in the air -- I'm figuring we'll either throw an exception or substitute in a default character, but the full expansion's definitely way too much. That's too bad, as: This was converted from nicode becoming This was converted from {FULLWIDTH LATIN CAPITAL LETTER U}nicode seems much more reasonable than choosing some poor ASCII character to act as the fallback. If someone does something stupid like converting a 5MB document in UTF-8 encoded Cyrillic into ASCII, then they're going to get a huge result, but that's no less useful than 3MB of text that looks like ** ***-**. ***'* *, I would think, and perhaps more useful for certain purposes (e.g. it could still be deciphered and/or re-assembled). The other way to go would be some sort of standardized low-level notation to represent encoding and codepoint such as: This was converted from {U+FF35}nicode That's less readable, but arguably more reversible and/or precise. Certainly more easily automatically detected. For example, the following Perl 5 code could reverse such transformation: s{\{(.)\+([a-f\d]+)\}}{ character(target_encoding = $target_encoding, source_encoding = abbrv_to_encoding($1), source_codepoint = hex(0x.$2)) }eg; assuming, of course, a function character and a function abbrv_to_encoding which attempt generate a character in a target encoding based on a character in a source encoding and return an encoding ID/name/object/whatever based on a one-character abbreviation respectively. It would be ideal if other tactics could be used like the GB 2312 encoding in ASCII described in RFC 1842. Of course, the above could be permuted that way: This was converted from {G+~{:Ky2;S{#,NpJ)l6HK!#~}} But that starts to get deeper into character set and encoding transformation than my head is capable of coping with at this stage (I'm really just learning about these topics). I fear I'm walking down a road that ends in my suggesting that every non-Unicode string has a MIME header, but rest assured that that's not my goal. I just wanted to suggest a useful alternative to throwing an exception on incompatible type conversion, especially for those client languages (e.g. m4) in which an exception will either have to be ignored or treated as fatal. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: We have spawn, and now we need exec
On Thu, 2004-08-05 at 13:43, Dan Sugalski wrote: Cool. On the Unix platforms we exec off 'sh' and pass in parameters (so we get command parameters split up right, IIRC). I'm presuming we don't do the same for Windows, so I'll make it the plain command and hope it all works out. Well, that's one way you can do it, but it causes a ton of headaches, e.g. because exec echo user's text goes here gets shell interpretation and fails, so by way of example only, Perl 5 allows for both usages depending on what you pass it. Parrot could easily make the distinction based on being passed a string value or a PMC array of some sort and end up with roughly the same functionality as Perl (though Perl itself would not use this as-is, as it decides further based on the content of the string, and will call raw exec(2) on the results of splitting the string on whitespace if no shell metacharacters occur, but I think that's a bit too much Perlishness to put in Parrot). Either way, Parrot really HAS to provide a raw POSIX exec, as it cannot be faked from a shell-using variant correctly. -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: We have spawn, and now we need exec
On Thu, 2004-08-05 at 14:11, Aaron Sherman wrote: Parrot could easily make the distinction based on being passed a string value or a PMC array of some sort and end up with roughly the same functionality as Perl (though Perl itself would not use this as-is, as it decides further based on the content of the string, and will call raw exec(2) on the results of splitting the string on whitespace if no shell metacharacters occur, but I think that's a bit too much Perlishness to put in Parrot). Run-on sentence from hell barely begins to describe the horror... I'm so sorry about that. Please, do me a favor and breath whenever you feel like it ;-) -- 781-324-3772 [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Python builtin namespace
On Thu, 2004-07-15 at 22:46, Dan Sugalski wrote: And language builtin namespaces in general. We need a standard, and now's as good a time as any, so... All language-specific builtin functions go into the _core_Language namespace. (So for Python it's _core_Python, Perl 5 is _core_Perl5, and so on) In the specific case of Perl 5 and 6, aren't builtins in the same (Parrot) namespace as user-defined functions? In Perl 5, you can access builtins through the CORE:: (Perl) namespace, which certainly is visible to all Perl programs. Would there be some sort of namespace aliasing for this sort of thing (or the Microsoft naming scheme for C#/.Net Framework for that matter)? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Perl Toolsmith http://www.ajs.com/~ajs/resume.html
[Fwd: Re: Layering PMCs]
I sent this message out a few days ago, but never saw it show up on the list... Just to recap a) option #1 seemed best to me b) this will all happen at the parrot level c) languages will almost never change an object to read-only d) there are some reasons that old access to an object should not become read-only e) true read-onlyness will probably most often be optimized at the language level by storing cached values in typed registers anyway -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Perl Toolsmith http://www.ajs.com/~ajs/resume.html ---BeginMessage--- On Sat, 2004-05-29 at 15:29, Dan Sugalski wrote: The problem with the first scheme is that anything that has a handle on the PMC will not get the new layers. Not a good thing. I like the first scheme. The question that comes up is: when does something get layered? That is: if I have code that says: new_thread_increment_every_minute(foo) become_read_only(foo) Then, you have two choices of semantic: 1. foo is read-only retroactively and will throw an exception in one minute 2. You've given the new thread a read-write version of foo, and the read-only version created after that now has the property of changing every minute. I would see this as being very useful for several types of read-only access to data that DOES change (an accumulator for a random number entropy pool, for example). High level languages on the other hand, should probably not expose this directly. They will create a variable and tag it as read only at the same time, and to the programmer there's no difference. If they do allow for run-time read-only-ification, they can always build their own high-level abstraction around this core PMC. The only problem I see with this is that high level languages might want to cache the value of a read-only variable in a typed register. If read only really is read only, that's valid, but if it's only an interface restriction it's not. There you're going to have some semantic boundaries between languages that might be unfortunate. How much of a problem that is, I'm not sure. As for threading, I think the simple layering is easiest, and again, you create the PMC layered if you want that functionality (e.g. for locking). -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback ---End Message---
Re: Layering PMCs
On Sat, 2004-05-29 at 15:29, Dan Sugalski wrote: The problem with the first scheme is that anything that has a handle on the PMC will not get the new layers. Not a good thing. I like the first scheme. The question that comes up is: when does something get layered? That is: if I have code that says: new_thread_increment_every_minute(foo) become_read_only(foo) Then, you have two choices of semantic: 1. foo is read-only retroactively and will throw an exception in one minute 2. You've given the new thread a read-write version of foo, and the read-only version created after that now has the property of changing every minute. I would see this as being very useful for several types of read-only access to data that DOES change (an accumulator for a random number entropy pool, for example). High level languages on the other hand, should probably not expose this directly. They will create a variable and tag it as read only at the same time, and to the programmer there's no difference. If they do allow for run-time read-only-ification, they can always build their own high-level abstraction around this core PMC. The only problem I see with this is that high level languages might want to cache the value of a read-only variable in a typed register. If read only really is read only, that's valid, but if it's only an interface restriction it's not. There you're going to have some semantic boundaries between languages that might be unfortunate. How much of a problem that is, I'm not sure. As for threading, I think the simple layering is easiest, and again, you create the PMC layered if you want that functionality (e.g. for locking). -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Please become ID verified.
On Mon, May 24, 2004 at 09:48:45PM -0400, Uri Guttman wrote: is there a paypal PMC in the plans? will it be multi-accounted? will it have built in auth support? what about rounding errors? In case it was not obvious, the Paypal message was a scam to get people's passwords. The offending host appears to be 81.196.122.75. -- Aaron Sherman [EMAIL PROTECTED] finger [EMAIL PROTECTED] for GPG info. Fingerprint: www.ajs.com/~ajs6DC1 F67A B9FB 2FBA D04C 619E FC35 5713 2676 CEAF Visit my Mushroom Journals at http://mush.ajs.com/
RE: Events (I think we need a new name)
On Fri, 2004-05-14 at 06:27, Rachwal Waldemar-AWR001 wrote: It seems the name 'event' is not as bad. So, maybe 'Pevent', stands for 'parrot event'? One advantage... it'd be easy searchable. I recall a pain whenever I searched for 'thread', or 'Icon'. If you're talking about search engines, then of course parrot event works just fine. If you're talking about searching your code, then that's another matter. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Events (I think we need a new name)
On Wed, 2004-05-12 at 12:08, Dan Sugalski wrote: It does, though, sound like we might want an alternate name for this stuff. While event is the right thing in some places it isn't in others (like the whole attribute/property mess) we may be well-served choosing another name. I'm open to suggestions here... How about skippy? Seriously, I would say that event is about as abstract as it comes. Even the proposed message is, in some ways, LESS abstract. What's the specific sort of case events don't seem to cover? The setting of a property? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Patch: Do rand() and srand()
On Wed, 2004-04-28 at 09:54, Aaron Sherman wrote: A simple implementation of rand() and srand() which may not be ideal for Perl. Also included is the test file for random ops. If anyone can think of a good way to ALWAYS know that a number we got back was random, throw that into the test ;-) Was this going to get sucked up into CVS? I'm just having to keep patching it back in each time I update for now. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Events design question: Handles for repeating events?
On Tue, 2004-05-04 at 09:25, Dan Sugalski wrote: Okay, I'm working up the design for the event and IO system so we can get that underway (who, me, avoid the unpleasantness of strings? Nah... :) and I've come across an interesting question. The way things are going to work with specific actions a program has asked to be done, such as a disk read or write, is that you get back a handle PMC that represents the request, and you can wait on that handle for the request to be completed. The sequence goes something like: write Px, Py, Sz # Return handle, file, and data to write waitfor Px # Wait for the request to finish So, all Parrot IO will be asynchronous? Does that mean that there's no way to perform an atomic read or write? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Events design question: Handles for repeating events?
On Tue, 2004-05-04 at 11:36, Dan Sugalski wrote: At 11:25 AM -0400 5/4/04, Aaron Sherman wrote: So, all Parrot IO will be asynchronous? Does that mean that there's no way to perform an atomic read or write? Yes, and there isn't now anywhere anyway so it's not a big deal. I was speaking in terms of Parrot. Obviously, at the OS level some writes are guaranteed atomic (e.g. POSIX dictates that writes of PIPE_BUF or fewer bytes are atomic on a pipe, but that's neither here nor there) and others are not. What I was asking was more in terms of what could happen to Parrot while your write is in an unknown state. Specifically, I'm concerned that I might want to say: become immune to any events perform write re-sensitize to events But, if writes are implemented using the event-handling system, won't that mean that you can't actually do that? Here's one scenario for a filter that I think demonstrates my concern: read event handler: perform synchronous write This simple example might perform a partial write, then get a read event and queue up a second write, perform a partial write on that, then queue up a third write due to another event You can feel free to tell me there's some obvious way this is avoided, as I admit I'm no expert in the domain of asynchronous IO management. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: Bit ops on strings
On Sat, 2004-05-01 at 04:57, Jarkko Hietaniemi wrote: If Jarkko tells me you can do bitwise operations with unicode text now in Perl 5, well... we'll support it there, too, though we shan't like it at all. We can and I don't like it at all [...] None of it anything I want to propagate anywhere. Please correct me if I'm wrong here, but I'm going to lay out my understanding as a set of assertions: * Parrot will be able to convert any encoding to any other encoding * though, some conversions will result in an exception, that's still a defined behavior * We've agreed that only raw binary 8-bit strings make sense for bit vector operations So it seems to me that the obvious way to go is to have all bit-s operations first convert to raw bytes (possibly throwing an exception) and then proceed to do their work. This means that UTF-8 strings will be handled just fine, and (as I understand it) some subset of Unicode-at-large will be handled as well. In other-words, the burden goes on the conversion functions, not on the bit ops. It's not that it's going to be meaningful in the general case, but if you have code like: sub foo() { return \x01+|\x02 } I would expect the get the bit-string, \x03 back even though strings may default to Unicode in Perl 6. You could put this on the shoulders of the client language (by saying that the operands must be pre-converted, but that seems to be contrary to Parrot's usual MO. Let me know. I'm happy to do it either way, and I'll look at modifying the other bit-string operators if they don't conform to the decision. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback signature.asc Description: This is a digitally signed message part
Re: Bit ops on strings
On Sat, 2004-05-01 at 11:26, Jarkko Hietaniemi wrote: As for codepoints outside of \x00-\xff, I vote exception. I don't think there's any other logical choice, but I think it's just an encoding conversion exception, not a special bit-op exception (that's arm-waving, I have not looked at Parrot's exception model yet... miles to go...) This means that UTF-8 strings will be handled just fine, and (as I Please don't mix encodings and code points. That strings might be serialized or stored as UTF-8 should have no consequence with bitops. What I meant was that UTF-8 IS going to be represented in a way that will guarantee you won't get an exception when trying to do bit-ops. All bets are off for many other encodings. While you're right that you might get lucky, that wasn't really the point I was making. Many languages (Perl included, I think) are going to encode strings as UTF-8 by default, and this means that in the general case, we should not expect exceptions to be thrown around any time we do a bit-op and 'A'|'B' will still be 'C' :-) Of course. But I would expect a horrible flaming death for \x{100}|+\x02. Well, if you consider a string conversion exception to be horrible flaming death, then I hate to see what you do with a divide-by-zero ;-) None of your response sounds overly scary to me, so I'll start looking at what Parrot does NOW for bit-string-ops and see if it needs to mutate to fit this model. Then I'll add in the rest. Then I get to see what evil Dan and Leo perform upon my patch ;-) -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback signature.asc Description: This is a digitally signed message part
Re: Bit ops on strings
On Sat, 2004-05-01 at 14:18, Jeff Clites wrote: On May 1, 2004, at 8:26 AM, Jarkko Hietaniemi wrote: Just FYI, the way I implemented bitwise-not so far, was to bitwise-not code points 0x{00}-0x{FF} as uint8-sized things, 0x{100}-0x{} as uint16-sized things, and 0x{} as uint32-sized things (but then bit-masking them with 0xF to make sure that they fell into a valid code point range). That's pretty arbitrary, but if you bitwise-not as though everything were 32-bits wide, you'll end up with a string containing no assigned code points at all (they'll all be 0x10F). But from a text point of view, bitwise-not on a string isn't a sensible operation no matter how you slice it (that is, even for 0x{00}-0x{FF}), so one flavor of arbitrary is just about as good as any other. We could also make anything 0x{FF} map to either 0x{00} or 0x{FF}, or mask if with 0xFF to push it into that range. It's all pretty meaningless, as text transformations go, and I can't imagine anyone using it for anything, except maybe weak encryption. I think Dan and I were both thinking in terms of bit-vector operations on byte-streams for any purpose that would require such a beast. In Perl, you have the vec function to make this slightly easier. This is one of those places where thinking about strings as text is highly misleading. They're used for an awful lot more. Exactly. And also realize that if you bitwise-not (or shift or something similar) the bytes of a UTF-8 serialization of something, the result isn't going to be valid UTF-8, so you'd be hard-pressed to lay text semantics down on top of it. How are you defining valid UTF-8? Is there a codepoint in UTF-8 between \x00 and \xff that isn't valid? Is there a reason to ever do bitwise operations on anything other than 8-bit codepoints? I'm beginning to wonder if we're going to be square-rooting strings, and taking the array-th root of a hash :) Strings are not numbers, but there's a heck of a lot of code out there that treats existing strings as bit-vectors (note: bit vectors are not numbers either), and that code needs to be supported, no? Now, shift operations aren't usually part of the package, but I figured that as long as we were going to have the rest of the bit-manipulators, finishing off the set would be of value. More to the point, I said all of this at the beginning of this thread. You should not, at this point, be confused about the scope of what I want to do, as it was very narrowly and clearly defined up-front. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback signature.asc Description: This is a digitally signed message part
Re: Bit ops on strings
On Sat, 2004-05-01 at 15:09, Jarkko Hietaniemi wrote: How are you defining valid UTF-8? Is there a codepoint in UTF-8 between \x00 and \xff that isn't valid? Is there a reason to ever do Like, half of them? \x80 .. \xff are all invalid as UTF-8. Heh, damn Ken Thompson and his placemat! I am too new to UCS and UTF-8, and had thought it was always 8-bit. I stand corrected, having read up on the UTF-8 and Unicode FAQ. Jeff, yeah I have to take back my statement. If Perl defaults to UTF-8, then it's not a valid assumption that a UTF-8 input string won't throw an exception. I still think that's ok, and better than representation-expanding to the larger representation and doing the bit-op in that, since that means that bit-vectors would have to be valid in enum_stringrep_one, _two and _four as sort of alternate datastructures. I don't think we want to go there. For everything else, as Jeff correctly points out, this has nothing to do with encoding. Only in the sense that default encoding in a language like (only one example) Perl 6 dictates what representation you will have to expect to be the common case. bitwise operations on anything other than 8-bit codepoints? I am very confused. THIS IS WHAT WE ALL SEEM TO BE SAYING. BITOPS ONLY ON EIGHT-BIT DATA. AM I WRONG? No, it's not, and could you please not get emotional about this? It's what you, Dan and I have been saying, but I was responding to Jeff who said: Just FYI, the way I implemented bitwise-not so far, was to bitwise-not code points 0x{00}-0x{FF} as uint8-sized things, 0x{100}-0x{} as uint16-sized things, and 0x{} as uint32-sized things (but then bit-masking them with 0xF to make sure that they fell into a valid code point range). It was kind of important that I deal with the fact that I was proposing a very different behavior for bit-shifting than exists currently for boolean operations, I thought. The question becomes should I CHANGE the existing bit-ops so that they don't work on representations in two or four bytes for symmetry? If this continues to be so contentious, I'm tempted to agree with the nay-sayers and say that Parrot shouldn't do bit-vectors on strings, and we should just implement a bit-vector class later on. Perl will just have to suffer the overhead of translation. This just IS NOT important enough to waste this many brain cells on. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback signature.asc Description: This is a digitally signed message part
Re: Bit ops on strings
On Fri, 2004-04-30 at 10:42, Dan Sugalski wrote: Bitstring operations ought only be valid on binary data, though, unless someone can give me a good reason why we ought to allow bitshifting on Unicode. (And then give me a reasoned argument *how*, too) 100% agree. If you want to play games with any other encoding, you may proceed to write your own damn code ;-) -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
RE: Bit ops on strings
On Fri, 2004-04-30 at 09:47, Butler, Gerald wrote: If I may interject for a moment: Let me start by saying that I have not drunk the Unicode cool-aid. I'm not at all certain that the overhead required to do all of what Parrot wants to do is warranted, BUT that's beside the point. Parrot is doing things the way it's doing them, and the time for debate was a few months or at latest weeks ago, as far as I can tell. I have been following the discussion of strings on this list over the last few weeks. It seems that there is somewhat of a disconnect in various definitions of what is a string. It seems as though there needs to be a hierarchy to this with a little more clear definition. May I humbly propose the following: 1. String - low-level, abstract, base class (or in Perl6 terms role -- I think) which represents a logically contiguous series of Parrot Int You say that you think there should be a hierarchy, but you're just throwing out broad concepts and applying them equally to terminology, representation and implementation. As such, there is no good way to respond to what you suggest, nor any way to determine how much work you are proposing be performed in order to bend existing code to your suggested paradigm. A string is what Dan described in his various postings on strings. Nuff said. ### Aside from the rest of your message, and bearing no logical impact on the rest of it, I'd like to call out: The information contained in this e-mail message is privileged and/or confidential and is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify us by telephone (330-668-5000), and destroy the original message. Thank you. Need I point out http://www.goldmark.org/jeff/stupid-disclaimers/ -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
RE: Bit ops on strings
On Fri, 2004-04-30 at 12:18, Butler, Gerald wrote: Now, we have people talking about doing LSL/LSR on Strings. That is 100% inconsistent with that definition of a String. Not at all, and keep in mind that I didn't propose this out of the blue. bands, bxors and bors are existing string ops and have been for a long time. I was just proposing rounding out the bit operator set. Go check out the ops/bit.ops in CVS. It's even well documented. I don't think Dan was being at all contradictory or inconsistent in his string postings, given that those ops were already there. I may have problems with the extent to which Parrot embraces abstraction, but inconsistency is not one of them. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Bit ops on strings
bit.ops defines some ops on strings, and not others. I was wondering if anyone thinks the following would be useful (I'm offering to write them, as it won't be much work): lsls(inout STR, in INT) lsrs(inout STR, in INT) and, of course, their appropriate permutations. For those who haven't looked at bit.ops, lsl and lsr are logical shift left and logical shift right. Doing this operation on strings (as bands, bors and bxors do) would allow the full range of bit-manipulation to be done quickly on strings-as-bitfields (though, of course, it's already possible even without these operations). I don't see shls and shrs being useful (or terribly meaningful), but correct me if I'm wrong there. Of course, there's the small matter that shifting left might grow your string, but this should not be a major concern for Parrot. I don't think shifting right should shrink the string. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: MMD performance (was: keyed vtables and mmd)
On Thu, 2004-04-29 at 03:33, Leopold Toetsch wrote: As Dan already said there is no performance hit (at least if the MMD tables don't blow the caches). Good stuff! One thing leaps to mind when you mention the cache though... keep in mind that blowing L2 cache (which we might be in no danger of doing at all, but I'm just bringing it up) might be WORSE than you would think on P4 and beyond because of hyperthreading. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Patch: Do rand() and srand()
A simple implementation of rand() and srand() which may not be ideal for Perl. Also included is the test file for random ops. If anyone can think of a good way to ALWAYS know that a number we got back was random, throw that into the test ;-) Perl 5 mandates that it calls srand if you call rand without first calling srand. Since not all Parrot client-languages will want that behavior it's not in this version of rand, but that leaves Perl having to maintain separate state. In future, it would be nice to add a special rsrand or the like, which checks to see if srand has already been called. For now this should be sufficient for anyone who expects a rand op, and it's not an onerous amount of state for Perl to store. The real concern is that Perl and Foolanguage might both srand(), but that's not something I'm gonna think too hard about just now and probably is a matter for library maintainers in those languages anyway. Oh, one more thing: I added op numbers for the sqrt ops since they were causing me to be given some warnings during build. Feel free to ignore them if you don't want sqrt to have op numbers. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback Index: ops/math.ops === RCS file: /cvs/public/parrot/ops/math.ops,v retrieving revision 1.18 diff -u -r1.18 math.ops --- ops/math.ops 27 Apr 2004 15:48:20 - 1.18 +++ ops/math.ops 27 Apr 2004 20:10:16 - @@ -1392,6 +1392,58 @@ =back +=item Brand(out NUM, in NUM) + +=item Brand(out NUM) + +=item Brand(out INT, in INT) + +=item Brand(out INT) + +=item Bsrand(in NUM) + +Generate random numbers based on the Random PMC. + +=cut + +inline op rand(out NUM, in NUM) { + FLOATVAL n = $2; + PMC * r = pmc_new_noinit(interpreter, enum_class_Random); + $1 = VTABLE_get_number(interpreter,r); + $1 *= $2; + goto NEXT(); +} + +inline op rand(out INT, in INT) { + INTVAL n = $2; + PMC * r = pmc_new_noinit(interpreter, enum_class_Random); + FLOATVAL resultnum; + resultnum = VTABLE_get_number(interpreter,r); + $1 = (INTVAL)(resultnum * (FLOATVAL)n); + goto NEXT(); +} + +inline op rand(out NUM) { + PMC * r = pmc_new_noinit(interpreter, enum_class_Random); + $1 = VTABLE_get_number(interpreter,r); + goto NEXT(); +} + +inline op rand(out INT) { + PMC *r = pmc_new_noinit(interpreter, enum_class_Random); + $1 = VTABLE_get_integer(interpreter,r); + goto NEXT(); +} + +inline op srand(in INT) { + INTVAL i = $1; + PMC * r = pmc_new_noinit(interpreter, enum_class_Random); + VTABLE_set_integer_native(interpreter,r,i); + goto NEXT(); +} + +=back + =cut ### Index: ops/ops.num === RCS file: /cvs/public/parrot/ops/ops.num,v retrieving revision 1.36 diff -u -r1.36 ops.num --- ops/ops.num 22 Apr 2004 09:17:38 - 1.36 +++ ops/ops.num 27 Apr 2004 20:10:16 - @@ -1451,3 +1451,15 @@ fetchmethod_p_p_s 1424 fetchmethod_p_p_sc 1425 setref_p_p 1426 +sqrt_n_i1427 +sqrt_n_ic 1428 +sqrt_n_n1429 +sqrt_n_nc 1430 +rand_n_n1431 +rand_n_nc 1432 +rand_i_i1433 +rand_i_ic 1434 +rand_n 1435 +rand_i 1436 +srand_i 1437 +srand_ic1438 #! perl -w # Copyright: 2001-2003 The Perl Foundation. All Rights Reserved. # $Id$ =head1 NAME t/op/random.t - Random numbers =head1 SYNOPSIS % perl t/op/random.t =head1 DESCRIPTION Tests random number generation =cut use Parrot::Test tests = 5; use Test::More; use Parrot::Config; use Config; output_is('CODE', OUT, generate random int); rand I0 print Called random just fine\n end CODE Called random just fine OUT output_is('CODE', OUT, generate random 10int=0); rand I0, 10 ge I0, 10, BROKE lt I0, 0, BROKE print Called random just fine\n exit 0 BROKE: print Failure: random number print I0 print is not in range 0..9\n end CODE Called random just fine OUT output_is('CODE', OUT, generate random num); rand N0 print Called random just fine\n end CODE Called random just fine OUT output_is('CODE', OUT, generate random 10num=0); rand N0, 10.0 ge N0, 10.0, BROKE lt N0, 0, BROKE print Called random just fine\n exit 0 BROKE: print Failure: random number print N0 print is not in range 0.0..10.0\n end CODE Called random just fine OUT output_is('CODE', OUT, Seed RNG); srand 1 print Seeded the rng just fine\n end CODE Seeded the rng just fine OUT 1; # HONK
Re: Patch: Do rand() and srand()
On Wed, 2004-04-28 at 10:01, Jens Rieks wrote: Thats the reason why we have a Random PMC (classes/random.pmc). I'am still not sure if we need an rand/srand OP for random numbers. As you already mentioned, srand uses a global state and I belief that it will cause trouble earlier or later. If you check out the patch, you will notice that the Random PMC (enum_class_Random) is the underlying implementation. This is just a functional interface. There is no state actually maintained in these functions. The reason for the ops is to avoid having 99 of 100 mathish functions in your native language defined as: foo: emit_parrot(foo, return, args) but rand defined as: rand: emit_parrot(set, return, Random) It's just a potential point of confusion. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: keyed vtables and mmd
On Wed, 2004-04-28 at 11:33, Dan Sugalski wrote: We toss the keyed variants for everything but get and set. And... we move *all* the operator functions out of the vtable and into the MMD system. [...] Comments? Only one question. What's the performance hit likely to be and is there any way around that performance hit for code that doesn't want to take it? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: File stat info
On Wed, 2004-04-28 at 11:56, Dan Sugalski wrote: stat [PINS]x, Sy, Iz stat Px, Sy [...] The returned PMC in the two-arg case could be a hash/array pmc and allow string-keyed access to elements. If we do that, then the names correspond to the constant names that follow. NAME Filename, no extension or path EXTENSION File extension This represents a world-view that is not universal. Rather than making Parrot into a lens through which system features need to be de-coded, why not provide a set of modular native-friendly tools with which to perform such operations? After all, in UNIX-land you can't know what the extension is (just look at the filenames auto.home, .bash_logout and foo.tar.gz). If you have a POSIX view by default, but provide a set of opcodes that specialize in Win32, Darwin, VMS, PalmOS, then you can avoid these points of confusion. Heck, you might even provide this abstraction as yet another layer, if it's really helpful. But most languages/system libraries that don't come out of Microsoft expect a POSIX view of the world, so that's probably a reasonable default. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: keyed vtables and mmd
On Wed, 2004-04-28 at 12:33, Dan Sugalski wrote: At 12:21 PM -0400 4/28/04, Aaron Sherman wrote: Since we're specifically talking about Perl here (and probably not Perl 5, since its overloading model is baroque and probably has to be managed by the compiler, not Parrot) Actually perl 5's overloading gets handled this way too. Overloaded operations *can't* be handled by the compiler in dynamic languages, and none of then do so. Hmmm, I thought we were on the same page here, but I'll back up and define terms if needed. When I talk about a runtime construct being handled by the compiler vs handled by parrot, I mean that the compiler will have to generate code that knows how to deal with the construct, rather than relying on Parrot's native constructs. That might be (as is the case with Perl 5 right now) that the construct is built into a runtime library, or it might be that the compiler generates special code inline. You seem to be replying to a point I would not make, e.g., that the compiler would have to somehow determine at compile-time what would happen. Clearly that's impossible. , I was under the impression that for types that are non-objecty, Types that are non-PMC won't check. PMC types will. Ok, so in Boston you suggested that every variable declared by a high level language would have to be a PMC and that INT registers for example were only for the compilers and Parrot libraries to use... would that not be the case for a Java int or a Perl 6 int and/or has it changed since then? I'm not arguing anything here, just trying to wrap my head around the scope of this change's impact. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: File stat info
On Wed, 2004-04-28 at 12:26, Dan Sugalski wrote: NAME Filename, no extension or path EXTENSION File extension This represents a world-view that is not universal. Rather than making Parrot into a lens through which system features need to be de-coded, why not provide a set of modular native-friendly tools with which to perform such operations? Because you end up with 78 kinds of portability hell if you don't, as everyone rolls their own way to handle this. Oh, don't get me wrong! I'm not saying an abstraction isn't all keen and such, I'm just wondering why we're abstracting farther out than POSIX when the right way, as you point out has never been a matter of consensus, and many client languages will be presenting POSIX semantics through their standard libraries anyway, which they will have to massage your representation back into. I'm OK with adding a TYPE to the stat array as well, though more for an it's a file/socket/device/directory type thing, rather than an it's an application/x-pdf file! thing. Well, since no OS I know of except for MacOS/Darwin has a reliable way to determine the ACTUAL type of a file, that's wise. ## ALTERNATE RESPONSE You didn't go far enough. Leave stat alone, back up 12 paces and write a vfs layer for Parrot that comes in at a level of abstraction WAY above the core POSIX/Win32/etc ops and provides a generic way to access URIs, mailboxes, files, shared memory regions, etc, etc. Why abstract within the arbitrary constraints of a POSIX-type stat model? Why assume that something has a name rather than a locator? Why not provide an abstract concept of type that encompasses all of MIME? Why not have permissions/ACL/security be a totally separate object which can understand SSL/TLS authentication models, pam, etc.? The obvious response is that you want to ship Parrot before the Y3k bug becomes a problem ;-) I understand that, and perhaps that's a reason to speculate about such a best, but implement after 1.0, but that doesn't invalidate the point. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: File stat info
On Wed, 2004-04-28 at 13:40, Dan Sugalski wrote: ALTERNATE RESPONSE This is where you go mad, right? :) Usually ;-) Why abstract within the arbitrary constraints of a POSIX-type stat model? I wasn't, actually. There's a good sprinkling of VMSisms in that list, and I'm all for adding more stuff if need be. (I forgot to note the various flavors of symlink, as well as the link count in cases where it can be determined, as well as user and group of the file itself) Yeah, noticed the VMSism (ACLs, version (mentioned later), a separate change dir bit), and being an old VMS hacker I approved in spirit, if not in action. VMS was nice for when it was used. It's too bad it's being maintained as a legacy now, and not the OS it could have been. If you scrap the places that you've factored out things that will have to be un-factored in the common case (filenames were the biggie), it's fine... just don't expect people to do anything with it except extract the POSIX semantics... after all, it took 15 years to get to the point that POSIX could unify file semantics as much as it did Keeping a niche open for ACLs is probably smart, esp. in the Windows world. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: keyed vtables and mmd
Ok, nuff said. I think there are slightly too many definitions that we're not agreeing on (though, I suspect if we ironed those out, we'd be in violent agreement). As for INT/PMC thing I'm pretty sure all of my concerns come down to: compilers can really screw each other over, but then we knew that, and there will have to be conventions to prevent it. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: File stat info
On Wed, 2004-04-28 at 14:51, Dan Sugalski wrote: At 8:08 PM +0200 4/28/04, Jerome Quelin wrote: Dan Sugalski wrote: [...] CTIME Creation time Will unixen use this for change time? (also spelled ctime too) Nope, for that they use mtime. We can expand the names to skip the confusion. *scratch head*... I'll dig out a man-page 'cause I don't want to sign on to the POSIX site just this sec: time_tst_atime;/* time of last access */ time_tst_mtime;/* time of last modification */ time_tst_ctime;/* time of last change */ There's no creation time listed, and mtime and ctime are most certainly not the same thing. Now, if you want to add a creation time, that's fine, but I recommend against calling it ctime, as that's a sort of well defined word in these parts. Should be OWNER_CD? Should be SYSTEM_CD? Should be OTHER_CD? Yep. Cut'n'paste error. :( I didn't even see that. Being dyslexic, my eye skips over that kind of error very easily. I just see it as my own mistake ;-) -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: File stat info
On Wed, 2004-04-28 at 15:42, Dan Sugalski wrote: At 10:32 PM +0300 4/28/04, Jarkko Hietaniemi wrote: I think you'll find ACL use is increasing, not decreasing. They've been tacked on to most recent filesystems, and they're coming into But AFAIK, Windows is the only place where the use of ACLs is encouraged in the native API. Everywhere else I look, they seem to be an add-on that you can use if you want to tie yourself to a particular set of extensions. This is why, for example, AIX has had ACLs forever, but I can't name one product that uses them (other than backup and restore software ;-) This is true. But good luck in trying to map between the ACL schema of different systems :-( Yech, good point. I'm not even sure you can do any sort of sane abstraction there. Sure you can. It's just at a much higher level of abstraction than stat. You could very easily say this is a file permission object and ask it can I do X to this file? or can user do X to this file where user might be a process or uid_t or whatever. That's perfectly reasonable as a core system abstraction layer, I was just waving the keep the native access too flag, since I've seen too many systems abstract away the native system to the point that no reasonable integration can occur between the language and its surroundings (e.g. Java). -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Patch: don't build docs for .*.ops
See attached patch which prevents the docs/Makefile from including invalid targets that just happen to be editor temp files (emacs temp files have a # character which really boggles make). -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback Index: config/gen/makefiles.pl === RCS file: /cvs/public/parrot/config/gen/makefiles.pl,v retrieving revision 1.30 diff -u -r1.30 makefiles.pl --- config/gen/makefiles.pl 19 Apr 2004 11:31:44 - 1.30 +++ config/gen/makefiles.pl 27 Apr 2004 19:52:18 - @@ -84,7 +84,7 @@ # set up docs/Makefile, partly based on the .ops in the root dir opendir OPS, ops or die opendir ops: $!; - my @ops = sort grep { /\.ops$/ } readdir OPS; + my @ops = sort grep { !/^\./ /\.ops$/ } readdir OPS; closedir OPS; my $pod = join , map { my $t = $_; $t =~ s/\.ops$/.pod/; ops/$t } @ops;
Re: hyper op - proof of concept
On Fri, 2004-04-23 at 15:34, Dan Sugalski wrote: At 3:25 PM -0400 4/23/04, Aaron Sherman wrote: That I did not know about, but noticed Dan pointing it out too. I'm still learning a lot here, It might be best, for everyone's peace of mind, blood pressure, and general edification, to take a(nother) run through the documentation. The stuff in docs/pdds isn't too out of date (mostly) and all the opcodes have POD, so you can do something like: Yeah, I've been plowing through it a piece at a time. I'm currently still mowing down the DOD docs which (given that I've been in application space for the last 8 years, and the world of GC has changed radically in that time) is a hard read. There are 14,304 lines of POD in the docs subdir and its immediate subdirs. That's a fair amount of reading, especially for something as dense as technical documentation. While diving in feet-first does get you going, looking for the rocks and deep water first is never ill-advised... :) Is that really what I'm doing? It's also the case that there's a HUGE amount of documentation and source code, and I doubt that ANYONE coming to this list and asking questions will understand all of it. I would be so egotistical as to even suggest that I've read more of the source and docs than most who will be asking questions in the next few years. Given that, getting the stupid stuff out of the way now, and putting it in a highly indexed form (e.g. a mailing list FAQ) that people on the list can be pointed at, might save EVEN MORE blood pressure. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback signature.asc Description: This is a digitally signed message part
Re: A12: The dynamic nature of a class
On Fri, 2004-04-23 at 08:42, Dan Sugalski wrote: Since any type potentially has assignment behaviour, it has to be a constructor. For example, if you've got the Joe class set such that assigning to it prints the contents to stderr, this: my Joe $foo; $foo = 12; should print 12 to stderr. Can't do that if you've not put at least a minimally constructed thing in the slot. Yes, and to make that statement a bit more generic, I would suggest that: my X $y; is, as far as I can tell: my X $y = undef; except that the explicit assignment might have a different signature (being as you are providing a parameter to the constructor, even if it's undef). There was a long thread a LONG time ago on what passing undef meant for signature matching and how that interacted with default arguments. I'm sorry, but I'm not able to recall the result at this point. If I have time, I'll chase it down. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: hyper op - proof of concept
Note: We've moved past hyper-ops (I hope!), but there are still some details in this post that deserve a response on tangential topics. On Wed, 2004-04-21 at 11:52, Leopold Toetsch wrote: Aaron Sherman [EMAIL PROTECTED] wrote: That's unrealistic. No. A real test. Sorry, I was not clear enough. Yes, of course, non-Parrot Perl 5 is going to be slow at this, but we expect that and your results showed nothing surprising. What might be interesting is to compare Parrot to Parrot doing this with and without a hyper-operator. That's all I was trying to say. As for the DOD: you have an excellent point, but it extends far beyond the hyper-operators. I'm starting to think that front-ends like the Python compiler or the Perl 6 compiler are going to need controls over the DOD for just the reasons you cite. After all, they know when they are about to start doing some large looping operation that's all highly constrained with respect to allocation. It would make sense to gather the resources they need, lock down DOD, do what they need to do and then unlock the DOD... -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: hyper op - proof of concept
On Fri, 2004-04-23 at 14:52, Leopold Toetsch wrote: Aaron Sherman [EMAIL PROTECTED] wrote: What might be interesting is to compare Parrot to Parrot doing this with and without a hyper-operator. That's all I was trying to say. I'd posted that as well. Here again with an O3 build of parrot: Oops, missed that. Thanks! I'm shocked by the difference in performance... it makes me wonder how efficient the optimization+JIT is when the two operations are SO different. I must simply not understand what's going on at the lowest level here. More investigation needed on my part, as I'm sure this will be an important point for me to understand in later topics that I'll run into writing Parrot code. As for the DOD: you have an excellent point, but it extends far beyond the hyper-operators. I'm starting to think that front-ends like the Python compiler or the Perl 6 compiler are going to need controls over the DOD for just the reasons you cite. After all, they know when they are about to start doing some large looping operation that's all highly constrained with respect to allocation. It would make sense to gather the resources they need, lock down DOD, do what they need to do and then unlock the DOD... Well, it's unlikely that we can expose all the details the more that such details may change. We could have a generalized version of such an operation though: i_need_now_x_pmcs_and_wont_dispose_any_start 10 # ... deep clone code or loop i_need_now_x_pmcs_and_wont_dispose_any_end er EOPCODETOOLONG :) Heh, yeah I getcha. It would be interesting, but as you point out it's ugly and specialized. sweep 1 sweepoff ... deep clone or some such sweepon That I did not know about, but noticed Dan pointing it out too. I'm still learning a lot here, and while I know it's frustrating, I hope to condense what I learn into some usable forms (perhaps adding to the FAQ as I suggested to Dan). I don't always agree with the two of you, but that's not required. I just need to understand enough that I can get the work done that I want to do, and make it efficient enough that people actually USE it ;-) -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
missint math ops?
I'm trying to write some code, and I'm not finding certain ops. Now, perhaps this is just that I don't know how to look for them, or perhaps they have yet to be written, so please pardon my ignorance. These are things that seem fairly atomic, and which exist in the C library. If they truly don't exist, perhaps this is a good place for me to jump in and get to know the code rather that just talking :) rand/srand sqrt -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: hyper op - proof of concept
On Wed, 2004-04-21 at 13:51, Larry Wall wrote: In any event, it is absolutely my intent that the builtin array types of Perl 6 support PDL directly, both in terms of efficiency and flexibility. You ain't seen Apocalypse 9 yet, but that's what it's all about. Straight from my rfc list file: Ok, the combination of Dan's (perhaps overzealous) emphasis on the dynamic nature of Parrot's client languages and my assumption that we had learned all there was to learn about the storage of aggregates mislead me here. That said, I now see why hyper goes in Parrot... maybe. It depends on how dynamic Perl is about lazy arrays (e.g. my int @foo = 1..Inf) and what happens when I: my int @foo = 1..3; $foo[0] = URI::AutoFetch.new(http://numberoftheweek.math.gov/;); If that's polymorphic, we're hosed. If it's an auto-conversion, then we're good. I like the polymorphic version for a lot of reasons, but I'll understand if we can't get that. Thanks all! -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
A12: The dynamic nature of a class
Ok, so I got to thinking about Parrot and compilation last night. Then something occurred to me, and I'm not sure how it works. When Perl sees: class Joe { my $.a; method b {...} } my Joe $j; Many things happen and some of them will require knowing what the result of the previous thing is. More to the point, Perl 6's compiler will have to parse class Joe, create a new object of type Class, parse and execute the following block/closure in class MetaClass, assign the result into the new Class object named Joe and then continue parsing, needing access to the values that were just created in order to further parse the declaration of $j There are several ways this can be accomplished: 1. Have a feedback loop between Parrot and Perl 6 that allows the compiler to execute a chunk of bytecode, get the result as a PMC and store it for future use. This will probably be needed regardless of which option is chosen, but may not be ideal. 2. Have a pseudo Perl 6 interpreter in the compiler which can execute a limited subset of Perl 6 that is allowed inside of class and module definitions (Larry implied that they were not limited in this way, but if they were, compilation could be optimized a bit). 3. Attempt to build a one-shot, bytecode stream that outputs a bytecode stream that represents the program. This would be the fastest in the general case, and would make pre-bytecoded libraries much easier to implement. However, it would also mean that class and module definitions could not affect the grammar of the language, and Larry has said that won't be the case :-( To me, #2 looks most attractive, but requires some duplication of effort. How easy would it be to interact with Parrot in the way that #1 proposes? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: A12: The dynamic nature of a class
On Thu, 2004-04-22 at 11:22, Dan Sugalski wrote: At 10:48 AM -0400 4/22/04, Aaron Sherman wrote: More to the point, Perl 6's compiler will have to parse class Joe, create a new object of type Class, parse and execute the following block/closure in class MetaClass, assign the result into the new Class object named Joe and then continue parsing, needing access to the values that were just created in order to further parse the declaration of $j Erm... no. Not even close, really. There's really nothing at all special about this--it's a very standard user-defined type issue, dead-common compiler stuff. You could, if you wanted, really complicate it, but there's no reason to and unless someone really messes up we're not going to. Just no need. That's not at all what A12 said. And, I quote: One of the big advances in Perl 5 was that a program could be in charge of its own compilation via use statements and BEGIN blocks. A Perl program isn't a passive thing that a compiler has its way with, willy-nilly. It's an active thing that negotiates with the compiler for a set of semantics. In Perl 6 we're not shying away from that, but taking it further, and at the same time hiding it in a more declarative style. So you need to be aware that, although many of the things we'll be talking about here look like declarations, they trigger Perl code that runs during compilation. This is in direct contradiction to what I'm hearing from you, Dan. What's the scoop? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: A12: The dynamic nature of a class
On Thu, 2004-04-22 at 14:44, Dan Sugalski wrote: At 1:05 PM -0400 4/22/04, Aaron Sherman wrote: This is in direct contradiction to what I'm hearing from you, Dan. What's the scoop? The scoop is that my Joe $foo; emits the code that, at runtime, finds the class ID of whatever Joe's in scope, instantiates a new object of that class, and sticks it into the $foo lexical slot that's in scope at runtime. Right, ok, good. I gotcha. But according to A12 as I understand it, the part BEFORE that, which looked innocently like a definition: class Joe { my $.a; method b {...} } would actually get turned into a BEGIN block that executes the body of the class definition as a closure in class MetaClass and stores the result into a new object (named Joe) of class Class. Perl 6's compiler does not (by default, at least) know how to run code. It just knows how to translate that text into bytecode (or IMCC or something). So it will need SOMETHING to execute, possibly multiple times with parsing going on before, after or during some of those executions. During is the hard one. That means you have to actually call back from Parrot into the Perl 6 compiler. But, even the simple: eval eval 'eval 1' causes that problem. How does Ponie deal with that? Does it simply act as an interpreter for the first pass and then do code-gen ala -MO ? If so, that's a nice dodge, but putting a full Perl 6 interpreter into Perl 6's compiler seems to me to be a tad heavy-weight. Thoughts? Am I missing a simple way to get around this? -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: hyper op - proof of concept
On Wed, 2004-04-21 at 15:46, Larry Wall wrote: On Wed, Apr 21, 2004 at 03:15:37PM -0400, Dan Sugalski wrote: : The math folks tell me it makes sense. I can come up with a : half-dozen non-contrived examples, and will if I have to. :-P I've said this before, and I'll keep repeating it till it sinks in. The math folks are completely, totally, blazingly untrustworthy on this subject. [...] They can't have my »« without a fight. Ah... now you see the true face of the age-old Linguistics-Mathematics wars! ;-) But seriously, to summarize what I've learned from this thread: * my int @foo will compile down into an efficient representation * PDL (and its like) will be able to use this to efficiently perform high-level operations on arrays, but only built-in operations * If someone (e.g. PDL) wants to implement other operations and their hyper-equivalent, they can do it in a high level language like P6 or as run-time loadable parrot opcodes (which PDL will certainly have to do, since most of their ops are in an ancient and gigantic Fortran lib). Sounding like problem solved to me! Thanks Larry. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: A12: The dynamic nature of a class
On Thu, 2004-04-22 at 15:37, Luke Palmer wrote: But Perl 6 is tightly coupled with Parrot. Perl 6 will be a Parrot program (even if it calls out to C a lot), and can therefore use the compreg opcodes. That means that any code executing in Parrot can call back out to the Perl 6 compiler, and obviously the Perl 6 compiler can call out to parrot. Clearly my question was garbled the first time, as this answer is exactly what I was looking for. Thanks! -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: hyper op - proof of concept
On Tue, 2004-04-20 at 18:06, Leopold Toetsch wrote: Aaron Sherman [EMAIL PROTECTED] wrote: This horse is getting a bit ripe, so I'm going to skip most of the detail. I think we all agree on most of the basics, we just disagree on what to do with them. That's cool. I do want to pick a couple of small nits though: Well, yes. Except for the special case, which is nice though: $ time parrot ih.imc #[1] real0m0.370s $ time perl i.pl #[2] real0m5.656s That's unrealistic. In P6, you should be able to take: @a + @b and turn it into: # Trivial example of hyper-operation, untested pseudo-IMCC # Just take __Perl_Ary_a and add it to __Perl_Ary_b and put # the result in tmp5 .local int tmp1 tmp1 = 0 .local int tmp2 tmp2 = __Perl_Ary_a .local int tmp3 tmp3 = __Perl_Ary_b .local int tmp4 # Not sure what the ? is below... is there a typeof? .local ? PerlArray tmp5 tmp5 = new .PerlArray # We auto-extend here... that may not be P6's eventual MO # but it's enough to get the point across if tmp2 = tmp3 goto AutoExtend_HYPER_1 __Perl_Ary_a = tmp3 tmp4 = tmp3 goto PRE_HYPER_1 AutoExtend_HYPER_1: __Perl_Ary_b = tmp2 tmp4 = tmp2 PRE_HYPER_1: tmp5 = tmp4 BEGIN_HYPER_1: if tmp1 = tmp4 goto END_HYPER_1 tmp5[tmp1] = __Perl_Ary_a[tmp1] + __Perl_Ary_b[tmp1] CONT_HYPER_1: # I forget if there's an inc op tmp1 = tmp1 + 1 goto BEGIN_HYPER_1 END_HYPER_1 Are we seriously suggesting that after JIT, that's going to be as slow as raw Perl, or even any slower than: .local ? PerlArray tmp1 hyper tmp1 = __Perl_Ary_a + __Perl_Ary_b ?! If so, I'm curious to know why. It seems to me that you're just moving the work from the Perl 6 compiler all the way down to the JIT, but the resulting code is the same, no? I would agree that a bulk array copy and iterators should go in Parrot. That much would speed up many things (especially the above code). Putting Perl 6 features into Parrot without factoring out their modular essence would seem to me to result in a great deal of duplication, but now I'm starting to get close to that horse again -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback
Re: hyper op - proof of concept
On Wed, 2004-04-21 at 10:13, Simon Glover wrote: Absolutely -- I really, _really_ want to be able to use hyper ops with fixed size, floating point arrays, and to have that be as fast as possible, as that should make it possible to implement something like PDL in the core. Mistake. You don't want to have to convert to-and-from arrays of PMCs in order to do those ops, and regardless of what kind of hyper-nifty-mumbo-jumbo you put into Parrot, that's exactly what you're going to have to do. In fact, Parrot Data Language (if there were such a thing) would likely introduce its own runtime-loadable opcode set to operate on a new PMC type called a piddle. Then, each client language could define (in a module/library) its own means of interacting with a piddle. For example in Perl, you might: multi method new(Class $class, int @ary) {...} multi method new(Class $class, float @ary) {...} multi method new(Class $class, int $value) {...} multi method new(Class $class, Octets $value: %*_) {...} and then you would override BUILD in order to emit your special piddle opcodes. Then, in user-space: my PDL::Piddle $foo = [1,2,3,4,5,6]; Does what you expect, and $foo + $bar is special. -- Aaron Sherman [EMAIL PROTECTED] Senior Systems Engineer and Toolsmith It's the sound of a satellite saying, 'get me down!' -Shriekback