Re: Second try: Builtins
Aaron Sherman wrote: On Sat, 2002-09-07 at 14:22, Smylers wrote: Should that C+ be there? I would expect chomp only to remove a single line-break. Note that this is in paragraph (e.g. C$/='') mode Ah, yes. I quoted the wrong case above. The final branch deals with the case when C$/ (or equivalent) is set: } else { $string =~ s/{[$irs]}+$//; return $0; } If C$irs = \n then I'd only expect a single trailing newline to be removed but that substitution still looks as though it'll get rid of as many as are there. In a scalar context does Creverse still a string with characters reversed? Yes, but that would be: sub reverse($string) { return join '', reverse([split //, $string]); } Perl 5's Creverse is sensitive to the context in which it is called rather than the number of arguments. This is an 'element' reversal with only one element: $ perl -wle 'print reverse qwabc' This is a 'character' reversal even though several strings have been passed: $ perl -wle 'print scalar reverse qwabc def' So a Creverse with a single array parameter could be either type. Smylers
Re: Second try: Builtins
On Sat, 2002-09-07 at 14:22, Smylers wrote: Aaron Sherman wrote: sub chomp($string is rw){ [...] } elsif $irs.length == 0 { $string =~ s/ \n+ $ //; Should that C+ be there? I would expect chomp only to remove a single line-break. Note that this is in paragraph (e.g. C$/='') mode sub reverse(@list) { my @r; my $last = @list.length - 1; for(my $i=$last;$i = 0;$i++) { @r[$last-$i] = @list[$i]; } return *@r; } In a scalar context does Creverse still a string with characters reversed? Yes, but that would be: sub reverse($string) { return join '', reverse([split //, $string]); } Though this example is too inefficient, it does demonstrate the point. -- Aaron Sherman [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Second try: Builtins
On Sat, 2002-09-07 at 10:53, Sean O'Rourke wrote: On Sat, 7 Sep 2002, Chuck Kulchar wrote: Also, how do these perl6 builtins in perl6 work with the current P6C/Builtins.pm? (also, why are some that are already defined in pure pasm/part of the parrot core redefined as perl6 code?) For the moment, they don't. Eventually, I expect there will be some sort of a header file with the builtin declarations (P6C parses and interprets function declarations for this very reason), and a .pbc file containing their code. As for why they're written in perl 6 code, I expect it's easier to define their semantics in Perl than in assembly. Correct in as far as it goes. The more general answer is that one of the goals of this re-write (as I was lead to believe) was that the Perl internals would be maintainable. If we write the well over 150 Perl 5 builtins in Parrot assembly, I think we can kiss that wish goodbye. Some of this will have to be done in assembly, but hopefully a very small and modular core (e.g. my proposal earlier on how to handle pack, sprintf and chr). The rest will be the subject of increasingly powerful optimizations that the compiler will have to perform for user code anyway. Ultimately I would hope that the only builtins that will be represented 100% in assembly will be those that have a 1-to-1 mapping in the parrot instruction set (e.g. scalar). BTW: Current status is that I'm preparing to make some changes to the compiler tonight. After that, I'll be ready to issue a patch against the current tree. Over the weekend I focused on getting all of the builtins to compile cleanly and I implemented a few other small pieces. We now have a sprintf that can handle C'%d' and C'%s' along with some simple modifiers, so Cprintf(%02d%% of % 6s\n) should work. I'm making heavy use of Cgiven, in the assumption that it will make the code easy to optimize. -- Aaron Sherman [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Second try: Builtins
On Mon, Sep 09, 2002 at 05:36:42PM -0400, Aaron Sherman wrote: Correct in as far as it goes. The more general answer is that one of the goals of this re-write (as I was lead to believe) was that the Perl internals would be maintainable. If we write the well over 150 Perl 5 builtins in Parrot assembly, I think we can kiss that wish goodbye. This may sound a bit arm wavy, but when I'm off messing up the core of perl5, I don't find the perl5 ops are the maintenance problem. Most of the op functions are quite small (partly due to the use of macros) but they are all nicely self contained. (And all in 6 (4 before 5.8) source files, out of a total of 36 source files) The writhing mass of yuck comes from the interaction of the bits in the various utility functions that they call in the other 26 or so files. Plus the 2 files of the regexp engine, and the 2 files of the parser which I attempt to avoid lest I go insane. Hence I don't think that writing the perl builtins in parrot assembly (or at least the majority that really need to go really fast) would be a maintenance nightmare. Although being able to write them in perl and having an inliner and optimiser that is good enough to produce results better than calling out to general purpose parrot assembler would be nice. Although my biased opinion is that probably best to write the perl builtins as tidy C code rather than parrot assembler. But I know C better. Nicholas Clark -- Even better than the real thing:http://nms-cgi.sourceforge.net/
Re: Second try: Builtins
On Mon, 2002-09-09 at 17:52, Nicholas Clark wrote: On Mon, Sep 09, 2002 at 05:36:42PM -0400, Aaron Sherman wrote: Correct in as far as it goes. The more general answer is that one of the goals of this re-write (as I was lead to believe) was that the Perl internals would be maintainable. If we write the well over 150 Perl 5 builtins in Parrot assembly, I think we can kiss that wish goodbye. This may sound a bit arm wavy, but when I'm off messing up the core of perl5, I don't find the perl5 ops are the maintenance problem. Most of the op functions are quite small (partly due to the use of macros) but they are all nicely self contained. (And all in 6 (4 before 5.8) source files, out of a total of 36 source files) Keep in mind that the majority of Perl 5 builtins are of the form: munge parameters... call libc function of same name... munge return values In Perl 6 those will mostly be the same. Many of them will be moved out to modules (e.g. the filehandle functions) but many others will remain in the core (e.g. chdir, getppid, etc) and simply be wrappers around the C functions. When the general-purpose interface for C is defined, these functions will be implemented in a fairly short period of time. Those that are left are internal Perl utilities that I break down into several categories: string, math, list, internal and misc. Of these, about 30-50% will probably be pure Perl. Another small percentage will be assembly wrappers that call a one-for-one parrot function (e.g. exit). The rest will be a complex mix of Perl and assembly (e.g. sprintf which is mostly Perl, but needs assembly for low-level type conversion). Although my biased opinion is that probably best to write the perl builtins as tidy C code rather than parrot assembler. But I know C better. Yeah, that would be ideal for speed. I am willing to concede that that's the way we'll have to go for some things, eventually. However, until we have a pure Perl library (or as much so as we can), I don't think we'll know where we need the speed boost most. What's more, this will force the compiler to optimize as strongly as possible, which can only benefit users. -- Aaron Sherman [EMAIL PROTECTED] http://www.ajs.com/~ajs
Re: Second try: Builtins
Aaron Sherman wrote: Of these, about 30-50% will probably be pure Perl. Another small percentage will be assembly wrappers that call a one-for-one parrot function (e.g. exit). The rest will be a complex mix of Perl and assembly (e.g. sprintf which is mostly Perl, but needs assembly for low-level type conversion). I'm just providing the necessary infrastructure inside imcc. The format of current Builtin's will probably slightly change. I need the global label (i.e. entry point) of the function for bsr fixup. Sean did propose: .extern sub_name _label Plus something like (mine): .sub sub_name { saveall .param PerlUndef Arg1 ... } for PIR subs. (Current imcc parses »ret« as end of sub, which we might change) There is no real need to use PASM at all in the function, but imcc (0.0.9) parses PASM instructions too. BTW: are there any thoughts about PackFile_FixupTable? leo
Re: Second try: Builtins
Aaron Sherman wrote: sub chomp($string is rw){ my $irs = ${/}; # XXX What is $/ now? if defined $irs { if $irs.isa(Object) { return undef; } elsif $irs.length == 0 { $string =~ s/ \n+ $ //; Should that C+ be there? I would expect chomp only to remove a single line-break. sub reverse(list) { my r; my $last = list.length - 1; for(my $i=$last;$i = 0;$i++) { r[$last-$i] = list[$i]; } return *@r; } In a scalar context does Creverse still a string with characters reversed? Smylers
Re: Second try: Builtins
# INTERNAL q, qq, qw # XXX - how do I do quote-like operators? I know I saw someone say... # Need to do: qr (NEVER(qr)) and qx presumably the way the perl5 tokeniser does them - by parsing the string into a series of concatenated constants and variables, with some optionally fed through uc/ucfirst/lc/lcfirst/quotemeta (And scalar and list interpolators breaking back out to the real parser) Actually, the way P6C does it is very different than the way perl5 does it; P6C matches straight through rather than by finding the end and then reparsing the middle. This is because perl6 needs to handle stuff like %var{var}, which wouldn't work at all in perl5. Check out P6C/Parser.pm and P6C/Tree/String.pm for more details. Also, how do these perl6 builtins in perl6 work with the current P6C/Builtins.pm? (also, why are some that are already defined in pure pasm/part of the parrot core redefined as perl6 code?) Joseph F. Ryan [EMAIL PROTECTED]
Re: Second try: Builtins
On Sat, 7 Sep 2002, Chuck Kulchar wrote: Also, how do these perl6 builtins in perl6 work with the current P6C/Builtins.pm? (also, why are some that are already defined in pure pasm/part of the parrot core redefined as perl6 code?) For the moment, they don't. Eventually, I expect there will be some sort of a header file with the builtin declarations (P6C parses and interprets function declarations for this very reason), and a .pbc file containing their code. As for why they're written in perl 6 code, I expect it's easier to define their semantics in Perl than in assembly. /s
Re: Second try: Builtins
On Fri, 2002-09-06 at 09:29, Nicholas Clark wrote: On Fri, Sep 06, 2002 at 01:34:56AM -0400, Aaron Sherman wrote: # INTERNAL q, qq, qw # XXX - how do I do quote-like operators? I know I saw someone say... # Need to do: qr (NEVER(qr)) and qx presumably the way the perl5 tokeniser does them - by parsing the string into a series of concatenated constants and variables, with some optionally fed through uc/ucfirst/lc/lcfirst/quotemeta (And scalar and list interpolators breaking back out to the real parser) Ok, so I guess that all has to go in the parser, not the builtins. Perl 5 already provides builtin functions that are the back-ends for things like C , Cqx, etc so those will be in the builtins, and the parser can just call them. It may already do so, I've not had time to look. sub chomp($string is rw){ my $irs = ${/}; # XXX What is $/ now? per file handle. So does that mean each string needs a property to hold what the record separator for the file handle it was read from at the time of reading? That's not terribly useful, since filehandles will auto-chomp in Perl 6 anyway. I propose the alternate: sub chomp($string is rw, $sep //= rx/\n/) { ... } sub chomp(strings is rw, $sep //= rx/\n/) { ... } This will mean that you can: chomp(lines=) but not: chomp($line1, $line2, $line3); Is that going to be a problem? (Well, as record separators could be regexps^Wpatterns actually I think a an offset to the start of the record separator will do) I forgot they could be patterns. Need to go fix that! sub index($string, $substr, int $pos //= 0) { # XXX - slow dumb way... need to break out Knuth [...] I think that string in string searches are common functionality that ought to be implemented in the parrot core. Rather than every language and extension that needs them having to re-implement the wheel. Good point. I will look at what parrot does now, and consider moving index over to the Internal section. Why is socketpair never? It's a real Unix system call that provides Because I was tired :) I'm going to cut that section down to just a list of the functions that need to be moved over to the IO modules as one long comment. It's just there for a reminder right now. Thanks for the comments and answers to some of my questions. The feedback as been quite helpful!
Second try: Builtins
This is still a monolith, but it's getting better. It's now stored in P6C/Builtins/CORE.p6m in my tree. More functions are coded, and I now differentiate between the functions that need external support (e.g. POSIX/libc functions) and those that just need to be written (e.g. sort). I think I've covered all of the comments (other than breaking up the file and making it part of the compilation process, which I'll work on this weekend, and then submit this as a patch to p6i). Anyone who wants to take a crack at answering any of the questions that I've marked with XXX will be much appreciated. I'm out of town for the weekend, but will be back and catching up on mail Sunday night. # # The core built-ins for Perl 6. # # Written in 2002 by Aaron Sherman [EMAIL PROTECTED] # This file can be distributed/modified under the same terms as Perl itself.. module CORE; # So how are we doing export? I'll look up the Exegeses later # export: # acos alarm asin atan2 bless caller chdir chmod chomp chomp # chomp chop chop chop chown chr chroot cos cos crypt dbmclose # dbmopen dump endgrent endhostent endnetent endprotoent # endpwent endservent eval exec exp fork format formline # getgrent getgrgid getgrnam gethostbyaddr gethostbyname # gethostent getlogin getnetbyaddr getnetbyname getnetent # getpgrp getppid getpriority getprotobyname getprotobynumber # getprotoent getpwent getpwnam getpwuid getservbyname # getservbyport getservent glob gmtime grep hex index int join # kill lc lcfirst length link local localtime log log10 lstat # map mkdir msgctl msgget msgrcv msgsnd oct open opendir ord # pack pipe pop pos printf prototype push quotemeta rand read # readlink readpipe ref rename reset reverse reverse rindex # rmdir scalar select select select semctl semget semop setgrent # sethostent setnetent setpgrp setpriority setprotoent setpwent # setservent shift shmctl shmget shmread shmwrite sin sleep sort # sort sort splice split split sprintf sqrt srand stat study # symlink syscall system tan times truncate uc ucfirst umask # umask unlink unpack unshift untie utime utime vec wait waitpid # warn write # XXX - This marker is used all over to indicate potential problems and # quesitons about how Perl 6 works. # # XXX High-level questions: # # When declaring: # sub foo($a, $b) {...} # and # sub foo($a, *@rest) { ... } # What is the correct order, and/or is this even valid? I need to know, # given the way I did sort and reverse in order to handle exploded # argument lists and arrays efficiently. # # Generally need to know how the interface ot libc will work, so # that all of this junk can be implemented. # # Do I need to @array is rw? I would think not # # Need to nail down when I should not be using *@array, e.g. return? # Some internal only markers for various sorts of unimplemented functionality sub UNIMP($func) { die Unimplemented: $func } sub LIBC($func,*@args) { die Unimplemented call to external code: $func } sub NEVER($func) { die Obsolete in Perl 6: $func } # Internal/IMC # Functions that are implemented in IMC and/or the parser directly # INTERNAL abs # INTERNAL defined # INTERNAL delete # INTERNAL die # INTERNAL do # INTERNAL each # INTERNAL eval(string) # INTERNAL exists # INTERNAL exit # INTERNAL goto # INTERNAL keys # INTERNAL last # INTERNAL lock # INTERNAL m # INTERNAL my # INTERNAL next # INTERNAL no # INTERNAL our # INTERNAL package # INTERNAL print # INTERNAL q, qq, qw # XXX - how do I do quote-like operators? I know I saw someone say... # Need to do: qr (NEVER(qr)) and qx # INTERNAL redo # INTERNAL return # INTERNAL s # INTERNAL sleep # INTERNAL sub # INTERNAL substr # INTERNAL time # INTERNAL undef # INTERNAL use # INTERNAL values # INTERNAL wantarray # INTERNAL y # Math # Mathematical functions and functions and conversions sub atan2(real $y, real $x) { return LIBC(atan2,$y,$x) } sub cos(real $num //= $_) { return LIBC(cos,$num) } sub cos(real $num //= $_) { return LIBC(cos,$num) } sub exp(real $num //= $_) { return LIBC(exp,$num) } sub log(real $num //= $_) { return LIBC(log,$num) } sub sin(real $num //= $_) { return LIBC(sin,$num) } sub sqrt(real $num //= $_) { return LIBC(sqrt,$num) } # From perlfunc sub acos(real $num //= $_) { atan2( sqrt(1 - ($num * $num)), $num ) } sub tan(real $num //= $_) { return sin($num) / cos($num) } sub log10(real $num //= $_) { return log($num)/log(10) } sub asin(real $num //= $_) { atan2($num, sqrt(1 - $num * $num)) } # Conversions sub int(int $num //= $_) { $num } sub hex($string //= $_) { my($tmp) = ($string =~ /^[0x]?([a-fA-F0-9]+)/); return 0 unless defined($hex) $hex.length; my $bit = 0; my $result = 0; for(my $i = $tmp.length-1;$i=0;$i--) { my $n = substr($tmp,$i,1); given $n { when 'a' .. 'f', 'A' .. 'F' {