[pugs] regexp bug?

2005-04-15 Thread BRTHZI Andrs
Hi,
This code:
my $a='A';
$a ~~ s:perl5:g/A/{chr(65535)}/;
say $a.bytes;
Outputs 0. Why?
Bye,
  Andras


Re: [pugs] regexp bug?

2005-04-15 Thread BRTHZI Andrs
Hi,
 This code:

 my $a='A';
 $a ~~ s:perl5:g/A/{chr(65535)}/;
 say $a.bytes;

 Outputs 0. Why?


 \u is not a legal unicode codepoint.  chr(65535) should raise an 
exception of some type.  So the above code does seem show a possible 
bug. But as that chr(65535) is an undefined char, who knows what the 
code is acually doing.

In my opinion (that can be wrong), \u can be stored as an UTF-8 
character, it should be 0xEF~0xBF~0xBF. If I do it outside the regexp (I 
mean say chr(65535).bytes, it works well.

Another bug, I've found, it's not related to the regexps, but still 
unicode character one:

  say chr(0x10).bytes;
The answer:
  pugs: encodeUTF8: ord returned a value above 0x10
And if I start to increment $b, I will get:
  pugs: Prelude.chr: bad argument
I don't understand it, as I thougth that unicode characters in the range 
of 0x-0x7FFF. Is Haskell not supporting the whole set?

There is a Unicode version, called UCS-2, that is just between 
0x-0x, but it still not answer the question.

[...]
Meanwhile, I've found this:
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2175.htm
It can be the answer to my question.
Bye,
  Andras



Re: [perl #34984] [PATCH] Fix segfault with const

2005-04-15 Thread Leopold Toetsch
Nick Glencross [EMAIL PROTECTED] wrote:

 This patch fixes a problem which can occur in this example:

 .sub test
 .const float a = 12
 print a
 print_newline
 .end

Ah yep.


 +if (t != 'P'  t != val-set)
 +IMCC_fataly(interp, E_TypeError,
 +const types do not match);

I think, we could be a bit more graceful here for I/N mismatch and set
for the above case the constant val-set to 'N'.

leo


[perl #34994] [TODO] make useful parts of Parrot config available at runtime

2005-04-15 Thread via RT
# New Ticket Created by  Leopold Toetsch 
# Please include the string:  [perl #34994]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org/rt3/Ticket/Display.html?id=34994 


A Python example first:

$ python
  import sys
  print sys.maxint
2147483647
  help(sys)
...

I'd like to have a similar[1] capability inside Parrot. The general plan 
was already discussed, when make install came up first.

1) We have Fruntime/parrot/include/config.fpmc, which is a frozen 
image of the config hash generated by Fconfig_lib.pasm. Creating the 
frozen image needs already parrot (a possibly already existing parrot or 
miniparrot in the long run). But locating this file needs the library or 
include path, with resides in this file. We got a typical hen and egg 
problem.

2) instead of creating e.g. src/revision.c (and possibly other similar 
files), we create a C-readable representation of the frozen config hash 
and re-link with that file: src/parrot_config.c. Now we have parrot$EXE 
with the config inside.

3) make install creates src/parrot_config_install.c and links that 
into parrot_install$EXE, which during installation becomes 
.../bin/parrot$EXE. With this step we get rid of the problem with 
runtime vs build directory library usage.

4) at program start the frozen config string gets thawed and we populate 
appropriate namespaces[2] with hash entries. Language folks and our 
fearless leader may please define the term appropriate :)

5) along with bringing the config online, some cleanup and renaming 
wouldn't harm e.g. iv vs opcode_t, intvalsize vs intsize vs 
opcode_t_size ...

6) the config information could be available as attributes of the 
respective namespace:

   ns = getclass ParrotInterpreter
   PINTVAL_size = ns.INTVAL_size   # getattribute shortcut

or with global namespace ops

   PINTVAL_size = find_global ParrotInterpreter, INTVAL_size

Comments, improvements, takers welcome,
leo

[1] w/o Python quirks

  sys.maxint = 2
  sys.maxint
2


[2] a remark about namespaces

We currently have:

/   (namespace root)
   __parrot_core ... MMD multi subs
   Integer
   Float
     Parrot PMC class namespaces

The PMC class namespaces should probably reside under __parrot_core to 
get rid of the namespace pollution. Or alternatively, we prepend two 
underscores:

/
   __parrot_core
   __ParrotInterpreter   aka __sys
   __ParrotIOaka __io



Re: [pugs] regexp bug?

2005-04-15 Thread BRTHZI Andrs
Hi,
Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 
byte UCS-2 value, but the Unicode standard specifically says that the 
values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should 
never appear in a Unicode string.  0x is reserved for out-of-band 
signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are 
specificaly reserved for out-of-band marking a UCS-2 file as being 
either bigendian or littlendian, but are specifically not considered 
part of the data.  chr() is currently defined to mean convert an int 
value to a Unicode codepoint. That's why I said that chr(65535) should 
return an exception, it's an argument error similar to sqrt(-1).
Thanks, I didn't know about it. I thought they just not appear in UTF-8 
coded strings, but you're right. I recommend it to raise an exception, too.

Bye,
  Andras


a PMC question?

2005-04-15 Thread bloves
hi,folks.

I am reading PMC C source code and reading some document( 
http://www.perl.com/pub/a/2002/01/30/pmcs.html;).



Some questions:



*this PMC design have changed?

*any body offer some advice that learn PMC C source code and PMC's theory?



Thanks.



   /\  /\

  / | / |

 / /  2  / /

/   /



p2p is protocol or compiler ?

[perl #34991] More const weirdness

2005-04-15 Thread via RT
# New Ticket Created by  Nick Glencross 
# Please include the string:  [perl #34991]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org/rt3/Ticket/Display.html?id=34991 


Here's another odd one, which looks const-related. Uncommenting the '+=' 
line causes a compile error when 'c' is subsequently used.

If the const is made a float, then the problem goes away. Bringing the 
const into the function, it compiles, but the result is wrong (it looks 
like perhaps it doesn't expand the const, and the tracing shows that it 
adds 0).

Regards,

Nick

p.s. If this is drawing focus away from important topics, please let me 
know. I recall that there was debate about macros etc., but I can't 
remember if consts were here to stay or not...


.const int c = 12

.sub test

.local float a
a = 96

# Uncomment this line, and the c symbol is 'forgotten'
# a += c
   
print a
print_newline
print c
print_newline

end
.end



Re: Parrot bytecode reentrancy

2005-04-15 Thread Nigel Sandever
On Thu, 31 Mar 2005 21:17:39 -0500, [EMAIL PROTECTED] (MrJoltCola) 
wrote:
 At 05:57 PM 3/31/2005, Nigel Sandever wrote:
 Is Parrot bytecode reentrant?
 
 Yes.
 
 That is, if I want to have two instances of a class in each of two 
 threads, will
 the bytecode for the class need to be loaded twice?
 
 No, just once.
 
 Also, will it be possible to pass objects (handles/references) between 
 threads?
 
 Yes, otherwise threads are no more useful than processes.
 
 -Melvin
 
Thanks. Another question arises.

When a sub that closes over a variable 

my $closure = 0;
sub do_something {
return $closure++:
}

is called from two threads, do the threads share a single closure or each get 
their own separate closure?

njs




Re: [perl #34978] lib/Parrot/Test.pm should not use in commands

2005-04-15 Thread Jens Rieks
Thank you, applied!

jens


Some PMC's Questions

2005-04-15 Thread bloves mr
hi,folks.
I am reading PMC C source code and reading some document(
http://www.perl.com/pub/a/2002/01/30/pmcs.html;).

Some questions:

*this PMC design have changed?
*any body offer some advice that learn PMC C source code and PMC's theory?

Thanks.
/*
  p2p is a protocol or a compiler?
*/


Re: [pugs] regexp bug?

2005-04-15 Thread BRTHZI Andrs
Hi,
my $a='A';
$a ~~ s:perl5:g/A/{chr(65535)}/;
say $a.bytes;
Outputs 0. Why?
\u is not a legal unicode codepoint.  chr(65535) should raise an 
exception of some type.  So the above code does seem show a possible 
bug. But as that chr(65535) is an undefined char, who knows what the 
code is acually doing.
It seems, that it gives back 0 in the 0xE000-0x range. Do you still 
think, it's normal?

Some Unicode code points are invalid and should not be used. [...] It 
can't be 0x or 0xFFFE, it can't be both = 0xDFFF and = 0xD800, and 
it can't be  0x10 and it can't be less than 0.

  http://www.elfdata.com/plugin/unicodefaqdata.html
Bye,
  Andras


Re: [pugs] regexp bug?

2005-04-15 Thread Mark A. Biggar
BRTHZI Andrs wrote:
Hi,
This code:
my $a='A';
$a ~~ s:perl5:g/A/{chr(65535)}/;
say $a.bytes;
Outputs 0. Why?
Bye,
  Andras
\u is not a legal unicode codepoint.  chr(65535) should raise an 
exception of some type.  So the above code does seem show a possible 
bug. But as that chr(65535) is an undefined char, who knows what the 
code is acually doing.

--
[EMAIL PROTECTED]
[EMAIL PROTECTED]


Re: [pugs] regexp bug?

2005-04-15 Thread Mark A. Biggar
BRTHZI Andrs wrote:
Hi,
  This code:
 
  my $a='A';
  $a ~~ s:perl5:g/A/{chr(65535)}/;
  say $a.bytes;
 
  Outputs 0. Why?
 
 
  \u is not a legal unicode codepoint.  chr(65535) should raise an 
exception of some type.  So the above code does seem show a possible 
bug. But as that chr(65535) is an undefined char, who knows what the 
code is acually doing.

In my opinion (that can be wrong), \u can be stored as an UTF-8 
character, it should be 0xEF~0xBF~0xBF. If I do it outside the regexp (I 
mean say chr(65535).bytes, it works well.

Another bug, I've found, it's not related to the regexps, but still 
unicode character one:

  say chr(0x10).bytes;
The answer:
  pugs: encodeUTF8: ord returned a value above 0x10
And if I start to increment $b, I will get:
  pugs: Prelude.chr: bad argument
I don't understand it, as I thougth that unicode characters in the range 
of 0x-0x7FFF. Is Haskell not supporting the whole set?

There is a Unicode version, called UCS-2, that is just between 
0x-0x, but it still not answer the question.

[...]
Meanwhile, I've found this:
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2175.htm
It can be the answer to my question.
Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 
byte UCS-2 value, but the Unicode standard specifically says that the 
values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should 
never appear in a Unicode string.  0x is reserved for out-of-band 
signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are 
specificaly reserved for out-of-band marking a UCS-2 file as being 
either bigendian or littlendian, but are specifically not considered 
part of the data.  chr() is currently defined to mean convert an int 
value to a Unicode codepoint. That's why I said that chr(65535) should 
return an exception, it's an argument error similar to sqrt(-1).

--
[EMAIL PROTECTED]
[EMAIL PROTECTED]


Re: Hyper operator corner case?

2005-04-15 Thread Thomas Sandla
John Williams wrote:
Good point.  Another one is: how does the meta_operator determine the
identity value for user-defined operators?
Does it have to? The definition of the identity value---BTW, I like
the term neutral value better because identity also is a relation
between two values---is that $x my_infix_op $neutral == $x.
So the generic implementation that copies surplus elements is correct
with respect to the resulting value. You shouldn't expect the operator
beeing called as many times as there are elements in the bigger data
structure, though. It's called only for positions where both structures
have actual values. But that is the same as short-circuiting  and ||.
And somewhat the reverse of authreading from junctive values.

I believe the fine points fall out like this:
   @a + 1# replicate
   @a + (1)  # replicate: (1) is still scalar
   @a + [1]  # extend: [1] is an array (and will auto-deref)
I think they fall out naturally from typing and dispatch. But note
that the » « operator has three args. I haven't made the op a dispatch
selector. If the my_infix_op from above needs to handle neutral elements
by itself just tell the dispatcher by defining
infix_circumfix_meta_operator:{'»','«'}:(List,List,my_infix_op:) and
construct the neutral elements when one of the list runs out of elements.
I hope the syntax I used does what I want to express. Note that in
:(List,List,my_infix_op:) the first two elements are types while
my_infix_op is a sub value. In that sense my op was actually wrong
but it was nice for wording my sentence. So the generic name should read
infix_circumfix_meta_operator:{'»','«'}:(List,List:Code) or perhaps
infix_circumfix_meta_operator:{'»','«'}:(List,List:) if  is considered
as the code sigil. Hmm, then we could also have :(@,@:) meaning the
same type spec?
BTW, starting from these type specs I come (back) to the suggestion of using
» « for hypering function calls and/or their arguments. Has that been decided?
I'm not sure if specialisation on values is covered by the :() syntax.
E.g. one could implement infix:*:(0,Any) to return 0 without evaluating
the Any term at all! But this needs either lazy evaluation in the functional
paradigma or code morphing 'x() * y()' to '(($t = x()) != 0) ?? $t * y() :: 0'
or some such. On assembler level this morphing reduces to an additional
check of a register for zero. But I'm not sure if the type system and the
optimizer will be *that* strong in the near future ;)
Regards
--
TSa (Thomas Sandlaß)


Re: Parrot/PUGS Hack-a-thon at the Austrian Perl Workshop

2005-04-15 Thread BRTHZI Andrs
Hi,
There will be a Parrot/PUGS Hack-a-thon at the Austrian Perl Workshop, which
takes place on 9th and 10th June in Vienna, Austria.
Autrijus Tang, Chip Salzenberg and Leo Toetsch will be there. You should be
there too :-)
I'll be there, too. ;)
Bye,
  Andras


Re: A sketch of the security model

2005-04-15 Thread Shevek
Someone's pointed this thread out to me, so I'm going to shove an oar in
following a few posts. I've done a fair bit of security work, so feel
free to ask me to explain, justify or provide references for anything.

On Wed, 2005-04-13 at 17:01 -0400, Dan Sugalski wrote:
 All security is done on a per-interpreter basis. (really on a 
 per-thread basis, but since we're one-thread per interpreter it's 
 essentially the same thing)

What you actually mean (or what I believe you _should_ mean) is
per-context, in the lambda-calculus sense of context. See notes below
about continuations.

 QUOTAs are limits on the number of resources or operations that an 
 interpreter an allocate or perform, either in absolute terms (i.e. 
 allocate no more than 10M of memory) or relative terms (i.e. can do 
 only 10 IO operations per second). Quotas are tracked by parrot, and 
 cover:

The ability to manipulate and exceed QUOTAs should be controlled in
dynamic context.

 PRIVILEGEs are permissions to do certain things. Parrot will have a 
 number of privileges it checks before doing dangerous operations, and 
 user code may also assign and check privileges.
 
 Normally parrot runs with no quotas and no privilege checking. This 
 is the fastest way to run. Code may at any time enable privilege 

Actually, you can do privilege checking in an efficient engine, even
using most of the reflection systems, with almost no overhead. See Java.

 and/or quota checking. Once enabled code must have proper privileges 
 to disable it again.

Typically AllPermission, otherwise you have the ability to perform
privilege escalation.

 Each running thread has two sets of privileges -- the active 
 privileges and the enableable privileges. Active privs are what's 
 actually in force at the moment, and can be dropped at any time. The 
 enableable privs are ones that code can turn on. It's possible to 
 have an active priv that's not in the enableable set, in which case 
 the current running code is allowed to do something but as soon as 
 the privilege is dropped it can't be re-enabled.

Enableable privileges are usually called static privileges and are
usually defined as the privileges held statically by the current object,
or if we read ahead to your next point, subroutine.

 Additionally, subroutines may be marked as having privileges, which 
 means that as long as control is inside the sub the priv in question 
 is enabled. This allows for code that has elevated privs, generally 
 system-level code.

Please no. Privileges should be explicitly granted. You have just
described the Unix SUID model, where as long as control is inside a
root-owned daemon (for daemon, read subroutine), the root privilege is
enabled. This always leads to privilege escalation and is BAD.

What you _should_ mean, according to all prior research, is that No
code may be inside that routine and still hold a privilege not held by
the routine. In shorter form, The dynamic (current) privilege set must
not exceed the static privilege set of any routine on the stack. A
slightly different formulation applies for data inspection systems. See
footnote.

 Continuations, when taken, capture the current set of active and 
 enableable privs, and when invoked those privs are put into place. 
 (This is a spot that will require some thought, since there's a 
 potential for privilege leaks which worries me here) Non-continuation 
 invokables (subs and methods) maintain the current set of privs, plus 
 possibly adding the sub-specific privs.

If you perform the above step correctly, then capturing a context and
including it in future access control checks is not hard. Java does this
by capturing a current AccessControlContext when a new ClassLoader is
created in a thread to be used in a different thread. No code loaded by
that ClassLoader IN ANY THREAD may exceed the privileges of the thread
which created the classloader at the time it created it.

 It's actually pretty straightforward, the hard part being the whole 
 don't screw up when implementing thing, along with designing the 
 base set of privs. Personally I think taking the VMS priv and quota 
 system as a base is a good way to go -- it's well-respected and 
 well-tested, and so far as I know theoretically sound. Unix's priv 
 model's a lot more primitive, and I don't think it's the one to take. 
 (We could invent our own, but history shows that people who invent 
 their own security system invent ones that suck, so that looks like 
 something worth avoiding)

Better systems to inspect would be Java (stack inspection), Perl5 (data
inspection). Please do not confuse the choice of privilege set and logic
over it (authorisation system) with the mechanism for identifying the
current set of privileges (identification of current principal).

The key difference in security between stack inspection and data
inspection systems for the purposes of parrot is that stack inspection
considers for security purposes the dynamic context of the 

Re: A sketch of the security model

2005-04-15 Thread Shevek
On Wed, 2005-04-13 at 17:51 -0400, Aaron Sherman wrote:
 On Wed, 2005-04-13 at 17:01, Dan Sugalski wrote:
  So here's what I was thinking of for Parrot's security and quota 
  model. (Note that none of this is actually *implemented* yet...)
 [...]
  It's actually pretty straightforward, the hard part being the whole 
  don't screw up when implementing thing, along with designing the 
  base set of privs. Personally I think taking the VMS priv and quota 
  system as a base is a good way to go -- it's well-respected and 
  well-tested, and so far as I know theoretically sound. Unix's priv 
  model's a lot more primitive, and I don't think it's the one to take. 
  (We could invent our own, but history shows that people who invent 
  their own security system invent ones that suck, so that looks like 
  something worth avoiding)
 
 VMS at least *is* a priv-based security model, but VMS privs are not
 appropriate for parrot on the whole.

The best known model for privileges (logic of authorisation over) is
that of Oracle, RT, etc, where access over privileges is transitive.
Will find good references on request/when I have more time. Bad
references are available from Ravi Sandhu, but he doesn't handle
transitivity or modification of rights well, if at all.

S.




Re: [perl #34994] [TODO] make useful parts of Parrot config available at runtime

2005-04-15 Thread Leopold Toetsch
Steven Philip Schubiger wrote:
[ cc'ed list, so that folks know about takers ]
On 15 Apr, Leopold Toetsch wrote:
: 5) along with bringing the config online, some cleanup and renaming 
: wouldn't harm e.g. iv vs opcode_t, intvalsize vs intsize vs 
: opcode_t_size ...

This part seems appealing to me, but bear in mind, I've never tampered
with the Parrot C sources, although I've been heavily involved in other
C-based projects (GNU coreutils et al.)
That stuff is all in Perl code under the config dir, e.g:
$ find config -type f | xargs grep -w intsize
And do you have more examples or should I follow my guts?
I think we should have:
  INTVAL_t   # type of the INTVAL
  FLOATVAL_t
  INTVAL_size
  int_size   # native c type
and so on. See also include/parrot/datatypes.h
Steven
leo


Re: New language: Parrot Common Lisp

2005-04-15 Thread Leopold Toetsch
Cory Spencer [EMAIL PROTECTED] wrote:

 I'd like to announce the creation of the Parrot Common Lisp project, which
 aims to implement a significant subset of the Common Lisp language.

Wow. I can even do something with it:

$ ../parrot lisp.imc
- (+ 2 5)
7
- (list 1 2 3)
(1 . (2 . (3 . NIL)))

Ehem, that's almost all I know about Lisp.

  Depending on the system (I develop on both x86/Linux and g4/OS X),
  you'll get a Bus Error, Segmentation Fault or some other random error
  if you don't disable the GC.

  (If anyone is able to track down aforementioned DOD/GC problems,
  you'll earn my eternal gratitude.)

Can you please provide a code snippet that exhibits the error.

 -c

leo


[SVN ci] MMD 23 - convert subtract MMD functions and opcodes

2005-04-15 Thread Leopold Toetsch
Continuing the MMD infix plan, we now have:
1) the subtract MMD functions are converted to the new function signature:
  PMC* subtract(PMC* value, PMC* dest)
If Cdest isn't NULL it's set to the result of the operation and the 
result is returned. This is the existing behavior. The TODO new n_sub 
opcode will return a new destination with the result as needed by 
languages like Python or Lisp.

2) There are now distinct infix variants of subtract, with i_ 
prepended to the function name:

  void i_subtract(PMC *value)
3) during opcode generation, the sub opcode is converted according to:
  sub Px, Py, Pz=  infix .MMD_SUBTRACT,   Px, Py, Pz
  sub Px, Py=  infix .MMD_I_SUBTRACT, Px, Py
  sub Px, Px, Py=  infix .MMD_I_SUBTRACT, Px, Py
I'm not quite sure, if the latter is technically correct or useful. It 
might cause a problem, when operators are overloaded. OTOH it can safe a 
compare if (dest == SELF) 

4) Tcl and Python scalars use the inherited subtract MMD of Parrot core 
types Integer, Float, Complex, and BigInt. The old (duplicated, 
cut'n'pasted) variants of subtract got just deleted in Tcl and Python 
dynamic classes.

5) for type promotion on Integer overflow, I've changed the bignum 
vtables. We now have:

  PMC* VTABLE_get_bignum(INTERP, SELF)
which returns a new big integer of the appropriate type e.g. a PyLong. 
Along with morph these two functions are enough to preserve the HLLs 
view of types. There is a new test t/dynclass/pyint_26 that shows 
correct promotion of PyInt to PyLong.

6) during changing the scalar classes I found a lot of unused functions 
and vtables. E.g.
  - get_bool_keyed*  # unused, unneeded
  - set_bool_keyed*  # same
  - set_number
  - set_string   # no vtable slots, we have assign anyway
This is partially cleaned up now.

7) make test succeeds, this includes t/dynclass/py*.t
cd languages/tcl
TEST_PROG_ARGS=-G make test  shows 46/228 failing, with DOD enabled 
almost all fail.

I don't know yet, what's going on here. It seems that TclParser is the 
culprit. It creates during class_init a lot of strings e.g. bs_nl, 
which are declared static in that file. But these strings aren't 
anchored anywhere or registered with Parrot's DOD registry.

leo
PS please make realclean so that vtable changes are propagated


Re: A sketch of the security model

2005-04-15 Thread Shevek
On Thu, 2005-04-14 at 09:51 -0700, Dave Whipp wrote:
 Dan Sugalski wrote:
 
  All security is done on a per-interpreter basis. (really on a per-thread 
  basis, but since we're one-thread per interpreter it's essentially the 
  same thing)
 ...
 * Number of open files
 * IO operations/sec
 * IO operations total
 ...
 
 Can an application get more resources simply by spawning threads? If 

Well, given that a child thread's dynamic access control context should
include the dynamic context of the parent thread at the point where the
thread was spawned, No.

What I describe is a (provably) correct implementation.

 the answer is no, parent and child must divide share their quotas then 
 there is a load balancing problem. If the answer is yes, then there's 

There is no load balancing problem assuming you are synchronized on the
thread-create point, which is not a major overhead, since that pretty
much has to be a synchronization point in the kernel anyway.

 no real protection at all. A threads-per-second limit isn't an answer 
 here, either (a malicious app could sit around for a few hours, 
 launching threads at a low intensity, until it has enough to bring down 
 the system).
 
 Is a thread really the right thing to apply these limits to? It seems to 

Limits are applied to privilege sets, not to threads.

 me that there needs to be some sort of token (cf. cash; cf capability) 
 that an application can obtain/spend/refresh to do these ops. An 

Yes, that's about the same.

 application could share its token(s) with any threads it creates. It 
 could probably even loan its token to a backgroud thread that does 
 some operation on behalf of many other threads.

Preferably not. I fear the concept of being able to hand out privileges
to low privilege threads. If the low privilege thread has access to a
(willing) object with static privileges allowing the operation, then
that object should perform the operation on behalf of the thread in a
dynamic context created by a 'grant' operation (See Fournet and Gordon,
2003). If the low privilege thread is made up entirely of low privilege
objects, then it shouldn't have the privilege under any circumstances.

S.




Re: A sketch of the security model

2005-04-15 Thread Shevek
On Wed, 2005-04-13 at 22:03 -0400, Michael Walter wrote:
 Dan,
 
 On 4/13/05, Dan Sugalski [EMAIL PROTECTED] wrote:
  All security is done on a per-interpreter basis. (really on a
  per-thread basis, but since we're one-thread per interpreter it's
  essentially the same thing)
 Just to get me back on track: Does this mean that when you spawn a
 thread, a separate interpreter runs in/manages that thread, or
 something else?
 
  Each running thread has two sets of privileges -- the active
  privileges and the enableable privileges. Active privs are what's
  actually in force at the moment, and can be dropped at any time. The
  enableable privs are ones that code can turn on. It's possible to
  have an active priv that's not in the enableable set, in which case
  the current running code is allowed to do something but as soon as
  the privilege is dropped it can't be re-enabled.
 
 How can dropping a privilege for the duration of a (dynamic) scope be
 implemented? Does this need to be implemented via a parrot intrinsic,
 such as:
 
   without_privs(list_of_privs, code_to_be_run_without_these_privs);
 
 ..or is it possible to do so with the primitives you sketched out above?

This is usually done by creating a function f(code) { code() } without
any static privileges in list_of_privs. To evaluate a function g()
without those privileges, evaluate f(g), and the natural mechanisms of
the interpreter will ensure that these privileges are not held during
g().

  Additionally, subroutines may be marked as having privileges, which
  means that as long as control is inside the sub the priv in question
  is enabled. This allows for code that has elevated privs, generally
  system-level code.
 
 Does the code marking a subroutines must have any other privilege than
 the one it is marking the subroutine with?
 
  ... Non-continuation
  invokables (subs and methods) maintain the current set of privs, plus
  possibly adding the sub-specific privs.
 
 Same for closures?

Closures may also capture a concept of the current context, which is
used when they are evaluated. This is critical in, for example, the case
of system code with higher static privileges returning a closure to a
low privilege object which may evaluate it at any time.

a) The closure must not have any privileges not held by the low
privilege object, so clearly it cannot just hold its static privilege
set, it must capture a current context.

b) If it does wish to have higher privilege (very common), it may grant
(Fournet+Gordon,2003) these privileges in a dynamic scope bounded below
by itself.

S.




Re: Parrot bytecode reentrancy

2005-04-15 Thread Leopold Toetsch
Nigel Sandever [EMAIL PROTECTED] wrote:

 When a sub that closes over a variable

   my $closure = 0;
   sub do_something {
   return $closure++:
   }

 is called from two threads, do the threads share a single closure or
 each get their own separate closure?

AFAIK: the closure bytecode is shared, the Closure PMC with the lexical
pad is distinct. But that all isn't implemented yet.

 njs

leo


Re: A sketch of the security model

2005-04-15 Thread Shevek
On Thu, 2005-04-14 at 09:11 -0400, Dan Sugalski wrote:
 At 10:03 PM -0400 4/13/05, Michael Walter wrote:

Each running thread has two sets of privileges -- the active
   privileges and the enableable privileges. Active privs are what's
   actually in force at the moment, and can be dropped at any time. The
   enableable privs are ones that code can turn on. It's possible to
   have an active priv that's not in the enableable set, in which case
   the current running code is allowed to do something but as soon as
   the privilege is dropped it can't be re-enabled.
 
 How can dropping a privilege for the duration of a (dynamic) scope be
 implemented? Does this need to be implemented via a parrot intrinsic,
 such as:
 
without_privs(list_of_privs, code_to_be_run_without_these_privs);
 
 ..or is it possible to do so with the primitives you sketched out above?
 
 When a priv is dropped it stays dropped until it's reinstated. If 
 code drops a priv that it can't re-enable then the priv is gone. 
 (There are going to be issues with privileges attached to 
 continuations, since this could potentially mean that dropped privs 
 get un-dropped when you invoke a return continuation, though dropping 
 a privilege could ripple up the return continuation chain)

Reinstating privileges when you return is normal, since potentially
malicious code and data has now been removed from the stack.

If you do NOT do it this way, then every piece of code must know the
privileges of every child piece of code it calls (bye-bye virtual base
classes with user implementations).

See http://research.microsoft.com/~adg/Publications/MSR-TR-2001-103.pdf

The ability to explicitly reenable a privilege via an opcode, rather
than via the removal of the malicious party from the computation (by
return) is almost definitely a bad idea. If you protect this opcode
using some security mechanism, you will rapidly find that security
mechanism can supersede the functionality provided by the opcode.

Additionally, subroutines may be marked as having privileges, which
   means that as long as control is inside the sub the priv in question
   is enabled. This allows for code that has elevated privs, generally
   system-level code.
 
 Does the code marking a subroutines must have any other privilege than
 the one it is marking the subroutine with?
 
 Dunno, that's something we'll need to work out. It's possible that 
 sub marking needs to be done externally -- that is, it's bytecode 
 metadata or something like that which requires system privileges of 
 some sort to set. (Though there are issues with that) Marking code as 
 privileged is really a system administration task, though we've not 
 really put much thought into administering a parrot system yet.

Actually, what usually happens is that subroutines (etc) are associated
with a responsible party (principal), and privileges are granted to the
principal; thus finding out the privileges of an opcode requires an
extra indirection. This is not a problem.

... Non-continuation
   invokables (subs and methods) maintain the current set of privs, plus
   possibly adding the sub-specific privs.
 Same for closures?
 
 Yeah, I think so.

No, as before. You cannot execute based only on static privileges - this
is what Unix does, and the Unix model is broken. You need either a stack
inspection or a data inspection model, or a combination of the two. Ask
me if you want formal descriptions or implementation details of these
models.

S.




$*CWD instead of chdir() and cwd()

2005-04-15 Thread Michael G Schwern
I was doing some work on Parrot::Test today and was replacing this code
with something more cross platform.

# Run the command in a different directory
my $command = 'some command';
$command= cd $dir  $command if $dir;
system($command);

I replaced it with this.

my $orig_dir = cwd;
chdir $dir if $dir;
system $command;
chdir $orig_dir;

Go into some new directory temporarily, run something, go back to the
original.

Hmm.  Set a global to a new value temporarily and then return to the
original value.  Sounds a lot like local.  So why not use it?

{
local chdir $dir if $dir;
system $command;
}

But localizing a function call makes no sense, especially if it has side
effects.  Well, the current working directory is just a filepath.  Scalar
data.  Why have a function to change a scalar?  Just change it directly.
Now local() makes perfect sense.

{
local $CWD = $dir if $dir;
system $command;
}

And this is exactly what File::chdir does.  $CWD is a tied scalar.
Changing it changes the current working directory.  Reading it tells you
what the current working directory is.  Localizing it allows you to
safely change the cwd temporarily, for example within the scope of a
subroutine.  It eliminates both chdir() and cwd().

Error handling is simple, a failed chdir returns undef and sets errno.

$CWD = $dir err die Can't chdir to $dir: $!;

I encourage Perl 6 to adapt $*CWD similar to File::chdir and simply eliminate
chdir() and cwd().  They're just an unlocalizable store and fetch for global
data.

As a matter of fact, Autrijus is walking me through implementing it in Pugs
right now.



Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Juerd
David Wheeler skribis 2005-04-14 21:32 (-0700):
 I was going to say that that was inconsistent, but since you never need 
 to repeat a letter in a character class, well, I guess it isn't. But 
 the first person to write [a...] gets what's comin' to 'em.

Given ASCII, [\x20...] would then be everything except control
characters. Handy!

By the way, does ...5 mean -Inf..5? ;)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Some PMC's Questions

2005-04-15 Thread Leopold Toetsch
Bloves Mr [EMAIL PROTECTED] wrote:
 hi,folks.
 I am reading PMC C source code and reading some document(
 http://www.perl.com/pub/a/2002/01/30/pmcs.html;).

Despite that the text is rather old, it's remarkably valid still.

 Some questions:

 *this PMC design have changed?

The internal layout of the PMC structure has changed, yes. And it
will likely change in the future. The internals of vtable calls and PMC
structure data access is now hidden inside macros:

  SELF-data   =  PMC_data(SELF)
  SELF-cache.int_val  =  PMC_int_val(SELF)
  $1-vtable-bet_bool()   =  VTABLE_get_bool(INTERP, $1)

and so on. For details you might consult include/parrot/pmc.h.

 *any body offer some advice that learn PMC C source code and PMC's theory?

Just have a look at existing PMCs in classes. Commonly used core classes
are a good begin, e.g.:

  classes/integer.pmc... the Integer PMC
  classes/resizablepmcarray.pmc  ... standard PMC array

or even

  classes/tqueue.pmc ... experimental thread-safe queue

 Thanks.

leo


[perl #34999] [TODO] remove more old stuff

2005-04-15 Thread via RT
# New Ticket Created by  Leopold Toetsch 
# Please include the string:  [perl #34999]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org/rt3/Ticket/Display.html?id=34999 


Some outdated files:

   lib/Parrot/PackFile/*
   lib/Parrot/PackFile.pm
   lib/Parrot/PackFile2.*

what is:

   lib/Parrot/String.pm  old packfile code?
   lib/Parrot/Types.pm   same?
   lib/Parrot/Key.pm same?

Do we still need:

   lib/Parrot/PMC.pm
   lib/Parrot/Makefile.PL

and what about the

   chartypes

directory, seems to be created in lib/Parrot/Distribution.pm

Already discussed:

   classes/pmc2c.pl   old PMC compiler
   classes/pmcarray.pmc   wrapper for PerlArray

leo



Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Aaron Sherman
On Thu, 2005-04-14 at 21:32 -0700, David Wheeler wrote:
 On Apr 14, 2005, at 7:06 PM, Patrick R. Michaud wrote:
 
  So, [a.z]  matches a, ., and z,
  while   [a..z] matches characters a through z inclusive.
 
 I was going to say that that was inconsistent, but since you never need 
 to repeat a letter in a character class, well, I guess it isn't. But 
 the first person to write [a...] gets what's comin' to 'em.

A silly question: is there a canonical character set from which we
extract these ranges? Are we hard-coding Unicode here, or is there some
way for the user to specify the character set for ranges?




Re: Test::Expect

2005-04-15 Thread Ricardo SIGNES
* Adrian Howard [EMAIL PROTECTED] [2005-04-14T15:37:07]
 On 14 Apr 2005, at 11:36, Leon Brocard wrote:
 Oh, I forgot to mention to perl-qa that I wrote Test::Expect:
   http://search.cpan.org/dist/Test-Expect/
 
 It's nice. Already used it :-)

Does anyone who has used both Test::Expect and Test::Output feel like
giving a simple comparison?

-- 
rjbs


pgpqdwYiXXJrd.pgp
Description: PGP signature


Re: [pugs] regexp bug?

2005-04-15 Thread hv
Mark A. Biggar [EMAIL PROTECTED] wrote:
:BÁRTHÁZI András wrote:
:
: Hi,
: 
: This code:
: 
: my $a='A';
: $a ~~ s:perl5:g/A/{chr(65535)}/;
: say $a.bytes;
: 
: Outputs 0. Why?
: 
: Bye,
:   Andras
: 
:
:\u is not a legal unicode codepoint.  chr(65535) should raise an 
:exception of some type.  So the above code does seem show a possible 
:bug. But as that chr(65535) is an undefined char, who knows what the 
:code is acually doing.

In perl5 at least, we support a wider concept of codepoints than the
Unicode consortium. This allows us to use strings for a wider variety
of things than just Unicode text (eg version strings, bit vectors etc).

In perl6 the greatly expanded set of types will presumably allow us
to distinguish actual Unicode data from more arbitrary sequences of
codepoints, and I'd normally expect that the more constrained type
would be a subtype of the less constrained type. In this case that
means I'd expect Unicode string to be a subtype of something like
codepoint sequence.

(In fact it'd probably be useful to have more levels than that - there
are times when you need the Unicode concepts for things like [[:digit:]],
but may be able to get better performance by avoiding the checks for
'legal Unicode codepoint'.)

On the other hand you will probably be able to achieve the things p5
overloads onto strings using packed integer arrays, so maybe this all
represents unnecessary complications. In which case maybe 'relaxed'
variants of Unicode strings aren't needed. We will probably still want
other sorts of strings though, such as ASCII.

Hugo


Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Brao Tich
- Original Message - 
From: Aaron Sherman [EMAIL PROTECTED]
To: David Wheeler [EMAIL PROTECTED]
Cc: Perl6 Language List perl6-language@perl.org
Sent: Friday, April 15, 2005 2:00 PM
Subject: Re: should we change [^a-z] to -[a..z] instead of -[a-z]?


 On Thu, 2005-04-14 at 21:32 -0700, David Wheeler wrote:
  On Apr 14, 2005, at 7:06 PM, Patrick R. Michaud wrote:
 
   So, [a.z]  matches a, ., and z,
   while   [a..z] matches characters a through z inclusive.
 
  I was going to say that that was inconsistent, but since you never need
  to repeat a letter in a character class, well, I guess it isn't. But
  the first person to write [a...] gets what's comin' to 'em.

 A silly question: is there a canonical character set from which we
 extract these ranges? Are we hard-coding Unicode here, or is there some
 way for the user to specify the character set for ranges?


delurk
even sillier question:
if [a.z] matches a, . and z
and [a...] matches all characters from a including (for some definition
of 'all')

how will be range \x21 .. \x2e written?
[!..\.]? (i.e. . escaped?)
/delurk

brao



[RFC] some doubtable MMDs?

2005-04-15 Thread Leopold Toetsch
I'm not quite sure, but it seems that some of the MMD functions may 
better be vtable methods:

- bitwise_sh[rl]*shift by anything other then int?
- bitwise_lsris missing generally
or even just a plain opcode only:
- logical_{or,and,xor}  return a PMC depending on the boolean value
What are HLLs expecting of these infix operations?
OTOH it might be useful that the current get_type_keyed operations 
(postcircumfix:[]) become MMD subroutines:

  Px = Py[Pz]Pz = String, Int, Key, Slice, ...
Comments welcome,
leo


[] ugly and hard to type

2005-04-15 Thread Juerd
Am I the only one who thinks [a-z] is ugly and hard to type because of
the nested brackets? The same goes for {...}. The latter can't easily
be fixed, I think, but the former perhaps can. If there are more who
think it needs to, that is. And {} is a bit easier to type because all
four are shifted (US QWERTY and US Dvorak), while with [] I really
have to think hard about when to press and when to release the shift
key.

\letter[] could well replace [], and \LETTER[] would then replace
-[]. This is consistent with many other \letters.

c for character is taken
r for range is taken by carriage return
a for any is taken by alarm (bell)
l for list is taken by lcfirst

m is available, but I can't think of a mnemonic :)

\m[a..z]  \M[a..z]

And to replace [a..z]-[aoeui] (does that construct even exist?),
[ \m[a..z]  \M[aoeui] ]. IMO, that's the only step backwards.

a would best communicate its function. Is the beep thing used enough?
(\cG still does that thing if \a is gone.)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: [] ugly and hard to type

2005-04-15 Thread Patrick R. Michaud
On Fri, Apr 15, 2005 at 02:58:44PM +0200, Juerd wrote:
 Am I the only one who thinks [a-z] is ugly and hard to type because of
 the nested brackets? The same goes for {...}. The latter can't easily
 be fixed, I think, but the former perhaps can. 

Part of the thinking behind this is that the [...] construct
is likely to be less common in p6 rules than [...] was in p5 regular
expressions.  For unicode reasons, one typically should be writing
alpha instead of [a-z] anyway.  

But yes, I understand the difficulty of typing [...] on non-US
keyboards.  :-)

 \letter[] could well replace [], and \LETTER[] would then replace
 -[]. This is consistent with many other \letters.
 
 c for character is taken
 r for range is taken by carriage return
 a for any is taken by alarm (bell)
 l for list is taken by lcfirst

Actually, \L[...] is gone -- see S05 and A05.  I'm not sure if \a
exists, I haven't seen any reference to it in p6 rules.  (One could
claim that it's carried over from p5, but rules are so far different
from regexes that I'm hesitant to make that assumption.)  We could
certainly declare \a to be something else.

This isn't a vote from me either in favor or against this idea...
I'm just clarifying and making sure the discussion is up-to-date
with the relevant specs.

Pm


Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Matthew Walton

 delurk
 even sillier question:
 if [a.z] matches a, . and z
 and [a...] matches all characters from a including (for some
 definition of 'all')

 how will be range \x21 .. \x2e written?
 [!..\.]? (i.e. . escaped?)
 /delurk

I was assuming from Larry's mail that [a...] would parse as either:

  1) a character class containing the range from 'a' to '.' (what that
  means is a bit mind-bending for a friday afternoon)  2) a character class 
containing 'a' then a range from '.' to... oh, an
  error
Which way might be ambiguous, but could of course be defined in the
grammar. It hadn't occurred to me that ... for the range to infinity would
be allowed or useful here. I suppose it could just mean 'up to the end of
the available codepoints'.
I do love the idea of [a..f] type ranges though. It's just what the
three dots mean that's got me confused.



Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Steven Philip Schubiger
On 14 Apr, Larry Wall wrote:

: In writing some character class translation, I realized that
: 
: -[a-z]
: 
: and its ilk are rather hard to read because of the two hyphens
: that mean different things.  We can't use ![a-z] because that's a
: 0-width lookahead.  Given that we're trying to get rid of special
: exceptions, and - in character classes is weird, and we already
: use .. for ranges everywhere else, and nobody is going to put a
: repeated character into a character class, I'm wondering if
: 
: -[a..z]
: 
: should be allowed/encouraged/required.  It greatly improves the
: readability in my estimation.  The only problem with requiring .. is
: that people *will* write [a-z] out of habit, and we would probably
: have to outlaw the - form for many years before everyone would get
: used to the .. form.  So maybe we allow - but warn if not backslashed.
: 
: Larry

I think, if we bear in mind, as it has been stressed previously, that
many changes concerning regular expressions have been introduced and
require users to assimilate themselves accordingly, it doesn't seem
unreasonable requiring to write double-dot instead of a hyphen; it also
fits the Principle of least surprise idiom nicely, in my opinion.

Nevertheless, as mentioned by David, [a...] would become rather
confusing to people first and secondly to the compiler; although,
regardless whether we assume dot preceeds double-dot or vice-versa,
there would be an expansion enforced (what I'd expect), perhaps
accompanied by a warning.

I agree on a warning upon non-escaped hyphen.

Steven


Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Rafael Garcia-Suarez
Aaron Sherman wrote in perl.perl6.language :

 A silly question: is there a canonical character set from which we
 extract these ranges? Are we hard-coding Unicode here, or is there some
 way for the user to specify the character set for ranges?

Perl 5 forces [a-z] (or [i-j] for that matter) to be a range of
lowercase alphabetic characters, even on EBCDIC platforms (where it's
not).


Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Patrick R. Michaud
On Fri, Apr 15, 2005 at 01:01:58PM -, Rafael Garcia-Suarez wrote:
 Aaron Sherman wrote in perl.perl6.language :
 
  A silly question: is there a canonical character set from which we
  extract these ranges? Are we hard-coding Unicode here, or is there some
  way for the user to specify the character set for ranges?
 
 Perl 5 forces [a-z] (or [i-j] for that matter) to be a range of
 lowercase alphabetic characters, even on EBCDIC platforms (where it's
 not).

At the moment, PGE (the part that implements the rule engine) is
deferring such questions to Parrot, and otherwise assuming Unicode.
Plus, S02 explicitly indicates that Perl is written in Unicode
and has consistent Unicode semantics, so I think that's what we should
go with.  It's certainly the way the compiler will go, at least
initially.

Pm


Re: [perl #34994] [TODO] make useful parts of Parrot config available at runtime

2005-04-15 Thread Steven Philip Schubiger
On 15 Apr, Leopold Toetsch wrote:

: That stuff is all in Perl code under the config dir, e.g:
: 
: $ find config -type f | xargs grep -w intsize

This clarifies some of my unapproved assumptions, although src has
some files containing these keywords too.

: I think we should have:
: 
:INTVAL_t   # type of the INTVAL
:FLOATVAL_t
:INTVAL_size
:int_size   # native c type
: 
: and so on. See also include/parrot/datatypes.h

I will.

: leo 

Steven



[perl #35000] [PATCH] README.win32 icu 3.2

2005-04-15 Thread François
# New Ticket Created by  Franois PERRAD 
# Please include the string:  [perl #35000]
# in the subject line of all future correspondence about this issue. 
# URL: https://rt.perl.org/rt3/Ticket/Display.html?id=35000 



small mistake in [perl #34986] :
with ICU 3.2, the library icudata.lib is renamed icudt.lib.

Francois Perrad.--- README.win32.orig   2005-04-15 11:08:34.0 +0200
+++ README.win322005-04-15 11:25:50.0 +0200
@@ -65,7 +65,7 @@
 mkdir C:\usr\lib\data
 set PATH=%PATH%;C:\usr\lib\icu\bin
 cd parrot directory
-perl Configure.pl --icushared=C:\usr\lib\icu\lib\icudata.lib 
C:\usr\lib\icu\lib\icuuc.lib --icuheaders=C:\usr\lib\icu\include 
--icudatadir=C:\usr\local\icu\data
+perl Configure.pl --icushared=C:\usr\lib\icu\lib\icudt.lib 
C:\usr\lib\icu\lib\icuuc.lib --icuheaders=C:\usr\lib\icu\include 
--icudatadir=C:\usr\local\icu\data
 
 With MinGW32, use icu-3.2-Win32-msvc6.zip.
 
@@ -112,9 +112,9 @@
 
 With the ActiveState Perl distribution, tell Configure.pl to use gcc :
 
-perl Configure.pl --cc=gcc --icushared=C:\usr\lib\icu\lib\icudata.lib 
C:\usr\lib\icu\lib\icuuc.lib --icuheaders=C:\usr\lib\icu\include 
--icudatadir=C:\usr\local\icu\data
-
-Nota: Use only the ICU binary distribution. 
+perl Configure.pl --cc=gcc --icushared=C:\usr\lib\icu\lib\icudt.lib 
C:\usr\lib\icu\lib\icuuc.lib --icuheaders=C:\usr\lib\icu\include 
--icudatadir=C:\usr\local\icu\data
+or
+perl Configure.pl --cc=gcc --without-icu
 
 =item Intel C++
 


[PATCH] Minor spelling punctuation errors

2005-04-15 Thread Steven Philip Schubiger
I've corrected a few spelling and punctuation errors;
since I'm not done yet, I'd like to know, whether I should 
continue, or if the general consensus is, that it's mostly 
needless nitpicking.

Punctuation has only been corrected, if punctuation was already
partly present; if totally absent, I didn't mind, as punctuation
does not always add up to readability.

Steven

--- src/builtin.c   Fri Apr 15 14:24:06 2005
+++ src/builtin.c   Fri Apr 15 13:04:58 2005
@@ -4,7 +4,7 @@
 
 =head1 NAME
 
-src/builtin.c - Bultin Methods
+src/builtin.c - Builtin Methods
 
 =head1 SYNOPSIS
 
--- src/datatypes.c Fri Apr 15 14:24:27 2005
+++ src/datatypes.c Fri Apr 15 14:34:40 2005
@@ -1,6 +1,5 @@
 /*
-Copyright: (c) 2002 Leopold Toetsch [EMAIL PROTECTED]
-License:  Artistic/GPL, see README and LICENSES for details
+Copyright: (c) 2002-2004 The Perl Foundation.  All Rights Reserved.
 $Id: datatypes.c,v 1.11 2004/09/08 00:33:58 dan Exp $
 
 =head1 NAME
@@ -10,7 +9,7 @@
 =head1 DESCRIPTION
 
 The functions in this file are used in .ops files to access the Cenum
-and C string constants for Parrot and native data types defined iin
+and C string constants for Parrot and native data types defined in
 Finclude/parrot/datatypes.h.
 
 =head2 Functions

--- src/debug.c Fri Apr 15 14:24:34 2005
+++ src/debug.c Fri Apr 15 13:30:21 2005
@@ -749,7 +749,7 @@
 PDB_line_t *line;
 long ln,i;
 
-/* If no line number was specified set it at the current line */
+/* If no line number was specified, set it at the current line */
 if (command  *command) {
 ln = atol(command);
 
@@ -944,7 +944,7 @@
 /* PDB_find_breakpoint
  *
  * Find breakpoint number N; returns NULL if the breakpoint doesn't
- * exist or if no breakpoint was specified
+ * exist or if no breakpoint was specified.
  *
  */
 /*
@@ -1470,8 +1470,8 @@
 dest[size++] = 'P';
 goto INTEGER;
 case PARROT_ARG_IC:
-/* If the opcode jumps and this is the last argument
-   means this is a label */
+/* If the opcode jumps and this is the last argument,
+   that means this is a label */
 if ((j == info-arg_count - 1) 
 (info-jump  PARROT_JUMP_RELATIVE))
 {
@@ -1888,7 +1888,7 @@
 
 =over 4
 
-=item * This should take the line get an instruction, get the opcode for
+=item * This should take the line, get an instruction, get the opcode for
 that instruction and check that is the correct one.
 
 =item * Decide what to do with macros if anything.
@@ -2265,7 +2265,8 @@
 =item Cstatic void
 dump_string(Interp *interpreter, STRING* s)
 
-Description.
+Dumps the buflen, flags, bufused, strlen, offset associated
+with a string and the string itself.
 
 =cut
 
--- src/dod.c   Fri Apr 15 14:24:42 2005
+++ src/dod.c   Fri Apr 15 13:41:18 2005
@@ -97,13 +97,13 @@
 ++arena_base-num_extended_PMCs;
 /*
  * XXX this basically invalidates the high-priority marking
- * of PMCs by putting all PMCs onto the front of the list
+ * of PMCs by putting all PMCs onto the front of the list.
  * The reason for this is the by far better cache locality
- * when aggregates and their contents are marked together
+ * when aggregates and their contents are marked together.
  *
  * To enable high priority marking again we should probably
  * use a second pointer chain, which is, when not empty,
- * processed first
+ * processed first.
  */
 if (tptr || hi_prio) {
 if (PMC_next_for_GC(tptr) == tptr) {
@@ -177,7 +177,7 @@
 if (*dod_flags  (PObj_is_special_PMC_FLAG  nm)) {
 /* All PMCs that need special treatment are handled here.
  * For normal PMCs, we don't touch the PMC memory itself
- * so that caches stay clean
+ * so that caches stay clean.
  */
 #if GC_VERBOSE
 if (PObj_report_TEST(obj)) {
@@ -210,7 +210,7 @@
 PObj_live_SET(obj);
 
 /* if object is a PMC and contains buffers or PMCs, then attach
- * the PMC to the chained mark list
+ * the PMC to the chained mark list.
  */
 if (PObj_is_special_PMC_TEST(obj)) {
 mark_special(interpreter, (PMC*) obj);
@@ -305,7 +305,7 @@
  * but t/library/dumper* fails w/o this marking.
  *
  * It seems that the Class PMC gets DODed - these should
- * get created as constant PMCs
+ * get created as constant PMCs.
  */
 for (i = 1; i  (unsigned int)enum_class_max; i++) {
 VTABLE *vtable;
@@ -404,10 +404,10 @@
  * First phase of mark is finished. Now if we are the owner
  * of a shared pool, we must run the mark phase of other
  * interpreters in our pool, so that live shared PMCs in that
- * interpreter are appended to our mark_ptrs chain
+ * interpreter are appended to our mark_ptrs chain.
  *
  * If there is a count of 

[SVN ci] MMD 24 - add converted

2005-04-15 Thread Leopold Toetsch
MMD subroutines add are done.
* removed all mathematical functions from Tcl scalars - all is inherited now
I forgot to mention in MMD 23:
* If you have an overriden __add or __subtract function, either defined 
as @MULTI or registered via mmdvtregister, these functions must now 
return the destination PMC. For not yet converted MMD infix operations, 
the return result is ignored, but it doesn't harm either.

leo


Re: New language: Parrot Common Lisp

2005-04-15 Thread Chip Salzenberg
According to Cory Spencer:
 I'd like to announce the creation of the Parrot Common Lisp project

Excellent!

   * It's not a compiler yet, although I've got plans for that down the
 road.

(declare (type PerlString s)) ?  :-)
-- 
Chip Salzenberg- a.k.a. -[EMAIL PROTECTED]
 Open Source is not an excuse to write fun code
then leave the actual work to others.


MMD 25 - multiply

2005-04-15 Thread Leopold Toetsch
One more, and my fingers  brain are getting tired of these changes.
If someone wants to continue (and complete it during night here ;-), 
it's a simple job:

1) vtable.tbl
   - change existing signature of next infix operation
   - add inplace variant directly below it
2) imcc/parser_util.c:is_infix()
   - add the compare case for the MMD
3) make realclean; perl Configure.pl ...  make -s
4) fix all compiler errors in classes and dynclasses by looking at 
already converted functions and adding the inplace variants

4a) remove code from dynclasses/py*.pmc, if it's the same as the Parrot 
core base class, or adapt code

5) make test 
6) svn ci
Thanks,
leo


Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 03:11:59AM -0700, Michael G Schwern wrote:
: Error handling is simple, a failed chdir returns undef and sets errno.
: 
:   $CWD = $dir err die Can't chdir to $dir: $!;

Offhand, I guess my main semantic problem with it is that if a chdir
fails, you aren't in an undefined location, which the new value of $CWD
would seem to indicate.  You're just where you were.  Then the user
either has to remember that, or there still has to be some other
means of finding out the real location.

The other problem with it is the fact that people will assign relative
paths to it and expect to get the relative path back out instead
of the absolute path.

: I encourage Perl 6 to adapt $*CWD similar to File::chdir and simply eliminate
: chdir() and cwd().  They're just an unlocalizable store and fetch for global
: data.

Your assumption there is a bit inaccurate--in P6 you are allowed to
temporize (localize) the effects of functions and methods that are
prepared to deal with it.  However, I agree that it's nice to have an
easily interpolatable value.  So I think I'd rather see $CWD always
return the current absolute path even after failure, and

temp chdir($dir) err fail Can't chdir to $dir: $!;

be made to work as a temporizable function at some point, via the TEMP
mechanism described in A4.

Larry


Truely temporary variables

2005-04-15 Thread Aaron Sherman
Among the various ways of declaring variables, will Perl 6 have a way to
say, this variable is highly temporary, and may be re-declared within
the same scope, or in a nested scope without concern? I often find
myself doing:

my $sql = q{...};
...do some DB stuff...
my $sql = q{...};
...do more DB stuff...

This of course results in re-defining $sql, so I take out the second
my, but then at some point I remove the first one, and strict chews me
out over not declaring $sql, so I make it my again.

This is a cycle I've repeated with dozens of variations on more
occasions than I care to (could?) count.

What I'd really like to say is:

throwawaytmpvar $sql = q{...};
throwawaytmpvar $sql = q{...};

without problems. Of course, throwawaytmpvar is a bit long, but you
get the idea.

It should probably be illegal to:

throwawaytmpvar $sql = q{...};
my $sql = q{...}; # Error: temporary became normal lexical

or for that matter even give it a new type:

throwawaytmpvar int $i = 0;
throwawaytmpvar str $i = oops; # Error: redefinition of type

There might be other assumptions that this implies. For example, it
might be considered always thread-private and might be required to be a
core, unboxed type. These extra assumptions are only worth it if they
enhance the optimization possibilities surrounding such a value.

-- 
Aaron Sherman [EMAIL PROTECTED]
Senior Systems Engineer and Toolsmith
It's the sound of a satellite saying, 'get me down!' -Shriekback




Re: Macros [was: Whither use English?]

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 12:45:14PM +1200, Sam Vilain wrote:
: Larry Wall wrote:
:  Well, only if you stick to a standard dialect.  As soon as you start
:  defining your own macros, it gets a little trickier.
: 
: Interesting, I hadn't considered that.
: 
: Having a quick browse through some of the discussions about macros, many
: of the macros I saw[*] looked something like they could be conceptualised
: as referring to the part of the AST where they were defined.
: 
: ie, making the AST more of an Abstract Syntax Graph.  And macros like
: 'free' (ie, stack frame and scope-less) subs, with only the complication
: of variable binding.  The ability to have recursive macros would then
: relate to this graph-ness.

That is one variety of macro.

: What are the shortcomings of this view of macros, as 'smart' (symbol
: binding) AST shortcuts?

The biggest problem with smart things is they're harder for not-so-smart
people to understand.

: The ability to know exactly what source corresponds to a given point on
: the AST, as well as knowing straight after parse time (save for string
: eval, of course) what each token in the source stream relates to is one
: thing that I'm aiming to have work with Perldoc.  I'm hoping this will
: assist I18N efforts and other uses like smart editors.

Yes, that's an important quality for many kinds of tools, whether
documentation, debugging, or refactoring.

: By smart editors, I'm talking about something that uses Perl/PPI as its
: grammar parsing engine, and it highlights the code based on where each
: token in the source stream ended up on the AST.  This would work
: completely with source that munges grammars (assuming the grammars are
: working ;).  Then, use cases like performing L10N for display to non-
: English speakers would be 'easy'.  I can think of other side-benefits
: to such regularity of the language, such as allowing Programatica-
: style systems for visually identifying 'proof-carrying code' and
: 'testing certificates' (see http://xrl.us/programatica).

Glad you think it's 'easy'.  Maybe you should 'just do it' for us.  :-)

: macros that run at compile time, and insert strings back into the
: document source seem hackish and scary to these sorts of prospects.

We also allow (but discourage) textual substitution macros.  They're
essentially just lexically scoped source filters, and suffer the
same problems as source filters, except for the fact that you can
more easily limit the damage to a small patch of code.  The problem
is that the original patch of text has to be stored in the AST along
with the new chunk of AST generated by the reparse, and it's not at
all clear how a tool should handle that conflict.  It's better to only
parse once whenever possible, and just make sure the original text
remains attached to the appropriate place in the AST.  More basically,
it's usually better to cooperate with the parser than to lie to it.

: But then, one man's hackish and scary is another man's elegant
: simplicity, I guess.
: 
: * - in particular, messages like this:
: - http://xrl.us/fr78
: 
: but this one gives me a hint that there is more to the story... I
: don't grok the intent of 'is parsed'
: - http://xrl.us/fr8a

This is mostly talked about in the relevant Apocalypses, and maybe
the Synopses.  See dev.perl.org for more.

Larry


Re: Truely temporary variables

2005-04-15 Thread Juerd
Aaron Sherman skribis 2005-04-15 11:45 (-0400):
 What I'd really like to say is:
   throwawaytmpvar $sql = q{...};
   throwawaytmpvar $sql = q{...};

I like the idea and propose a, aliased an for this.

 It should probably be illegal to:
   throwawaytmpvar $sql = q{...};
   my $sql = q{...}; # Error: temporary became normal lexical
 or for that matter even give it a new type:
   throwawaytmpvar int $i = 0;
   throwawaytmpvar str $i = oops; # Error: redefinition of type

Giving it a new type should be valid. That is, I think the variable is
more useful if the old one is thrown away and a new one is created. This
can perhaps be optimized by re-using the same thing if it has no
external references anymore.

In fact,

a Str $foo = $foo;

is a nice way to indicate that from now on, you don't care about its
numeric value anymore.

All in all, I think a|an can just be my without warnings and then do
what you want. 

Hm. Funny idea just occurred to me. What if something in ALLCAPS, or
better, just Ucfirst would disable all warnings for just that thing?

my $foo;
say $foo;  # warning about undef $foo
Say $foo;  # no warning

$closed_fh.print(Int($foo));  # just a warning about the closed fh

my $foo;   # warning about new $foo masking first
My $foo;   # no warning

If you think this looks much like PHP's @, you're right. It's not so bad
an idea, actually. The problem with PHP is that everything's a warning
and almost nothing actually dies.

No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
ugly). Suggestions?


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Truely temporary variables

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 11:45:16AM -0400, Aaron Sherman wrote:
: Among the various ways of declaring variables, will Perl 6 have a way to
: say, this variable is highly temporary, and may be re-declared within
: the same scope, or in a nested scope without concern? I often find
: myself doing:
: 
:   my $sql = q{...};
:   ...do some DB stuff...
:   my $sql = q{...};
:   ...do more DB stuff...
: 
: This of course results in re-defining $sql, so I take out the second
: my, but then at some point I remove the first one, and strict chews me
: out over not declaring $sql, so I make it my again.
: 
: This is a cycle I've repeated with dozens of variations on more
: occasions than I care to (could?) count.

And at that point, why not just change it to this?

my $sql;
$sql = q{...};
...do some DB stuff...
$sql = q{...};
...do more DB stuff...

It seems to me that assignment does a pretty good job of clobbering a
variable's value without the need to redeclare the container.  If you
really want to program in a definitional paradigm that requires every
new definition to have a declaration, then you ought to be giving
different definitions different names, seems like, or putting each
of them into its own scope.  Or write yourself a macro.  Or just turn
off the redefinition warning...

It doesn't seem to rise to the level of a new keyword for me.

Larry


Re: Truely temporary variables

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote:
: No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
: ugly). Suggestions?

Maybe we could define an ok operator that suppresses only the
*first* warning produced by its argument(s).  Then if you get multiple
warnings, you at least get some indication that you've overgeneralized,
even if the wrong warning comes out.  Or maybe it only suppresses
the first warning till you get a second warning, and then it prints both.

Larry


Re: Truely temporary variables

2005-04-15 Thread Patrick R. Michaud
On Fri, Apr 15, 2005 at 09:17:13AM -0700, Larry Wall wrote:
 On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote:
 : No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
 : ugly). Suggestions?
 
 Maybe we could define an ok operator that suppresses only the
 *first* warning produced by its argument(s).  Then if you get multiple
 warnings, you at least get some indication that you've overgeneralized,
 even if the wrong warning comes out.  Or maybe it only suppresses
 the first warning till you get a second warning, and then it prints both.

And after the third warning, it sends you to your room with no supper.

Pm


Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Rod Adams
David Wheeler wrote:
But the first person to write [a...] gets what's comin' to 'em.
Is that nothing (since '.' lt 'a'), or everything after 'a'?
-- Rod Adams



Re: [pugs] regexp bug?

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 12:56:14AM -0700, Mark A. Biggar wrote:
: Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 
: byte UCS-2 value, but the Unicode standard specifically says that the 
: values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should 
: never appear in a Unicode string.  0x is reserved for out-of-band 
: signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are 
: specificaly reserved for out-of-band marking a UCS-2 file as being 
: either bigendian or littlendian, but are specifically not considered 
: part of the data.  chr() is currently defined to mean convert an int 
: value to a Unicode codepoint. That's why I said that chr(65535) should 
: return an exception, it's an argument error similar to sqrt(-1).

It has to at least be possible to Think Bad Thoughts in Perl.
It doesn't have to be the default, though.  But there has to be
some way of allowing illegal characters to be talked about, or
you can't write programs that talk about them.  It's like saying
it's okay to be an executioner as long as you don't kill anyone...

Larry


Re: Parrot bytecode reentrancy

2005-04-15 Thread Nigel Sandever
15/04/2005 10:35:56, Leopold Toetsch [EMAIL PROTECTED] wrote:

Nigel Sandever [EMAIL PROTECTED] wrote:

 When a sub that closes over a variable

  my $closure = 0;
  sub do_something {
  return $closure++:
  }

 is called from two threads, do the threads share a single closure or
 each get their own separate closure?

AFAIK: the closure bytecode is shared, 

Great.

the Closure PMC with the lexical
pad is distinct. 

I think that makes perfect sense. No implicit sharing.

But that all isn't implemented yet.


Understood. I am being premature in thinking about this. 

But this is where I come unstuck. What would this mean/do when called from 2 
threads?

my $closure :shared = 0;
sub do_something {
return $closure++:
}

or this:

our $closure :shared = 0;
sub do_something {
return $closure++:
}

I struck me a while back that there is a contradiction in idea of a shared, 
'my' variable. 

I want to say lexical, but a var declared with 'our' is in some sense lexical. 

Where I am going is that shared implies global. Access can be constrained by 
requiring a lexical declaration using 'our', but 'my' variables should not be 
able to be marked 'shared'.

One nice thing that falls out of that, is that no 'my' vars would ever be 
shared, which means they never require semaphore checks. That would mean that a 
non threaded app running on a multi-threaded build of Parrot, need never incur 
a 
penalty of semaphore checks if it always use 'my'. *I think*?

In effect, all vars declared 'our' would be implicitly shared, (and would 
require semaphoring), removing the need for a 'shared' attribute. 

In P5, lexicals are already quicker that globals, so any additional penalty 
added to globals because of multithreading will not affect any single-threaded 
code that is striving for ultimate performance, because they would already be 
utilising lexicals. 

Equally, things like filehandles are inherently process-global in scope and 
therefore sharable between threads and require semaphore checks. 

I only throw this into the thought-pot because there seems to me to be a 
natural 
symmetry between the concept of 'global' and the concept of 'shared'.

I won't argue the case for this, but I thought that if I mention it, it might 
also make some sense to others when the time comes for this stuff to be 
designed 
and implemented.

 njs

leo


njs







Re: Truely temporary variables

2005-04-15 Thread Rod Adams
Larry Wall wrote:
On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote:
: No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
: ugly). Suggestions?
Maybe we could define an ok operator that suppresses only the
*first* warning produced by its argument(s).  Then if you get multiple
warnings, you at least get some indication that you've overgeneralized,
even if the wrong warning comes out.  Or maybe it only suppresses
the first warning till you get a second warning, and then it prints both.
Wouldn't some form of trait make more sense:
   my $sql = '...' is ok;
Only trick would be getting is ok to bind to the thing in the 
preceding expression that produces the warning the programmer was 
expecting. Certainly

   {my $sql = '...'} is ok;
get the point across that warnings are somewhat ignorable for the block, 
but that starts getting to look a lot like

   {my $sql = '...'} CATCH {default};
Except that one is run-time, the other compile-time.
So one could interpret this thread as a cry for a compile-time exception 
handler. I see some interesting uses for this in conjunction with 
Ceval, but I doubt I'm seeing the whole story.

-- Rod Adams


Re: Truely temporary variables

2005-04-15 Thread Juerd
Rod Adams skribis 2005-04-15 11:53 (-0500):
 Wouldn't some form of trait make more sense:
my $sql = '...' is ok;

Depends. A unary ok operator would let you pinpoint very easily,
*without* using parens:

ok $fh.print($foo); # no warnings about print (closed fh?)
# but warning about undef $foo remains

$fh.print(ok $foo);  # warn about printing thingies, but not about
 # undef $foo

say $foo, $bar, ok $baz, $quux;  # complain about everything, except
 # what has to do with $baz

my $foo;
ok my $foo = foo $bar baz;  # warn about $bar, but not the masking
my $foo = ok foo $bar baz;  # other way around


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Truely temporary variables

2005-04-15 Thread Luke Palmer
Aaron Sherman writes:
 Among the various ways of declaring variables, will Perl 6 have a way to
 say, this variable is highly temporary, and may be re-declared within
 the same scope, or in a nested scope without concern? I often find
 myself doing:
 
   my $sql = q{...};
   ...do some DB stuff...
   my $sql = q{...};
   ...do more DB stuff...

There's a pretty common idiom for this:

{
my $sql = q{...};
# ... do some DB stuff ...
}
{
my $sql = q{...};
# ... do more DB stuff ...
}

You see it in test suites all over the CPANdom.  

Luke


Re: [pugs] regexp bug?

2005-04-15 Thread mark . a . biggar

Isn't that what the difference between byte-level and codepoint-level access to 
strings is all about.  If you want to work with values that are illegal 
codepoints then you should be working at the byte-level not the 
codepoint-level, at least by default.

--
Mark Biggar
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]


 On Fri, Apr 15, 2005 at 12:56:14AM -0700, Mark A. Biggar wrote:
 : Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 
 : byte UCS-2 value, but the Unicode standard specifically says that the 
 : values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should 
 : never appear in a Unicode string.  0x is reserved for out-of-band 
 : signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are 
 : specificaly reserved for out-of-band marking a UCS-2 file as being 
 : either bigendian or littlendian, but are specifically not considered 
 : part of the data.  chr() is currently defined to mean convert an int 
 : value to a Unicode codepoint. That's why I said that chr(65535) should 
 : return an exception, it's an argument error similar to sqrt(-1).
 
 It has to at least be possible to Think Bad Thoughts in Perl.
 It doesn't have to be the default, though.  But there has to be
 some way of allowing illegal characters to be talked about, or
 you can't write programs that talk about them.  It's like saying
 it's okay to be an executioner as long as you don't kill anyone...
 
 Larry


Re: [perl #35000] [PATCH] README.win32 icu 3.2

2005-04-15 Thread chromatic
On Fri, 2005-04-15 at 05:38 -0700, François PERRAD wrote:

 small mistake in [perl #34986] :
 with ICU 3.2, the library icudata.lib is renamed icudt.lib.

Thanks, applied.

-- c



Re: [perl #34984] [PATCH] Fix segfault with const

2005-04-15 Thread Nick Glencross
Leopold Toetsch via RT wrote:
Nick Glencross [EMAIL PROTECTED] wrote:
 

This patch fixes a problem which can occur in this example:
   

 

.sub test
   .const float a = 12
   print a
   print_newline
.end
   

Ah yep.
 

+if (t != 'P'  t != val-set)
+IMCC_fataly(interp, E_TypeError,
+const types do not match);
   

I think, we could be a bit more graceful here for I/N mismatch and set
for the above case the constant val-set to 'N'.
 

Yes, I was planning to do something a bit more thorough, but fixing the 
immediate segfault was the first challenge.

I've looked over the code a bit more now, and see that the value is 
still stored textually at this point, so setting the type as you've said 
is pretty simple. It's a shame that strings can be in a number of 
different formats, and probably quoted, preventing this from working for 
them too.

Anyhow, here's a new patch for you to review, and perhaps apply...?
Cheers,
Nick
Index: imcc/symreg.c
===
--- imcc/symreg.c   (revision 7843)
+++ imcc/symreg.c   (working copy)
@@ -307,6 +307,7 @@
 INS(interp, unit, set_p_pc, , r, 2, 0, 1);
 return NULL;
 }
+
 /* Makes a new identifier constant with value val */
 SymReg *
 mk_const_ident(Interp *interp,
@@ -314,6 +315,16 @@
 {
 SymReg *r;
 
+// Forbid assigning a string to anything other than a string const
+// for now
+if (t != 'S'  val-set == 'S')
+IMCC_fataly(interp, E_TypeError,
+bad const initialisation);
+
+// Cast value to const type
+if (t == 'S' || t == 'I')
+val-set = t;
+
 if (global) {
 if (t == 'P') {
 IMCC_fataly(interp, E_SyntaxError,


Statement modifier scope

2005-04-15 Thread Paul Seamons
The following chunks behave the same in Perl 5.6 as in Perl 5.8.  Notice the 
output of branching statement modifiers vs. looping statement modifiers. 

perl -e '$f=1; {local $f=2; print $f} print  - $f\n'
  # prints 2 - 1

perl -e '$f=1; {local $f=2 if 1; print $f} print  - $f\n
  # prints 2 - 1

perl -e '$f=1; {local $f=2 unless 0; print $f} print  - $f\n''
  # prints 2 - 1

perl -e '$f=1; {local $f=2 for 1; print $f} print  - $f\n'
  # prints 1 - 1

perl -e '$f=1; {local $f=2 until 1; print $f} print  - $f\n'
  # prints 1 - 1

perl -e '$f=1; {local $f=2 while !$n++; print $f} print  - $f\n'
  # prints 1 - 1

It appears that there is an implicit block around statements with looping 
statement modifiers.  perlsyn does state that the control variables of the 
for statement modifier are locally scoped, but doesn't really mention that 
the entire statement is as well.  I'm not sure if this was in the original 
design spec or if it flowed out of the implementation details, but either way 
it seems to represent an inconsistency in the treatment of locality with 
regards to braces (ok I guess there are several in Perl5).

So the question is, what will it be like for Perl6.  It would seem that all of 
the following should hold true because of scoping being tied to the blocks.

pugs -e 'our $f=1; {temp $f=2; print $f}; say  - $f'
   # should print 2 - 1 (currently prints 2 - 2 - but that is a compiler 
issue)

pugs -e 'our $f=1; {temp $f=2 if 1; print $f}; say  - $f'
   # should print 2 - 1 (currently dies with parse error)

pugs -e 'our $f=1; {temp $f=2 for 1; print $f}; say  - $f'
   # hopefully prints 2 - 1 (currently dies with parse error)

As a side note - pugs does work with:

pugs -e 'our $f=1; {$f=2 for 1; print $f}; say  - $f'
  # prints 2 - 2 (as it should.  It seems that statement modifiers don't 
currently work with declarations - but that is a compiler issue - not a 
language issue.)

I have wanted to do this in Perl5 but couldn't but would love to be able to do 
in Perl6:

my %h = a 1 b 2 c 3;
{
  temp %h{$_} ++ for %h.keys;
  %h.say; # values are incremented still
}
%h.say; # values are back to original values

Paul


Re: New language: Parrot Common Lisp

2005-04-15 Thread Cory Spencer

 (If anyone is able to track down aforementioned DOD/GC problems,
 you'll earn my eternal gratitude.)
Can you please provide a code snippet that exhibits the error.
Just running the program gives me errors on both Linux/x86 and OS X. 
Running with GC disabled works fine.

On OS X with GC enabled:
forge:~/svn/parrot-lisp/trunk$ parrot lisp.pbc
Can't find method '__set_string_native' for object 'LispSymbol'
On OS X with GC disabled:
forge:~/svn/parrot-lisp/trunk$ parrot -G lisp.pbc
-
On Linux with GC enabled:
anvil:~/svn/parrot-lisp/trunk$ parrot lisp.pbc
Can't find method '__set_string_native' for object 'LispSymbol'
On Linux with GC disabled:
anvil:~/svn/parrot-lisp/trunk$ parrot lisp.pbc
-
This is on the Parrot checked out of Subversion this morning (revision 
7846).  Which OS/build number were you using?

-c


Re: Statement modifier scope

2005-04-15 Thread Juerd
Paul Seamons skribis 2005-04-15 11:50 (-0600):
 my %h = a 1 b 2 c 3;
 {
   temp %h{$_} ++ for %h.keys;

Just make that two lines. Is that so bad?

temp %h;
%h.values »++;

   %h.say; # values are incremented still
 }
 %h.say; # values are back to original values


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Truely temporary variables

2005-04-15 Thread Brent 'Dax' Royal-Gordon
Aaron Sherman [EMAIL PROTECTED] wrote:
 What I'd really like to say is:

 throwawaytmpvar $sql = q{...};
 throwawaytmpvar $sql = q{...};

Anything wrong with:

   my $sql = q{...};
   temp $sql = q{...};
   temp $sql = q{...};

(Assuming Ctemp is made to work on lexicals, of course.)

-- 
Brent 'Dax' Royal-Gordon [EMAIL PROTECTED]
Perl and Parrot hacker

I used to have a life, but I liked mail-reading so much better.


Re: Statement modifier scope

2005-04-15 Thread Paul Seamons
On Friday 15 April 2005 11:57 am, Juerd wrote:
 Paul Seamons skribis 2005-04-15 11:50 (-0600):
  my %h = a 1 b 2 c 3;
  {
temp %h{$_} ++ for %h.keys;

 Just make that two lines. Is that so bad?

 temp %h;
 %h.values »++;


For the given example, your code fits perfectly.  A more common case I have 
had to deal with is more like this:

my %h = a 1 b 2 c 3
my %other = a one b two;
{
  temp %h{$_} = %other{$_} for %other.keys;
  %h.say;
}

Ideally that example would print
aone
btwo
c3

It isn't possible any more to do something like
{
  temp %h = (%h, %other);
}
because that second %h is now hidden from scope (I forget which Apocalypse or 
mail thread I saw it in).  Plus for huge hashes it just isn't very efficient.

I'd like to temporarily put the values of one hash into another (without 
wiping out all of the modfied hashes values like temp %h would do), run 
some code, leave scope and have the modified hash go back to normal.  In 
perl5 I've had to implement that programatically by saving existing values 
into yet another hash - running the code - putting them back.  It works but 
there is all sorts of issues with defined vs exists.

So yes - your code fits the limited example I gave.  But I'd still like the 
other item to work.

Paul


Re: Truely temporary variables

2005-04-15 Thread Juerd
Brent 'Dax' Royal-Gordon skribis 2005-04-15 11:15 (-0700):
 Anything wrong with:

Yes, moving things around breaks it, as does removing the first. There
is no real dependency on the first $sql and it'd be great if declaration
wouldn't add one.

   temp $sql = q{...};
   my $sql = q{...};
   temp $sql = q{...};


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: [perl #34984] [PATCH] Fix segfault with const

2005-04-15 Thread Nick Glencross
Leopold Toetsch via RT wrote:

I think, we could be a bit more graceful here for I/N mismatch and set
for the above case the constant val-set to 'N'.
   

Let me redo that...  I've just sent the wrong attachment which had a 
typo in it ...

[This should really address rare but possible Unicode strings, shouldn't 
it?]

Nick
Index: imcc/symreg.c
===
--- imcc/symreg.c   (revision 7843)
+++ imcc/symreg.c   (working copy)
@@ -307,6 +307,7 @@
 INS(interp, unit, set_p_pc, , r, 2, 0, 1);
 return NULL;
 }
+
 /* Makes a new identifier constant with value val */
 SymReg *
 mk_const_ident(Interp *interp,
@@ -314,6 +315,16 @@
 {
 SymReg *r;
 
+// Forbid assigning a string to anything other than a string const
+// for now
+if (t != 'S'  val-set == 'S')
+IMCC_fataly(interp, E_TypeError,
+bad const initialisation);
+
+// Cast value to const type
+if (t == 'N' || t == 'I')
+val-set = t;
+
 if (global) {
 if (t == 'P') {
 IMCC_fataly(interp, E_SyntaxError,


Re: Truely temporary variables

2005-04-15 Thread chromatic
On Fri, 2005-04-15 at 11:21 -0500, Patrick R. Michaud wrote:

 On Fri, Apr 15, 2005 at 09:17:13AM -0700, Larry Wall wrote:

  Maybe we could define an ok operator that suppresses only the
  *first* warning produced by its argument(s).  Then if you get multiple
  warnings, you at least get some indication that you've overgeneralized,
  even if the wrong warning comes out.  Or maybe it only suppresses
  the first warning till you get a second warning, and then it prints both.

 And after the third warning, it sends you to your room with no supper.

Talk about a strict permission system.  If that's the case, I want a
I'm the human here, darnit! option to bypass it.

-- c



Re: Statement modifier scope

2005-04-15 Thread Juerd
Paul Seamons skribis 2005-04-15 12:16 (-0600):
 For the given example, your code fits perfectly.  A more common case I have 
 had to deal with is more like this:
 my %h = a 1 b 2 c 3
 my %other = a one b two;
 {
   temp %h{$_} = %other{$_} for %other.keys;

Either

temp %h;
%h{$_} = %other{$_} for %other.keys;

or

temp %h;
%h{ %other.keys } = %other.values;

or even

temp %h{ %other.keys } = %other.values;

should work well already?
 
   %h.say;
 }

I think it's hard to find an example that can't easily be rewritten as
something that already works. Gather/take solves most.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Statement modifier scope

2005-04-15 Thread Paul Seamons

 temp %h;
 %h{ %other.keys } = %other.values;

 or even

 temp %h{ %other.keys } = %other.values;

 should work well already?

Almost - but not quite.

In Perl5
perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h; $h{a}=one; print Dumper 
\%h} print Dumper \%h;
$VAR1 = {
  'a' = 'one'
};
$VAR1 = {
  'a' = '1',
  'b' = '2'
};

I'm imaging the behavior would be the same with Perl6.  Notice that 'b' is 
gone in the first print.  I only want to temporarily modify some values 
(the ones from the %other hash).  I don't want the contents of the %h to be 
identical to %other - I already have %other.

So in Perl5 this does work:

perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h=%h; $h{a}=one; print 
Dumper \%h} print Dumper \%h;
$VAR1 = {
  'a' = 'one'
  'b' = '2',
};
$VAR1 = {
  'a' = '1',
  'b' = '2'
};
But this won't work in Perl6 (temp $var = $var doesn't work in Perl6) and 
again it may be fine for small hashes with only a little data - but for a 
huge hash (1000+ keys) it is very inefficient.

This is good discussion - but it isn't the real focus of the original message 
in the thread - the question is about the local (temp) scoping of looping 
statement modifiers in Perl6.

Though, I do appreciate your trying to get my example working as is.

Paul


Re: Truely temporary variables

2005-04-15 Thread Aaron Sherman
On Fri, 2005-04-15 at 13:10, Luke Palmer wrote:
 Aaron Sherman writes:
  Among the various ways of declaring variables, will Perl 6 have a way to
  say, this variable is highly temporary, and may be re-declared within
  the same scope, or in a nested scope without concern? I often find
  myself doing:
  
  my $sql = q{...};
  ...do some DB stuff...
  my $sql = q{...};
  ...do more DB stuff...
 
 There's a pretty common idiom for this:
 
 {
 my $sql = q{...};
 # ... do some DB stuff ...
 }
 {
 my $sql = q{...};
 # ... do more DB stuff ...
 }
 
 You see it in test suites all over the CPANdom.  

You see it all over my code too... it is always possible to simulate
many kinds of trickery that way. For example, if you want to write a
loop with a counter that is visible one statement after the loop
completes, you can say:

{
my int $i;
loop $i=0;...;$i++ {
...
}
do_stuff($i);
}

But isn't:

loop my int $i=0;...;$i++ {
...;
LAST{do_stuff($i)}
}

much cleaner? I think so, if for no other reason than it explicitly says
what it means. That's one of the reasons that LAST is so handy.

So too would my mythical declarator would prevent a few steps that are
otherwise quite easy, but cumbersome in the large.

Whatever, though. It was a simple suggestion, and seems to have sparked
FAR more controversy than the small win warrants.

-- 
Aaron Sherman [EMAIL PROTECTED]
Senior Systems Engineer and Toolsmith
It's the sound of a satellite saying, 'get me down!' -Shriekback




Re: Statement modifier scope

2005-04-15 Thread Paul Seamons
On Friday 15 April 2005 12:28 pm, Juerd wrote:
 temp %h{ %other.keys } = %other.values;

Oops missed that - I like that for solving this particular problem.  It does 
even work in Perl5:

perl -MData::Dumper -e '%h=qw(a 1 b 2); {local @h{qw(a b)}=(one,two); 
print Dumper \%h} print Dumper \%h'
$VAR1 = {
  'a' = 'one',
  'b' = 'two'
};
$VAR1 = {
  'a' = '1',
  'b' = '2'
};

I had never thought to do a hash slice in a local.  That is great!!!

Thank you very much!  Wish I'd know about that three years ago.

But, it still doesn't answer the original question about scoping in the 
looping statement modifiers.

Paul


Re: [perl #34984] [PATCH] Fix segfault with const

2005-04-15 Thread Nicholas Clark
On Fri, Apr 15, 2005 at 07:26:56PM +0100, Nick Glencross wrote:

 +// Forbid assigning a string to anything other than a string const
 +// for now

In future, please don't use C99 comments.

(apart from that, I don't have the knowledge to comment on this patch)

Nicholas Clark


Re: [pugs] regexp bug?

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 05:12:54PM +, [EMAIL PROTECTED] wrote:

: Isn't that what the difference between byte-level and codepoint-level
: access to strings is all about.  If you want to work with values that
: are illegal codepoints then you should be working at the byte-level
: not the codepoint-level, at least by default.

Sure, but there's no guarantee you have access to a lower level,
depending on the interface presented by the object in question, and
you shouldn't probably have to know that anyway, if there's a useful
abstraction level at which illegal character means something as
a unit to the higher level.  The fact is that U+ is an illegal
character regardless of the encoding, and I'd like to be able to
talk about it as a character, without having to know whether it's
an illegal UTF-8 byte sequence, or an illegal UTF-16 byte sequence,
or a 256-bit integer stored somewhere that you just aren't allowed
to think about certain values of.

In short, legal Unicode strings should probably be viewed as a
constrained subtype of strings, not as a storage type.  I know you've
known Ada from its infancy. :-)  Perl 6 makes the same distinction, and
can presumably get at the unconstrained type for any constrained type.
So if you hand me a Unicode string with arbitrary value restrictions,
there had better be a way to view that string without the arbitrary
restrictions.  You need to be able to determine somehow that types
Even or Odd have a storage class of type Int.

Larry


Re: should we change [^a-z] to -[a..z] instead of -[a-z]?

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 11:28:31AM -0500, Rod Adams wrote:
: David Wheeler wrote:
: 
: But the first person to write [a...] gets what's comin' to 'em.
: 
: Is that nothing (since '.' lt 'a'), or everything after 'a'?

Might as well make it everything after 'a' for consistency.  One could
also view the last dot as a special version of the ordinary any dot,
and read it a to whatever.

Larry


Re: Statement modifier scope

2005-04-15 Thread Juerd
Paul Seamons skribis 2005-04-15 12:41 (-0600):
 In Perl5
 perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h; $h{a}=one; print Dumper 
 \%h} print Dumper \%h;
 $VAR1 = {
   'a' = 'one'
 };
 $VAR1 = {
   'a' = '1',
   'b' = '2'
 };
 I'm imaging the behavior would be the same with Perl6.  Notice that 'b' is 

I'm imagining it will be different, as I expect temp to not hide the old
thing. I'm not sure it will.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: Statement modifier scope

2005-04-15 Thread Larry Wall
I would like to get rid of all those implicit scopes.  The only
exception would be that any topicalizing modifier allocates a private
lexical $_ scoped to just that statement.  But dynamic scoping may
happen only at explicit block boundaries.

I can see the argument for the other side, where any deferred
code is treated as a kind of closure regardless of whether there are
explicit curlies around it.  That would solve certain problems like
defining the scopes of the lexicals in

$a = $x ?? my $y :: my $z;

or the infamous

my $x = 1 if $y;

to extend only to the subexpressions in which they find themselves.
But it's not what naive users expect, and it's hard to explain, so I
think we should stick with explicit curlies for most of our scoping
needs, even if it means letting certain variables hang around undefined
because their initialization was never executed.

Larry


Various questions

2005-04-15 Thread Philip Taylor
I've been working on a C-to-Parrot compiler (actually an IMC backend
for the LCC compiler), tentatively named Carrot, over the past week. It
can currently do some reasonably useful things, like running the Cola
compiler (with only a very small amount of cheating), but it has raised 
a few queries:

* I can usually handle unsigned numbers by pretending they're signed and 
using 'I' registers, but some things appear to be awkward without new 
ops - in particular, div and cmod, and le/lt/ge/gt comparisons. (As far 
as I can tell, those are the only ones C would need; everything else 
should work fine with the signed variants).

I've added divu/leu/etc ops to math.ops/cmp.ops (and just made them cast 
their operands into UINTVALs) - is that a reasonable thing to do? Would 
they be better in a new .ops file?

* Should there be an 'isatty' op/method? (or is there something else 
that isatty(fileno(file)) (which Cola's lexer uses) should do, in 
order to return a reasonable answer?)

* Is it possible to merge PBC files together, like load_bytecode but at 
compile-time?

The compiler converts .c to .pbc (via .imc), then the linker just 
creates a program full of load_bytecode, so the actual linking gets done 
at run-time, which isn't very nice when you try moving/deleting one of 
the .pbcs. (And lcc always deletes the .pbcs, since it assumes they're 
temporary files.)

* How efficient are PMC method calls? (And are performance concerns 
documented anywhere, like op calls are roughly n times faster than 
methods, so compiler-writers could avoid implementing things in stupid 
ways, or is it too early to be doing that?)

I've been using [gs]et_integer_keyed_int on a PMC to allow pointer 
access. Since it reads whole ints, it probably crashes unnecessarily 
when e.g. reading chars at unlucky addresses - but IMC code like val = 
mem.read_i1(ptr) feels unpleasantly inefficient, particularly in 
string-processing loops.

Hmm... Should I just accept that C-on-Parrot will always be relatively 
slow, since its concept of memory is slightly incompatible with 
Parrot's, and anybody who wants speed can use a native C compiler, so I 
can stop worrying about it? :-)

Thanks,
--
Philip Taylor
[EMAIL PROTECTED]


Re: Statement modifier scope

2005-04-15 Thread Paul Seamons
 I'm imagining it will be different, as I expect temp to not hide the old
 thing. I'm not sure it will.

That is another good question.  I just searched through the S and A's and 
couldn't find if temp will blank it out.  I am thinking it will act like 
local.  Each of the declarations my, our and local currently set the value to 
undefined (unless set = to something).  I imagine that temp and let will 
behave the same.

In which case local %h; and let %h would allocate a new, empty variable in 
a addition to the original variable (which is hidden but still retains its 
contents).

Paul


Re: [RFC] some doubtable MMDs?

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 02:38:36PM +0200, Leopold Toetsch wrote:
: I'm not quite sure, but it seems that some of the MMD functions may 
: better be vtable methods:
: 
: - bitwise_sh[rl]*shift by anything other then int?
: - bitwise_lsris missing generally
: 
: or even just a plain opcode only:
: 
: - logical_{or,and,xor}  return a PMC depending on the boolean value
: 
: What are HLLs expecting of these infix operations?

Perl 6 tends to distinguish these as different operators, though Perl 5
did overload the bitwise ops on both strings and numbers, which newbies
found confusing in ambiguous cases, which is why we changed it.

: OTOH it might be useful that the current get_type_keyed operations 
: (postcircumfix:[]) become MMD subroutines:
: 
:   Px = Py[Pz]Pz = String, Int, Key, Slice, ...

At the moment, the Perl 6 optimizer is explicitly allowed to optimize
array indices with the assumption that the subscript is a scalar
(or slice) of integer, or something that converts to integer.  I'd be
interested to know if that policy will actually buy us any performance.
If it always has to go through MMD anyway, maybe it doesn't.  But
array indexing code tends to be pretty hot, so if we can keep it
somewhat optimizable and/or jittable, that'd be nice.

Larry


Re: [pugs] regexp bug?

2005-04-15 Thread Nicholas Clark
On Fri, Apr 15, 2005 at 09:34:58AM -0700, Larry Wall wrote:

 It doesn't have to be the default, though.  But there has to be
 some way of allowing illegal characters to be talked about, or
 you can't write programs that talk about them.  It's like saying

Thoughtcrime acceptable. Doubleplusgood.

Nicholas Clark


Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Michael G Schwern
Thus spake Larry Wall:
 Offhand, I guess my main semantic problem with it is that if a chdir
 fails, you aren't in an undefined location, which the new value of $CWD
 would seem to indicate.  You're just where you were.  Then the user
 either has to remember that, or there still has to be some other
 means of finding out the real location.

To be clear:  Only the store operation will return undef on failure.  
Additional fetches on $CWD will continue to return the cwd.

$CWD = '/path/which/exists';
$CWD = '/i/do/not/exist' err warn $!;
print $CWD;

This prints /path/which/exists/.


 The other problem with it is the fact that people will assign relative
 paths to it and expect to get the relative path back out instead
 of the absolute path.

I honestly never had this problem until I sat down and thought about it. :)
THEN I got all confused and started to do things like $CWD .= '/subdir';
instead of simply $CWD = 'subdir';.  But the rule is simple and natural.
It takes a relative or absolute directory and ALWAYS returns an absolute 
path.  Lax in what inputs it accepts, strict in what it emits.  This is no
more to remember than what chdir() and cwd() would do.

The result from $CWD would simply be a Dir object similar to Ken Williams' 
Path::Class or Ruby's Dir object.  One of the methods would be .relative.

I didn't bring up @CWD because I thought it would be too much in one sitting.
Basically it allows you to do this:

pop @CWD;   # chdir ('..');
push @CWD, 'dir';   # chdir ('dir');
print $CWD[0];  # (File::Spec-splitdir(abs_path()))[0];
# ie. What top level directory am I in?

and all sorts of other operations that would normally involve a lot of
splitdir'ing.

And then there's %CWD which I'm toying with being a per-volume chdir like
you can do on Windows but that may be too much of a questionable thing.


 Your assumption there is a bit inaccurate--in P6 you are allowed to
 temporize (localize) the effects of functions and methods that are
 prepared to deal with it.  

Yeah, we were talking about it on #perl6 a bit.  That seems to me the more
bizarre idea than assigning to something which can fail.  Localizing an
assignment is easy, there's just one thing to revert.  But function calls can
do lots of things.  Just how much does it reverse?  I guess if its used
sensibly on sharp functions, such as chdir, and the behavior is 
user-definable it can work but I don't know if the behavior will ever
be obvious for anything beyond the trivial.

FWIW my prompting to write File::chdir was a desire was for local chdir.
So if temp chdir can be made to work that would solve most of the problem.

If nothing else perhaps chdir() should be eliminated and cwd() simply takes
an argument to make it a getter/setter.


 However, I agree that it's nice to have an
 easily interpolatable value.  So I think I'd rather see $CWD always
 return the current absolute path even after failure

The problem there is it leaves $CWD without an error mechanism and thus
becomes an unverifiable operation.  You have to use chdir() if you want to
error check and $CWD is reduced to a scripting feature.

It could throw an exception but then you have to wrap everything in a try
block.  Unless Perl 6 is going this route for I/O errors in general I'd
rather not.

I'll give the error mechanism some more thought.


Anyhow, I encourage folks to play with File::chdir and see what they think
of the idea.  I'm fixing up the Windows nits in the tests now.



Re: Statement modifier scope

2005-04-15 Thread Juerd
Paul Seamons skribis 2005-04-15 13:42 (-0600):
 Each of the declarations my, our and local currently set the value to 
 undefined (unless set = to something).

That's not true.

use strict;
$::foo = 5;
our $foo;
print $foo;  # 5


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


nbsp in \s, ?ws and

2005-04-15 Thread Juerd
Is there a ?ws-like thingy that is always \s+?

Do \s and ?ws match non-breaking whitespace, U+00A0?

How about:

U+0008  backspace
U+00A0  no break space (Repeated for overview)
U+1361  ethiopic wordspace
U+2000  en quad
U+2001  em quad
U+2002  en space
U+2003  em space
U+2004  three per em space
U+2005  four per em space
U+2006  six per em space
U+2007  figure space
U+2008  punctuation space
U+2009  thin space 
U+200A  hair space
U+200B  zero width space
U+202F  narrow no break space
U+205F  medium mathematic space
U+2060  word joiner (What is that, anyway?)
U+3000  ideographic space
U+FEFF  zero width non-breaking space

\s is said (in S05) to match any unicode whitespace, but letting it
match NBSP and then using \s for splitting things is wrong, I think.

Are the contents of  split using ?ws? (Is $foo, where $foo is
foo\xA0bar, one or two elements?)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Juerd
Michael G Schwern skribis 2005-04-15 13:12 (-0700):
 To be clear:  Only the store operation will return undef on failure.  
 Additional fetches on $CWD will continue to return the cwd.

Still breaks

$ref = \($CWD = $foo);

I'm not sure this breakage matters, but if it breaks one thing, it's
likely to break more than just that one thing, and I wonder how much
attention this has been given.

Hm, but $CWD++ is nice! Especially if after photos9 it goes to photos10,
and not photot0. How does string ++ work in Perl 6, anyway?

 The problem there is it leaves $CWD without an error mechanism and thus
 becomes an unverifiable operation.  You have to use chdir() if you want to
 error check and $CWD is reduced to a scripting feature.

Well, after failure it can be cwd() but false without breaking any real
code, because normally, you'd never if (cwd) { ... }, simply because
there's ALWAYS a cwd. If this is done, the thing returned by the STORE
can still be an lvalue and thus be properly reffed.

This would mean you'd use or instead of err, but I don't understand the
point of err meaning error together with the introduction of
true-but-false values anyway. Low-prec // should imo just be spelled
dor. But it's too late for that, of course.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 01:12:46PM -0700, Michael G Schwern wrote:
: Thus spake Larry Wall:
:  Offhand, I guess my main semantic problem with it is that if a chdir
:  fails, you aren't in an undefined location, which the new value of $CWD
:  would seem to indicate.  You're just where you were.  Then the user
:  either has to remember that, or there still has to be some other
:  means of finding out the real location.
: 
: To be clear:  Only the store operation will return undef on failure.  

That doesn't square with the notion that an assignment returns the
actual lvalue:

($new = $old) =~ s/foo/bar/;

: Additional fetches on $CWD will continue to return the cwd.
: 
:   $CWD = '/path/which/exists';
:   $CWD = '/i/do/not/exist' err warn $!;
:   print $CWD;
: 
: This prints /path/which/exists/.

Except that the err should be looking at $CWD, not some other return value
of the assignment.

:  The other problem with it is the fact that people will assign relative
:  paths to it and expect to get the relative path back out instead
:  of the absolute path.
: 
: I honestly never had this problem until I sat down and thought about it. :)
: THEN I got all confused and started to do things like $CWD .= '/subdir';
: instead of simply $CWD = 'subdir';.  But the rule is simple and natural.
: It takes a relative or absolute directory and ALWAYS returns an absolute 
: path.  Lax in what inputs it accepts, strict in what it emits.  This is no
: more to remember than what chdir() and cwd() would do.
: 
: The result from $CWD would simply be a Dir object similar to Ken Williams' 
: Path::Class or Ruby's Dir object.  One of the methods would be .relative.
: 
: I didn't bring up @CWD because I thought it would be too much in one sitting.
: Basically it allows you to do this:
: 
:   pop @CWD;   # chdir ('..');
:   push @CWD, 'dir';   # chdir ('dir');
:   print $CWD[0];  # (File::Spec-splitdir(abs_path()))[0];
:   # ie. What top level directory am I in?
: 
: and all sorts of other operations that would normally involve a lot of
: splitdir'ing.
: 
: And then there's %CWD which I'm toying with being a per-volume chdir like
: you can do on Windows but that may be too much of a questionable thing.

You could multiplex both the array and hash roles into the object
returned by $CWD, much like the $/ pattern match result object can
be subscripted as either $/[1] or $/mantissa.  $CWD would itself
behave like a string in string context, but $CWD[] would get you to
the array value, and $CWD{} the hash value for systems that have
more than one current directory.

:  Your assumption there is a bit inaccurate--in P6 you are allowed to
:  temporize (localize) the effects of functions and methods that are
:  prepared to deal with it.  
: 
: Yeah, we were talking about it on #perl6 a bit.  That seems to me the more
: bizarre idea than assigning to something which can fail.  Localizing an
: assignment is easy, there's just one thing to revert.  But function calls can
: do lots of things.  Just how much does it reverse?  I guess if its used
: sensibly on sharp functions, such as chdir, and the behavior is 
: user-definable it can work but I don't know if the behavior will ever
: be obvious for anything beyond the trivial.

The function reverses whatever its TEMP property's closure knows how
to reverse.  It's up to the function to know what its side effects are
and arrange to undo them.

: FWIW my prompting to write File::chdir was a desire was for local chdir.
: So if temp chdir can be made to work that would solve most of the problem.
: 
: If nothing else perhaps chdir() should be eliminated and cwd() simply takes
: an argument to make it a getter/setter.

If you're going to throw away the verb then the noun might as well be
a variable.  But I like verbs for their readability, even if the verb
is push.  Note that push could be made to work with temp as well:

temp push $CWD, subdir err fail ...

This would automatically pop $CWD at the end of the dynamic scope.

:  However, I agree that it's nice to have an
:  easily interpolatable value.  So I think I'd rather see $CWD always
:  return the current absolute path even after failure
: 
: The problem there is it leaves $CWD without an error mechanism and thus
: becomes an unverifiable operation.  You have to use chdir() if you want to
: error check and $CWD is reduced to a scripting feature.

That was my point.  And if you look back at what you wrote, you just
called $CWD an operation.  It's not--it's a noun.  I like nouns,
but I also like verbs, and unlike in Perl 5 we don't have to rely on
the magical side effects of certain mystical nouns to do localization
any more.

But I don't understand what you mean by a scripting feature, or
how getting reduced to one is antithetical to a blissful existence.

: It could throw an exception but then you have to wrap everything in a try
: block.  Unless Perl 6 is going this 

Heredocs: How equal are bunches of spaces to tabs?

2005-04-15 Thread Juerd
Pasted from pugs/examples/cookbook/01-00introduction.p6:

# XXX - question: How equal are bunches of spaces to tabs?
#   -- I'd say that's a question for perl6lang


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread chromatic
On Fri, 2005-04-15 at 23:52 +0200, Juerd wrote:

 Well, after failure it can be cwd() but false without breaking any real
 code, because normally, you'd never if (cwd) { ... }, simply because
 there's ALWAYS a cwd.

Not always -- try removing a directory that's the pwd of another
process.

-- c



Re: nbsp in \s, ?ws and

2005-04-15 Thread Juerd
Aaron Sherman skribis 2005-04-15 18:20 (-0400):
  Is there a ?ws-like thingy that is always \s+?
 Not sure what that means exactly.

?ws is \s* or \s+, depending on its surroundings.

 Thankfully, NBSP (U+00A0) is not Unicode whitespace.

Thanks for sharing this information!


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: nbsp in \s, ?ws and

2005-04-15 Thread Larry Wall
On Fri, Apr 15, 2005 at 11:44:03PM +0200, Juerd wrote:
: Is there a ?ws-like thingy that is always \s+?

Not currently, since \s+ is there.  ?ws used to be that, but
currently is defined as the magical whitespace matcher used by :words.

: Do \s and ?ws match non-breaking whitespace, U+00A0?

Yes.

: How about:
: 
: U+0008  backspace
: U+00A0  no break space (Repeated for overview)
: U+1361  ethiopic wordspace
: U+2000  en quad
: U+2001  em quad
: U+2002  en space
: U+2003  em space
: U+2004  three per em space
: U+2005  four per em space
: U+2006  six per em space
: U+2007  figure space
: U+2008  punctuation space
: U+2009  thin space 
: U+200A  hair space
: U+200B  zero width space
: U+202F  narrow no break space
: U+205F  medium mathematic space
: U+2060  word joiner (What is that, anyway?)
: U+3000  ideographic space
: U+FEFF  zero width non-breaking space

Yes, any Unicode whitespace, but you seem to have a different list than
I do.  Outside of the standard ASCIIish control-character whitespace,
I count only the \pZ characters, not the \pC characters, so I don't have
to tell you what a word-joiner is, since it's a \p[Cf] character.  :-)

I will also gleefully ignore the existence of BOMs.

So I make it:

0020;SPACE;Zs;0;WS;N;
00A0;NO-BREAK SPACE;Zs;0;CS;noBreak 0020N;NON-BREAKING SPACE
1680;OGHAM SPACE MARK;Zs;0;WS;N;
180E;MONGOLIAN VOWEL SEPARATOR;Zs;0;WS;N;
2000;EN QUAD;Zs;0;WS;2002N;
2001;EM QUAD;Zs;0;WS;2003N;
2002;EN SPACE;Zs;0;WS;compat 0020N;
2003;EM SPACE;Zs;0;WS;compat 0020N;
2004;THREE-PER-EM SPACE;Zs;0;WS;compat 0020N;
2005;FOUR-PER-EM SPACE;Zs;0;WS;compat 0020N;
2006;SIX-PER-EM SPACE;Zs;0;WS;compat 0020N;
2007;FIGURE SPACE;Zs;0;WS;noBreak 0020N;
2008;PUNCTUATION SPACE;Zs;0;WS;compat 0020N;
2009;THIN SPACE;Zs;0;WS;compat 0020N;
200A;HAIR SPACE;Zs;0;WS;compat 0020N;
200B;ZERO WIDTH SPACE;Zs;0;BN;N;
2028;LINE SEPARATOR;Zl;0;WS;N;
2029;PARAGRAPH SEPARATOR;Zp;0;B;N;
202F;NARROW NO-BREAK SPACE;Zs;0;WS;noBreak 0020N;
205F;MEDIUM MATHEMATICAL SPACE;Zs;0;WS;compat 0020N;
3000;IDEOGRAPHIC SPACE;Zs;0;WS;wide 0020N;

: \s is said (in S05) to match any unicode whitespace, but letting it
: match NBSP and then using \s for splitting things is wrong, I think.

Perhaps the default word split should not be based on \s then.
It's just one more difference, in addition to trimming leading and
trailing whitespace like awk.

: Are the contents of  split using ?ws? (Is $foo, where $foo is
: foo\xA0bar, one or two elements?)

That is using the default word splitter (or it *is* the default word
splitter), so if the default word split is based on +[\s]-[\xA0]
it would be one element.

Of course, the ZERO WIDTH SPACE is a nasty critter for anyone using
whitespace to separate tokens.  That and maybe thin spaces probably
merit warnings in Perl code where they might cause visual ambiguity.

Larry


Re: nbsp in \s, ?ws and

2005-04-15 Thread Juerd
Larry Wall skribis 2005-04-15 15:38 (-0700):
 : Do \s and ?ws match non-breaking whitespace, U+00A0?
 Yes.

That makes \s+ and \s*, and thus ?ws very useless for anything but
trimming whitespace. For splitting (including word wrapping), it'd do
exactly the wrong thing.

 : \s is said (in S05) to match any unicode whitespace, but letting it
 : match NBSP and then using \s for splitting things is wrong, I think.
 Perhaps the default word split should not be based on \s then.

It'd have to.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html


Re: nbsp in \s, ?ws and

2005-04-15 Thread Larry Wall
On Sat, Apr 16, 2005 at 12:46:47AM +0200, Juerd wrote:
: Larry Wall skribis 2005-04-15 15:38 (-0700):
:  : Do \s and ?ws match non-breaking whitespace, U+00A0?
:  Yes.
: 
: That makes \s+ and \s*, and thus ?ws very useless for anything but
: trimming whitespace. For splitting (including word wrapping), it'd do
: exactly the wrong thing.

Maybe we just need a bws for breaking white space, or some such.
?ws is primarily used in pattern matching with :w, where a
non-breaking space in the input would presumably be matched by a
non-breaking space in the pattern, or maybe an explicit nbsp.
As long as patterns (with or without :w) treat non-breaking spaces
as ordinary matching characters, it should work out, methinks.
Though it's probably a hair more readable to use an explicit nbsp...

Larry


Re: Heredocs: How equal are bunches of spaces to tabs?

2005-04-15 Thread Larry Wall
On Sat, Apr 16, 2005 at 12:11:24AM +0200, Juerd wrote:
: Pasted from pugs/examples/cookbook/01-00introduction.p6:
: 
: # XXX - question: How equal are bunches of spaces to tabs?
: #   -- I'd say that's a question for perl6lang

This seems to be singularly short on context, but if it has to do with
trimming leading whitespace from heredocs, A2 already discusses this.

Larry


Re: nbsp in \s, ?ws and

2005-04-15 Thread Mark Reed
I thought we had just established that nbsp is not in Unicode¹s definition
of whitespace.  So why should \s match it?



On 2005-04-15 18:56, Larry Wall [EMAIL PROTECTED] wrote:

 On Sat, Apr 16, 2005 at 12:46:47AM +0200, Juerd wrote:
 : Larry Wall skribis 2005-04-15 15:38 (-0700):
 :  : Do \s and ?ws match non-breaking whitespace, U+00A0?
 :  Yes. 
 : 
 : That makes \s+ and \s*, and thus ?ws very useless for anything but
 : trimming whitespace. For splitting (including word wrapping), it'd do
 : exactly the wrong thing.
 
 Maybe we just need a bws for breaking white space, or some such.
 ?ws is primarily used in pattern matching with :w, where a
 non-breaking space in the input would presumably be matched by a
 non-breaking space in the pattern, or maybe an explicit nbsp.
 As long as patterns (with or without :w) treat non-breaking spaces
 as ordinary matching characters, it should work out, methinks.
 Though it's probably a hair more readable to use an explicit nbsp...
 
 Larry 
 




Comparing rationals/floats

2005-04-15 Thread gcomnz
More questions stemming from cookbook work... Decimal Comparisons:

The most common recipe around for comparisons is to use sprintf to cut
the decimals to size and then compare strings. Seems ugly.

The non-stringification way to do it is usually along the lines of: 

if (abs($value1 - $value2)  abs($value1 * epsilon))

(From Mastering Algorithms with Perl errata)

I'm wondering though, if C$value1 == $value2 is always wrong (or
almost always wrong) then should it be smarter and:

a. throw a warning
b. DWIM using overloaded operators (as in reduce precision then compare)
c. throw a warning but have other comparison operators just for this
case to make sure you know what you're doing

I'd vote for b., but I don't know enough about the problem domain to
know if that is safe, and realistically I just want to write the
cookbook entry rather than start a math-geniuses flame war ;-)

Which leads to another question: Are there $value.precision() and
$value.accuracy() methods available for decimals? I'd really rather
not do the string comparison if it can be avoided, maybe it's just the
purist in me saying leave the numbers be :-)

Apologies in advance if this is somewhere I missed. I did a lot of searching.

Marcus Adair


Re: [pugs] Quoting constructs

2005-04-15 Thread Roie Marianer
On Friday 15 April 2005 3:27 am, Larry Wall wrote:
 On Fri, Apr 15, 2005 at 03:27:27AM +0300, Roie Marianer wrote:
 :  %hash a $key_b c   :key a $value_b c 
 :  %hash« a $key_b c »:key« a $value_b c »
 :
 : Just to be certain, these are both equivalent to
 :
 :  @hash{'a', $key_b, 'c'} key = ['a', $value_b, 'c']
 :
 : in Perl 5, right?

 Close.  It's actually more like:

 @hash{split  , a $key_b c}key = [split  , a $value_b c]

I actually knew that, but in my head $key_b and $value_b were single words. 
But according to S02, the interpolation is protected by quotes. That is, if 
$key_b is q0/printf Hello, world\n or die/, that's four words, correct? Or 
is it just if the quotes actually appear in the quoting construct? Basically 
I'm wondering if there's a detailed specification of how  should work.

Several only-slightly-related questions about interpolating:

1. qq x$varx eq $var? (That's how it works in Perl5, anyway)

2. If the delimiter is not a single character (I think this only applies to 
), does a backslash protect the first character or both? For example, in
 some words \ or die
Is that three words ['some', 'words', ''] with the  ending the construct, 
or is that ['some', 'words', '', 'or', 'die']? (and the rest of the file 
is interpolated and split into words)

3. Are -style delimiters allowed in other quoting constructs? Is 
qHello the string Hello, or the string Hello followed by the 
greater-than sign? (As you can probably tell, I haven't implemented  yet 
at all.)

My head hurts. :-)

By the way, something tells me perl6-compiler isn't the best place for this 
discussion. Is there a secret group of people that discusses cornercases for 
perl6, and if so could someone tell me on what list they live?
-- 
-Roie
v2sw6+7CPhw5ln5pr4/6$ck2ma8+9u7/8LSw2l6Fi2e2+8t4TNDSb8/4Aen4+7g5Za22p7/8
[ http://www.hackerkey.com ]


Re: Comparing rationals/floats

2005-04-15 Thread Doug McNutt
At 16:18 -0700 4/15/05, gcomnz wrote:
More questions stemming from cookbook work... Decimal Comparisons:

The most common recipe around for comparisons is to use sprintf to cut
the decimals to size and then compare strings. Seems ugly.

The non-stringification way to do it is usually along the lines of:

if (abs($value1 - $value2)  abs($value1 * epsilon))

(From Mastering Algorithms with Perl errata)

I'm wondering though, if C$value1 == $value2 is always wrong (or
almost always wrong) then should it be smarter and:
SNIP
Marcus Adair

I have longed for an OO class that might be called measurement. An object 
would include a float, a unit of measure, and an estimate of accuracy.

Mathematical operations would be overloaded so that the result of a calculation 
would appropriately handle propagation of the argument's accuracies into the 
result. It might even do unit conversions but that's another subject. Coercion 
of a float into a measurement would be automatic with infinite precision 
assumed.

Given the new class it is easy to adjust comparison operators to calculate 
within experimental error.

-- 

-- Life begins at ovulation. Ladies should endeavor to get every young life 
fertilized. --


Announcing Test::TAP::Model and Test::TAP::HTMLMatrix

2005-04-15 Thread Yuval Kogman
Hola...

The code used to generate pugs smoke HTMLs (like
http://nothingmuch.woobling.org/pugs_test_status/ - warning around
800K), was refactored into two perl (5) modules, now (that is, when
your mirror has synched) available on the CPAN.

This code is authored by many of the pugs authors. If you feel the
need to discuss it, I think #perl6 on freenode is the place. In any
case, I'm not authoritative, as this code is not only mine.

In order to honor the fine tradition of releng breakage, both 0.01
versions are crummy. Use 0.02. Sorry =(

The two darcs repos for these modules are:

http://nothingmuch.woobling.org/Test-TAP-Model
http://nothingmuch.woobling.org/Test-TAP-HTMLMatrix

Test::TAP::Model wraps around Test::Harness::Straps and gives a sort
of souped up DOM to the TAP data that was collected, and
Test::TAP::HTMLMatrix creates the HTML using this DOM and a Petal
template.

Ciao!

-- 
 ()  Yuval Kogman [EMAIL PROTECTED] 0xEBD27418  perl hacker 
 /\  kung foo master: /methinks long and hard, and runs away: neeyah!!!



pgpwL4wbp4Eoc.pgp
Description: PGP signature


Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Michael G Schwern
On Fri, Apr 15, 2005 at 11:52:38PM +0200, Juerd wrote:
  becomes an unverifiable operation.  You have to use chdir() if you want to
  error check and $CWD is reduced to a scripting feature.
 
 Well, after failure it can be cwd() but false without breaking any real
 code, because normally, you'd never if (cwd) { ... }, simply because
 there's ALWAYS a cwd. If this is done, the thing returned by the STORE
 can still be an lvalue and thus be properly reffed.

Good idea!



  1   2   >