Re: Decrement of Numbers in Strings (Was: [svn:perl6-synopsis] r14460 - doc/trunk/design/syn)

2008-04-23 Thread Ph. Marek
On Mittwoch, 23. April 2008, Larry Wall wrote:
> On Wed, Apr 23, 2008 at 04:03:01PM +0100, Smylers wrote:
> : The algorithm for increment and decrement on strings sounds really good,
> : however I'm concerned that dealing with all that has made the common
> : case of integer decrement a little less intuitive where the integer
> : happens to be stored in a string, for example in this case:
> :
> :perl -wle '$a = 10; $b = shift; $a--; $b--; print "$a $b"' 10
> :
> : Perl 5 prints "9 9", but Perl 6 will print "9 09".
>
> On the other hand, "09" has the advantage of still having the numeric
> value 9.  But the converse is not true if the user was expecting a
> string decrement, since decrementing "10" in Perl 5 to get 9 is also
> counterintuitive if you were expecting "09".  So Perl 5 just punts,
> which is the third option.  In any case, there's always something
> to explain to a beginner.  But I think in Perl 6 we're leaning more
> toward preserving information than Perl 5 did.
But that doesn't really work for loops.

Imagine (excuse my perl5)
$a = "100";
$a-- for(1 .. 40);

So ($a eq "060")?
Then you'll have the problem that this gets (or might get) interpreted as 
octal somewhere; if not in perl6 directly (because of different base 
specifications), you're likely to get problems when passing that to other 
programs, eg. via system().


I think that's a can of work, and I'd be +1 on TSa:
> If the programmer really wants to decrement "10" to "09" she has 
> to cast that to Str: ("10" as Str)--. So we have "10".HOW === Str
> but "10".WHAT === Num Str. 
Regards,

Phil


Re: multi method dispatching of optional arguments (further refined)

2006-09-04 Thread Ph. Marek
On Tuesday 05 September 2006 07:52, Trey Harris wrote:
> I don't think you're dumb; the Synopses just require that you intuit
> certain things from each other, from examples in other Synopses, and so on
> in a Perlish sort of way; what you're looking for is not spelled out
> explicitly.  It can be found by noticing how you specify subtypes, along
> with noticing that subtypes can be specified as parameter types.  There's
> also an example showing explicitly what you want in S12.
Ok, I'll try to dive through the documentation before asking questions.

> It's just
>
> multi sub SomeThing(Num $a where {$^a == 4}, Num $b) { $b + 2  }
> multi sub SomeThing(Num $a, Num $b where {$^b == 3}) { $a + 1  }
> multi sub SomeThing(Num $a, Num $b)  { $a * $b }
>
> Yes, the signatures are different--the first two multis specify subtypes
> as their signatures, the last specifies a canonical type.
Thank you *very* much! That clears it up.


Regards,

Phil


Re: multi method dispatching of optional arguments (further refined)

2006-09-04 Thread Ph. Marek
On Monday 04 September 2006 16:21, Audrey Tang wrote:
> 2006/9/4, Ph. Marek <[EMAIL PROTECTED]>:
> > Excuse me for getting into this thread with only minor knowledge about
> > perl6, but will there be MMD based on the *value* of parameters? Like
> > Haskell has.
>
> Why, yes, see the various Unpacking sections in S06, as well as "where"
> type constraints.  We're st^H^Hadapting as much as we can. :-)
Hello Audrey!


I now had a look at http://dev.perl.org/perl6/doc/design/syn/S06.html but 
didn't find what I meant.
Sorry if I'm just dumb and don't understand you (or S06); I'll try to explain 
what I mean.


In Haskell you can eg. write:

SomeThing :: Int -> Int -> Int
SomeThing a b
  | a = 4   : b+2
  | b = 3   : a+1
  | otherwise   : a*b
or
AnotherThing :: Int -> Int -> Int
AnotherThing 4 b = b+2
AnotherThing b 3 = a+1
AnotherThing a b = a*b


In Perl5 this looks like

sub SomeThing
{
  my($a, $b)[EMAIL PROTECTED];

  return b+2 if ($a == 4);
  return a+1 if ($b == 3);
  return a*b;
}

Which is a bit wrong IMO, because the condition should be first.
But
sub SomeThing
{
  my($a, $b)[EMAIL PROTECTED];

  if ($a == 4) { return b+2 }
  if ($b == 3) { return a+1 }
  return a*b;
}
is a bit of a hazzle with the {} and repeated if()s.


What I am asking is whether there will be some multimethod dispatch depending 
on the *value*, not the *type*, of parameters.
Perl6 could possibly do something with "given"; but matching on multiple 
variables seems to be verbose, too.
I'm looking for something in the way of

sub SomeThing(Num $a, Num $b) where $a==4 is $b+2;
sub SomeThing(Num $a, Num $b) where $b==3 is $a+1;
sub SomeThing(Num $a, Num $b) { return $a * $b }

but without specifying the signature multiple times (or maybe we should, since 
it's MMD). Now

sub SomeThing(Num $a, Num $b) 
{
  if $a==4 { return $b+2;}
  if $b==3 { return $a+1;}
 return $a * $b;
}

would almost do what I want, but I don't know if the compiler would optimize 
that in the way it could for direct MMD depending on types.


Regards,

Phil


Re: multi method dispatching of optional arguments (further refined)

2006-09-03 Thread Ph. Marek
On Sunday 03 September 2006 14:25, Mark Stosberg wrote:
> Luke Palmer wrote:
> > On 9/3/06, Mark Stosberg <[EMAIL PROTECTED]> wrote:
> >> Note that the variant /with/ the parameter can be considered an exact
> >> match, but but the variant /without/ it cannot be considered an exact
> >> match.
Excuse me for getting into this thread with only minor knowledge about perl6, 
but will there be MMD based on the *value* of parameters? Like Haskell has.

I don't know about a possible syntax, but sometimes it's a very nice way to 
dispatch to different parts.

(I know that that's possible with if statements, but they have a disadvantage:
they're not so visually "dispatching", if you know what I mean).


Regards,

Phil


Re: Do chained comparisons short-circuit?

2006-01-18 Thread Ph. Marek
On Thursday 19 January 2006 04:25, Luke Palmer wrote:
> On 1/19/06, Joe Gottman <[EMAIL PROTECTED]> wrote:
> >Suppose I have code that looks like this:
> >
> > my ($x, $y, $z) = (1, 2, 3);
> >
> > say "sorted backward" if ++$x > ++$y > ++$z;
> >
> > Will $z be incremented even though the chained comparison is known to be
> > false after ++$x and ++$y are compared?
>
> I don't see a reason for chained comparisons not to short-circuit,
> besides the surprise factor.  But anyone who knows about &&, and
> understands chained comparisons as expanding to &&, should understand
> short-circuiting behavior.
Although that may lead to _longer_ code, which (when extended) is likely to be 
broken:

$x++; $y++; $z++;
say "sorted backward" if $x > $y > $z;

To be honest, in this example it mostly doesn't matter; if $x > $y, then 
($x+1) > ($y+1). But in many quickly written scripts I did some numeric 
operation to force the value to numeric, even if I got a parameter like 
"string" (which becomes 0 when numyfied)


How about some flag saying "don't short-circuit this"?


Regards,

Phil


Re: reduce metaoperator on an empty list

2005-06-07 Thread Ph. Marek
On Tuesday 07 June 2005 23:41, Luke Palmer wrote:
> On 6/7/05, Larry Wall <[EMAIL PROTECTED]> wrote:
> > Okay, I've made up my mind.  The "err" option is not tenable because
> > it can cloak real exceptions, and having multiple versions of reduce is
> > simply multiplying entities without adding much power.  So let's allow
> > an optional "identvalue" trait on operators.  If it's there, reduce
> > can use it.  If it's not, reduce returns failure on 0 args.  Built-in
> > addition will have an identity value of 0, while multiplication will
> > have an identity value of 1.  String concatenation will have "".
> > We can go as far as having -Inf on [<] and +Inf on [>]
>
> < and > still don't make sense as reduce operators.  Observe the table:
>
> # of args   |   Return (type)
> 0   |   -Inf
> 1   |   Num  (the argument)
> 2   |   bool
> ... |   bool
How about using initvalue twice for empty array, ie. always pad to at least 
two values?

So
 $bool = [<] @empty_array; # is false (-Inf < -Inf)
 $bool = [<=] @empty_array; # is true (-Inf <= -Inf)

Which would make some sort of sense - in an empty array there's no right 
element that's bigger than it's left neighbour ...

And if the case [<] @empty_array should return true it's easy to use ?? ::.


Just my ยค0.02.


Regards,

Phil



Re: Zero-day rules implementation status in Pugs

2005-05-09 Thread Ph. Marek
On Monday 09 May 2005 19:36, Autrijus Tang wrote:
> On Mon, May 09, 2005 at 10:51:53PM +1000, Damian Conway wrote:
> > Autrijus wrote:
> > >/me eagerly awaits new revelation from Damian...
> >
> > Be careful what you wish for. Here's draft zero. ;-)
>
> ...and here is my status report of the Zero-Day exploit, err,
> implementation, in Pugs. :-)
That's  great.
I'm just waiting for the next time, when you announce the implementation 
before the draft.

I'm really looking forward to meet you in Vienna next month.


Regards,

Phil



Re: S5 and overlap

2004-09-21 Thread Ph. Marek
> > # With the new :ov (:overlap) modifier, the current rule will match at
> > all possible character positions (including overlapping) and return all
> > matches in a list context, or a disjunction of matches in a scalar
> > context. The first match at any position is returned.
> >
> > $str = "abracadabra";
> >
> > @substrings = $str ~~ m:overlap/ a (.*) a /;
> >
> > # bracadabr cadabr dabr br
>
> Maybe I'm wrong here, but I'd get
Just found the answer, sorry.

But that gets me to the next question, ie I don't understand the difference 
between exhaustive and overlap.

Is it that overlap fixes the first point of the pattern match and does further 
scanning for all possibilities, and exhaustive then *after* this processing 
searches for another first point?


Regards,

Phil


S5 and overlap

2004-09-21 Thread Ph. Marek
> # With the new :ov (:overlap) modifier, the current rule will match at all
> possible character positions (including overlapping) and return all matches
> in a list context, or a disjunction of matches in a scalar context. The
> first match at any position is returned.   
> 
> $str = "abracadabra";
> 
> @substrings = $str ~~ m:overlap/ a (.*) a /;
> 
> # bracadabr cadabr dabr br

Maybe I'm wrong here, but I'd get
$str = "abracadabra";
 bracadabr
 cadabr
   dabr
 br
(so far identical), but then I'd also expect
  bracad
 cad
   d
  brac
 c
  br

which gets me to the question, if there'll be some elements multiple times in 
the array (they should), and in which order they appear (first match to 
(nth .. 1st) match, 2nd to (nth .. 2nd)) and so on ...

BTW: will
  $str = "abracadabra";
 
  @substrings = $str ~~ m:overlap/ a (.*) (b|d) /;
get some empty strings as well (I believe it should)?


Regards,

Phil



Re: push with lazy lists

2004-07-18 Thread Ph. Marek
On Friday 16 July 2004 18:23, Jonadab the Unsightly One wrote:
> > Please take my words as my understanding, ie. with no connection to
> > mathmatics or number theory or whatever. I'll just say what I
> > believe is practical.
>
> [...]
>
> > I'd believe that infinity can be integer, ie. has no numbers after
> > the comma; and infinity is in the natural numbers (?), which are a
> > subset of integers.
>
> If that were the case, 0/Inf would == 0.
Isn't that so?
0/+Inf == 0
0/-Inf  == 0 (or -0, if you wish :-)

> Also, if that were the case, 0..Inf would be a finite list.  (It is
> trivial to prove that 0..N is a finite list with finite cardinality
> for all natural numbers N.  So if you set N equal to Inf, 0..Inf would
> have finite cardinality, if Inf is a natural number.)
>
> This is obviously some new definition of Inf of which I was not
> previously aware.
Well, after reading my sentence one more, I see what may have caused some 
troubles.
Inf is not in N; but *in my understanding* it fits naturally as an extension 
to N, that is, Inf is (or can be) integer as is "after" N...

This won't be written in math books, I know.

> Also, if that were the case, 0..Inf would be a finite list.  (It is
> trivial to prove that 0..N is a finite list with finite cardinality
> for all natural numbers N.  So if you set N equal to Inf, 0..Inf would
> have finite cardinality, if Inf is a natural number.)
If I extend the natural numbers N with Inf to a new set NI (N with Inf), then 
0 .. n (for n in NI) need not be finite ...


Sorry for my (very possibly wrong) opinion ...


Regards,

Phil



Re: push with lazy lists

2004-07-14 Thread Ph. Marek
On Wednesday 14 July 2004 08:39, David Storrs wrote:
> > To repeat Dave and myself - if
> > @x = 1 .. Inf;
> > then
> > rand(@x)
> > should be Inf, and so
> > print $x[rand(@x)];
> > should give Inf, as the infinite element of @x is Inf.

Please take my words as my understanding, ie. with no connection to mathmatics 
or number theory or whatever. I'll just say what I believe is practical.

> Does it even make sense to take the Infiniteth element of an
> array?...after all, array indices are integers, and Inf is not an
> integer.  
I'd believe that infinity can be integer, ie. has no numbers after the comma; 
and infinity is in the natural numbers (?), which are a subset of integers.

> If we allow it, should we also allow people to take the 
> NaNth element of an array?  
NaN is already a "number" (internal representation), so it doesn't get 
converted.
As there is no NaNth element, it would return either undef (as in (0,1,2)[8] ) 
or an exception, as it is no numeric index.

> How about the 'foobar'th element? 
'foobar' is converted to a number, so the 0th element is taken.

> What happens if I take the Infiniteth element of a finite list?
undef, as in 8th element of (1,2,3).

> I think I would prefer if using Inf as an array index resulted in a
> trappable error.
That's a possibility. It could raise an exception as with NaN.

To summarize:
@x= ('a', 5 .. Inf, 'b');
$x[0] is 'a'
$x['foo'] is 'a'
$x[-1] is 'b'
$x[2] is 6
$x[2002] is 2006
I believe these are clear and understandable.

$x[Inf] is 'b'
$x[-2] is Inf
$x[-10] is Inf
$x[-2] is Inf
These would result in simply interpolating the indizes.

$x[NaN] gets an exception
because NaN is already of numeric type (as in $x=tan(pi/2)), but can not be 
associated to any index.


So I'd propose to solve this argument based on "can be used as an index".
An infinite array (and even an finite) can be asked for an infinite index - 
which has an value for infinite arrays.
This is just so there's no special coding for some indizes necessary - imagine 
a lookup like
@x = (10,9,9,8,8,8,6,3,2,1,1,1,1,0);
$number = scalar()+0;
print $x[10/$number];
which would work for *any* input, and just give undef for most of them.


Regards,

Phil


BTW: is it possible to define a look-up table as in
@x = (1, 2, 3, 4, 5, Inf .. Inf)
to get everything from [5] on to be Inf?



Re: push with lazy lists

2004-07-13 Thread Ph. Marek
> >--- Larry Wall <[EMAIL PROTECTED]> wrote:
> >>  The hard part being to pick a random number in [0,Inf) uniformly. :-)
> >
> >Half of all numbers in [0, Inf) are in the range [Inf/2, Inf). Which
> >collapses to the range [Inf, Inf). Returning Inf seems to satisfy the
> >uniform distribution requirement: if you have a number you're waiting
> >to see returned, just wait a bit longer...
>
> I like the 1/n trick used in the Perl Cookbook (Picking a Random Line from
> a File).  We could apply the same idea here:
>
>   rand($_)<1 && ($chosen=$_) for 1...Inf;
I don't believe that that could give you an value ...

> All right, it would take a bit longer for your program to run, but that's
> a performance issue for them to sort out on *-internals.
Like, it would take a bit longer than your lifetime :-)?

>-David "sure Moore's Law will deal with it in a year or two" Green
'And my new '986 does the infinite loop in under 3.5 seconds' :-)


To repeat Dave and myself - if
@x = 1 .. Inf;
then
rand(@x)
should be Inf, and so
print $x[rand(@x)];
should give Inf, as the infinite element of @x is Inf.


But maybe we could get an index of Inf working like -1 (ie. the last value): 
@x = 1 .. Inf;
push @x, "a";
print $x[Inf];
would print an "a" ...

although, on this line of reasoning,
print $x[rand(@x)];
would always print "a" 


I believe that an array should get an .rand-Method, which could do the right 
thing.
@x= (1 .. Inf, "b", -Inf .. -1, "c", 1 .. Inf);
print $x[rand(@x)],"\n" while (1);
could give
Inf
Inf
-Inf
b
c
Inf
-Inf
and so on - an "random" element of a random part of the array, and an infinite 
list gives Inf (or -Inf) as a random element (as explained above in this 
thread).

So an array would have to know of how many "pieces" it is constructed, and 
then choose an element among the pieces ...

I'd think that's reasonable, isn't it?


Regards,

Phil



Re: push with lazy lists

2004-07-12 Thread Ph. Marek
On Thursday 08 July 2004 05:25, Larry Wall wrote:
> : say @x[rand];  # how about now?
>
> Well, that's always going to ask for @x[0], which isn't a problem.
> However, if you say rand(@x), it has to calculate the number of
> elements in @x, which could take a little while...
I'd expect to be rand(@x) = rand(1)[EMAIL PROTECTED] = rand(1)*Inf = Inf or NaN.

Case 1 (Inf) would give Inf (which can be argued, since infinite many more 
elements are bigger than any given finite number), and case 2 could give an 
exception ...


Regards,

Phil


Re: question regarding rules and bytes vs characters

2004-07-11 Thread Ph. Marek
> : Hello everybody,
> :
> : I'm about to learn myself perl6 (after using perl5 for some time).
>
> I'm also trying to learn perl6 after using perl5 for some time.  :-)
I wouldn't even try to compare you and me  :-)

> Pretty close.  The way it's set up currently, $len is a reference
> to a variable external to the rule, so $len is likely to fail under
> stricture unless you've declared "my $len" somewhere.  To make the
> variable automatically scope to the rule, you have to use $?len
> these days.
ok.

> : And furthermore is perl6 said to be unicode-ready.
> : So I put the :u0-modifier in the data-regex; will that DWIM if I try to
> : match a unicode-string with that rule?
>
> It should.  However (and this is a really big however), you'll have
> to be very careful that something earlier hasn't converted one form
> of Unicode to another on you.  For instance, if your string came in
> as UTF-8, and your I/O layer translated it internally to UTF-32 or
> some such, you're just completely hosed.  When you're working at the
> bytes level, you must know the encoding of your string.
>
> So the natural reaction is to open your I/O handle :raw to get binary
> data into your string.  Then you try to match Unicode graphemes with [
> :u2 . ] and discover that *that* doesn't work.  Which is obvious when
> you consider that Perl has no way of knowing which Unicode encoding
> the binary data is in, so it's gonna consider it to be something like
> Latin-1 unless you tell it otherwise.  So you'll probably have to
> cast the binary string to whatever its actual encoding is (potentially
> lying about the binary parts, which we may or may not get away with,
> depending on who validates the string when), or maybe we just need
> to define rules like  and  for use
> under the :u0 regime.
Of course the file must be opened in binary mode - else the line-endings etc. 
can be destroyed in the binary data, which is bad.

So Perl/Parrot can't autodetect the kind of encoding.
But maybe it should be possible to do something like
[:utf16be_codepoint]? Len: $?len:=(\d+) \n
$?data:=([:raw .]<$len>) \n
ie. say that the conversion to unicode is optional??

> : Is anything known about the internals of pattern matching whether the
> : hypothetical variables will consume (double) space?
> : I'm asking because I imagine getting a tag like "Len: 2" and then
> : having problems with 256MB RAM. Matching shouldn't be a problem according
> : to apo 5 (see the chapter "RFC 093: Regex: Support for incremental
> : pattern matching") but I'll maybe have troubles using the matched data?
>
> My understanding is that Parrot implements copy-on-write, so you should
> be okay there.
ok, thank you.

> Even the late ones?  :-)
even them - this is the *only* answer I received.

Again:
> : Thank you for all answers!

> Larry
Phil


question regarding rules and bytes vs characters

2004-05-31 Thread Ph. Marek
Hello everybody,

I'm about to learn myself perl6 (after using perl5 for some time).

One of my first questions deals with regexes.


I'd like to parse data of the form
Len: 15\n
(15 bytes data)\n
Len: 5\n
(5 bytes data)\n
\n
OtherTag: some value here\n
and so on, where the data can (and will) be binary.

I'd try for something like
my $data_tag= rule { 
Len\: $len:=(\d) \n 
$data:=([:u0 .]<$len>)\n  # these are bytes
};

Is that correct?

And furthermore is perl6 said to be unicode-ready.
So I put the :u0-modifier in the data-regex; will that DWIM if I try to match 
a unicode-string with that rule?


Is anything known about the internals of pattern matching whether the 
hypothetical variables will consume (double) space?
I'm asking because I imagine getting a tag like "Len: 2" and then 
having problems with 256MB RAM. Matching shouldn't be a problem according to 
apo 5 (see the chapter "RFC 093: Regex: Support for incremental pattern 
matching") but I'll maybe have troubles using the matched data?


Thank you for all answers!


Regards,

Phil


Re: The Sort Problem

2004-02-13 Thread Ph. Marek
Am Freitag, 13. Februar 2004 01:40 schrieb Larry Wall:
> On Thu, Feb 12, 2004 at 04:29:58PM -0500, Uri Guttman wrote:
> : again, confusing. why should the order of a binary operator mean so
> : much? the order of a sort key is either ascending or descending. that is
> : what coders want to specify. translating that to the correct operator
> : (cmp or <=>) and the correct binary order is not the same as specifying
> : the key sort order and key type (int, string, float).
>
> Uri is dead on with this one, guys.
As I listen to this mails, I get the feeling that something like this is 
wanted:

Key generation:
@unsorted_temp = map {
   $k1=$_.func1('a');# ASC
   $k2=$_.func2('we');  # DESC
   [ $_, $k1, $k2 ];
 } @unsorted;
Now we've got an array with keys and the objects.
Sorting:
@sorted = sort {
  $a->[1] cmp $b->[1] ||
  $b->[2] <=> $a->[2] ||
} @unsorted_temp;


These things would have to be said in P6.
So approx.:
@sorted = @unsorted.sort(
  keys => [ { $_.func1('a'); },
{ $_.func2('we'); } ],
  cmp => [ cmp, <=> ],
  order => [ "asc", "desc"],
  key_generation => "lazy",
);

That would explain what I want.
Maybe we could turn the parts around:

@sorted = @unsorted.sort(
  1 => [ { $_.func1('a'); }, cmp, "asc"],
  2 => [ { $_.func2('we'); }, <=>, "desc"],
);

or maybe use a hash instead of an array:

@sorted = @unsorted.sort(
  1 => [ key => { $_.func1('a'); }, op => cmp, order => "asc"],
  2 => [ key => { $_.func2('we'); }, op => <=>, order => "desc"],
);


If that's too verbose? I don't think so; I've stumbled often enough on $a <=> 
$b vs. $b <=> $a and similar, and the above just tells what should be done.


Regards,

Phil



Re: The Sort Problem

2004-02-12 Thread Ph. Marek
> ...
> so here is a (very rough and probably broken) syntax idea building on
> that:
>
> sort :key { :descend :string .foo('bar').substr( 10, 3) }
>
>  :key {  :int .foo('baz') }
>  :key {  :float .foo('amount') } @unsorted ;
I see a kind of problem here: If the parts of the key are not fixed length but 
can vary you can put them in strings *only* after processing all and 
verifying the needed length.

Example:
sort :key { :descend :string .foo('bar') }
  :key {  :int .foo('baz') }
  :key {  :float .foo('amount') } @unsorted ;

Now .foo('bar') isn't bounded with any length - so you don't know how much 
space to reserve.


And I believe that 
- generating keys on every list element
- storing them into a array (array of array) and
- after having processed all checking the length, and
- now generate the to-be-sorted-strings 
- sort

isn't the optimal way.
BTW: this requires that *all* keys are generated.
In cases like
- by name,
- by age,
- by height,
- by number of toes left,
- and finally sort by the social security number

most of the extractions (and possibly database-queries of calculations or 
whatever) will not be done - at least in the current form of
sort { $a->{"name"} cmp $b->{"name"} ||
 $a->{"age"} <=> $b->{"age"} || 
...

That is to say, I very much like the syntax you propose, but I'm not sure if 
pre-generating *every* key-part is necessarily a speed-up.

If there are expensive calculations you can always cut them short be 
pre-calculating them into a hash by object, and just query this in sort.


Also I fear that the amount of memory necessary to sort an array of length N 
is not N*2 (unsorted, sorted), but more like N*3 (unsorted, keys, sorted), 
which could cause troubles on bigger arrays 


Regards,

Phil



calling functions/class methods

2004-01-30 Thread Ph. Marek
Hello everybody,

first of all please forgive me if I'm using the wrong words - I'm not up to 
date about the (current) meanings of methods, functions, etc.


I read the article
http://www.cuj.com/documents/s=8042/cuj0002meyers/

There is stated (short version - read article for details):
In C++ there are member functions, which are called via
object.member(parameter),
and non-member (possibly friend) function, which are called via
function(object,parameter).

I wondered whether perl6 could do both:
- When called via object.member, look for a member function; if it is not 
found, look for a function with this name, which takes an object as first 
parameter.
- When called the other way, look first for the function, then for a member.

So both ways are possible, and in the (not-interfering) normal situation (only 
one of member/function defined) it would support encapsulation, in that a 
caller does not need to know if this function was a member or not.


I fear that I'm on a completly wrong track, or that this has been decided - 
but I didn't find something about this.


Regards,

Phil



Re: Next Apocalypse

2003-09-15 Thread Ph. Marek
> Because there are some assertions that can lead the optimizer to make some
> fundamental assumptions, and if those assumptions get violated or
> redefined while you're in the middle of executing a function that makes
> use of those assumptions, well...
>
> Changing a function from pure to impure, adding an overloaded operator, or
> changing the core structure of a class can all result in code that needs
> regeneration. That's no big deal for code you haven't executed yet, but if
> you have:
>
> a = 1;
> b = 12;
> foo();
> c = a + b;
>
> and a and b are both passive classes, that can get transformed to
>
> a = 1;
> b = 12;
> foo();
> c = 13;
>
> but if foo changes the rules of the game (adding an overloaded + to a or
> b's class) then the code in that sub could be incorrect.
>
> You can, of course, stop even potential optimization once the first "I can
> change the rules" operation is found, but since even assignment can change
> the rules that's where we are right now. We'd like to get better by
> optimizing based on what we can see at compile time, but that's a very,
> very difficult thing to do.
How about retaining some "debug" info, (line number come to mind), but only at 
expression level??
So in your example if foo() changed the + operator, it would return into the 
calling_sub() at expression 4 (numbered from 1 here :-), notice that 
something has changed, recompile the sub, and continue processing at 
expression 4.

Phil



Re: regex matching from a position ?

2003-02-12 Thread Ph. Marek
> Phil, please see the perlfunc entry for "pos" and the perlre section
> on \G.  This is what you need.
Thanks a lot! I know about pos but thought it was read-only.
And \G is relatively new, isn't it? Certainly wasn't 
existing in '97 when I learned perl :-) 
And the "basics" are seldom read again in the docs...


Thank you very much, although it's still 32% slower:


2505792 bytes to do ...
Benchmark: timing 100 iterations of from_start, pos, re_dyn, re_once, substr, 
substr_set...
from_start:  2 wallclock secs ( 1.06 usr +  0.00 sys =  1.06 CPU) @ 943396.23/s 
(n=100)
   pos:  0 wallclock secs ( 1.55 usr +  0.01 sys =  1.56 CPU) @ 641025.64/s 
(n=100)
re_dyn:  7 wallclock secs ( 6.13 usr +  0.00 sys =  6.13 CPU) @ 163132.14/s 
(n=100)
   re_once:  2 wallclock secs ( 1.22 usr +  0.00 sys =  1.22 CPU) @ 819672.13/s 
(n=100)
substr:  2 wallclock secs ( 2.39 usr +  0.01 sys =  2.40 CPU) @ 41.67/s 
(n=100)
substr_set:  3 wallclock secs ( 3.10 usr +  0.00 sys =  3.10 CPU) @ 322580.65/s 
(n=100)
   Ratere_dyn substr_setsubstr   pos  re_once from_start
re_dyn 163132/s--   -49%  -61%  -75% -80%   -83%
substr_set 322581/s   98% --  -23%  -50% -61%   -66%
substr 416667/s  155%29%--  -35% -49%   -56%
pos641026/s  293%99%   54%-- -22%   -32%
re_once819672/s  402%   154%   97%   28%   --   -13%
from_start 943396/s  478%   192%  126%   47%  15% --


Regards,

Phil

#!/usr/bin/perl

use Benchmark qw(cmpthese);


$pos=500;
$runs=100;
$_=`cat /etc/* 2> /dev/null`;
study $_;

print length($_), " bytes to do ...\n";

cmpthese($runs,
{
  "from_start"  => sub { m/\S*\s+(\S+)/; },
  "re_dyn"  => sub { m/^[\x00-\xff]{$pos}\S*\s+(\S+)/; },
  "re_once" => sub { m/^[\x00-\xff]{$pos}\S*\s+(\S+)/o; },
  "substr" => sub { substr($_,$pos) =~ m/\S*\s+(\S+)/; },
  "substr_set" => sub { $tmp=substr($_,$pos); $tmp =~ m/\S*\s+(\S+)/; },
  "pos"  => sub { pos($pos); m/\G\S*\s+(\S+)/; },
}
);




regex matching from a position ?

2003-02-11 Thread Ph. Marek
Hello everybody,

I've sometimes the task to analyse a string 
starting from a given position, where this position 
changes after each iteration. (like index() does)


As this is perl there are MTOWTDIIP but I'd like 
to know the fastest.

So I used Benchmark.pm to find that out. (script attached)


Excerpt from script:
  "from_start"  => sub { m/\S*\s+(\S+)/; },
  "re_dyn"  => sub { m/^[\x00-\xff]{$pos}\S*\s+(\S+)/; },
  "re_once" => sub { m/^[\x00-\xff]{$pos}\S*\s+(\S+)/o; },
  "substr" => sub { substr($_,$pos) =~ m/\S*\s+(\S+)/; },
  "substr_set" => sub { $tmp=substr($_,$pos); $tmp =~ m/\S*\s+(\S+)/; },

from_start is for comparision only as it should be.
re_once is for comparision too as the index can't be adjusted.
(and dynamically recompiling via eval() for changing indexes can't be fast enough)


Results:

2505792 bytes to do ...
Benchmark: timing 100 iterations of from_start, re_dyn, re_once, substr, 
substr_set...
from_start:  1 wallclock secs ( 1.26 usr + -0.01 sys =  1.25 CPU) @ 80.00/s 
(n=100)
re_dyn:  9 wallclock secs ( 6.52 usr +  0.00 sys =  6.52 CPU) @ 153374.23/s 
(n=100)
   re_once:  1 wallclock secs ( 1.26 usr +  0.01 sys =  1.27 CPU) @ 787401.57/s 
(n=100)
substr:  4 wallclock secs ( 2.36 usr +  0.02 sys =  2.38 CPU) @ 420168.07/s 
(n=100)
substr_set:  5 wallclock secs ( 3.23 usr +  0.00 sys =  3.23 CPU) @ 309597.52/s 
(n=100)
   Rate re_dyn substr_set substrre_once from_start
re_dyn 153374/s --   -50%   -63%   -81%   -81%
substr_set 309598/s   102% --   -26%   -61%   -61%
substr 420168/s   174%36% --   -47%   -47%
re_once787402/s   413%   154%87% ---2%
from_start 80/s   422%   158%90% 2% --


So: every possibility is *much* slower than necessary!
So I propose (I know that I'm a bit late, but who cares ... :-) 
a new option for regexes (like each, case-insensitive, 
and match- multiple-times) which allows to specify a 
position to start matching. That should be *no* overhead!
eg:
$text.m:from500:i /\s*(\S+)/;


Currently the substr() is the fastest available option - unless somebody
has more imagination than me (which I take as given).

So, is there a faster possibility, is that no problem for perl6, 
or will something like this be implemented?



Regards,

Phil


#!/usr/bin/perl

use Benchmark qw(cmpthese);


$pos=500;
$runs=100;
$_=`cat /etc/* 2> /dev/null`;
study $_;

print length($_), " bytes to do ...\n";

cmpthese($runs,
{
  "from_start"  => sub { m/\S*\s+(\S+)/; },
  "re_dyn"  => sub { m/^[\x00-\xff]{$pos}\S*\s+(\S+)/; },
  "re_once" => sub { m/^[\x00-\xff]{$pos}\S*\s+(\S+)/o; },
  "substr" => sub { substr($_,$pos) =~ m/\S*\s+(\S+)/; },
  "substr_set" => sub { $tmp=substr($_,$pos); $tmp =~ m/\S*\s+(\S+)/; },
}
);

  





hyper/vector operation operator

2002-11-27 Thread Ph. Marek
Hello everyone!


First of all - I do not closely follow perl6/parrot development. I read "this 
week on perl6" on www.perl.com but that's it - so if I'm completly off the 
track, let me know.


Regarding the discussions about the hyper operator (eg adding elements of 2 
arrays into another array) I've had the following idea: use "=>"

- in perl5 there is an operator "=>" which is used in associative array 
assignment. In perl6 this means "pairs" IIRC, which could get interpreted as 
"add pairs of numbers"
- it has a nice visual feeling: combines 2 elements (two lines) into 1 (one 
end).


So an usage could be
@a = @b =>+ @b;
@a = @b =+> @b;
@a = @b +=> @b;
where the 2nd form would be the most intuitive (from reading this source).
Hmm, that would leave us with
@a =+>= @b;
which ain't as pretty.


What do you think?


Regards,

Phil