Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-02 Thread Markus Laire

On 1 Oct 2002 at 18:47, [EMAIL PROTECTED] wrote:

   all text up to, but not including the string union.
 
  rule getstuffbeforeunion { (.*?) union | (.*) }
  
  a union = a 
  b = b
 
 hmm... well, it works, but its not very efficient. It basically 
 scans the whole string to the end to see if there is a union string, and 
 then backtracks to take the alternative. And hence, its not very scalable. 
 It also doesn't 'complexify' very well.

What about

Perl 5:   /(.*?)(?:union|$)/
Perl 6:   /(.*?) [union | $$]/

or if you want to exlude 'union' from match

Perl 5:   /(.*?)(?=union|$)/
Perl 6:   /(.*?) [after: union | $$]/

IMHO those should scan string one char at a time until 'union' or end-
of-string, which is optimal solution.

-- 
Markus Laire 'malaire' [EMAIL PROTECTED]





Re: Subject-Oriented Programming

2002-10-02 Thread Andy Wardley

On Mon, Sep 30, 2002 at 11:22:02PM -0400, Michael G Schwern wrote:
 Last year at JAOO I stumbled on this thing called Subject-Oriented
 Programming which looked interesting.  

There are a bunch of advanced programming techniques like this that
all fit under the same umbrella:

  * Subject Oriented Programming (IBM)
  * Aspect Oriented Programming (Xerox Parc)
  * Composition Filters
  * Adaptive Programming - Demeter Method and Propagation Patterns

They're all attempting to achieve the same goal - a clear separation of
concerns - and although the details are different, the overlying 
principal is the same.

They are all metaprogramming layers which allow you to put wrappers around 
your code, and in particular your objects.  These wrappers intercept method 
calls and can modify them, redirect them, and otherwise jiggle around with 
them in all sorts of interesting ways.

For example, AOP identifies cross-cutting aspects (like error handling,
logging, etc) which cut across an entire system.  It's considered bad
form to implement logging is each of the 20 object classes you're using
because you've lost the separation of concerns.  If you want to change
the way logging is handled, you have to go and change 20 classes.  You 
can use inheritance or delegation to acheive a clear separation of 
concerns, but these have their own problems.  The inheritance route
tends to lead to large and cumbersome object hierarchies that are fragile
to change and beset with the problems of multiple inheritance.  Delegation
is sometimes better, but tends to lead to highly fragmented programs
where it is difficult to see the wood for the trees.

AOP tackles this problem by allowing you to define different aspects
of your program (e.g. database access, user interface, error handling,
logging, etc.) and then weave them together by defining join points
and leaving the pre-processor to join up the dots.  Template processing
systems are a simple example of AOP.  If you write a CGI script with
Perl code and HTML fragments interleaved then you make it hard to read,
update and understand.  Better is to define your Perl code in one place
(the implementation aspect, or model in another parlance) and your
HTML markup in another (the presentation aspect, or view) and then leave
it to your favourite template processor to weave the two together.

The approach of SOP is to define an object-like entity, the subject
which is a wrapper around one or more objects.  This acts like a facade
around the inner objects, allowing method calls to the subject to be 
directed to the appropriate inner object, possibly with various forms
of manipulation taking place en route.

For example, you might have an employee object defined by your accounts 
department which includes payroll number, employee information, salary,
etc.  You might want to reuse this object in another application but 
without exposing salary details for employees.  By defining a subject
to enclose the object, you can effectively hide these fields.  You can 
also create composite subjects where an employee has a dozen fields,
methods, etc., which are implemented by 3 different underlying object 
classes.  Or you might want to implement some kind of security layer in 
your subject so that only certain privileged people have access to 
sensitive information.   You do all these things in the subject wrapper
and don't need to change your underlying objects.

Composition Filters are very similar but substitute subject for filter.
You compose collections of object and define how the method calls to them
should be filtered.

Adaptive Programming is slighty different in that it defines a methodology
for modelling a problem domain, and provides tools for generating code (C++)
to implement it.  The Demeter Method is a best-practice approach for 
designing your underlying objects so that they fit nicely together.
Propagation patterns are used to say, in effect, the shoulder bone is 
connected to the arm bone, you turn the handle and out comes your code,
with everything connected together and working in harmony as it should be.

This is a form of Generative Programming, which is the general term for
any process whereby you define a high-level model of a program and have
generators actually write the program for you (or more commonly, just the 
wiring code between existing objects).  This is slightly different from 
the usual metaprogramming approach of AOP and SOP which make extensive use
of C++ templates to pre-process your program code.

As for Perl implementations... hmmm.

The key step is to identify/implement a mechanism whereby we can 
put hooks into object vtables, allowing user code to intercept methods
called against objects.  Ruby has an AOP module (AspectR ISTR) which 
does this, making it trivially easy to intercept calls to a particular 
object method and perform some action on the way.  For things like debug 
tracing, logging, security layers, etc., this is invaluable.

With such a 

Re: Interfaces

2002-10-02 Thread Michael G Schwern

On Tue, Oct 01, 2002 at 05:04:29PM -0700, Michael Lazzaro wrote:
 On Tuesday, October 1, 2002, at 02:49  PM, Michael Lazzaro wrote:
 Which implies, I assume, that interface is not the default state of 
 a class method, e.g. we do need something like method foo() is 
 interface { ... } to declare any given method
 
 Flippin' hell, never mind.  You're almost certainly talking about a 
 style like:
 
   interface Vehicle {
   method foo () { ... }
   method bar () { ... }
   }

Definately not that.

 - or -
   class Vehicle is interface {
   ...
   }


snip

   class Vehicle {
   method foo () is interface { ... }
   method bar () is interface { ... }
   method zap () is private { ... }
   }

Perhaps both of the above, but only if methods-as-interfaces have to be
explicitly declared.  I like the class Vehicle is interface as a shorthand
for declaring every method of a class to be an interface.

It depends on if method signatures are enforced on subclasses by default, or
if you have to explicitly declare yourself to be an interface.  That's up in
the air in my mind.

Orthoginal to that decision is if a subclass should be able to explicitly
ignore its parent's interface and conditions.  I started by leaning towards
yes, now I'm thinking no.


-- 

Michael G. Schwern   [EMAIL PROTECTED]http://www.pobox.com/~schwern/
Perl Quality Assurance  [EMAIL PROTECTED] Kwalitee Is Job One
Plus I remember being impressed with Ada because you could write an
infinite loop without a faked up condition.  The idea being that in Ada
the typical infinite loop would be normally be terminated by detonation.
-- Larry Wall in [EMAIL PROTECTED]



Re: Interfaces

2002-10-02 Thread Michael G Schwern

On Tue, Oct 01, 2002 at 04:01:26PM -0700, Michael Lazzaro wrote:
 On Tue, Oct 01, 2002 at 03:43:22PM -0400, Trey Harris wrote:
 You want something like
 
   class Car is Vehicle renames(drive = accel)
 is MP3_Player renames(drive = mp3_drive);
 
 I *really* like this, but would the above be better coded as:
 
   class Car is Vehicle renames(drive = accel)
   has MP3_Player renames(drive = mp3_drive);
 
 ... implying a container relationship with automatic delegation?  

That would simply be another way to do it.  One is multiple inheritence, the
other is delegation.  Both should be in the language.


 Among the other considerations is that if you simply said
 
   class Car is Vehicle has MP3_Player;
 
 the inheritance chain could assume that Car.drive === Vehicle.drive, 
 because is-a (inheritance) beats has-a (containment or delegation).  If 
 you needed to, you should still be able to call $mycar.MP3_Player.drive 
 to DWYM, too.

This, too, is another way to do it, but I like Trey's original solution much
better.  When you use your MP3 Car as a Vehicle, the Vehicle methods win.
When you use it like an MP3_Player, the MP3_Player methods win.  No need to
expose the underlying MP3_Player object to the user.  YMMV.


-- 

Michael G. Schwern   [EMAIL PROTECTED]http://www.pobox.com/~schwern/
Perl Quality Assurance  [EMAIL PROTECTED] Kwalitee Is Job One
List context isn't dangerous.  Misquoting Gibson is dangerous.
-- Ziggy



Re: Interfaces

2002-10-02 Thread Michael G Schwern

On Tue, Oct 01, 2002 at 02:49:49PM -0700, Michael Lazzaro wrote:
 My musing is that the behavior of a class in different contexts is
 itself an interface, in the sense of being a contract between a
 class/subclass and it's users
 
 Ah HA!  Contract!  Return values can be enforce via a simple DBC post
 condition, no need to invent a whole new return value signature.
 
 I think I get it, but can you give some pseudocode? If you want a 
 method to return a list of Zoo animals in list context, and a Zoo 
 object in Zoo object context, what would that look like?

The trick is having some way of getting at the return value. Class::Contract
does this by having a magic value() function you can call in a
post-condition.  I can't think of anything better, so...

class Animals;
method zoo {
   ...
   
   # I have no idea what actual post condition syntax will look like
   post {
   given want {
   when 'LIST' { grep { $^thing.isa('Zoo::Animal') } value() }
   default { value().isa('Zoo') }
   }
   }
}

It might be nice if the return value was the topic of the post condition,
but that leads to problems of how you deal with lists as topics which I
don't know if they've been solved.


 (I'm assuming that DBC postconditions on a method would be treated, 
 internally, as part of the overall signature/prototype of the method: 
 i.e. if you override the method in a subclass, all original 
 postconditions would still remain attached to it (though the new method 
 might itself add additional postconditions.))

That's how I understand it works.


-- 

Michael G. Schwern   [EMAIL PROTECTED]http://www.pobox.com/~schwern/
Perl Quality Assurance  [EMAIL PROTECTED] Kwalitee Is Job One
I'm a man, but I can change... if I have to.
-- Red Green



Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-02 Thread esp5

On Wed, Oct 02, 2002 at 10:39:17AM +0300, Markus Laire wrote:
 On 1 Oct 2002 at 18:47, [EMAIL PROTECTED] wrote:
 
all text up to, but not including the string union.
  
   rule getstuffbeforeunion { (.*?) union | (.*) }
   
   a union = a 
   b = b
  
  hmm... well, it works, but its not very efficient. It basically 
  scans the whole string to the end to see if there is a union string, and 
  then backtracks to take the alternative. And hence, its not very scalable. 
  It also doesn't 'complexify' very well.
 
 What about
 
 Perl 5:   /(.*?)(?:union|$)/
 Perl 6:   /(.*?) [union | $$]/
 
 or if you want to exlude 'union' from match
 
 Perl 5:   /(.*?)(?=union|$)/
 Perl 6:   /(.*?) [after: union | $$]/
 

that's exceedingly slow, at least by my benchmark. So far, I've got 4 
possibilities:

my $regex1 = qr{(?:(?!union).)*}sx;
my $regex2 = qr{(?:[^u]+|u[^n]|un[^i]|uni[^o]|unio[^n])*}sx;
my $regex3 = qr{(?:[^u]+|(?!union).)*}sx;
my $regex4 = qr{(.*?)(?=union|$)}sx;

timethese
(
10,
{
'questionbang'  = sub { ($line =~ m($regex1)); },
'questionbang2' = sub { ($line =~ m($regex3)); },
'alternation'   = sub { ($line =~ m($regex2)); }
'nongreedy' = sub { ($line =~ m($regex4)); },
}
);


which come out:

alternation:  8 wallclock secs ( 7.71 usr +  0.00 sys =  7.71 CPU) @ 12970.17/s 
(n=10)
questionbang: 17 wallclock secs (16.05 usr +  0.00 sys = 16.05 CPU) @ 6230.53/s 
(n=10)
questionbang2:  8 wallclock secs ( 7.74 usr +  0.00 sys =  7.74 CPU) @ 12919.90/s 
(n=10)
nongreedy: 41 wallclock secs (41.74 usr +  0.00 sys = 41.74 CPU) @ 2395.78/s (n=10)


So yes, a form can be constructed out of ?! which is of approximately equal 
speed to the alternation.

However, in straight C, the corresponding time is:

2.31u 0.02s 0:02.37 98.3%

which tells me that a lot of optimisation could be made with a generic 
mechanism for (non)matching multi-byte character classes. The problem has 
to be dealt with anyways when considering unicode... And which form would people
rather type:

(-[^u]+|(?!union).)*

or
-[^'union']*

I'd say the second scores over the first in intuition, if nothing else...

Ed



Re: Subject-Oriented Programming

2002-10-02 Thread Michael Lazzaro


On Wednesday, October 2, 2002, at 03:11  AM, Andy Wardley wrote:
 There are a bunch of advanced programming techniques like this that
 all fit under the same umbrella:

   * Subject Oriented Programming (IBM)
   * Aspect Oriented Programming (Xerox Parc)
   * Composition Filters
   * Adaptive Programming - Demeter Method and Propagation Patterns

For those interested in exploring the concept, a perl5 implementation 
of AOP, by Marcel Grunauer, exists on CPAN:

http://search.cpan.org/author/MARCEL/Aspect-0.08/

MikeL




Re: exegesis 5 question: matching negative, multi-byte strings

2002-10-02 Thread Joe Gottman



 On 1 Oct 2002 at 18:47, [EMAIL PROTECTED] wrote:

all text up to, but not including the string union.

How about (Perl6)

  /(.*?) union {$pos -= length('union');}/

   This gets everything up to and including the first instance of 'union',
then gets rid of the bit at the end that we don't want.  The capturing
parentheses ensure that we return the part of the string we want, and we
manually reset $pos to the correct position.  This is easy to understand,
and very extensible.

Joe Gottman