Re: PDL-P: Re: Reduce [was: Re: Random items (old p5p issues)]

2000-08-05 Thread Ariel Scolnicov

Tuomas Lukka [EMAIL PROTECTED] writes:

 On 4 Aug 2000, Ariel Scolnicov wrote:
 
  Karl Glazebrook [EMAIL PROTECTED] writes:
  
   OK I will raise to the bait
   
   I think it's a bit unfair to say that PDL people have failed to 'bite',
   there was quite a bit of discussion on our list after your post. Also
   some concern about how much of perl6 is vapourware.
   
   I am game to take part in discussions. 
   
   It has always been apparent to me that Numerical Python is better integrated
   than PDL. Some language changes in core python WERE made to accomodate it,
   also Python had less syntax clutter to get around.
   
   I definitely support embedding many of the key PDL ideas into the language
   - they key one is a much easier syntax for a multi-dim slice. We are currently
   driven to
   
   $a-slice("10:100,30:200");
   
   compared to IDL AND NumPy: a[10:100,30:200]
  
  Perl doesn't have multi-dimensional arrays (yet, I hope), but it
  *does* spell `:' as "..", even today.  @x[7..9] is a 3-element list,
  which I don't see as any different from @x[7:9].  Does the slice share 
  the elements of @a in your example?
 
 Well, first of all, 
 
   10:100, 30:200
 
 is not the same: in Perl it comes out as
 
   10..100, 30..200
 
   10, 11, ... , 100, 30, 31, .., 200
 
 whereas what we want is
 
   Span(10, 100), Span(30, 200)
 
 where Span is some suitable object telling that this span is a parameter.
 There are also other syntaxes for slice we would like to have but these
 can probably be kludged.

I think we're confusing 2 separate issues here.  PDL seems to deal
with both, but let's keep them separate:

(1) Decent span objects for generating subarrays (e.g. 1:10:3 (which
should probably be 1..10:3, or whatever the iterator sequence
works out as).

(2) Multidimensional arrays.  Today Perl does *not* have these.  As
Tuomas points out, the `,' operator is already too overloaded to
separate indices.  Note: I say the Perl has no multidimensional
arrays.  A list-of-lists is *not* a multidimensional array.  A
multidimensional array can be *implemented* as a list-of-lists,
but not efficiently.

Naturally, there are interesting interactions between (1) and (2),
e.g. when slicing a multidimensional array (consider, for instance,
what an operator should look like that returns the diagonal of a
square matrix).

A third issue:

(3) An "rvalue" slice is a *copy* of the elements sliced from the
array; an "lvalue" slice consists of the elements themselves, with 
a different indexing scheme.  Naturally, Perl should have both...


[...]

  Regarding multi-dimensional arrays, the PDL porters are undoubtable
  champions; what is required?
 
 Well, the PDL distro is our answer to that ;) ;)

But Shirley Perl could make your job a little less horrible?  At the
very least, if `:' were a binary operator, you could overload it to
generate non-string slice indices.

-- 
Ariel Scolnicov|"GCAAGAATTGAACTGTAG"| [EMAIL PROTECTED]
Compugen Ltd.  |Tel: +972-2-6795059 (Jerusalem) \ We recycle all our Hz
72 Pinhas Rosen St.|Tel: +972-3-7658514 (Main office)`-
Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels



Re: PDL-P: Re: Reduce [was: Re: Random items (old p5p issues)]

2000-08-04 Thread Karl Glazebrook


OK I will raise to the bait

I think it's a bit unfair to say that PDL people have failed to 'bite',
there was quite a bit of discussion on our list after your post. Also
some concern about how much of perl6 is vapourware.

I am game to take part in discussions. 

It has always been apparent to me that Numerical Python is better integrated
than PDL. Some language changes in core python WERE made to accomodate it,
also Python had less syntax clutter to get around.

I definitely support embedding many of the key PDL ideas into the language
- they key one is a much easier syntax for a multi-dim slice. We are currently
driven to

$a-slice("10:100,30:200");

compared to IDL AND NumPy: a[10:100,30:200]

I'd propose simply building the a:b syntax into the core of Perl6. It's
convenient and almost standard.

perl6 should provide simple arrays, but they should be allowed to be
replaced with objects with no change of syntax. So

@a[10:100,30:200];

Would work whether @a was a perl list of lists or a PDL compact array.

So would @a * @b

Loop unrolling sounds really good, their should be hooks for objects to
provide their own implementation. Proper overloading and ability to 
overload by arg type are required, i.e.

sub myfunc{ float x, complex y }
sub myfunc{ float x, float y }


Their should also be hooks for slices, for example if one is implementing
a complex objects (e.g. representing a map) - one might want a slice in
physical units instead of array indices.

I'd even propose getting rid of @a for arrays and $a for scalars and just
making the "a". I've never really liked that feature of perl - I am sure
some users agree and some disagree - might be worth taking a straw poll.
In this age where everything may (or may not) be an object are $ and @ really
required? There are too many objects types and not enough funny symbols..
even with Unicode.

Karl Glazebrook


Jeremy Howard wrote:
 
   BTW, I'd like to see a more lightweight currying mechanism too. The
   challenge is to find a 'perlish' but not heavyweight approach...
 
  Ah, good.  I assume that having established the challenge, you'll be
  rising to it? :-)
 
 Yes of course. But I want to first of all see the following RFCs from Damian
 he's promised:
 
  * Built-ins: min() and max() functions and acceptors
 
  * Built-ins: reduce() function
 
  * Data structures: Semi-finite (lazy) lists
 
  * Subroutines: higher order functions
 
  * Subroutines: lazy evaluation of argument lists
 
  * Superpositions: vector operations via superpositions
 
 Damian is likely to write these in a way that is nicely integrated together,
 based on past experience. What I'd then like to do is to see how these fit
 together to fill in the stuff I mentioned earlier today:
 quote
 - Matrix ops
 - Support for lazy evaluation
 - Compile time expression unrolling (e.g. so that $a = sum(@b*@c+@d) does
 just one loop and no memory copy, as would occur with expression templates
 in C++)
 - Ability to specify infinite lists (e.g. like in Haskell)
 - Generic programming (iterators, algorithms, etc, eg. like in the STL)
 /quote
 
 I think the way I'd like to do this is to try and implement a couple of
 interesting bits of code that I've found are good tests of numerical
 programming environments. Stuff that just looks beautiful in Mathematica
 (which supports functional, rule-based, and procedural programming), but is
 full of loops and control structures in most languages. That way any bits
 that are missing will be pretty obvious (at least bits that matter to me!)
 
 I've tried to get input from PDL porters by cross-posting a couple of times,
 but haven't got much of a bite yet. I'm nervous about finding that we either
 reinvent the wheel, or break useful stuff that they've done... Are there any
 PDL gurus here who are interested in getting involved in some of these
 perl6-language issues?



Re: PDL-P: Re: Reduce [was: Re: Random items (old p5p issues)]

2000-08-04 Thread Karl Glazebrook


Also on the issue of loop unrolling and efficient looping.

PDL has what we call 'threading'.

This allows a C-level function to specify the dimensionality of
the arguments it accepts. For example a function addtoline() which
hyptheticaly adds a constant to a row vector might
have a 'signature'

a(n); b(); [o]c(n);

So a(n) is a 1D input, b() is a scalar (0D) which addtoline might add
to all the elements of a(n) and c(n) is the output 1D.

What is really useful is that if you add extra dimensions they
get looped over automatically, at the C-level so really fast.

e.g. a(100,10), b(10), c(100,10)

- adds to all 10 rows of a.

ALSO they way it is implemented is the array pointers are calculated
by C macros in such a way as to support transpositions and slicing
with zero memory overhead. Thus if I want to add one to every *column*
of a, in a slice:

addtoline $a-xchg(0,1)-slice("10:20,20:40"), 10, $c

here xchg creates a virtual transposition of the first two dims, and slice
creates a virtual slice. This is all done by storing extra info in the
$a object.

I think these ideas would be of use in any discussion of perl6 numerical
efficiency - there are other ways I guess. The core idea is to try and stay
in compiled loops.

The other advantage of this 'threading' is that it then automatically parallelizes
the problem - we even have an experimental PDL implementation which can use
multiple CPUs to do 'threads'.

One problem we are continually faced in PDL is we do all this at the C-level
- but then we run into problems where if we have pure-perl PDL functions they
can't do these tricks. 

Another problem though is while one can usually write many complicated multi-dim
problems with threading tricks, and avoid loops, it is sometimes a bit taxing
on the brain! One often wishes one could just write it as C/fortran style
loops and have the language figure out how to do the loops efficiently.

Anyway some integration of concepts for handling large numerical computation
into the core would definitely be a good thing.

Karl Glazebrook



Re: PDL-P: Re: Reduce [was: Re: Random items (old p5p issues)]

2000-08-04 Thread Tim Jenness

On Fri, 4 Aug 2000, Tuomas Lukka wrote:

 On 4 Aug 2000, Ariel Scolnicov wrote:
 
 Well, first of all, 
 
   10:100, 30:200
 
 is not the same: in Perl it comes out as
 
   10..100, 30..200
 
   10, 11, ... , 100, 30, 31, .., 200
 

Additionally, generically it would not necessarily have to be a range of
integers. The range could be specified as floating point if we are
specifying a slice in physical coordinates.

-- 
Tim Jenness
JCMT software engineer/Support scientist
http://www.jach.hawaii.edu/~timj