Hi Luis,

To answer your other excellent question:

Dimension-moving operations work by taking the original ndarray, and making a 
new one with a different "dope vector" 
(https://en.wikipedia.org/wiki/Dope_vector). That "dope vector" idea is worth 
spelling out a bit; it's the specification for how to treat a single linear 
array of values as an n-dimensional thing; so it contains the dimension sizes, 
and the increments to take one step along each one (for short, "dims" and 
"incs").

That "incs" idea is fundamental to how something like mv operates; the 
underlying locations of the values don't change at all (which makes this a 
"virtual affine", or "vaffine" for short, operation), but the ordering of the 
dimension (length)s has changed (along with their incs, obviously). So for the 
typical "over" manipulation, one mv()s another dim into the 0 slot (which also 
moves its "inc" there). Then when the "over" operation is called on it, it goes 
along its dim 0, broadcasting over any other dims separately (which includes 
the "real", or underlying, dim 0, i.e. the one with an "inc" of 1).

You will note that while no copy got made of the underlying ndarray's data, 
there's still the cost of making that intermediate ndarray. For large data, 
that's negligible compared to the main operation, but if it can be avoided, 
that's cleaner as well as increasing the expressive power of the language.

A further idea that occurs to me while writing the above, is that "einops" 
might be a meta-function. It might take two metaparameters, the "einops" spec, 
and the underlying operation (somewhat like Ingo's PDL::Dims does), then the 
real parameters for that operation. The meta-function would look up the 
underlying op's vtable, as proposed in 
https://github.com/PDLPorters/pdl/issues/506, and set up a broadcast over the 
given real parameters but modified according to the einops spec - no 
intermediate array needed. Put another way, it would do somewhat similar to the 
little-used broadcast method (which can set up explicit broadcasting over a 
"broadcast group" numbered >0, which is how reorder is implemented, and which 
is mind-bendingly confusing).

This vtable-use is somewhat similar to the idea that is finally forming for me, 
as to how to implement loop-fusion: having deferred evaluation or by other 
means (possibly another meta-function), one would have a series of (for now, 
only scalar) operations. The broadcast over the real parameters would be set 
up, with a quasi-broadcast set up for each of the fused operations, with the 
inputs and outputs of each being as appropriate (probably the input/output of 
each non-first/last one would be a single-valued [t] ndarray). Then a modified 
version of each stages readdata would be called (that didn't broadcast like a 
normal one) for each fused operation, then the meta-function's "broadcast move 
one step along" would be called.

This meta-function idea implies that a broadcast should become more of a 
first-class entity than it currently is.

Thoughts on any of the above most welcome, as always!

Best regards,
Ed
________________________________
From: Luis Mochan <moc...@icf.unam.mx>
Sent: 14 January 2025 04:55
To: Ed . <ej...@hotmail.com>
Cc: Ingo Schmid <ingo...@gmx.at>; pdl-general@lists.sourceforge.net 
<pdl-general@lists.sourceforge.net>
Subject: Re: [Pdl-general] [Pdl-devel] Pdlpp modules

Hi Ed,

On Tue, Jan 14, 2025 at 02:44:36AM +0000, Ed . wrote:
> Hi Luis,
>
> I did in fact look at ApplyDim when you posted it on CPAN, because on the IRC 
> channel we have a bot that reports whenever a module starting with "PDL" is 
> uploaded to CPAN. Please believe me when I say I know extremely well that 
> it's not currently possible to skip mv/reorder/whatever. What I'm saying is 
> it ought to be possible to tell PDL to just do its "*over" operation along a 
> different dimension.

I don't know how PDL actually moves dimensions around. Would doing *over
operations along different dimensions be less expensive than
reordering and performing over the first dimensions?

> This evening, it occurred to me a possible answer to the problem of how to 
> express such a thing might lie in "einops" 
> (https://github.com/arogozhnikov/einops). That would certainly also offer 
> ideas in how to express dim-naming stuff also.

Einstein's notation is compact and expressive. It is great generally,
but not always, and it is cumbersome where it can't be used directly. For
example, making a weighted sum of products of elements of a and b with
weights w, sum_i(h_i a_i b_i) is written in Einstein's notation as
g_{ij} a_i b_j  where g_{ij} is a special metric, a matrix with diagonal
elements h_i and zeroes elsewhere. I guess computationally you wouldn't want
to do that (multiply every a_i by every b_j and then the result by
g_{ij} which is mostly zeroes and do a double sum over i and j).
I haven't looked though at any details of einops yet.

Best regards,
Luis


>
> Best regards,
> Ed
>
> ________________________________
> From: Luis Mochan <moc...@icf.unam.mx>
> Sent: 14 January 2025 02:23
> To: Ed . <ej...@hotmail.com>
> Cc: Ingo Schmid <ingo...@gmx.at>; pdl-general@lists.sourceforge.net 
> <pdl-general@lists.sourceforge.net>
> Subject: Re: [Pdl-general] [Pdl-devel] Pdlpp modules
>
> On Mon, Jan 13, 2025 at 07:33:34PM +0000, Ed . wrote:
> > I have some awareness of PDL::Dims...
> >
> > And on the subject of capabilities I think should be stolen into core PDL, 
> > I believe "apply an operation over dims other than 0" is another.
>
> Have you looked at PDL::ApplyDim? It doesn't skip the mv nor reorder
> operations, it just hides them away so the user doesn't have to keep
> track of how the dimensions were shuffled before applying an operation
> nor how to unshuffle them afterwards. He just have to state which
> dimension(s) should come to the front or to the back before applying
> the operation.
>
> Regards,
> Luis
>
>
> --
>
>                                                                   o
> W. Luis Mochán,                      | tel:(52)(777)329-1734     /<(*)
> Instituto de Ciencias Físicas, UNAM  | fax:(52)(777)317-5388     `>/   /\
> Av. Universidad s/n CP 62210         |                           (*)/\/  \
> Cuernavaca, Morelos, México          | moc...@fis.unam.mx   /\_/\__/
> GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16  C2DF 5F0A C52B 791E B9EB

--

                                                                  o
W. Luis Mochán,                      | tel:(52)(777)329-1734     /<(*)
Instituto de Ciencias Físicas, UNAM  | fax:(52)(777)317-5388     `>/   /\
Av. Universidad s/n CP 62210         |                           (*)/\/  \
Cuernavaca, Morelos, México          | moc...@fis.unam.mx   /\_/\__/
GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16  C2DF 5F0A C52B 791E B9EB
_______________________________________________
pdl-general mailing list
pdl-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pdl-general

Reply via email to