Re: [Jprogramming] partitioned sort

'Pascal Jasmin' via Programming Wed, 15 Apr 2015 09:31:37 -0700

that is very cool

no idea if this is more efficient, (or if your original /:~ ,: 1 version


  1 0 0 0 1 0 0 ([: ; +/\@:[ <@(/:~)/. ]) 100 300 200 400 3 2 1 
100 200 300 400 1 2 3 


it boxes and immediately razes just to avoid fills.

The reason I bring it up is that though /: is impressively fast, its still at 
best an O(n) function, and I don't believe repeated portions of tacit functions 
are optimized away.  ie:

a=. 1e6
  timespacex '3 : ''b + b =. +/y'' a' 
0.00197536 9856 
  timespacex '(+/ + +/) a' 
0.00354816 3840 
 
though this tacit version avoids the duplicate /:@:] 

  1 0 0 0 1 0 0 (] {~  [ (] {~ +/\@[ /:@:{~  ]) /:@]) 100 300 200 400 3 2 1 
100 200 300 400 1 2 3 


there is still 2 /: operations, (and 3 '{') which I would be surprised to learn 
can be expected to be faster than cut raze even if its applied at full rank.

A very useful special code opportunity would be ([: ; <@f) as a shortcut to 
avoid fills without actually boxing then razing.



----- Original Message -----
From: Raul Miller <[email protected]>
To: Programming forum <[email protected]>
Cc: 
Sent: Wednesday, April 15, 2015 11:34 AM
Subject: [Jprogramming] partitioned sort

Someone at Dyalog APL (I'm going to guess Brian Becker, but I could
easily be wrong) posted this tweet:

https://twitter.com/dyalogapl/status/588351556329803776

which was basically about this APL expression:

      psort←{g←⍋⍵ ⋄ ⍵[g[⍋(+\⍺)[g]]]} ⋄ 1 0 0 0 1 0 0 psort 100 300 200 400 3 2 1

100 200 300 400 1 2 3

I don't know if that's going to be readable. Here's a direct link into
an executable APL environment:

http://tinyurl.com/partitionedsortexample

So... what's going on here?

Well, first off, partioned sort is basically this:

   1 0 0 0 1 0 0 ([: ; <@/:~;.1) 100 300 200 400 3 2 1
100 200 300 400 1 2 3

This could be described as "sort within each partition". But this
approach uses boxing, which winds up being inefficient when you need a
lot of boxes. And it's worth while to be able to think about
alternatives to boxing.

So here's another approach:

   1 0 0 0 1 0 0 (] {~ /:@] {~ +/\@[ /:@:{~ /:@]) 100 300 200 400 3 2 1
100 200 300 400 1 2 3

Translated to explicit J, that might look something like this:

psort=:4 :0
  gy=: /:y
  sx=: +/\x
  pz=: gy { sx
  gz=: /: pz
  ord=: gz { gy
  ord { y
)

   1 0 0 0 1 0 0 psort 100 300 200 400 3 2 1
100 200 300 400 1 2 3

Note that a general principle which is relevant here is that /: /: y
represents counting integers in the same order as the original
argument.  And /: /: /: y is the same as /: y. (We are not actually
using /: /: /: but the result of +/\x is similar in character to /: /:
+/\ x and thinking about things in this way might help you understand
how this ordering works.)

Note also that I've made all the intermediate results of this explicit
implementation be global, so they can be inspected:

   gy
6 5 4 0 2 1 3

This is the indices we need to sort our right argument. This order is
mildly complex, and should perhaps be thought of as "unstructured".
(Its complexity reflects the complexity of the right argument.)

   sx
1 1 1 1 2 2 2

These are key values to distinguish which partition things go in,
based on our left argument.

   pz
2 2 2 1 1 1 1

This is what our partition keys would look like, when they get sorted
using in right argument order. In this case, they are non-overlapping
because the value ranges in our example problem do not overlap between
partitions.

   gz
3 4 5 6 0 1 2

If you look closely, you'll see that in this case these are just
counting up in sub-sequences for each partition. In this example, just
like pz, the complexity here is that of our left argument because in
this example the range of values in each of our partitions do not
overlap.

But what we have here are the indices we would need select sorted
values (if y were sorted) into our partitions (without otherwise
disturbing their order).

   ord
0 2 1 3 6 5 4

So this is our original sorting keys (gy) sorted into our sorted
partition order. Using these indices puts our original y argument into
the desired order - all without boxing.

Which is kind of neat.

Thanks,

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm 
 

----- Original Message -----
From: Raul Miller <[email protected]>
To: Programming forum <[email protected]>
Cc: 
Sent: Wednesday, April 15, 2015 11:34 AM
Subject: [Jprogramming] partitioned sort

Someone at Dyalog APL (I'm going to guess Brian Becker, but I could
easily be wrong) posted this tweet:

https://twitter.com/dyalogapl/status/588351556329803776

which was basically about this APL expression:

      psort←{g←⍋⍵ ⋄ ⍵[g[⍋(+\⍺)[g]]]} ⋄ 1 0 0 0 1 0 0 psort 100 300 200 400 3 2 1

100 200 300 400 1 2 3

I don't know if that's going to be readable. Here's a direct link into
an executable APL environment:

http://tinyurl.com/partitionedsortexample

So... what's going on here?

Well, first off, partioned sort is basically this:

   1 0 0 0 1 0 0 ([: ; <@/:~;.1) 100 300 200 400 3 2 1
100 200 300 400 1 2 3

This could be described as "sort within each partition". But this
approach uses boxing, which winds up being inefficient when you need a
lot of boxes. And it's worth while to be able to think about
alternatives to boxing.

So here's another approach:

   1 0 0 0 1 0 0 (] {~ /:@] {~ +/\@[ /:@:{~ /:@]) 100 300 200 400 3 2 1
100 200 300 400 1 2 3

Translated to explicit J, that might look something like this:

psort=:4 :0
  gy=: /:y
  sx=: +/\x
  pz=: gy { sx
  gz=: /: pz
  ord=: gz { gy
  ord { y
)

   1 0 0 0 1 0 0 psort 100 300 200 400 3 2 1
100 200 300 400 1 2 3

Note that a general principle which is relevant here is that /: /: y
represents counting integers in the same order as the original
argument.  And /: /: /: y is the same as /: y. (We are not actually
using /: /: /: but the result of +/\x is similar in character to /: /:
+/\ x and thinking about things in this way might help you understand
how this ordering works.)

Note also that I've made all the intermediate results of this explicit
implementation be global, so they can be inspected:

   gy
6 5 4 0 2 1 3

This is the indices we need to sort our right argument. This order is
mildly complex, and should perhaps be thought of as "unstructured".
(Its complexity reflects the complexity of the right argument.)

   sx
1 1 1 1 2 2 2

These are key values to distinguish which partition things go in,
based on our left argument.

   pz
2 2 2 1 1 1 1

This is what our partition keys would look like, when they get sorted
using in right argument order. In this case, they are non-overlapping
because the value ranges in our example problem do not overlap between
partitions.

   gz
3 4 5 6 0 1 2

If you look closely, you'll see that in this case these are just
counting up in sub-sequences for each partition. In this example, just
like pz, the complexity here is that of our left argument because in
this example the range of values in each of our partitions do not
overlap.

But what we have here are the indices we would need select sorted
values (if y were sorted) into our partitions (without otherwise
disturbing their order).

   ord
0 2 1 3 6 5 4

So this is our original sorting keys (gy) sorted into our sorted
partition order. Using these indices puts our original y argument into
the desired order - all without boxing.

Which is kind of neat.

Thanks,

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] partitioned sort

Reply via email to