[Jprogramming] partitioned sort

Raul Miller Wed, 15 Apr 2015 08:36:01 -0700

Someone at Dyalog APL (I'm going to guess Brian Becker, but I could
easily be wrong) posted this tweet:


https://twitter.com/dyalogapl/status/588351556329803776

which was basically about this APL expression:

      psort←{g←⍋⍵ ⋄ ⍵[g[⍋(+\⍺)[g]]]} ⋄ 1 0 0 0 1 0 0 psort 100 300 200 400 3 2 1

100 200 300 400 1 2 3

I don't know if that's going to be readable. Here's a direct link into
an executable APL environment:

http://tinyurl.com/partitionedsortexample

So... what's going on here?

Well, first off, partioned sort is basically this:

   1 0 0 0 1 0 0 ([: ; <@/:~;.1) 100 300 200 400 3 2 1
100 200 300 400 1 2 3

This could be described as "sort within each partition". But this
approach uses boxing, which winds up being inefficient when you need a
lot of boxes. And it's worth while to be able to think about
alternatives to boxing.

So here's another approach:

   1 0 0 0 1 0 0 (] {~ /:@] {~ +/\@[ /:@:{~ /:@]) 100 300 200 400 3 2 1
100 200 300 400 1 2 3

Translated to explicit J, that might look something like this:

psort=:4 :0
  gy=: /:y
  sx=: +/\x
  pz=: gy { sx
  gz=: /: pz
  ord=: gz { gy
  ord { y
)

   1 0 0 0 1 0 0 psort 100 300 200 400 3 2 1
100 200 300 400 1 2 3

Note that a general principle which is relevant here is that /: /: y
represents counting integers in the same order as the original
argument.  And /: /: /: y is the same as /: y. (We are not actually
using /: /: /: but the result of +/\x is similar in character to /: /:
+/\ x and thinking about things in this way might help you understand
how this ordering works.)

Note also that I've made all the intermediate results of this explicit
implementation be global, so they can be inspected:

   gy
6 5 4 0 2 1 3

This is the indices we need to sort our right argument. This order is
mildly complex, and should perhaps be thought of as "unstructured".
(Its complexity reflects the complexity of the right argument.)

   sx
1 1 1 1 2 2 2

These are key values to distinguish which partition things go in,
based on our left argument.

   pz
2 2 2 1 1 1 1

This is what our partition keys would look like, when they get sorted
using in right argument order. In this case, they are non-overlapping
because the value ranges in our example problem do not overlap between
partitions.

   gz
3 4 5 6 0 1 2

If you look closely, you'll see that in this case these are just
counting up in sub-sequences for each partition. In this example, just
like pz, the complexity here is that of our left argument because in
this example the range of values in each of our partitions do not
overlap.

But what we have here are the indices we would need select sorted
values (if y were sorted) into our partitions (without otherwise
disturbing their order).

   ord
0 2 1 3 6 5 4

So this is our original sorting keys (gy) sorted into our sorted
partition order. Using these indices puts our original y argument into
the desired order - all without boxing.

Which is kind of neat.

Thanks,

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

[Jprogramming] partitioned sort

Reply via email to