Thanks for the encouragement. I have persisted and have the following to
report on. The remark on the constants was on the dot - when I scaled the
algorithm to larger size inputs, the recursive implementation performed
better.

(Side-remark: I was pleasantly surprised to see this thread discussed in
NYCJUG  [1]. )

Summary: Maybe the remarks in [1] are pertinent. On the other hand, creating
efficient algorithms with J means recasting the algorithms in a different
form altogether to benefit from J's execution style. These require different
insights. In the below two examples, with FFT, I was able to get the hint
crucial to implement fft7 that can be efficiently translated to J's
programming style while for quicksort, I still lack those insights that can
help.

For the new sort implementation proposal, I am not sure that with special
coding and swap as indicated, I can get a performance comparable to the
grade operator of J. Still, it helped me to use these other operators that I
did not get a chance to explore so far. (Well, I should admit that sorting
is much more primitive algorithm than FFT is. )

Hope this work is interesting for others as it is for me!!

Regards,
Yuva

Report:


1) I tried implementing original quick sort in a J friendly way. For this, I
wanted to 'stop' a scan and so implement the 3-way partition (<,=,>) in a
different way. Initially I had not much success until I read about the
special coding [2]. Herein it is mentioned that f i. 0: family has special
coding. I understand that in the implementation, this special coding
effectively stops at a desired point. Further, C. can be used to perform a
swap. With these ideas, I got this far:
   qs1 =: C.~ <@:(({. i.&1@:< ]) , ({. i:&1@:> ]) )
   qs1 a [smoutput a =. (#?.#) i.20
6 3 19 15 10 14 0 7 12 17 16 4 13 2 1 9 18 5 11 8
6 3 5 15 10 14 0 7 12 17 16 4 13 2 1 9 18 19 11 8

This performs a unit swap based on the pivot as the first element. Indices 2
& 17 are swapped. Now this process has to continue on the sub-array starting
from Index 3 up to & including index 16. Continuing this way, one partition
can be achieved. Open issues are how to extract the sub-array, what is the
stopping condition and how to move the pivot to the right position to set up
the recursion at the higher level. I was not able to code this in a
satisfactory way so far. (There are assumption in this code that can prevent
its generalization too)

2) So, I turned my attention to FFT. (The reason for qsort and FFT is here
[3] - the top 10 algorithms of the 20the Century). Here the FFT lab was
helpful. My first version of the recursive fft was like this:

roots=:[:^ (-0j2p1)&% * i...@-:

en =: 13 : '((# y) $ 1 0)#y'
od =: 13 : '((# y) $ 0 1)#y'
pr =: 13 : '((fft4(en y))+(roots # y)*(fft4(od y))) '
sr =: 13 : '((fft4(en y))-(roots # y)*(fft4(od y))) '
tr =: pr,sr
fft4 =: tr ^: (1<#)

Well, this was not as good as the dfft which from the FFT lab was this

dfft=: 3 : '+/ y * ^ (#y) %~ (- o. 0j2 ) * */~ i.#y' NB. from the FFT lab

This was a surprise and so I laboured to find a good FFT implementation in
the J 'native' way. My breakthrough came from an article in DDJ [4].

fft71 =: ( ,@,:~@:$:@en + ,@,:~@:$:@od * ([:^_0j2p1&%...@# * i...@#))^: (1<#)
fft7 =: +...@fft71

(The reason for fft7 is because the fft71 computation gave complex
conjugates which I could not easily resolve.)

What is nice about fft7 is that it is a good competitor to fftw!! It scales
well ( I have tried it for a small set of inputs, maybe there are more that
need to be tested).
To test this, I created a simple test harness [5] that for inputs of length
2^n creates random inputs of specified number, applies the algorithms to the
inputs, and creates an average of time & space values.

B7 =: > 'fftw ';'fft7 '
(12;13;1000) test B7
+---------------------------------------------+-----+
|+--+----------+---------+---------+---------+|fftw |
||0 |fftw      |fft7     |fftw     |fft7     ||fft7 |
|+--+----------+---------+---------+---------+|     |
||12|0.00282423|0.0515584|823808   |902144   ||     |
|+--+----------+---------+---------+---------+|     |
||13|0.00504044|0.105105 |1.64301e6|1.80326e6||     |
|+--+----------+---------+---------+---------+|     |
+---------------------------------------------+-----+

In the first box, the first column is the exponent value (input size = 2^n),
the second two are time while the last two are space used.

In comparison, dfft implementation explodes on space already at 2^12 :
   dfft 42 + i.2^12
|out of memory: dfft
|   +/y*^(#y)%~(-o.0j2)    **/~i.#y

For an idea of time/space performance,

B6 =: > 'dfft ';'fftw ';'fft7 '
(11;11;10) test B6
+---------------------------------------------------------+-----+
|+--+-------+----------+---------+---------+------+------+|dfft |
||0 |dfft   |fftw      |fft7     |dfft     |fftw  |fft7  ||fftw |
|+--+-------+----------+---------+---------+------+------+|fft7 |
||11|1.20241|0.00155679|0.0268432|7.04942e8|414208|451584||     |
|+--+-------+----------+---------+---------+------+------+|     |
+---------------------------------------------------------+-----+

This shows that fft7 is quite good in its execution.


Regarding use of this work, I have seen other threads that are looking for a
FFTW implementation in 64 bit. Based on further tests, fft7 proposed could
be used if found accurate enough for use.

   (fftw -: fft7) i.2^9
1

Beyond 9, the two functions donot agree with difference in the tolerance of
match.

Deriving fft7 was an interesting exercise and helped me learn the way to
optimize J functions.

[1] http://www.jsoftware.com/jwiki/NYCJUG/2009-04-14
[2] http://www.jsoftware.com/help/dictionary/special.htm
[3] http://amath.colorado.edu/resources/archive/topten.pdf
[4] http://www.ddj.com/cpp/199500857
[5] Test harness to check performance of the algorithms:

NB. x test y
NB. 0{x : lower bound
NB. 1{x : upper bound
NB. 2{x : sample size : default = 10
NB. y   : list of algorithms to compare
NB. output: exponent; time ; space
test =: 4 : 0  NB. test harness to check the space and time.
 r =. 0 0$''
 A =. y
 r =. r, (0; <"1 A , A)
 'c d m'=. 3{.x,<10
 try.
  for_k. c+ i. >: d-c do.
   n =.  ": ({~ #?#)"1 ((m, 2^k) $ 42+ i. 2^k)
   n1 =. A ,/"1/  n
   et =. <"0 (+/%#)"1 time "1  n1
   es =. <"0 (+/%#)"1 space "1   n1
   r =. r, (k;et,es)
  end.
 catch.
  smoutput 'execution error encountered'
 end.
 r;A
)

On Sun, Apr 12, 2009 at 9:12 AM, Roger Hui <[email protected]> wrote:

> I encourage you to persist in this effort.  J is suitable
> for testing sorting ideas, for the same reasons that
> it is suited for testing other programming ideas:
> interactive environment, conciseness, expressiveness,
> etc.  One of the first uses of APL (the predecessor of J)
> was in sorting; see Chapter 6 of Ken Iverson's
> "A Programming Language".
> http://www.jsoftware.com/jwiki/Doc/A_Programming_Language
>
> There are a few existing essays on sorting:
> http://www.jsoftware.com/jwiki/Essays/Quicksort
> http://www.jsoftware.com/jwiki/Essays/Sorting_versus_Grading
> http://www.jsoftware.com/jwiki/Essays/Order_Statistics
>
> You do have to be careful when you are comparing times.
> Two algorithms may be the same order but the fact
> may be obscured by the large constants involved.
>
>
>
> ----- Original Message -----
> From: Yuvaraj Athur Raghuvir <[email protected]>
> Date: Thursday, April 9, 2009 8:54
> Subject: Re: [Jprogramming] Efficient Quick Sort?
> To: Programming forum <[email protected]>
>
> > Hello,
> >
> > Maybe I should add some more context on why I am doing this - I am
> > (re-)reading up on algorithms and thought I shall use J to
> > experiment. Am I
> > doing the right thing by choosing J to implement and test these
> > rather low
> > level algorithms? Maybe for full control on memory and accesses,
> > I should
> > choose C++ instead.
> > Comments?
> >
> > Regards,
> > Yuva
> >
> > p.s: Not sure if this is programming or chat. For now I am
> > continuing this
> > thread.
> >
> > On Tue, Apr 7, 2009 at 12:25 PM, Yuvaraj Athur Raghuvir <
> > [email protected]> wrote:
> >
> > > Hello,
> > >
> > > Inspired by the quicksort defined at
> > > http://jsoftware.com/help/dictionary/d212.htm as:
> > > quicksortr=: (($:@(<#[) , (=#[) , $:@(>#[)) ({~ ?...@#)) ^:
> > (1<#) NB. random
> > > pivot
> > >
> > > I wrote a selection sort as so:
> > > selectsort =: ((<./),$:@(]-. <./))^: (1<#)
> > >
> > > Yes, I know that grade(/:)  is the J way of doing
> > sorting. I am just trying
> > > this out.
> > >
> > > To eliminate the random pivot selection, I did the following:
> > > quicksort=: (($:@(<#[) , (=#[) , $:@(>#[)) (0{])) ^: (1<#)
> > >
> > >
> > > To test, I created these simple verbs:
> > > cmd =: 'ts ''n1=.selectsort a''';'ts ''n0=.(/:a){a''';'a =.
> > (10 ? 10) { 22+
> > > i. 10'
> > > cmd =: ('ts ''n3=.quicksort a''';'ts ''n2=.quicksortr a'''),cmd
> > > cmd =: ('n3-:n2';'n2-:n1';'n1-:n0'),cmd
> > >
> > > test=: 4 : ',. > ". L:0 (x # ,. <y)' NB. Am I doing too
> > much work here?
> > >
> > >     b =.100000 test cmd
> > >    (+/ % #) ;"1 (3 4 5 6 {"1  b)
> > > 3.01884e_5 3590.4 3.35261e_5 3589.82 2.14828e_5 4744.79
> > 6.48059e_6 1384.96
> > >
> > > The data shows that quicksort is faster than quicksortr but is
> > slower than
> > > the selectsort on the average.
> > >
> > > This is expected since the quicksort does a full scan to
> > choose the items
> > > lesser/greater than the pivot element.
> > >
> > > Is it possible to implement a quicksort faster than the
> > selection sort that
> > > is proposed here?
> > >
> > > Regards,
> > > Yuva
> > >
> > > p.s: Interestingly, I see that irrespective of the number of
> > trials, I
> > > loose the first three match results. So, I have
> > >    +/;3&{."1 b
> > > 299997
> > >    +/;3&{."1 (100 test cmd)
> > > 297
> > >    +/;3&{."1 (10 test cmd)
> > > 27
> > >
> > > What could be wrong?
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to