Roger, thanks for the explanation. I was just getting to that point in the C code.
No wonder I couldn't figure it out at first. I was expecting to see a sort algorithm. The C code is remarkably clear once the intent is known. I had to step through it to visualize what it was doing. Count up the occurrences of each character yv[256]; ... DO(n, ++yv[*wv++];); .. DO(p, DO(yv[j], *v++=j;); yv[j]=0; ++j;); For each occurrence of the character, add it to the output array On Wed, Mar 5, 2014 at 5:19 PM, Roger Hui <rogerhui.can...@gmail.com> wrote: > Good answers. For /:~x vs. g{x, the explanations are: > > - Indexing must check for index error. Sorting does not. > - Indexing uses random read access over a large chunk of memory (i.e. > x). Sort does not. > > A more detailed explanation: To sort over a small known universe (and > characters definitely qualify), you basically compute #/.~x (the ordering > is wrong, but you get the idea). In C: > > I4 count[M]; > memset(count,0x00,sizeof(count)); > for(i=0;i<n;++i)count[*x++]=1; > > > This is lightning fast on modern CPUs: sequential read access and no branch > prediction fails. (And the ordering is correct.) Once having the counts, > as Henry said, you can do count#a. or in C: > > for(i=0;i<M;++i){m=count[j]; for(j=0;j<m;++j)*z++=i;} > > > Also lightning fast with very localized reads. > > It's ironic that in school sorting is an application with heavy emphasis on > comparisons, counting # of comparisons, etc. In the method above, there is > not a single comparison involving x. I once told someone that I can sort > 4-byte integers and 8-byte IEEE floats in linear time. He looked at me > like I was crazy, presumably remembering from school that sorting was > PROVEN to take n log n comparisons. > > As for why sorting is faster than grading, see > http://www.jsoftware.com/jwiki/Essays/Sorting_versus_Grading > > Now, for those of you who know C (or other scalar language), is there a > faster way to find the minimum of a vector of small integers x (2-byte > integers, say) than the following: > > min=-32768; > for(i=0;i<n;++i){if(min>*x)min=*x; ++x;} > > > I know an alternative which is 70% faster. No fancy SSE instructions. No > multicore. No loop unrolling. > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm