On Tuesday, October 6, 2015 at 11:44:52 AM UTC-4, Scott Jones wrote: > > Well, I asked about how to get it as fast as possible. > Turns out, leading_zeros does just what I want, using the lzcnt* > instruction :-) > If you don't count the unnecessary frame setup (pushq %rbp; movq %rsp, > %rbp) and frame pop/return (popq %rbp ; ret), the whole thing I want boils > down to 4 instructions, nicely parameterized by Julia on the type ;-) >
What this means, is that I can take a column index, row index, and input array indices, pack them tightly into a work vector (with indices of zero input values already removed), that can be sorted very quickly and stably by quicksort (because of using the input array index value, makes there not be any duplicates). Love the way Julia does a lot of the work I used to have to do by hand for me ;-) Hat's off, again, to the core team and all the contributors!
