Hey all, I finally got around to playing around with a suffix array implementation and wanted to share. It's a pure julia port of sais <https://sites.google.com/site/yuta256/sais>, by Yuta Mori and while not the bleeding edge fastest suffix array sorting algorithm out there, I personally think it's the best bang for your buck (in terms of a relatively simple implementation and platform independence).
Compared to state of the art, the sais algorithm is consistently within 1.5-2.5x of the fastest (see detailed benchmarks here <https://code.google.com/p/libdivsufsort/wiki/SACA_Benchmarks>), and handily beats all other algorithms in terms of memory usage. The native julia implementation is around 10x faster than the Java sais implementation and only 1.2-1.3x slower than the native C. I still need to spin up a direct comparison of this julia version vs. Go's standard library implementation, which is based on the qsufsort algorithm (though benchmarks show sais always beating it). There's still some more work on the interface to do and implementing some algorithms that build on the suffix array, so for now the only functionality is getting the sorted suffix array for a string/corpus. I'm also planning on playing around with some distributed/parallel features for the core algorithm to see what other shenanigans are possible. https://github.com/quinnj/SuffixArrays.jl -Jacob
