I agree, this is really, really cool. :) Feel like working on the WinCE port? ;)
I think we've had some suggestions before that we use a similar type of index-based compression. The ZIP/LZSS block compression we use now works really well, I believe, and gets module sizes down to the same range of about 1.5 to 2 mb for an OT & NT set in Latin-1. (Unicode texts will be a little larger until we get SCSU working and in use.) I assume index-based compression could actually improve standard search times if we rewrote the search engine to make use of it directly, whereas ZIP compression increases search times. But with regular expression searches, the index-based compression would increase search times even more because we'd end up having to reconstruct ever verse anyway. --Chris