> Since clucene isn't aiming for either UTF-16 OR UTF-32, I don't > believe you'll be able to. A better approach would be to get the size > of "content" and set a value based on that.
FWIW, I just did this for both searching and index creating. I over-allocated the length by 500, which is overkill. I don't really think it needs to be overallocated at all (content will be UTF-8 which will always have at least one char per character, whereas we're converting to USC2 or USC4 which both have exactly one per character, so using the length exactly should always work, I think). Anyway, the result is a 3 second decrease in indexing time for ESV, which is fairly substantial I think. I'd definitely recommend doing something like this. Matthew _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page