Erick Erickson [erickerick...@gmail.com] wrote: > Of course I wouldn't be doing the work so I really don't have much of > a vote, but it's not clear to me at all that enough people would actually > have a use-case for 2b+ docs in a single shard to make it > worthwhile. At that scale GC potentially becomes really unpleasant for > instance....
Over the last years we have seen a few use cases here on the mailing list. I would be very surprised if the number of such cases does not keep rising. Currently the work for a complete overhaul does not measure up to the rewards, but that is slowly changing. At the very least I find it prudent to not limit new Lucene/Solr interfaces to ints. As for GC: Right now a lot of structures are single-array oriented (for example using a long-array to represent bits in a bitset), which might not work well with current garbage collectors. A change to higher limits also means re-thinking such approaches: If the garbage collectors likes objects below a certain size then split the arrays into that. Likewise, iterations over structures linear in size to the index could be threaded. These are issues even with the current 2b limitation. - Toke Eskildsen