Yes. But how to be sure that the first 20 or 512 characters of a documents are in the same language as the whole document?
I think the language identifier must process the whole document to clearly identify its main language.
This seems like it would be a good configuration option. Folks who want to do a better job of language identification can set it higher, so that more text is analyzed.
Doug
