There are ways to figure out language with very short text.  In fact,
one can identify language changes in documents that contain text in
multiple languages.

That's not to say that Twitter uses such methods, just that it's
possible to identify languages in tweet-size documents.

On Oct 26, 6:19 am, Nicole Simon <> wrote:
> The language selection is useless, even with a limitation to English.
> The problem is probably that normal methods of attributing language
> are more or less based on longer text - and not text stripped down
> to 140 chars or less.
> If you want to make detection f.e. in search, rather get all
> results and apply common sense methods, like grep
> special words which most likely are only used in your
> language of choice.
> For real 'select your choice here' it is not going to work.
> At the current rate, this is rather hurting than helping.
> I instruct users in my book to rather use search which
> will limit itself, i.e. use German words if possible in search.
> Nicole
> --
> My german twitter site
> Kontakt:
> skype: nicole.simon /
> phone: +49 451 899 75 03 / mobile: +49 179 499 7076

Reply via email to