I manage to find a dataset on the website of Ethnologue, though it doesn't look like open source, need to check with them exactly how I'm allowed to use it: http://www.ethnologue.com/codes/download-code-tables
Thanks for the explanation Phillippe. I know it is not an easy issue. Look for different resources on the web, any specific links or feedbacks would be helpful. On 17 September 2016 at 13:35, Philippe Verdy <[email protected]> wrote: > Not all languages are sorted, only those for which there are released data > in CLDR. > And languages frequently belong to several countries/territories at the > same time, with different official or recognized status (itself independant > of the number of actual speakers, which is very frequently roughly > estimated). > Some countries are giving official statistics about their national or > regional languages, but frequently these stats are old, or underestimated > or overestimated for political reasons, or some languages are mixed as if > they were only one, or simply discarded if it is considered locally as a > secondary language, even if the official language is superficially > understood but taken as a primary one. > Statistics are also forgetting native speakers living abroad in a > diaspora, or secondary learners of a language taught in foreign countries. > > > 2016-09-17 11:19 GMT+02:00 Mats Blakstad <[email protected]>: > >> Hi >> >> Is there any dataset that contains all languages in the world sorted by >> country/territory? >> >> I found this at Unicode, but seems like only containing the most spoken >> languages in each country and not the smaller once: >> http://www.unicode.org/cldr/charts/latest/supplemental/terri >> tory_language_information.html >> >> Thanks in advance for help. >> >> Best regards >> Mats Blakstad >> > >

