Did I mix Canaries and Baleares? I'll have to look again for the articles related to Catalan, which spoke about its 4 main dialects. I was probably remembering one being in the Canaries, but you may be right if this is really the Baleares. May I reformulate the examples ?
The problem with language tags is that using ISO 3166 codes (which are defined by administrative divisions rather than by linguistic regions) is a workaround. For language tags, these instable administrative codes do not match well with language usage. And there is evidence that: - administrative regions that have a code in ISO 3166 are covering several linguistic regions which will still need their distinction in language tags. - some linguistic regions cover several countries. The country distinction is not much helpful when the real limits are linguistic areas. I spoke about Catalan which is a good example where one of the 4 main dialects is spoken in 3 administrative countries (Spain, Andorra, France). Having to use a country code after the main language code but before a region code is a hack for separating those languages appropriately. Codes like "ca-ES", "ca-AD", "ca-FR" will not be helpful to make the appropriate distinctions between the 4 main dialects of Catalan. I could say the same thing about the 4 main dialects of Breton, within the same administrative region of France (Britanny), and where the other level of encoding in ISO3166 is the numeric department: the four variants of Breton are not correctly identified by the very administrative definition of French departments (which have absolutely no sense as linguistic regions). If you think about classifying the vrious dialects of languages in Borneo, Africa, Mexico, Brasil or China, you'll find the same caveats: ISO 3166 is not offering a correct way to encode linguistic regions, for use in RFC 3066 language tags... So we are left at NOT using any RFC 3066 code, but to use specific language codes for these variants. Shamely, those variants are not easy to group together in softwares that will not consider the specific language variants and will proposed a default "standard" form. For correct linguistic classification, it seems then that the Ethnologue classification would offer a better model, if it proposed a appropriate encoding and not only a classification by groups and names. So RFC 3066 language tags (not ISO 639 language _codes_) are for now a nightmare to handle, with the problem even more serious by the inclusion of ISO 3066 which was clearly not done for language classification but for administrative and legal usages...

