The granularity of language encoding can be used to make data collection less 
rude. Depending on what data is being collected, and how it is being collected, 
systems of language codes can easily overreach their usefulness.  The primary 
function of Open Government websites is to attract data tourists.

For example, some US Government Agency (I forget the exact name) involved with 
geo-location also uses the ISO 639 three character codes (347 of them).  This 
is public information.  Open Government websites "write" in the ISO 639 two 
character codes (151 of them).  A sample of 150+ Government websites showed 
only about 68 of these two letter codes actually in use.  The codes can be 
reduced with an SQL table. There is very little need to attempt a SPARQL 
solution.

http://www.rustprivacy.org/faca/languages.php

Drop Down Lists (for example) and SQL,XML,CSV versions of the table are 
available for download.

--Gannon 



Reply via email to