On 25 August 2011 20:13, Kevin Brubeck Unhammer <[email protected]> wrote:
> "Jimmy O'Regan" <[email protected]>
> writes:
>
>> $ wget http://downloads.dbpedia.org/3.7/ca/mappingbased_properties_ca.nt.bz2
>> $ bzgrep '/demonym' mappingbased_properties_ca.nt.bz2 |perl
>> -MURI::Escape '-MUnicode::Escape qw(unescape)' -ane 'if
>> (m!<http://dbpedia.org/resource/([^>]*)>
>> <http://dbpedia.org/ontology/demonym> "([^"]*)"\@ca .!) {print
>> uri_unescape($1)."\t".unescape($2)."\n";}'
>>
>> gives things like:
>> Alcover       Alcoverenc, alcoverenca
>> Aiguamúrcia   Aiguamurcienc, aiguamurcienca
>> Amer  Amerencs, amerenques
>> Almoster      Almosterenc, almosterenca
>> L'Albiol      Albiolenc, albiolenca
>> Alforja       Alforgenc, alforgenca
>> Argelaguer    Argelaguenc, argelaguenca
>> L'Arboç       Arbocenc, arbocenca
>> Arbúcies      Arbucienc, arbucienca
>> Albinyana     Albinyanenc, albinyanenca
>>
>> ...of course, it's not all /that/ neat and tidy:
>>
>> Newcastle_upon_Tyne   Geordie
>> Encarnación_(Paraguai)        encarnacero/a
>> Kristiansand  kristiansander
>> Bodø  bodøværing
>> Haugesund     haugesundar, -er
>
> Demonym is a name for someone who's from a certain place?

Yes.

> In that case,
> at least the last three should be correct and "official"[1].

Yes, but they're not Catalan ("foo"@ca).

Extracting assertions from Wikipedia isn't a neat and tidy process,
the dbpedia extraction framework doesn't have a facility for
specifying the language of a particular piece of text, and the
templates themselves don't have a standardised way of denoting the
language of a particular piece of text, so it's bound to be a bit
noisy.

-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management 
Up to 160% more powerful than alternatives and 25% more efficient. 
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to