Hi,
thanks. I will try this out when I'm less busy.
What about the possibility to make some kind of add-on to Apertium to
handle proper names? It should be far easier than the already present
finite state transducer for transliteration, wouldn't it?
Yours,
Per Tunedal

On Fri, May 31, 2013, at 15:06, Jimmy O'Regan wrote:
> On 30 May 2013 18:47, Francis Tyers <[email protected]> wrote:
> > El dj 30 de 05 de 2013 a les 19:42 +0200, en/na Per Tunedal va escriure:
> >> The most difficult part would be to find the names. Perhaps someone has
> >> any ideas?
> >
> > In Icelandic--English, regular expressions are used. See e.g. pardefs
> > for "persons" and "lastnames" in is.dix
> >
> > This is not altogether recommended though, as regular expressions slow
> > down your transducer. What you could do is use them on a large corpus
> > and then mass-add the ones after superficial checking.
> 
> Census data is easy to find, gazetteers for NER are easy to find,
> en.wiktionary has categories for names
> (http://en.wiktionary.org/wiki/Category:Surnames_by_language
> http://en.wiktionary.org/wiki/Category:Male_given_names_by_language
> http://en.wiktionary.org/wiki/Category:Female_given_names_by_language),
> as do en.wikipedia (http://en.wikipedia.org/wiki/Category:Surnames
> http://en.wikipedia.org/wiki/Category:Given_names), da.wikipedia
> (http://da.wikipedia.org/wiki/Kategori:Efternavne
> http://da.wikipedia.org/wiki/Kategori:Fornavne), and sv.wikipedia
> (http://sv.wikipedia.org/wiki/Kategori:Efternamn
> http://sv.wikipedia.org/wiki/Kategori:Förnamn), and Europarl has
> speaker annotation which contains the name of the speaker.
> 
> -- 
> <Sefam> Are any of the mentors around?
> <jimregan> yes, they're the ones trolling you
> 
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite
> It's a free troubleshooting tool designed for production
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap2
> _______________________________________________
> Apertium-stuff mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to