2014-07-08 21:53 GMT+02:00 Marco A.G.Pinto <marcoagpi...@mail.telepac.pt>:

>  Alberto sent me via e-mail the most recent tag dictionary for pt_PT:
> https://dl.dropboxusercontent.com/u/30674540/pt-preao.dump_20140708.gz
>

Comparing both dictionaries, I see pros and cons.

1) Number of entries

Freeling: 1.257.867 entries
pt-preao:  1.061.895 entries.

2) The tags are slightly different.

3) Freeling has more information in verbs. This would make some grammar
rules unfeasible with pt-preao.

Freeling:
colhemos colher VMIP1P0
colhemos colher VMIS1P0

pt-preao:
colhemos colher VIPP
colhemos colher VIHP

4) pt-preao contains verbs with hyphened suffixes, which are absent in
Freeling. If we choose one or the other, the tokenizer and the
disambiguator should be adapted consistently.

abafar-lha abafar VN
abafar-lhas abafar VN
abafar-lhe abafar VN
abafar-lhe-ei abafar VIFS
abafar-lhe-eis abafar VIFP
abafar-lhe-emos abafar VIFP
abafar-lhe-ia abafar VCHS
abafar-lhe-iam abafar VCHP
...

I certainly would choose Freeling. But that's up to you.

Regards,
Jaume Ortolà






>  Is it possible to add this one?
>
> About creating the rules, do I really need to use the online rule editor,
> or can I still change in grammar.xml and commit?
>
>
> Thanks!
>
> Kind regards,
>      >Marco A.G.Pinto
>        -----------------------
>
>
> On 08/07/2014 20:43, Jaume Ortolà i Font wrote:
>
> Marco,
>
>  I have committed a Portuguese tagger dictionary built from the Freeling
> dictionary. It's more than enough to start. Now you have more than a milion
> tagged word forms.
>
>  Once the code is updated, you will be able to write your own rules in
> the online rule editor:
> http://community.languagetool.org/ruleEditor2/index
>
>  The Freeling tags for Portuguese are not fully documented [1], but they
> are similar to Spanish or Catalan.
>
>   Regards,
> Jaume Ortolà
>
>  [1]
> http://nlp.lsi.upc.edu/freeling/index.php?option=com_content&task=view&id=18&Itemid=47
>
>
>
> 2014-07-08 20:56 GMT+02:00 Marco A.G.Pinto <marcoagpi...@mail.telepac.pt>:
>
>>  Alberto Simões from Minho University told me that the freeling tagger is
>> not up to date.
>>
>> Minho University are the persons in charge of the official pt_PT speller
>> for OpenOffice, LibreOffice and Mozilla.
>>
>> They maintain the dictionary and release regular updates.
>>
>> Marcin, if you or anyone needs information, please contact Alberto.
>>
>> Thanks!
>>
>> Kind regards,
>>       >Marco A.G.Pinto
>>        ------------------------
>>
>>
>
>
> --
>
>
> ------------------------------------------------------------------------------
> Open source business process management suite built on Java and Eclipse
> Turn processes into business applications with Bonita BPM Community Edition
> Quickly connect people, data, and systems into organized workflows
> Winner of BOSSIE, CODIE, OW2 and Gartner awards
> http://p.sf.net/sfu/Bonitasoft
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to