NB: All decisions herein are submitted to be approved by the PMC.
=========================================================================
Chair: Francis Tyers
Participants:
Felipe Sánchez-Martínez
Víctor Sánchez-Cartagena
Xavier Ivars-Ribes
Miquel Esplà Gomis
Lluís Villarejo
Antonio Toral
Jim O'Regan
Tihomir Rangelov
Gema Ramírez-Sánchez
Sergio Ortiz-Rojas (via chat)
=========================================================================
AGENDA
* Welcome everyone!
* Apertium in 2010:
- GsoC: 9 projects!!!
- GCI: 92 tasks
- Other projects:
+ Icelandic-English (Spectie - NILS),
+ sme-nob (Unhammer - Norwegian government),
+ Tradubi (DLSI & final project students),
+ web application to report errors (Softvalencià),
+ hybrid Apertium-Moses prototype (to be released),
+ TM&Apertium (UOC: http://apertium.uoc.edu and UPV:
http://babel.cc.upv.es),
+ multi-domain management (to be released - UOC)
- Improvements:
+ new pairs (26 stable pairs!!!!)
+ technological improvements -- lexical selection in progress :'(
- Community: more people, almost same results
- Publications: http://wiki.apertium.org/wiki/Publications
- Events where Apertium was present: MT-Marathon, LREC (no
presentation), EAMT, UPV & UA seminars
- Funding:
Plans for 2011:
* Elections results coming soon! Mikel = president. PMC = 7 wise men
* GsoC 2011?
* GCI 2011?
* Other projects:
- Luxembourg Apertium Workshop,
- Vitaka thesis (elicitation of data for Apertium),
- Espla thesis (integration of Apertium in CAT),
- joregan (EAMT Basque-English)
* Improvements: new pairs/ technological improvements
* Community: more people, more results!!
* Publications:
* Events where Apertium will be present: WMT-2011 (Edinburgh, EAMT
(Belgium), ESSLLI (Slovenia, Lexical resources & CG), EAMT-School
(Russia)
* Funding:
=========================================================================
The main conclussions/comments were:
* Gsoc 2010: 9 students granted.
- For GSoC 2011: Proposal to make votes from people in the
community / mentors will be consultative not mandatory. President
and PMC will decide the final projects for GSoC. People agreed.)
* GCI: it was a good experience, time consuming but very interesting (92
tasks - 15 students - 2/3 mentors). Take a look to the project rankings.
* Other projects:
- Tradubi and scaleMT: Make sure the service is up to date and figure
out a way of providing all the pairs.
- Softcatalà: Participants suggested the suitability of having a
similar (or same) reporting error system than Softvalencià or
Softcatalà in Apertium. Xavi said it would be easy to do and that
filtering bad feedback would be a must.
- Prompsit's hybrid MT: an Apertium-Moses is being developed by
Prompsit. Plans to make it available during 2011 were reported.
Participants talk about the convenience of releasing it in the
Apertium project or in the Moses project. Sergio suggested Apertium
as "the software can be viewed as apertium components" and "it and
uses a lot of apertium". Other participants suggested Moses to
increase Apertium visibility in the Moses community. People should
go to Apertium to download the data in any case. Also data changes
more rapidly than code. No decision made.
- Apertium + TM: Villarejo commented that this project was instigated
by UOC and that the first version was available although it was very
limited (short sentences, short TM and only perfect matches) and
that a new version was coming soon. Gema asked Sergio to sum up
the improvements in the new version. Sergio said:
"The module consists on a couple of programs, currently writen in
python. One goes after the deformatter and the other before the
reformatter. The module uses a class of fulltext index to match
and edit distance to evaluate accuracy.
The first program reads the translation memory in TMX,
processes text and matches full sentences but doing smart things
with numbers, for example the resulting matched segments are
encapsuled as superblanks and encoded with base64 just to not
have problems with escaping characters. The other program only
unescapes these blocks before reformatter. That's all in a
nutshell."
- Multi-domain vocabulary managment: A project to introduce
domain-adapted terms in Apertium associated to domain-agreed codes
by some universities and instigated by UOC had been started in 2010.
The integration into a language package and the data formats as well
as some work on the data has already been done and a first version
will be soon available through UOC MT service.
Nobody could remember about the proposal for integration and formats
(discussed by almost all members in PMC in mail thread "Diccionario
por áreas temáticas para la UOC". Gema committed to resend (or send
for the first time) it to Fran and Jim.
* New pairs:
- Icelandic--English (is-en)
- Macedonian--Bulgarian (mk-bg)
- Macedonian--English (mk-en)
- Italian--Catalan (it-ca)
- And in incubator: Czech--Polish, French--Portuguese,
Catalan--Sardinian and Dutch--Afrikaans
- Participants highlighted the importance of having a language pair
maintainer.
Participants suggested that some work on the web be done.
* Community: we have 324 people registered in sourceforge but almost the
same people are active. How people can be encouraged to be active is a
concern of the participants. Suggested by Sergio: SVN server of SF
does not work. Are we going to auto host us in the new server?
People were all in favour that autohosting, but that the SVN is large
and would be difficult to auto-host. Auto-hosting the web and
translators is easier. Bytemark server is very limited (400MB). Pairs
cannot be installed there. Participants suggest to use the
webservices, starting with xixona/elx to use ScaleMT to run the
language pairs. Felipe suggested having a page per translator on the
web with a direct-download link to a SourceForge mirror and a
paragraph of text describing what the user can expect from the pair.
Everyone else agreed this would be a good idea.
* Elections on Apertium are about to finish. Mikel will be the
president. PMC member will be soon announced. Unhammer is running
the show.
* GSoC 2011: besides the proposal for voting changes, Fran asked for
someone else to organise GSoC this year. He said "I don't mind doing
it but it is pretty much a full time job herding cats... I mean
mentors/students". Nobody stepped forward.
* Other projects going on Apertium for 2011:
- Luxembourg Apertium Workshop: a 3 workshop on Apertium addressed to
translators from de European Community will be held in Luxembourg
between 2-4 May.
- Víctor explains his thesis project and the progress on a hybrid
Apertium-Moses system based on data generation from dictionaries and
rule patterns. Fran suggests he looks at EAMT2009 Breton-French
paper.
- Miquel explains his thesis project and the motivation behind it
(integrating MT in CAT tools). The FP7 project "HelpUTrans" was
mentioned. Apertium will be one of the systems included as test
case.
* Funding:
- Felipe mentions about the crowdfunding initiatives appearing
recently in Spain (such as www.lanzanos.com). Having no legal
entity such as an association is always a problem to access
different kinds of funding.
* Web:
- Update Tinylex / online dictionary lookup to include new pairs.
- Participants agree that the Apertium website needs improvement and
changes. Xavi is asked to do them and participants suggest to pay
this from remaining Apertium funds
* Participants agree that we should going on be present in conferences
and workshops to make Apertium visible.
- Fran promises to stop writing articles to concentrate on his
thesis. Except maybe one article to EAMT if someone else does all
the work.
* Participants decided to stop here.
=========================================================================
If anyone has any comments / amendments, please email them to me
offlist. When the minutes are passed by the PMC I will add them to the
Wiki. I think it would be good to have a page with a record of
'extraordinary meetings' / 'minutes'.
Fran
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff