Hello,

I did not post to kde-devel since a long time and I hope that this one will be 
well received even if it is not immediately related to KDE. In fact, some of 
you will remember that I talked a few years ago about the possible release 
under an open source licence of the natural language analyzer I work on at work 
and that could be useful to KDE, for example in Nepomuk.

Well, here we are, finally, and thus we propose a (paid) internship to help us 
clean up and set up before putting it online.

You'll find below the internship proposition.

Regards,

Gaël


Internship: Release of a multilingual linguistic analysis software under a 
Free/Libre Open Source Software Licence
(Compulsory internship with internship agreement, Master 1 or 2 level)

CEA LIST
Vision and Content Engineering Laboratory

The internship will take place in the premises of LVIC at Nano-INNOV located in 
Palaiseau 25 km south of Paris, France.


TOPIC

Context

Since 2002, the LVIC develops the multilingual linguistic analyzer LIMA. It is 
now a very modular tool able to analyse (tokenization, morphological, syntactic 
and semantic parsing) texts in languages ​​as diverse as English, French, 
Arabic, Chinese , Spanish, German or Italian. LIMA currently represents more 
than 100,000 lines of code (excluding linguistic resources). LIMA is already 
used in several industrial products, but the CEA LIST has decided to distribute 
it under Free/Libre Open Source Software License (FLOSS) to facilitate its use, 
its dissemination and to get faster returns from a broader community of users.
LIMA is coded in standard C++. It uses extensively boost and Qt libraries and 
is cross-platform (GNU/Linux and MS Windows so far). Its architecture makes it 
easily extensible and integratable into applications.

Objectives

This release, which is within ASFALDA project (funded by the French National 
Research Agency) requires further improvements to the software before its 
distribution on several aspects:
- API documentation;
- User documentation;
- Unit tests;
- Functional tests.

LIMA depends on linguistic resources to operate (dictionaries, parsing rules, 
...). Even if the laboratory is the owner of some of them, others are from 
commercial resources and may not be distributed freely. Another goal of the 
intern will thus to produce alternative resources from freely available 
linguistic resources.

The intern will work on these topics in order to make available LIMA on a 
software forge at the end of the course. The selected candidate will have a 
good level in C++, an understanding of issues related to software release 
(testing, documentation ...) and ideally have participated in a free software 
project.


Course Duration: 4 to 6 months

Training required: Master 1 or 2.

Contact:
Gaël de Chalendar
Mail: [email protected]
Phone: +33 6 76 36 70 31
Skype: kleagg
XMPP: [email protected]

>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

Reply via email to