Hello, I did not post to kde-devel since a long time and I hope that this one will be well received even if it is not immediately related to KDE. In fact, some of you will remember that I talked a few years ago about the possible release under an open source licence of the natural language analyzer I work on at work and that could be useful to KDE, for example in Nepomuk.
Well, here we are, finally, and thus we propose a (paid) internship to help us clean up and set up before putting it online. You'll find below the internship proposition. Regards, Gaël Internship: Release of a multilingual linguistic analysis software under a Free/Libre Open Source Software Licence (Compulsory internship with internship agreement, Master 1 or 2 level) CEA LIST Vision and Content Engineering Laboratory The internship will take place in the premises of LVIC at Nano-INNOV located in Palaiseau 25 km south of Paris, France. TOPIC Context Since 2002, the LVIC develops the multilingual linguistic analyzer LIMA. It is now a very modular tool able to analyse (tokenization, morphological, syntactic and semantic parsing) texts in languages as diverse as English, French, Arabic, Chinese , Spanish, German or Italian. LIMA currently represents more than 100,000 lines of code (excluding linguistic resources). LIMA is already used in several industrial products, but the CEA LIST has decided to distribute it under Free/Libre Open Source Software License (FLOSS) to facilitate its use, its dissemination and to get faster returns from a broader community of users. LIMA is coded in standard C++. It uses extensively boost and Qt libraries and is cross-platform (GNU/Linux and MS Windows so far). Its architecture makes it easily extensible and integratable into applications. Objectives This release, which is within ASFALDA project (funded by the French National Research Agency) requires further improvements to the software before its distribution on several aspects: - API documentation; - User documentation; - Unit tests; - Functional tests. LIMA depends on linguistic resources to operate (dictionaries, parsing rules, ...). Even if the laboratory is the owner of some of them, others are from commercial resources and may not be distributed freely. Another goal of the intern will thus to produce alternative resources from freely available linguistic resources. The intern will work on these topics in order to make available LIMA on a software forge at the end of the course. The selected candidate will have a good level in C++, an understanding of issues related to software release (testing, documentation ...) and ideally have participated in a free software project. Course Duration: 4 to 6 months Training required: Master 1 or 2. Contact: Gaël de Chalendar Mail: [email protected] Phone: +33 6 76 36 70 31 Skype: kleagg XMPP: [email protected] >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
