Readers of the MT-list may be interested in the following call: ---
Call for contributions NIPS 2006 Workshop MACHINE LEARNING FOR MULTILINGUAL INFORMATION ACCESS ==================================================== http://ilt.iit.nrc.ca/MLIA/ Description: ------------ In many different settings, accessing information available in different languages is a challenge. In Europe, the wide variety of languages is clearly a bottleneck for efficient circulation and access to information. More than half of EU citizens cannot hold a conversation in a language other than their mother tongue. Even in an officially bilingual country like Canada, less than one in five are considered to have a good enough command of both official languages (2001 census data). The traditional paradigm for addressing this issue is to perform human translation on a massive scale, and rely on monolingual information access technology. Although this model has worked reasonably well in the past, the rapid increase in the amount of information produced (and, in Europe, in the number of languages covered) raises questions as to its sustainability. Machine Learning has the potential to help develop and deploy technology that provides: 1. access to information across different languages, 2. usable translation from one language to another. We are interested in Machine Learning techniques addressing for example the following problems: * Word alignment * Machine translation * Multilingual lexicon and terminology extraction * Cross-lingual information retrieval * Cross-lingual categorisation Goals of the workshop: ---------------------- Multilingual applications are also emerging as a promising application for some Machine Learning techniques, for example the use of Kernel CCA for Cross-Language applications, or large-margin approaches to word alignment. This new trend converges with a well-established interest of the Natural Language Processing community for learning approaches. The purpose of this workshop is to provide a forum for discussion of current developments at the intersection between multilingual processing and machine learning. This includes developing new techniques to address various multilingual information access problems (e.g. translation), but also scaling up existing techniques to the available NLP data, developing tools for cross-language information retrieval, etc. We will promote discussions of some inter-related key issues in applying Machine Learning to Multilingual problems: * SCALING UP: - Applying ML to 100 million words corpora (e.g. SMT) - Deploying ML solutions on new language pairs * SCARCE RESOURCES: - Languages or domains with limited bilingual corpora - Bootstrapping limited resources * EVALUATION: - Design of better performance measures - Optimisation of application-specific measures - Learning human evaluation * PRIOR LINGUISTIC KNOWLEDGE: - Modelling and using linguistic knowledge in ML - The continuum between all-data (SMT) and all prior knowledge (handcrafted rules) Submission instructions: ------------------------ Researchers interested in presenting their work at the workshop should send an email to: mlia (at) nrc-cnrc.gc.ca (preferably plain text) with the following information: - Title - Author(s) - Abstract (around 1 page) Schedule: Submission deadline: 29 October 2006 Notification: 6 November 2006 Workshop date: 8 or 9 December 2006 Co-organisers: -------------- Cyril Goutte, National Research Council Canada Nicola Cancedda, Xerox Research Centre Europe Marc Dymetman, Xerox Research Centre Europe George Foster, National Research Council Canada Workshop format: ---------------- We intend to leave a good part of the workshop to panel discussions that will address relevant topics in multilingual information access (MIA), as well as invited talks presenting some important MIA problems and associated challenges for Machine Learning. For each half day, we will start with either a keynote or a short tutorial, continue with a few shorter technical presentations, and end with a panel discussion (topics to be decided depending on the confirmed list of speakers). Invited speakers: - Dan Melamed (Courant Institute, NYU) - John Shawe-Taylor (ECS, U. of Southampton, UK), tbc - Ralf Steinberger (JRC, Ispra, Italy) - Wray Buntine (HIIT, Helsinki, Finland), tbc Related work: ------------- Past NIPS workshops have addressed related topics such as learning with structured data, or the use of Machine Learning for Natural Language Processing. There is also some ongoing interest within the European network of excellence Pascal, as exemplified by the recent workshop on intelligent information access. However none of these specifically target multilingual aspects. We believe there is sufficient interest and genuine need on this particular aspect to justify a specific focus on multilingual information access. The newly started European project SMART (Statistical Multilingual Analysis for Retrieval and Translation) is specifically targeting advanced machine learning techniques for multilingual applications. _______________________________________________ Mt-list mailing list
