[Mt-list] 1st CfP: LLREC 2010 Workshop on Language Resources (LRs) and Human Language Technologies (HLT) for Semitic Languages - Status, Updates and Prospects

info Wed, 27 Jan 2010 01:18:17 -0800

[apologies for cross-postings]


CALL FOR PAPERS

*Workshop on Language Resources (LRs) and Human Language Technologies(HLT) for Semitic Languages - Status, Updates, and Prospects

To be held in conjunction with the 7th International Language Resourcesand Evaluation Conference (LREC 2010)


*17 May 2010, Mediterranean Conference Centre, Valetta, Malta*

*Deadline for submission: 26 February 2010
*

Description

The Semitic family includes languages and dialects spoken by a largenumber of native speakers (around 300 million). Prominent members ofthis family are Arabic (and its varieties), Hebrew, Amharic, Tigrinya,Aramaic, Maltese and Syriac. Their shared ancestry is apparent throughpervasive cognate sharing, a rich and productive pattern-basedmorphology, and similar syntactic constructions. In addition, there areseveral languages which are used in the same geographic area such asAmazigh or Coptic, which, while not Semitic, have common features withSemitic languages, such as borrowed vocabulary.

The recent surge in computational work for processing Semitic languages,particularly Modern Standard Arabic (MSA) and Modern Hebrew (MH), hasbrought modest improvements in terms of actual empirical results forvarious language processing components (e.g., morphological analyzers,parsers, named entity recognizers, audio transcriptions, etc.).Apparently, reusing existing approaches developed for English or Frenchfor processing Semitic language text/speech, e.g., Arabic parsing is notas straightforward as initially thought. Apart from the limitedavailability of suitable language resources, there is increasingevidence that Semitic languages demand modeling approaches andannotations that deviate from those found suitable for English/French.Issues such as the pattern-based morphology, the frequently head-initialsyntactic structure, the importance of the interface between morphologyand syntax, and the difference between spoken and written forms(especially in Colloquial Arabic(s)) exemplify the kind of challengesthat may arise when processing Semitic languages. For languagetechnologies, such as information retrieval and machine translation,these challenges are compounded by sparse data and often result inpoorer performance than for other languages.

This Workshop intends to follow on topics of paramount importance forSemitic-language NLP that were discussed at previous events (LREC,MEDAR/NEMLAR Conferences, the workshops of the ACL Special InterestGroup for Semitic languages, etc.) and which are worth revisiting.

The workshop will bring together people who are actively involved inSemitic language processing in a mono- or cross/multilingual context,and give them an opportunity to update the community through reports oncompleted or ongoing work as well as on the availability of LRs,evaluation protocols and campaigns, products and core technologies (inparticular open source ones). We also invite authors to address otherlanguages spoken in the Semitic language area (languages such asAmazigh, Coptic, etc.). This should enable participants to develop acommon view on where we stand and to foster the discussion of the futureof this research area. Particular attention will be paid to activitiesinvolving technologies such as Machine Translation and Cross-LingualInformation Retrieval/Extraction, Summarization, etc. Evaluationmethodologies and resources for evaluation of HLT will be also a mainfocus.We expect to elaborate on the HLT state of the art, identify problems ofcommon interest, and debate on a potential roadmap for the Semiticlanguages. Issues related to sharing of resources, tools, standards,sharing and dissemination of information and expertise, adoption ofcurrent best practices, setting up joint projects and technologytransfer mechanisms will be an important part of the workshop.


Topics of Interest

This full-day workshop is not intended to be a mini-conference, but as areal workshop aiming at concrete results that should clarify thesituation of Semitic languages with respect to Language Resources andEvaluation. We expect to launch at least two evaluation campaigns:Comparative evaluation of Morphology taggers and Named EntitiesRecognizers.


Among the many issues to be addressed, below follow a few suggestions:

? Issues in the design, the acquisition, creation, management,access, distribution, use of Language Resources, in particular in abilingual/multilingual setting (Standard Arabic, Hebrew, ColloquialArabic, Amazigh, Coptic, Maltese, etc.)

? Impact on LR collections/processing and NLP of the crucial issuesrelated to "code switching" between different dialects and languages

? Specific issues related to the above-mentioned languages such asthe role of morphology, named entities, corpus alignment, etc.

? Multilinguality issues including relationship between Colloquialand Standard Arabic


?    Exploitation of LR in different types of applications

?    Industrial LR requirements and community's response

? Benchmarking of systems and products; resources for benchmarkingand evaluation for written and spoken language processing;

? Focus on some key technologies such as MT (all approaches e.g.Statistical, Example-Based, etc.), Information Retrieval, SpeechRecognition, Spoken Documents Retrieval, CLIR, Question-Answering,Summarization, etc.

? Local, regional, and international activities and projects andneeds, possibilities, forms, initiatives of/for regional andinternational cooperation.

We invite submissions on computational approaches to processingtext/speech in all Semitic and Semitic-area languages. The call is openfor all kinds of computational work, e.g., work on computationallinguistic processing components (e.g., analyzers, taggers, parsers), onstate-of-the-art NLP applications and systems, on leveraging resourceand tool creation for the Semitic language family, and on usingcomputational tools to gain new linguistic insight. We especiallywelcome submissions on work that crosses individual language boundaries,heightens awareness amongst Semitic-language researchers of sharedchallenges and breakthroughs, and highlights issues and solutions commonto any subset of the Semitic languages family.

Workshop general chair:Khalid Choukri, [email protected], ELRA/ELDA, Paris, France

Workshop co-chairs:Owen Rambow, Columbia University, New York, USABente Maegaard , University of Copenhagen, DenmarkIbrahim A. Al-Kharashi, Computer and Electronics Research Institute,King Abdulaziz City for Science and Technology, Saudi Arabia



Organizing Committee information

The Organizing, Program, and the Scientific Committees will be listed onthe web pages.


Important Dates

Deadline for abstract submissions:    26 February 2010
Notification of acceptance:        15 March 2010
Final version of accepted paper:    11 April 2010
Workshop full-day:            17 May 2010

Submission Details

Submissions should comply with LREC standards (including the LREC Mapinitiative) and must be in English. Abstracts for workshop contributionsshould not exceed Four A4 pages (excluding references). An additionaltitle page should state: the title; author(s); affiliation(s); andcontact author's e-mail address, as well as postal address, telephoneand fax numbers.

Submission will use the LREC START facility:https://www.softconf.com/lrec2010/SemiticLanguages2010/

Expected deadline is 26 February 2010.

Submitted papers will be judged based on relevance to the workshop aims,as well as the novelty of the idea, technical quality, clarity ofpresentation, and expected impact on future research within the area offocus.

Registration to LREC'2010 will be required for participation, sopotential participants are invited to refer to the main conferencewebsite for all details not covered in the present call(http://www.lrec-conf.org/lrec2010/)

Formatting instructions for the final full version of papers will besent to authors after notification of acceptance and will be identicalto LREC main conference instructions.

/When submitting a paper through the START page, authors will be kindlyasked to provide relevant information about the resources that have beenused for the work described in their paper or that are the outcome oftheir research. For further information on this new initiative, pleaserefer tohttp://www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources.

/
//

_______________________________________________
Mt-list mailing list

[Mt-list] 1st CfP: LLREC 2010 Workshop on Language Resources (LRs) and Human Language Technologies (HLT) for Semitic Languages - Status, Updates and Prospects

Reply via email to