===============================================================
                COLING 2004 WORKSHOP
   Computational Approaches to Arabic Script-based Languages

                Saturday, August 28, 2004
                  University of Geneva
                   Geneva, Switzerland
===============================================================

CONFERENCE WEBSITE: http://www.issco.unige.ch/coling2004/

WORKSHOP DESCRIPTION

  Recently, there has been a surge of interest in the study of the languages
of the Middle East, especially Arabic, Persian (Farsi), Pashto and Urdu.
This sudden and urgent interest is manifested by the availability of funding
for rapid development of practical systems for processing large volumes of
data in these languages. Computational applications for proper name
identification, entity recognition, categorization, information retrieval,
summarization, machine translation and other implementations are currently
in high demand. This comes at a time when advances in formal and
computational linguistics over the last fifty years are being consolidated,
while work on machine learning and statistical methods has been showing
great promise.

  Although there exists a considerable body of work in computational
linguistics specifically targeted to these middle eastern languages, much of
the research and development has been the result of initiatives by
individual research establishments or industry firms. Furthermore, the usage
of the Arabic script gives rise to certain issues that are common to all
these languages despite their being of distinct language families. Hence,
these languages share properties such as the absence of capitalization,
right to left direction, lack of clear word boundaries, complex word
structure, a high degree of ambiguity due to non-representation of short
vowels in the writing system, and related encoding issues.

  The goal of this workshop is to provide a forum for those involved in the
development of NLP systems in Arabic script languages to exchange ideas,
approaches and implementations of computational systems; to discuss the
common challenges faced by all practitioners; and to assess the state of the
art in the field. In addition, one of the aims of the workshop is to
identify promising areas for future collaborative research in the
development of NLP systems for Arabic script languages. Solutions that are
designed to solve the specific problems of these languages could very well
have wider applications and relevance to the rest of the NLP community. 

WORKSHOP TOPICS

Authors of papers in any area of NLP in Arabic script-based languages are
encouraged to apply. We encourage submissions dealing with language-specific
issues, as well as discussions of challenges imposed by the usage of the
Arabic script. Papers could be on - but not limited to - any of the
following topics:

* Morphological analysis
* Syntactic ambiguity resolution
* Relevance of shallow parsing
* Machine translation from and to Arabic script languages
* Sense disambiguation
* Homograph resolution
* Semantic analysis
* Entity recognition
* Information retrieval
* Classification of documents
* Text mining
* Summarization
* Statistical approaches
* Speech recognition and generation
* Lexical databases
* Knowledge and domain representation
* Spelling and grammar checking tools

SUBMISSION REQUIREMENTS

Papers should be original, previously unpublished work and should not
identify the author(s). They should emphasize completed work rather than
intended work. Papers that are being submitted to other conferences must
reflect this fact on the title page. 

Submissions should be no longer than 8 pages (including figures and
references). Email submissions (ps or pdf) are preferred and should be sent
to both [EMAIL PROTECTED] and [EMAIL PROTECTED] by midnight of the due
date. Submissions should be in English. The papers should be attached to an
email indicating contact information for the author(s) and paper's title.
Formatting requirements for the final version of accepted papers will be
posted as soon as they become available.

Hardcopy submissions should be sent to:
Ali Farghaly
SYSTRAN Software, Inc.
9333 Genesee Ave, Pl 1
San Diego, CA 92121
USA

IMPORTANT DATES

Submissions due: March 25th, 2004
Notification date: April 25th, 2004
Deadline for camera ready copy: May 25th, 2004

ORGANIZING COMMITTEE

This workshop is organized by
Ali Farghaly (SYSTRAN Software, Inc.)
Karine Megerdoomian (Inxight Software and University of California San
Diego)

The call for papers as well as future information on the workshop can be
found at http://members.cox.net/karinem/COLING2004

PROGRAM COMMITTEE

Jan W. Amtrup, Bowne Global Solutions
Tim Buckwalter, Linguistic Data Consortium
Violetta Cavalli-Sforza, Carnegie Mellon University
Joseph Dichy, Lyon University
Andrew Freeman, University of Washington
Nizar Habash, University of Maryland, College Park
Masayo Iida, Inxight Software, Inc.
Simin Karimi, University of Arizona
Martin Kay, Stanford University
Kevin Knight, USC/Information Sciences Institute
Farhad Oroumchian, University of Wollongong in Dubai
Ahmed Rafea, The American University in Cairo
Jean Senellart, SYSTRAN Software
R�mi Zajac, SYSTRAN Software






-- 
  For MT-List info, see http://www.eamt.org/mt-list.html

Reply via email to