Call for papers: Second Workshop on Computation and Written Language (CAWL
2024)

CAWL 2024 will be held in conjunction with LREC-COLING 2024 on May 21 in
Torino, Italy. The workshop will feature an invited talk by Nizar Habash
(NYU Abu Dhabi), and has a special theme for workshop submissions: Writing
Systems of Africa. Annual CAWL workshops are organized under the guidance
of the newly formed ACL Special Interest Group on Writing Systems and
Written Language (SIGWrit). We welcome submissions of scientific papers to
be presented at the workshop and archived in the ACL Anthology. Please see
explicit submission guidelines below, including details on topics of
interest and the special workshop theme, and see the workshop webpage
https://sigwrit.org/workshops/cawl2024/ for additional relevant information.

Most work in NLP focuses on language in its canonical written form. This
has often led researchers to ignore the differences between written and
spoken language or, worse, to conflate the two. Instances of conflation are
statements like “Chinese is a logographic language" or “Persian is a
right-to-left language", variants of which can be found frequently in the
ACL anthology. These statements confuse properties of the language with
properties of its writing system. Ignoring differences between written and
spoken language leads, among other things, to conflating different words
that are spelled the same (e.g., English bass), or treating as different,
words that have multiple spellings (e.g., Japanese umai ‘tasty’, which can
be written 旨い, うまい, ウマい, or 美味い).

Furthermore, methods for dealing with written language issues (e.g.,
various kinds of normalization or conversion) or for recognizing text input
(e.g. OCR & handwriting recognition or text entry methods) are often
regarded as precursors to NLP rather than as fundamental parts of the
enterprise, despite the fact that most NLP methods rely centrally on
representations derived from text rather than (spoken) language. This
general lack of consideration of writing has led to much of the research on
such topics to largely appear outside of ACL venues, in conferences or
journals of neighboring fields such as speech technology (e.g., text
normalization) or human-computer interaction (e.g., text entry).

This workshop will bring together researchers who are interested in the
relationship between written and spoken language, the properties of written
language, the ways in which writing systems encode language, and
applications specifically focused on characteristics of writing systems.
Topics of interest include but are not limited to:

   - Text entry
   - Text tokenization
   - Disambiguation of abbreviations and homographs
   - Grapheme-to-phoneme conversion, transliteration, and diacritization
   - Text normalization for speech and for processing "informal" genres of
   text
   - Computational study of literary devices involving writing systems,
   such as eye dialect
   - Information-theoretic and machine-learning approaches to decipherment
   - Methods for specialized text genres, e.g., clinical notes
   - Optical character (incl. handwriting) recognition and historical
   document processing
   - Orthographic representation for unwritten languages
   - Spelling error detection and correction
   - Script normalization and encoding
   - Writing system typology and its relevance to speech and language
   processing

We invite submissions on the relationship between written and spoken
language, the properties of written language, the ways in which writing
systems encode language, and applications specifically focused on
characteristics of writing systems.

Additionally, we particularly encourage, and will prioritize, papers
on the special
theme of the workshop: Writing Systems of Africa. African languages make
use of a wide variety of writing systems, from those based on the
Perso-Arabic or Latin scripts throughout Africa, the Ge'ez script in the
Horn of Africa, or the Tifinagh script for Berber languages in North
Africa, to recently invented writing systems such as the Adlam alphabet
created for Fula. Issues arising from the adaptation of scripts to new
languages, such as Ajami or orthographies using the Latin script, would be
of interest. For example, the primary language of instruction in the
schools of Mali is French, so that speakers of Bambara, despite not
generally being taught to read that language in the schools, will often
make use of either the Latin script that they learned via French in school
or the Perso-Arabic (Ajami) script from religious instruction to write
their language. Bambara is also sometimes written with the modern N'Ko
script. Given this diversity of options, Bambara written language can be
extremely varied, presenting major challenges to corpus building and
automatic language processing methods.

Important dates:

Paper submission deadline: February 22, 2024 (anywhere in the world)
Notification of acceptance: March 25, 2024
Camera-ready paper due: April 5, 2024
Workshop date: May 21, 2024

Submission Guidelines

Please submit short (4 page) or long (8 page) submissions in PDF format to
https://softconf.com/lrec-coling2024/cawl2024/. Both short and long paper
submissions will be reviewed in the same process. Authors should follow the
formatting guidelines of LREC-COLING 2024, available in the authors kit (
https://lrec-coling-2024.org/authors-kit/), and we will follow the paper
submission and reviewing policies detailed in the LREC-COLING 2024 call for
papers (https://lrec-coling-2024.org/2nd-call-for-papers/). Note that, as
with the main conference, reviewing is double-anonymous, i.e., reviewers
will not know author identity and vice versa, hence no author information
should be included in the papers; self-reference that identifies the
authors should be avoided or anonymised. Accepted papers will appear in the
workshop proceedings in the ACL anthology.

For questions about the submission guidelines, please contact workshop
organizers at [email protected].

Organizers:

   - Kyle Gorman <https://wellformedness.com/>, Graduate Center, City
   University of New York & Google, USA
   - Emily Prud’hommeaux <http://cs.bc.edu/~prudhome/>, Boston College, USA
   - Brian Roark <https://lanzaroark.org/brian-roark/>, Google, USA
   - Richard Sproat <https://rws.xoba.com/>, Google DeepMind, Japan

Program Committee:

   - David Ifeoluwa Adelani <https://dadelani.github.io/>, University
   College London, UK
   - Manex Agirrezabal <https://manexagirrezabal.github.io/>, University of
   Copenhagen, Denmark
   - Sina Ahmadi <https://sinaahmadi.github.io/>, George Mason University,
   USA
   - Cecilia Alm <https://www.rit.edu/directory/coagla-cecilia-alm>,
   Rochester Institute of Technology, USA
   - Mark Aronoff <https://linguistics.stonybrook.edu/faculty/mark.aronoff/>,
   Stony Brook University, USA
   - Steven Bedrick
   <https://www.ohsu.edu/school-of-medicine/csee/steven-bedrick>, Oregon
   Health & Science University, USA
   - Taylor Berg-Kirkpatrick <https://cseweb.ucsd.edu/~tberg/>, UC San
   Diego, USA
   - Amalia Gnanadesikan
   <https://scholar.google.com/citations?user=HkNhAoAAAAAJ&hl=en>,
   University of Maryland, USA
   - Christian Gold
   
<https://www.fernuni-hagen.de/english/research/clusters/catalpa/about-catalpa/members/christian.gold.shtml>,
   CATALPA, FernUniversität in Hagen, Germany
   - Alexander Gutkin <https://research.google/people/AlexanderGutkin/>,
   Google, UK
   - Nizar Habash
   
<https://nyuad.nyu.edu/en/academics/divisions/science/faculty/nizar-habash.html>,
   NYU Abu Dhabi, United Arab Emirates
   - Yannis Haralambous
   <https://www.imt-atlantique.fr/en/person/yannis-haralambous>, IMT
   Atlantique & CNRS Lab-STICC, France
   - Cassandra Jacobs <https://www.acsu.buffalo.edu/~cxjacobs/>, University
   at Buffalo, USA
   - Martin Jansche
   <https://scholar.google.com/citations?user=z8yPdQQAAAAJ&hl=en>, Amazon,
   UK
   - Kathryn Kelley
   <https://www.unibo.it/sitoweb/kathrynerin.kelley/research>, Università
   di Bologna, Italy
   - George Kiraz <https://www.ias.edu/scholars/george-kiraz>, Princeton
   University, USA
   - Christo Kirov <https://ckirov.github.io/>, Google, USA
   - Jordan Kodner <https://jkodner05.github.io/>, Stony Brook University,
   USA
   - Anoop Kunchukuttan <http://anoopk.in/>, Microsoft, India
   - Yang Li <https://npuliyang.github.io/>, Northwestern Polytechnical
   University, China
   - Constantine Lignos <https://lignos.org/>, Brandeis University, USA
   - Zoey Liu <https://zoeyliu18.github.io/>, University of Florida, USA
   - Jalal Maleki <https://liu.se/en/employee/jalma87>, Linköping
   University, Sweden
   - M. Willis Monroe <https://www.willismonroe.com/>, University of New
   Brunswick, Canada
   - Gerald Penn <http://www.cs.toronto.edu/~gpenn/>, University of
   Toronto, Canada
   - Yuval Pinter <https://www.cs.bgu.ac.il/~pintery/>, Ben-Gurion
   University of the Negev, Israel
   - William Poser <https://billposer.org/>, independent scholar, Canada
   - Shruti Rijhwani <https://shrutirij.github.io/>, Google, USA
   - Maria Ryskina <https://ryskina.github.io/>, MIT, USA
   - Anoop Sarkar
   <https://www.sfu.ca/computing/people/faculty/anoopsarkar.html>, Simon
   Fraser University, Canada
   - Lane Schwartz <http://dowobeha.github.io/>, University of Alaska,
   Fairbanks, USA
   - Djamé Seddah <http://pauillac.inria.fr/~seddah/>, Sorbonne University
   & Inria, France
   - Shuming Shi
   <https://scholar.google.com/citations?user=Lg31AKMAAAAJ&hl=en>, Tencent,
   China
   - Claytone Sikasote <https://csikasote.github.io/>, University of Zambia
   (UNZA), Zambia
   - Fabio Tamburini <https://corpora.ficlit.unibo.it/People/Tamburini/>,
   University of Bologna, Italy
   - Kumiko Tanaka-Ishii <https://www.cl.rcast.u-tokyo.ac.jp/Top.html>,
   University of Tokyo, Japan
   - Lawrence Wolf-Sonkin
   <https://aclanthology.org/people/l/lawrence-wolf-sonkin/>, Google, USA
   - Martha Yifiru Tachbelie
   <https://scholar.google.com/citations?user=9N37SgoAAAAJ>, Addis Ababa
   University, Ethiopia
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to