Mail from ILUG-BOM list (Non-Digest Mode)
_______________________________________________

Dear Colleague,

I thought you may find this event interesting.
Please forward it to other colleagues who may
also want to attend.

Regards,
- Durgesh Rao
  Research Scientist, NCST.

                  NCST invites you to an Invited Talk on

      "Corpus Linguistics and Information Extraction"
           

Speaker: Dr Tony McEnery, Lancaster University, UK

Date and Time: Thursday, August 10, 2000, 10.00 a.m.

Venue: Lecture Theatre, NCST Juhu

Abstract:
---------

Information Extraction (IE) deals with techniques to analyze
structured information from unstructured text in natural language.
Sample applications include -- analysis of financial news to track 
strategic events such as mergers, tie-ups and acquisitions in a
specific sector, extraction of relevant personal profiles from
resumes or matrimonial ads, and extraction of symptoms and 
diagnoses from medical reports.

The urgency and potential of this technology can be gauged 
from the fact that over 90 per cent of the information on 
the World Wide Web occurs not in structured data format, 
but as unstructured text in natural languages like English, 
and increasingly, many other languages.

Corpus Linguistics combines ideas from Linguistics and
Statistics to devise tools and techniques to systematically collect,
annotate and analyze corpora, or large collection of text.
It is thus an important tool for Language Engineering.

This talk seeks to explore how Information Extraction and Corpus Linguistics 
can each benefit from the techniques of the other. The goals of 
information extraction and of corpus linguistics have thus far had 
little in common. However, both are concerned with processing 
large bodies of text. It is timely to explore how one can 
contribute to the other. This talk will produce just such an exploration, 
in the light of the recent workshop on this topic at the 
LREC 2000 conference in Athens, Greece.

About the Speaker:
-----------------

Dr. Tony McEnery is Reader in Multilingual Corpus Linguistics in the
Department of Linguistics, Lancaster University. He is a member of a
team that have been working on corpus linguistics, and corpus based
approaches to language engineering for over twenty years. He has worked
on a wide range of European and Asian languages, and is currently
engaged in work designed to create an integrated environment for the
construction and exploitation of corpora of South Asian languages. 

Dr. McEnery is the author of "Corpus Linguistics", one of the
standard works on this topic, and has published widely on corpus based
approaches to natural language processing. He is currently on a
visit to India to explore avenues of technical collaboration for
building platforms for multilingual language engineering, with
focus on Indian and other South Asian languages.

For more info: http://www.ling.lancs.ac.uk/staff/tony/tony.htm

From:
Durgesh Rao, Research Scientist, KBCS,
National Centre for Software Technology,
Gulmohar Road 9, Juhu, Mumbai 400049.
Phone: (022)6201606  Fax: (022)6210139  
Email: [EMAIL PROTECTED]

----------------------------------------------------------------------
Durgesh D Rao (DDR)                      Email: [EMAIL PROTECTED]
NCST, Gulmohar Rd 9, Juhu, Mumbai 400049 Ph:6201606x372(o), 6450092(r)
----------------------------------------------------------------------

_______________________________________________
Website: http://www.ilug-bom.org.in/ilug
Linuxers mailing list
[EMAIL PROTECTED]
http://ilug-bom.org.in/mailman/listinfo/linuxers

Reply via email to