FYI!

-------------------------

COMPUTER SCIENCE
COLLOQUIUM

Kernel Methods for Word Sense Disambiguation and Abbreviation Expansion in 
the Medical Domain

MAHESH JOSHI
Computer Science Graduate Student

Friday, June 23, 2006
10:00 a.m.

HELLER HALL 306

ABSTRACT

Word Sense Disambiguation (WSD) is the problem of automatically deciding 
the correct meaning of an ambiguous word based on the surrounding context 
in which it appears. The automatic expansion of abbreviations having 
multiple possible expansions can be viewed as a special form of WSD with 
the multiple expansions acting as "senses" of the ambiguous abbreviation. 
Both of these are significant problems even in a specialized domain of 
medical text such as abstracts of articles in scholarly medical journals 
and clinical notes taken by physicians.

Most popular approaches to WSD involve supervised machine learning methods 
which require a set of manually annotated examples of disambiguated words 
or abbreviations to learn patterns that help in disambiguating future 
unseen instances. However, manual annotation imposes a limit on the amount 
of labeled data that can be made available to the supervised machine 
learning algorithms such as Support Vector Machines (SVMs). Kernel methods 
for SVMs provide an elegant framework to incorporate knowledge from 
unlabeled data into the SVM learners. This thesis explores the application 
of kernel methods to two datasets from the medical domain, one containing 
ambiguous words and the other containing ambiguous abbreviations. We have 
developed two classes of semantic kernels - Latent Semantic Analysis based 
Kernels and Word Association Kernels for SVMs, that are learned from 
unlabeled text containing the ambiguous words or abbreviations using 
unsupervised methods. We have found that our semantic kernels improve the 
accuracy of SVMs on the task of WSD in the medical domain. In particular, 
our second class of kernels (Word Association Kernels) perform better and 
the improvements are significant when the sense distribution for an 
ambiguous word is balanced.



------------------------ Yahoo! Groups Sponsor --------------------~--> 
Check out the new improvements in Yahoo! Groups email.
http://us.click.yahoo.com/ulQhNB/fOaOAA/HwKMAA/x3XolB/TM
--------------------------------------------------------------------~-> 

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/nlpatumd/

<*> To unsubscribe from this group, send an email to:
    [EMAIL PROTECTED]

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 


Reply via email to