Revision: 18386 http://sourceforge.net/p/gate/code/18386 Author: markagreenwood Date: 2014-10-14 15:40:19 +0000 (Tue, 14 Oct 2014) Log Message: ----------- some documentation for the new linguisitc simplifier plugin
Modified Paths: -------------- userguide/trunk/misc-creole.tex userguide/trunk/recent-changes.tex Modified: userguide/trunk/misc-creole.tex =================================================================== --- userguide/trunk/misc-creole.tex 2014-10-14 15:02:34 UTC (rev 18385) +++ userguide/trunk/misc-creole.tex 2014-10-14 15:40:19 UTC (rev 18386) @@ -3585,3 +3585,44 @@ The `Log4J Level: ALL' tool adds a new entry to the Tools menu which switches the Log4J level of all loggers and appenders to ALL so that you can quickly see all logging activity in both the GUI and the log files. + +% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\sect[sec:misc-creole:linguistic-simplifier]{Linguistic Simplifier} +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% +This plugin provides a linguistically based document simplifier and is based +upon work supported by the EU \htlink{http://www.forgetit-project.eu/}{ForgetIT} +project. + +The idea behind this plugin is to simplify sentences by removing words or +phrases which are not required to convey the main point of the sentence. +This can can be viewed as a first step in document summarization and also +mirrors the way people remember conversations; the details and not the +exact words used. The approach presented here uses accomplishes this task +using a number of linguistically motived rules in conjunction with WordNet. +Examples sentences which can be simplified include: + +\begin{itemize} +\item For some reason people will actually buy a pink coloured car. +\item The tub of ice-cream was unusually large in size. +\item There was a big explosion, which shook the windows, and people ran into the street. +\item The function of this department is the collection of accounts. +\end{itemize} + +For best results the PR should be run after running the following pre-processing +PRs: tokenizer, sentence splitter, POS tagger, morphological analyser, and the +noun chunker. The output of the PR is stored as \verb|Redundant| annotations (in the +annotation set specified by the \verb|annotationSetName| runtime parameter). To produce +a simplified document the text under each \verb|Redundant| annotation should be removed, +and replaced, if present, by the annotations \verb|replacement| feature. Two document +exporter plugins are also provided to output simplified documents as either plain text +or HTML. + +The plugin contains a demo application (available from the Ready-Made menu if +the plugin has been loaded), which allows the techniques to be demonstrated. +The performance of the approach can be improved by passing a WordNet LR +instance to the PR as a runtime param. This is not provided in the demo +application, as it is not possible to provide this in an easily portable way. +See Section \ref{sec:misc-creole:wn} for details of how to load WordNet into +GATE. Modified: userguide/trunk/recent-changes.tex =================================================================== --- userguide/trunk/recent-changes.tex 2014-10-14 15:02:34 UTC (rev 18385) +++ userguide/trunk/recent-changes.tex 2014-10-14 15:40:19 UTC (rev 18386) @@ -24,6 +24,7 @@ \rcSubsect{October 2014} A new plugin for simplifying sentences using lingustic rules and information. +See Section \ref{sec:misc-creole:linguistic-simplifier} for more details. \rcSubsect{September 2014} This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ GATE-cvs mailing list GATE-cvs@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gate-cvs