Revision: 16009 http://gate.svn.sourceforge.net/gate/?rev=16009&view=rev Author: markagreenwood Date: 2012-08-12 11:50:44 +0000 (Sun, 12 Aug 2012) Log Message: ----------- added the missing documentation of the different multiplier types you can use with the numbers tagger
Modified Paths: -------------- userguide/trunk/misc-creole.tex Modified: userguide/trunk/misc-creole.tex =================================================================== --- userguide/trunk/misc-creole.tex 2012-08-12 01:27:14 UTC (rev 16008) +++ userguide/trunk/misc-creole.tex 2012-08-12 11:50:44 UTC (rev 16009) @@ -453,6 +453,7 @@ four thousand one hundred and two & 4102\\ 3 million & 3000000\\ f\"{u}nfundzwanzig & 25\\ +4 score & 80\\ \hline \end{tabular} \caption{Numbers Tagger Examples} @@ -507,6 +508,7 @@ <word value="2">hundreds</word> <word value="3">thousand</word> <word value="3">thousands</word> + <word value </multipliers> <conjunctions> <word whole="true">and</word> @@ -526,8 +528,29 @@ configuration file can be seen in Figure \ref{fig:numbers:example}. This configuration file specifies a handful of words and multipliers and a single conjunction. It also imports another configuration file (in the same format) -defining Unicode symbols. Most of the configuration file is self-explanatory, -however, the conjunctions needs further clarification. In English conjunctions +defining Unicode symbols. + +The words are self-explanatory but the multipliers and conjunctions need further +clarification. + +There are three possible types of multiplier: + +\begin{itemize} +\item \textbf{e}: This is the default multiplier type (i.e. is used if the type +is missing) and signifies base 10 exponential notation. For example, if the specified +value is 2 then this is expanded to $\times 10^2$, hence converting the text ``3 hundred'' into +$3 \times 10^2$ or 300. +\item \textbf{/}: This type allows you to define fractions. For example you would define a half using the value 2 (i.e. +you divide by 2). This allows text such as ``three halves'' to be normalized to 1.5 (i.e. $3/2$). Note that +you can also use this type of multiplier to specify multiples greater than one. For example, the text ``four score'' +should be normalized to 80 as a score represents 20 years. To specifiy such a multiplier we use the fraction type +with a value of 0.05. This leads to normalized value being calculated as $4/0.05$ which is 80. To determine the +value use the simple formula $(100/multipe)/100$ +\item \textbf{\^}: Multipliers of this type allow you to specify powers. For example, you could define ``squared'' with +a value of 2 to allow the text ``three squared'' to be normalized to the number 9. +\end{itemize} + +In English conjunctions are whole words, that is they require white space on either side of them, e.g. three hundred and one. In other languages, however, numbers can be joined into a single word using a conjunction. For example, in German the conjunction `und' This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ GATE-cvs mailing list GATE-cvs@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gate-cvs