Revision: 17953
http://sourceforge.net/p/gate/code/17953
Author: ian_roberts
Date: 2014-05-11 12:46:36 +0000 (Sun, 11 May 2014)
Log Message:
-----------
Mark's changes from trunk
Modified Paths:
--------------
userguide/branches/release-8.0/social-media.tex
Modified: userguide/branches/release-8.0/social-media.tex
===================================================================
--- userguide/branches/release-8.0/social-media.tex 2014-05-11 07:43:33 UTC
(rev 17952)
+++ userguide/branches/release-8.0/social-media.tex 2014-05-11 12:46:36 UTC
(rev 17953)
@@ -120,7 +120,9 @@
for another component that can split up multi-word hashtags.
\item ``Emoticons'' such as \verb!:-D! can be treated as a single token. This
requires a gazetteer of emoticons to be run before the tokeniser, an example
- gazetteer is provided in the Twitter plugin.
+ gazetteer is provided in the Twitter plugin. This gazetteer also normalises
+ the emoticons to help with classification, machine learning etc. For example,
+ \verb!:-D!, and \verb!8D! are both normalized to \verb!:D!.
\end{itemize}
The ``Tweet Normaliser'' PR uses a spelling correction dictionary to correct
@@ -155,6 +157,8 @@
apostrophe). Camel-cased hashtags (\verb!#CamelCasedHashtag!) are split at
case changes.
+More details, and an example usecase, can be found in \cite{Maynard14a}.
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\sect[sec:social:twitie]{The TwitIE Pipeline}
This was sent by the SourceForge.net collaborative development platform, the
world's largest Open Source development site.
------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
GATE-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/gate-cvs