Help with Alphanumeric Tokens

Brian & Fran Thu, 21 Dec 2017 07:58:04 -0800

Good day, Peter,

We are learning UIMA Ruta and are having some problems with it. As I posted on 
stackoverflow, we have a lot of data in our documents that does not fit the 
traditional natural language mold. We have a lot of alphanumeric data such as 
file hashes, email addresses, domains, etc. We tried to re-work the JFlex lexer 
and re-build ruta-core, but are now struggling to get it working in the Ruta 
Workbench. Is there a better way to parse out and annotate such data? A file 
containing sentences or tabular data with MD5 hashes would be a great example.


Thank you,
Fran

Sent from my iPhone

Help with Alphanumeric Tokens

Reply via email to