The regex in token should look like this :
/\S+/
I think not having the / / is causing the delimeter errors...
On Thu, May 12, 2016 at 2:11 AM, amir.jad...@yahoo.com [ngram] <
ngram@yahoogroups.com> wrote:
>
>
> I'm running count.pl on a set of unicode documents. Create a new
> file('token') wh
I'm running count.pl on a set of unicode documents. Create a new file('token')
which contains '\S+' in order to match any characters but space.
Here is the output:
⇒ count.pl --ngram=1 --token=token ocount.txt Documents
Ignoring regex with no delimiters: \S+
No token definitions to work wi