The regex in token should look like this : /\S+/
I think not having the / / is causing the delimeter errors... On Thu, May 12, 2016 at 2:11 AM, amir.jad...@yahoo.com [ngram] < ngram@yahoogroups.com> wrote: > > > I'm running count.pl on a set of unicode documents. Create a new > file('token') which contains '\S+' in order to match any characters but > space. > > Here is the output: > > > ⇒ count.pl --ngram=1 --token=token ocount.txt Documents > > Ignoring regex with no delimiters: \S+ > > No token definitions to work with. > > Type count.pl --help for help. > > > What's the problem?! > > >