Re: [ngram] Ignoring regex with no delimiters

2016-05-12 Thread Ted Pedersen tpede...@d.umn.edu [ngram]
The regex in token should look like this : /\S+/ I think not having the / / is causing the delimeter errors... On Thu, May 12, 2016 at 2:11 AM, amir.jad...@yahoo.com [ngram] < ngram@yahoogroups.com> wrote: > > > I'm running count.pl on a set of unicode documents. Create a new > file('token') wh

[ngram] Ignoring regex with no delimiters

2016-05-12 Thread amir.jad...@yahoo.com [ngram]
I'm running count.pl on a set of unicode documents. Create a new file('token') which contains '\S+' in order to match any characters but space. Here is the output: ⇒ count.pl --ngram=1 --token=token ocount.txt Documents Ignoring regex with no delimiters: \S+ No token definitions to work wi