I'm running count.pl on a set of unicode documents. Create a new file('token') which contains '\S+' in order to match any characters but space. Here is the output:
⇒ count.pl --ngram=1 --token=token ocount.txt Documents Ignoring regex with no delimiters: \S+ No token definitions to work with. Type count.pl --help for help. What's the problem?!