Hi Christos, In order to count as you describe, you just need to use the --newLine option.
If you run count.pl --help you can see all the command line options. Among them is ... --newLine Prevents n-grams from spanning across the new-line character. which should do exactly as you wish! Happy Counting, :) Ted On Thu, Feb 5, 2009 at 8:29 AM, christos.braeunle <christos.braeu...@yahoo.com> wrote: > Hello > > I started using the NSP package and i am realy impressed by its power. > First of all thanks for that great tool! > > Now i run into a problem when building ngrams. I want to tell count.pl > not to create ngrams over the end of a sentence. > > For example: i have two sentences. > > Vincent loves Honey Bunny > A women snorts > > Now when building bigrams i would like to get: > > Vincent<>loves > loves<>Honey > Honey<>Bunny > A<>women > women<>snorts > > so i want that the bigram Bunny<>A is not created (and don't gets counted) > > Is there a way to achieve this? > > I hope my question is understandable and has not been ask bevor. > > If i missed some relevant documentation, i would be glad to be pointet > to it. > > Thanks a lot > > Christos Bräunle > > -- Ted Pedersen http://www.d.umn.edu/~tpederse