Hi Christos,

In order to count as you describe, you just need to use the --newLine option.

If you run

count.pl --help

you can see all the command line options. Among them is ...

 --newLine          Prevents n-grams from spanning across the
                          new-line character.

which should do exactly as you wish!

Happy Counting, :)
Ted

On Thu, Feb 5, 2009 at 8:29 AM, christos.braeunle
<christos.braeu...@yahoo.com> wrote:
> Hello
>
> I started using the NSP package and i am realy impressed by its power.
> First of all thanks for that great tool!
>
> Now i run into a problem when building ngrams. I want to tell count.pl
> not to create ngrams over the end of a sentence.
>
> For example: i have two sentences.
>
> Vincent loves Honey Bunny
> A women snorts
>
> Now when building bigrams i would like to get:
>
> Vincent<>loves
> loves<>Honey
> Honey<>Bunny
> A<>women
> women<>snorts
>
> so i want that the bigram Bunny<>A is not created (and don't gets counted)
>
> Is there a way to achieve this?
>
> I hope my question is understandable and has not been ask bevor.
>
> If i missed some relevant documentation, i would be glad to be pointet
> to it.
>
> Thanks a lot
>
> Christos Bräunle
>
> 



-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

Reply via email to