I would run count.pl without a stop list (or perhaps just a normal stop list) and then process the output in another program (e.g. sed). This one liner would do the trick:
sed -ne '/XTOPOX/p' count-output.cnt --- In ngram@yahoogroups.com, "ftwaroch" <f.a.twar...@...> wrote: > > Dear ngram group, > > Thanks for a great tool! I started playing with count.pl a couple of > days ago and wondered if it was possible to do the opposite of a > stopword list. My intention is to create an n-gram file that contains > only n-grams with a certain item (I investigate placenames in a text.) > I replaced all known placenames with a dummy value XTOPOX, and > defined a stoplist file - > > @stop.mode=AND > /[^XTOP]/ > > This is not very clean approach as all patterns that are not XTOP are > returned and I get noise back as well, see example: > > example_out.txt > > 16 > to<>XTOPOX<>2 2 4 > XTOPOX<>.<>2 4 3 > Tudur<>XTOPOX<>1 1 4 > XTOPOX<>on<>1 4 1 > XTOPOX<>,<>1 4 1 > OF<>TO<>1 1 1 > XX<>THE<>1 1 1 > CHAPTER<>XX<>1 2 1 > ,<>XTOPOX<>1 2 4 > TO<>DAY<>1 1 1 > X<>LLYWELYN<>1 1 1 > T<>.<>1 1 3 > ,<>T<>1 2 1 > CHAPTER<>X<>1 2 1 > > > Is there approach to that? If you have any pointers for me I would be > very happy. > many thanks, > > Florian >