Dear ngram group, Thanks for a great tool! I started playing with count.pl a couple of days ago and wondered if it was possible to do the opposite of a stopword list. My intention is to create an n-gram file that contains only n-grams with a certain item (I investigate placenames in a text.) I replaced all known placenames with a dummy value XTOPOX, and defined a stoplist file -
@stop.mode=AND /[^XTOP]/ This is not very clean approach as all patterns that are not XTOP are returned and I get noise back as well, see example: example_out.txt 16 to<>XTOPOX<>2 2 4 XTOPOX<>.<>2 4 3 Tudur<>XTOPOX<>1 1 4 XTOPOX<>on<>1 4 1 XTOPOX<>,<>1 4 1 OF<>TO<>1 1 1 XX<>THE<>1 1 1 CHAPTER<>XX<>1 2 1 ,<>XTOPOX<>1 2 4 TO<>DAY<>1 1 1 X<>LLYWELYN<>1 1 1 T<>.<>1 1 3 ,<>T<>1 2 1 CHAPTER<>X<>1 2 1 Is there approach to that? If you have any pointers for me I would be very happy. many thanks, Florian