Hi Patrick, You need to pre-process the text (data cleaning) to remove punctuations before run by count.pl. The same idea, you need to post-process to get the format you want of the bigrams or trigrams.
Thanks, Ying semiotica24 wrote: > > Sorry for the basic questions: > 1. I need 2 versions of output for each list of bigrams and trigrams > that I create using the various measures in count.pl and statistic.pl: > one with the default statistics and one without. How do I format to > exclude the statistics? > e.g.: > mobile<>phones<>100 280 384 > cellular<>phones<>96 214 384 > > mobile phones > cellular phones > > 2. I need to remove punctuation . and , I've tried within my stopword > list, but I don't have the tags quite right. How should I enter into > my stop file? > > Thanks! > > Patrick > > ------------------------------------ Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/ngram/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/ngram/join (Yahoo! ID required) <*> To change settings via email: ngram-dig...@yahoogroups.com ngram-fullfeatu...@yahoogroups.com <*> To unsubscribe from this group, send an email to: ngram-unsubscr...@yahoogroups.com <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/