In answer to your first question... On Wed, Aug 17, 2011 at 1:30 PM, semiotica24 <semiotic...@yahoo.com> wrote: > > > > Sorry for the basic questions: > 1. I need 2 versions of output for each list of bigrams and trigrams that I > create using the various measures in count.pl and statistic.pl: one with the > default statistics and one without. How do I format to exclude the statistics? > e.g.: > mobile<>phones<>100 280 384 > cellular<>phones<>96 214 384 > > mobile phones > cellular phones >
There isn't a built in way to do this, but the following script will work for bigrams... marengo(17): cat c.pl while (<>) { if (/^(\S+)<>(\S+)<>/) { print "$1 $2\n"; } } marengo(18): cat out 7 i<>have<>1 2 1 news<>i<>1 1 2 have<>news<>1 1 1 like<>ngrams<>1 1 1 i<>like<>1 2 1 friends<>i<>1 1 2 my<>friends<>1 1 1 marengo(19): perl c.pl out i have news i have news like ngrams i like friends i my friends Hope this helps! Ted > > 2. I need to remove punctuation . and , I've tried within my stopword list, > but I don't have the tags quite right. How should I enter into my stop file? > > Thanks! > > Patrick > > -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------ Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/ngram/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/ngram/join (Yahoo! ID required) <*> To change settings via email: ngram-dig...@yahoogroups.com ngram-fullfeatu...@yahoogroups.com <*> To unsubscribe from this group, send an email to: ngram-unsubscr...@yahoogroups.com <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/