Hi Patrick,

You need to pre-process the text (data cleaning) to remove
punctuations before run by count.pl. The same idea, you
need to post-process to get the format you want of the bigrams
or trigrams.

Thanks,
Ying

semiotica24 wrote:
>
> Sorry for the basic questions:
> 1. I need 2 versions of output for each list of bigrams and trigrams 
> that I create using the various measures in count.pl and statistic.pl: 
> one with the default statistics and one without. How do I format to 
> exclude the statistics?
> e.g.:
> mobile<>phones<>100 280 384
> cellular<>phones<>96 214 384
>
> mobile phones
> cellular phones
>
> 2. I need to remove punctuation . and , I've tried within my stopword 
> list, but I don't have the tags quite right. How should I enter into 
> my stop file?
>
> Thanks!
>
> Patrick
>
> 



------------------------------------

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/ngram/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/ngram/join
    (Yahoo! ID required)

<*> To change settings via email:
    ngram-dig...@yahoogroups.com 
    ngram-fullfeatu...@yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    ngram-unsubscr...@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/

Reply via email to