I'm trying to analyze thousands of different cancer datasets and run the same python program on them. I use Windows XP, Python 2.7 and the IDLE interpreter. I already have the input files in a directory and I want to learn the syntax for the quickest way to execute the program over all these datasets.
As an example,for the sample python program below, I don't want to have to go into the python program each time and change filename and countfile. A computer could do this much quicker than I ever could. Thanks in advance! * * import string filename = 'draft1.txt' countfile = 'draft1_output.txt' def add_word(counts, word): if counts.has_key(word): counts[word] += 1 else: counts[word] = 1 def get_word(item): word = '' item = item.strip(string.digits) item = item.lstrip(string.punctuation) item = item.rstrip(string.punctuation) word = item.lower() return word def count_words(text): text = ' '.join(text.split('--')) #replace '--' with a space items = text.split() #leaves in leading and trailing punctuation, #'--' not recognised by split() as a word separator counts = {} for item in items: word = get_word(item) if not word == '': add_word(counts, word) return counts infile = open(filename, 'r') text = infile.read() infile.close() counts = count_words(text) outfile = open(countfile, 'w') outfile.write("%-18s%s\n" %("Word", "Count")) outfile.write("=======================\n") counts_list = counts.items() counts_list.sort() for word in counts_list: outfile.write("%-18s%d\n" %(word[0], word[1])) outfile.close
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor