#Here's my try:
vowel_killer_dict = { ord(a): None for a in 'aeiou'}
def devocalize(word):
return word.translate(vowel_killer_dict)
vowelled = ['him', 'ham', 'hum', 'fun', 'fan']
vowelled = set(vowelled)
devocalise_dict={}
for a in vowelled:
devocalise_dict[a]= devocalize(a)
unvowelled=set(devocalise_dict.values())
for lex in unvowelled:
d={}
d[lex] = [word for word in vowelled if devocalise_dict[word] == lex]
print lex, " ".join(d[lex])
--- En date de : Lun, 21.12.09, Emad Nawfal (عمـ نوفل ـاد)
<[email protected]> a écrit :
De: Emad Nawfal (عمـ نوفل ـاد) <[email protected]>
Objet: [Tutor] How can I make this run faster?
À: "tutor" <[email protected]>
Date: lundi 21 Décembre 2009, 8 h 40
Dear Tutors,
The purpose of this script is to see how many vocalized forms map to a single
consonantal form. For example, the form "fn" could be fan, fin, fun.
The input is a large list (taken from a file) that has ordinary words. The
script creates a devocalized list, then compares the two lists.
The problem: It takes over an hour to process 1 file. The average file size is
400,000 words.
Question: How can I make it run faster? I have a large number of files.
Note: I'm not a programmer, so please avoid very technical terms.
Thank you in anticipation.
def devocalize(word):
vowels = "aiou"
return "".join([letter for letter in word if letter not in vowels])
vowelled = ['him', 'ham', 'hum', 'fun', 'fan'] # input, usually a large list of
around 500,000 items
vowelled = set(vowelled)
unvowelled = set([devocalize(word) for word in vowelled])
for lex in unvowelled:
d = {}
d[lex] = [word for word in vowelled if devocalize(word) == lex]
print lex, " ".join(d[lex])
--
لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد
الغزالي
"No victim has ever been more repressed and alienated than the truth"
Emad Soliman Nawfal
Indiana University, Bloomington
--------------------------------------------------------
-----La pièce jointe associée suit-----
_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor
Découvrez les photos les plus intéressantes du jour.
http://www.flickr.com/explore/interesting/7days/_______________________________________________
Tutor maillist - [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor