2009/12/21 Alan Gauld <[email protected]> > > "Emad Nawfal (عمـ نوفل ـاد)" <[email protected]> wrote > > > def devocalize(word): >> vowels = "aiou" >> > Should this include 'e'? > > return "".join([letter for letter in word if letter not in vowels]) >> > > Its probably faster to use a regular expression replacement. > Simply replace any vowel with the empty string. > > > vowelled = ['him', 'ham', 'hum', 'fun', 'fan'] # input, usually a large >> list >> of around 500,000 items >> vowelled = set(vowelled) >> > > > How do you process the file? Do you read it all into memory and > then convert it to a set? Or do you process each line (one word > per line?) and add the words to the set one by one? The latter > is probably faster. > > > unvowelled = set([devocalize(word) for word in vowelled]) >> for lex in unvowelled: >> d = {} >> d[lex] = [word for word in vowelled if devocalize(word) == lex] >> > > I think you could remove the comprehensions and do all of > this inside a single loop. One of those cases where a single > explicit loop is faster than 2 comprehesions and a loop. > > But the only way to be sure is to test/profile to see whee the slowdown > occurs. > > HTH, > > -- > Alan Gauld > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > > _______________________________________________ > Tutor maillist - [email protected] > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor >
Thank you so much Bob and Alan The script is meant to process Semitic languages, so I was just giving examples from English. I totally forgot the 'e'. Bob's script runs perfectly. I'm a non-programmer in the sense that I know how to do basic things, but not a professional. For example, my script does what I want, but when I needed to look into effeciency, I got stuck. Thank you all for the help. -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington --------------------------------------------------------
_______________________________________________ Tutor maillist - [email protected] To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
