2009/1/24 Emad Nawfal (عماد نوفل) <emadnaw...@gmail.com>: > > > 2009/1/23 Emad Nawfal (عماد نوفل) <emadnaw...@gmail.com> >> >> >> On Fri, Jan 23, 2009 at 6:57 PM, Andre Engels <andreeng...@gmail.com> >> wrote: >>> >>> I made an error in my program... Sorry, it should be: >>> >>> def hasRoot(word, root): # This order I find more logical >>> loc = 0 >>> for letter in root: >>> loc = word.find(letter,loc) # I missed the ,loc here... >>> if loc == -1: >>> return false >>> return true >>> >>> # main >>> >>> infile = open("myCorpus.txt").read().split() >>> query = "ktb" >>> outcome = [word for word in infile if hasRoot(word,query)] >>> >>> >>> -- >>> André Engels, andreeng...@gmail.com >> >> >> Thank you so much. bktab is a legal Arabic word. I also found the word >> bmktbha in the corpus. I would have missed that. >> Thank you again. >> -- >> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد >> الغزالي >> "No victim has ever been more repressed and alienated than the truth" >> >> Emad Soliman Nawfal >> Indiana University, Bloomington >> http://emnawfal.googlepages.com >> -------------------------------------------------------- > > Hi again, > If I want to use a regular expression to find the root ktb in all its > derivations, would this be a good way around it: > >>>> x = re.compile("[a-z]*k[a-z]*t[a-z]*b[a-z]*") >>>> text = "hw syktbha ghda wlktab ktb" >>>> re.findall(x, text) > ['syktbha', 'wlktab', 'ktb'] >>>>
Yes, that looks correct - and a regular expression solution also is easier to adapt - for example, the little that I know of Arab makes me believe that _between_ the letters of a root there may only be vowels. If that's correct, the RE can be changed to "[a-z]*k[aeiou]*t[aeiou]*b[a-z]*" -- André Engels, andreeng...@gmail.com _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor