2009/9/17 bob gailer <bgai...@gmail.com> > Emad Nawfal (عماد نوفل) wrote: > >> Hi Tutors, >> I want to color-code the different parts of the word in a morphologically >> complex natural language. The file I have looks like this, where the fisrt >> column is the word, and the second is the composite part of speech tag. For >> example, Al is a DETERMINER, wlAy is a NOUN and At is a PLURAL NOUN SUFFIX >> >> Al+wlAy+At DET+NOUN+NSUFF_FEM_PL >> Al+mtHd+p DET+ADJ+NSUFF_FEM_SG >> >> The output I want is one on which the word has no plus signs, and each >> segment is color-coded with a grammatical category. For example, the noun is >> red, the det is green, and the suffix is orange. Like on this page here: >> http://docs.google.com/View?id=df7jv9p9_3582pt63cc4 >> I am stuck with the html part and I don't know where to start. I have no >> experience with html, but I have this skeleton (which may not be the right >> thing any way) >> Any help with materials, modules, suggestions appreciated. >> >> This skeleton of my program is as follows: >> >> ############# >> RED = ("NOUN", "ADJ") >> GREEN = ("DET", "DEMON") >> ORANGE = ("NSUFF", "VSUFF", "ADJSUFF") >> > > Instead of that use a dictionary: > > colors = dict(NOUN="RED", ADJ="RED",DET ="GREEn",DEMON ="GREEN", > NSUFF="ORANGE", VSUFF="ORANGE", ADJSUFF="ORANGE") > >> # print html head >> def print_html_head(): >> #print the head of the html page >> def print_html_tail(): >> # print the tail of the html page >> >> def color(segment, color): >> # STUCK HERE shoudl take a color, and a segment for example >> >> # main >> import sys >> infile = open(sys.argv[1]) # takes as input the POS-tagged file >> print_html_head() >> for line in infile: >> line = line.split() >> if len(line) != 2: continue >> word = line[0] >> pos = line[1] >> zipped = zip(word.split("+"), pos.split("+")) >> for x, y in zipped: >> if y in DET: >> color(x, "#FF0000") >> else: >> color(x, "#0000FF") >> >> print_html_tail() >> >> >> >> -- >> لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد >> الغزالي >> "No victim has ever been more repressed and alienated than the truth" >> >> Emad Soliman Nawfal >> Indiana University, Bloomington >> -------------------------------------------------------- >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Tutor maillist - Tutor@python.org >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> >> > > > -- > Bob Gailer > Chapel Hill NC > 919-636-4239 >
Thank you all. This is great help. I just started looking into html two days ago. Thank you again. -- لا أعرف مظلوما تواطأ الناس علي هضمه ولا زهدوا في إنصافه كالحقيقة.....محمد الغزالي "No victim has ever been more repressed and alienated than the truth" Emad Soliman Nawfal Indiana University, Bloomington --------------------------------------------------------
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor