Hi Vince, Thanks very much for the one-line version--unfortunately, I still get errors. The overall script runs over every text file in a directory, but as soon as it hits a text file without a <genre> tag, it gives this error:
Traceback (most recent call last): File "C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 168, in <module> main(".","output.csv") File "C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 166, in main os.path.walk(top_level_dir, reviewDirectory, writer ) File "C:\Python26\lib\ntpath.py", line 259, in walk func(arg, top, names) File "C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 162, in reviewDirectory reviewFile( dirname+'/'+fileName, args ) File "C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 74, in reviewFile rgenre = re.split(r';', rf.info["genre"]) KeyError: 'genre' I'm about to give what may be too much information--I really thought there must be a way to say "don't choke if you don't find any rgenres because rf.info["genre"] was empty". But maybe I need to define the "None" condition earlier? Basically a text file has this structure: <info> <title>High Noon</title> <genre>Drama;Western</genre> # But this tag doesn't exist for all text files # etc </info> <review> <author>u493498</author> <rating>9 out of 10</rating> <summary>A great flick</summary> <text>blah blah blah</text> # etc </review> # next review--all about the movie featured in the info tags -----Original Message----- From: Vince Spicer <vi...@vinces.ca> To: aenea...@priest.com Cc: tutor@python.org Sent: Mon, Sep 13, 2010 9:08 pm Subject: Re: [Tutor] If/elif/else when a list is empty On Mon, Sep 13, 2010 at 9:58 PM, <aenea...@priest.com> wrote: Hi, I'm parsing IMDB movie reviews (each movie is in its own text file). In my script, I'm trying to extract genre information. Movies have up to three categories of genres--but not all have a "genre" tag and that fact is making my script abort whenever it encounters a movie text file that doesn't have a "genre" tag. I thought the following should solve it, but it doesn't. The basic question is how I say "if genre information doesn't at all, just make rg1=rg2=rg3="NA"? rgenre = re.split(r';', rf.info["genre"]) # When movies have genre information they store it as <genre>Drama;Western;Thriller</genre> if len(rgenre)>0: if len(rgenre)>2: rg1=rgenre[0] rg2=rgenre[1] rg3=rgenre[2] elif len(rgenre)==2: rg1=rgenre[0] rg2=rgenre[1] rg3="NA" elif len(rgenre)==1: rg1=rgenre[0] rg2="NA" rg3="NA" else len(rgenre)<1: # I was hoping this would take care of the "there is no genre information" scenario but it doesn't rg1=rg2=rg3="NA" This probably does a weird nesting thing, but even simpler version I have tried don't work. Thanks very much for any help! Tyler _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor Hey Tyler you can simplify this with a onliner. rg1, rg2, rg3 = rgenre + ["NA"]*(3-len(rgenre[:3])) Hope that helps, if you have any questions feel free to ask. Vince
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor