Hi Vince,

Thanks very much for the one-line version--unfortunately, I still get errors. 
The overall script runs over every text file in a directory, but as soon as it 
hits a text file without a <genre> tag, it gives this error:

Traceback (most recent call last):
  File 
"C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 
168, in <module>
    main(".","output.csv")
  File 
"C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 
166, in main
    os.path.walk(top_level_dir, reviewDirectory, writer )
  File "C:\Python26\lib\ntpath.py", line 259, in walk
    func(arg, top, names)
  File 
"C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 
162, in reviewDirectory
    reviewFile( dirname+'/'+fileName, args )
  File 
"C:\Users\tylersc\Desktop\Tyler2\Tyler\words_per_review_IMDB_9-13-10.py", line 
74, in reviewFile
    rgenre = re.split(r';', rf.info["genre"])
KeyError: 'genre'

I'm about to give what may be too much information--I really thought there must 
be a way to say "don't choke if you don't find any rgenres because 
rf.info["genre"] was empty". But maybe I need to define the "None" condition 
earlier?

Basically a text file has this structure:
<info>
<title>High Noon</title>
<genre>Drama;Western</genre> # But this tag doesn't exist for all text files
# etc
</info>
<review>
<author>u493498</author>
<rating>9 out of 10</rating>
<summary>A great flick</summary>
<text>blah blah blah</text>
# etc
</review>
# next review--all about the movie featured in the info tags






-----Original Message-----
From: Vince Spicer <vi...@vinces.ca>
To: aenea...@priest.com
Cc: tutor@python.org
Sent: Mon, Sep 13, 2010 9:08 pm
Subject: Re: [Tutor] If/elif/else when a list is empty





On Mon, Sep 13, 2010 at 9:58 PM, <aenea...@priest.com> wrote:

Hi,
 
I'm parsing IMDB movie reviews (each movie is in its own text file). In my 
script, I'm trying to extract genre information. Movies have up to three 
categories of genres--but not all have a "genre" tag and that fact is making my 
script abort whenever it encounters a movie text file that doesn't have a 
"genre" tag. 
 
I thought the following should solve it, but it doesn't. The basic question is 
how I say "if genre information doesn't at all, just make rg1=rg2=rg3="NA"?
 
rgenre = re.split(r';', rf.info["genre"]) # When movies have genre information 
they store it as <genre>Drama;Western;Thriller</genre>
 
if len(rgenre)>0:
          if len(rgenre)>2:
              rg1=rgenre[0]
              rg2=rgenre[1]
              rg3=rgenre[2]
          elif len(rgenre)==2:
              rg1=rgenre[0]
              rg2=rgenre[1]
              rg3="NA"
          elif len(rgenre)==1:
              rg1=rgenre[0]
              rg2="NA"
              rg3="NA"
   else len(rgenre)<1: # I was hoping this would take care of the "there is no 
genre information" scenario but it doesn't
           rg1=rg2=rg3="NA"
 
This probably does a weird nesting thing, but even simpler version I have tried 
don't work. 
 
Thanks very much for any help!
 
Tyler
      




_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor




Hey Tyler you can simplify this with a onliner.


rg1, rg2, rg3 = rgenre + ["NA"]*(3-len(rgenre[:3]))


Hope that helps, if you have any questions feel free to ask.


Vince

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to