On Tue, Feb 2, 2010 at 4:19 PM, Kent Johnson <ken...@tds.net> wrote: > On Tue, Feb 2, 2010 at 9:33 AM, Norman Khine <nor...@khine.net> wrote: >> On Tue, Feb 2, 2010 at 1:27 PM, Kent Johnson <ken...@tds.net> wrote: >>> On Tue, Feb 2, 2010 at 4:16 AM, Norman Khine <nor...@khine.net> wrote: >>> >>>> here are the changes: >>>> >>>> import re >>>> file=open('producers_google_map_code.txt', 'r') >>>> data = repr( file.read().decode('utf-8') ) >>> >>> Why do you use repr() here? >> >> i have latin-1 chars in the producers_google_map_code.txt' file and >> this is the only way to get it to read the data. >> >> is this incorrect? > > Well, the repr() call is after the file read. If your data is latin-1 > you should decode it as latin-1, not utf-8: > data = file.read().decode('latin-1') > > Though if the decode('utf-8') succeeds, and you do have non-ascii > characters in the data, they are probably encoded in utf-8, not > latin-1. Are you sure you have latin-1? > > The repr() call converts back to ascii text, maybe that is what you want? > > Perhaps you put in the repr because you were having trouble printing? > > It smells of programming by guess rather than a correct solution to > some problem. What happens if you take it out?
when i take it out, i get an empty list. whereas both data = repr( file.read().decode('latin-1') ) and data = repr( file.read().decode('utf-8') ) returns the full list. here is the file http://cdn.admgard.org/documents/producers_google_map_code.txt > > Kent > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor